fix(ci): disable telemetry in release workflow to fix test failures#1442
Conversation
The release workflow's "Update snapshots" step fails because withCommandRunTelemetry triggers real filesystem and network I/O (createClient → getOrCreateInstallationId → mkdir/writeFile, plus OtelMetricSink construction and flush) that doesn't resolve within the 50-100ms test delays. build-and-test.yml already sets AGENTCORE_TELEMETRY_DISABLED=1 at the workflow level, preventing this. The release workflow was missing it.
|
Claude Security Review: no high-confidence findings. (run) |
Package TarballHow to installgh release download pr-1442-tarball --repo aws/agentcore-cli --pattern "*.tgz" --dir /tmp/pr-tarball
npm install -g /tmp/pr-tarball/aws-agentcore-0.16.0.tgz |
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Looks good to merge.
The change is minimal, surgical, and brings parity with build-and-test.yml which already sets the same env var at the workflow level. Root cause analysis in the PR description is accurate — withCommandRunTelemetry triggers real fs/network I/O via getOrCreateInstallationId() and OtelMetricSink construction that doesn't resolve within the short setTimeout delays in the affected TUI tests.
One follow-up worth considering (not blocking): the tests really shouldn't depend on a workflow-level env var to avoid hitting the network and writing to ~/.agentcore/config.json. A vitest setupFiles entry that sets AGENTCORE_TELEMETRY_DISABLED=1 (or stubs the telemetry client) would make this immune to the env getting dropped from a future workflow. But unblocking releases now with this CI parity fix is the right call.
Coverage Report
|
Summary
AGENTCORE_TELEMETRY_DISABLED: '1'to the release workflow, matching whatbuild-and-test.ymlalready setsRoot Cause
The
prepare-releasejob runsnpm run test:update-snapshots(all unit tests, unsharded) withoutAGENTCORE_TELEMETRY_DISABLED. This causeswithCommandRunTelemetryin the TUI hooks (useLogsFlow,useRemoveResource) to trigger real I/O duringcreateClient():getOrCreateInstallationId()→access()+mkdir()+writeFile()to~/.agentcore/config.jsonOtelMetricSinkconstruction (network connection attempt)client.flush()in the finally block (actual export attempt)This I/O doesn't resolve within the 50-100ms
setTimeoutdelays used by the LogsScreen and useRemove tests, so assertions fire while components are still in their loading state.build-and-test.ymlavoids this by settingAGENTCORE_TELEMETRY_DISABLED=1at the workflow level.Test plan
AGENTCORE_TELEMETRY_DISABLED=1, fail without it