feat: launch telemetry#1430
Conversation
Package TarballHow to installgh release download pr-1430-tarball --repo aws/agentcore-cli --pattern "*.tgz" --dir /tmp/pr-tarball
npm install -g /tmp/pr-tarball/aws-agentcore-0.16.0.tgz |
|
Claude Security Review: no high-confidence findings. (run) |
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Overall the launch-telemetry changes look solid — moving telemetry preferences under a generic config command, adding a Result-based read API, and validating endpoints/UUIDs are good cleanups. I flagged a few items below that I think need to be addressed before merging — primarily missing telemetry instrumentation on the new config command itself (which is required per src/cli/telemetry/README.md), and stale entries in COMMAND_SCHEMAS for the removed telemetry enable/disable subcommands.
| const result = await resolveAction(key, value)(); | ||
| printResult(result); | ||
| if (!result.success) process.exit(1); | ||
| }); |
There was a problem hiding this comment.
Missing telemetry instrumentation for the new config command.
This PR is titled "launch telemetry" and the README at src/cli/telemetry/README.md requires every new top-level command to (a) declare an attribute schema in src/cli/telemetry/schemas/command-run.ts under COMMAND_SCHEMAS, and (b) be wrapped with runCliCommand (or withCommandRunTelemetry) so a cli.command_run metric is emitted. The action handler here just runs resolveAction(...), prints, and exits — no metric is ever recorded for agentcore config ... invocations, which is precisely the kind of usage we'd want visibility into now that config is the documented way to opt in/out of telemetry.
A few options:
- Add three command keys (
config.list,config.get,config.set) toCOMMAND_SCHEMAS(most likely allNoAttrs, orconfig.setcould carry a sanitizedkeyenum if we want to track which keys users touch most), and wrap each branch withrunCliCommand. - Add a single
configkey toCOMMAND_SCHEMASand dispatch all three branches through onerunCliCommand('config', ...)call, with attrs distinguishing the action.
Option 1 is more consistent with how dataset.*, import.*, etc. are modeled today.
There was a problem hiding this comment.
will follow-up with this, I don't think its a blocker and I'd like to sync the backend schemas once I have a shape.
| export async function readGlobalConfig(configFile = GLOBAL_CONFIG_FILE): Promise<GlobalConfig> { | ||
| export function validateGlobalConfig(data: unknown): { success: boolean; error?: z.ZodError } { | ||
| return GlobalConfigSchemaStrict.safeParse(data); | ||
| } |
There was a problem hiding this comment.
validateGlobalConfig returns { success: boolean; error?: z.ZodError }, which loses the discriminated union that safeParse provides. Callers like handleConfigSet only branch on success, so this works, but in the success case any caller that wants the parsed/coerced value gets nothing — and TS won't tell you that the data field is gone. Either:
- Return
z.SafeParseReturnType<unknown, GlobalConfig>(i.e., justGlobalConfigSchemaStrict.safeParse(data)typed as-is), or - Return an explicit
Result<{ data: GlobalConfig }, ZodError>for consistency with the rest of the file.
This matters because today handleConfigSet writes partial (the user-supplied object) rather than validation.data — so if we ever add a .transform() or coercion to the schema, the validated/normalized form would be silently discarded.
There was a problem hiding this comment.
this is intentional. validate is used for validation, read is used for reading. Validate is also used to verify that new (unwritten) config shapes are valid. The purpose of the function is "is this shape a valid config", so a success boolean is sufficient.
|
Claude Security Review: no high-confidence findings. (run) |
| } from '../../lib/schemas/io/global-config'; | ||
| import { createTempConfig } from './helpers/temp-config'; | ||
| import { readFile, writeFile } from 'fs/promises'; | ||
| import assert from 'node:assert'; |
There was a problem hiding this comment.
note: we import and use assert here because it provides type narrowing and avoids verbosity in follow-up assertions on result types.
| export async function handleConfigGet(key: string): Promise<ConfigResult> { | ||
| const read = await readGlobalConfig(); | ||
| if (!read.success) { | ||
| return { success: false, error: new Error(`Error: Unable to parse config file at ${GLOBAL_CONFIG_FILE}`) }; |
There was a problem hiding this comment.
note: planning to follow-up with some telemetry on the config command and make this a custom error. Could use ValidationError, but may want something more specific to the file system.
|
|
||
| For more information on what exactly is captured, see the schemas, which | ||
| include all attributes and metrics captured: | ||
| https://github.com/aws/agentcore-cli/tree/main/src/cli/telemetry/schemas |
There was a problem hiding this comment.
once we have proper documentation, I'll replace the source code link with a docs link.
| try { | ||
| sinks.push(new OtelMetricSink({ endpoint, resource })); | ||
| } catch { | ||
| // Telemetry is best-effort — skip the network sink rather than crash. |
There was a problem hiding this comment.
this is needed for the case where a customer provides an invalid url in their config or via env var.
| validate: 'Validate agentcore/ config files.', | ||
| 'config-bundle': '[preview] Manage configuration bundle versions and diffs.', | ||
| archive: '[preview] Archive (delete) a batch evaluation or recommendation on the service and clear local history.', | ||
| config: 'Adjust global configuration settings such as telemetry opt-out status', |
There was a problem hiding this comment.
note: should move these somewhere more intuitive, but OOS here.
| const result = await resolveAction(key, value)(); | ||
| printResult(result); | ||
| if (!result.success) process.exit(1); | ||
| }); |
There was a problem hiding this comment.
will follow-up with this, I don't think its a blocker and I'd like to sync the backend schemas once I have a shape.
| export async function readGlobalConfig(configFile = GLOBAL_CONFIG_FILE): Promise<GlobalConfig> { | ||
| export function validateGlobalConfig(data: unknown): { success: boolean; error?: z.ZodError } { | ||
| return GlobalConfigSchemaStrict.safeParse(data); | ||
| } |
There was a problem hiding this comment.
this is intentional. validate is used for validation, read is used for reading. Validate is also used to verify that new (unwritten) config shapes are valid. The purpose of the function is "is this shape a valid config", so a success boolean is sufficient.
| describe('config command', () => { | ||
| afterAll(() => rm(testConfigDir, { recursive: true, force: true })); | ||
|
|
||
| it('lists config with only installationId when fresh', async () => { |
There was a problem hiding this comment.
the "only" is misleading - this test checks installationId is there but not that the config only has that key
There was a problem hiding this comment.
good catch, updated.
| @@ -214,8 +214,6 @@ export const COMMAND_SCHEMAS = { | |||
| 'dataset.publish-version': NoAttrs, | |||
| 'dataset.remove-version': NoAttrs, | |||
| 'telemetry.disable': NoAttrs, | |||
|
Claude Security Review: no high-confidence findings. (run) |
Description
This is a composite PR of a few changes.
.jsonfile extension is misleading for.jsonlfiles, invalid configs silently overwritten. (ex. [bug]: invalid agentcore config is silently overwritten. #1337).Future Work (non-launch blocking):
Related Issue
Closes #
Documentation PR
Will follow-up with telemetry documentation.
Type of Change
Testing
How have you tested the change?
npm run test:unitandnpm run test:integnpm run typechecknpm run lintsrc/assets/, I rannpm run test:update-snapshotsand committed the updated snapshotsChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.