Skip to content

Classify agents on every command invocation#296

Merged
gtsiolis merged 2 commits into
mainfrom
pro-276-classify-agents-on-every-command-invocation
Jun 11, 2026
Merged

Classify agents on every command invocation#296
gtsiolis merged 2 commits into
mainfrom
pro-276-classify-agents-on-every-command-invocation

Conversation

@gtsiolis

@gtsiolis gtsiolis commented Jun 10, 2026

Copy link
Copy Markdown
Member

Tags every command invocation in CLI telemetry so we can answer "are agents using lstk?". Detection is environment-based (the strongest signal) with a TTY fallback for humans, and is attached to every telemetry event. A detected agent is also forwarded into the emulator container as AI_AGENT.

Agent and CI are tracked as independent axes — an agent can run inside CI, so each is recorded separately rather than collapsed into one mutually exclusive value. Every event carries is_ci plus agent_identity / ci_identity when detected, and a derived caller_type (agent/ci/human) for single-label segmentation. This mirrors the legacy CLI (localstack/localstack-cli-standalone#19), where is_ci is its own field independent of agent detection.

See PRO-276 for more context.

Notes

  • Orthogonal axes — an agent running inside CI is recorded on both (agent_identity and is_ci/ci_identity); caller_type derives to agent (agent > ci > human) so segmentation stays single-label without discarding the CI signal.
  • is_ci is sourced from the single caller classifier (14 known CI providers), not a separate bare CI env lookup.
  • AI_AGENT on the container is session-origin — it reflects who started the container, not who makes later calls into it, and is forwarded whenever an agent is detected (including inside CI).
  • Detection-only, no free-text override, so no user-supplied value reaches telemetry and no sanitization is needed.

@gtsiolis

gtsiolis commented Jun 10, 2026

Copy link
Copy Markdown
Member Author

Comment thread internal/caller/caller.go Outdated
TypeHuman Type = "human"
TypeAgent Type = "agent"
TypeCI Type = "ci"
)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: why do we have a type with mutually exclusive values (ci can also be agent but we can't represent this here), when instead we could have 2 binary fields: agent=yes/no ci=yes/no? Data model could become something like:

  is_agent / agent_identity   (cursor, claude-code, …)
  is_ci    / ci_identity      (github-actions, jenkins, …)
  detection_method

Then we could also remove the other lookup in internal/telemetry/client.go line 77 but instead use the value from internal/caller/caller.go

@gtsiolis gtsiolis Jun 11, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, @carole-lavillonniere! ⚾

I was trying to mirror the approach from localstack/localstack-cli-standalone#19, but you're right! Made some changes in 27b8021, and updated PR description. Could you take another look and take it over from here? 🙏

gtsiolis and others added 2 commits June 11, 2026 11:15
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@gtsiolis gtsiolis force-pushed the pro-276-classify-agents-on-every-command-invocation branch from 8d4d900 to 27b8021 Compare June 11, 2026 08:44
@gtsiolis gtsiolis merged commit deae609 into main Jun 11, 2026
12 checks passed
@gtsiolis gtsiolis deleted the pro-276-classify-agents-on-every-command-invocation branch June 11, 2026 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants