Skip to content

feat(telemetry): fix command/exit-code tracking and enrich events#206

Merged
rhuanbarreto merged 6 commits intomainfrom
feat/telemetry-improvements
Apr 14, 2026
Merged

feat(telemetry): fix command/exit-code tracking and enrich events#206
rhuanbarreto merged 6 commits intomainfrom
feat/telemetry-improvements

Conversation

@rhuanbarreto
Copy link
Copy Markdown
Contributor

@rhuanbarreto rhuanbarreto commented Apr 14, 2026

Summary

Three bugs made the PostHog data close to unusable, and we had no way to tell which repos are actually running archgate. This PR fixes the bugs, enriches every event, and plumbs repo identification through — gated on confirmed-public visibility so we never leak private repos.

Bugs fixed

  • command_completed.command always "root" — the Commander preAction / postAction callback's first arg is the hookedCommand (root program), not the invoked command. Switched to the second arg (actionCommand). Verified in PostHog: 202/202 command_completed events in the last 14d had command=root.
  • command_completed.exit_code always 0 — hard-coded 0 in the postAction hook, AND ~50 direct process.exit(N) call sites bypass the hook and the main() flush entirely. Introduced helpers/exit.ts (exitWith / beginCommand / finalizeCommand) that records the real exit code + outcome tag, flushes PostHog + Sentry, then exits. All command call sites migrated to await exitWith(code). Verified in PostHog: 474/474 command_completed events had exit_code=0.
  • adr_count / has_project stale after archgate init — the project context was cached on first call (before init created .archgate/adrs/), so post-init events reported 0. Dropped the cache — the readdirSync is cheap enough to re-run per event.

Event enrichments

Common props (every event):

  • ci_provider — GitHub Actions / GitLab CI / CircleCI / Jenkins / Azure Pipelines / Buildkite / TeamCity / AWS CodeBuild / Bitbucket Pipelines / Travis / other
  • shell (bash / zsh / pwsh / ...), locale
  • adr_domains_count
  • Repo context (non-identifying): repo_is_git, repo_host (github / gitlab / bitbucket / azure-devops / other), repo_id (sha256(normalized remote URL)[0:16]), git_default_branch

Per-event additions:

  • command_executed: command_depth, options_used
  • command_completed: outcome (success / user_error / internal_error / cancelled), error_kind bucket
  • check_completed: files_scanned, load_duration_ms, check_duration_ms
  • login_completed: failure_reason (network / tls / denied / other)

New events:

  • project_initialized — one-time event on archgate init. Always carries repo_host, repo_is_git, repo_id, and repo_public (true / false / null). For repos confirmed public via their host's unauthenticated API, it additionally carries remote_url, repo_owner, repo_name.
  • telemetry_preference_changed — fires once on enable/disable.

Repo identity: confirmed-public only

The rule is intentionally simple:

Share identity iff the host confirms the repo is public. Otherwise don't.

No identity-specific flag, no identity-specific env var. Users who want zero events use the existing telemetry opt-out (ARCHGATE_TELEMETRY=0 or archgate telemetry disable), which suppresses the event entirely — a dedicated identity knob would be redundant and create two-source-of-truth drift.

Public-visibility probes (unauthenticated, 3s timeout, cached per process):

  • GitHub: api.github.com/repos/{o}/{r}private: false
  • GitLab: gitlab.com/api/v4/projects/{p}visibility: public
  • Bitbucket: api.bitbucket.org/2.0/repositories/{o}/{r}is_private: false
  • Azure DevOps: dev.azure.com/{org}/_apis/projects/{project}visibility: public

Timeouts, 401/403/429 responses, network errors, and self-hosted hosts all return null (= "unknown") rather than falling through to share.

Azure DevOps URL parsing handles three forms — modern HTTPS (dev.azure.com/{org}/{project}/_git/{repo}), legacy ({org}.visualstudio.com/{project}/_git/{repo}), and SSH (ssh.dev.azure.com:v3/{org}/{project}/{repo}) — and normalises all three to the same repo_id so it's stable regardless of how origin is set.

Privacy docs rewritten in both en and pt-br (ADR GEN-002 i18n parity).

Test hygiene

tests/helpers/telemetry.test.ts was sending real events to the production PostHog project — this is why we saw check, adr create, login_completed etc. with install_method=global-pm, has_project=true in the data. Tests now set NODE_ENV=test in beforeEach, and trackEvent() is a no-op under that env. Matches the Sentry convention in ARCH-005.

Commits

  1. feat(telemetry): fix command/exit-code tracking and enrich events — the three core bug fixes + common-prop/per-event enrichment + hashed repo_id on every event
  2. feat(telemetry): gate repo identity on confirmed-public repos + Azure DevOps — the public-visibility probe across four hosts; Azure DevOps added
  3. refactor(telemetry): drop ARCHGATE_SHARE_REPO_IDENTITY env + --no-share-repo-identity flag — simplification: one rule, share iff confirmed public; rely on the existing telemetry opt-out for "no events at all"

Test plan

  • bun run validate (lint + typecheck + format + 639 tests + 21 ADR rules + build:check) passes
  • New tests cover parseRemoteUrl for all four hosts (including Azure DevOps HTTPS / SSH / legacy URL shapes all hashing to the same repo_id), isPublicRepo for every host's success/404/401/403/error branches, and the shouldShareRepoIdentity public-only rule
  • Manually dry-run archgate init on a public GitHub repo → confirm project_initialized carries owner/name, identity_shared=true, repo_public=true (follow-up once merged)
  • Manually dry-run archgate init on a private GitHub repo → confirm identity_shared=false, repo_public=false, no owner/name fields (follow-up)
  • Manually dry-run archgate init on a public Azure DevOps repo → confirm same behaviour (follow-up)
  • With ARCHGATE_TELEMETRY=0 set, confirm project_initialized is not sent at all (follow-up)

🤖 Generated with Claude Code

Fixes three bugs that made the PostHog data thin:

- command_completed was always "root" because the Commander preAction /
  postAction hook callback's first arg is the hookedCommand (the root
  program), not the invoked command. Switched to the second arg.
- exit_code was always 0 because ~50 direct process.exit(N) sites bypass
  the postAction hook and the main() flush. Introduced helpers/exit.ts
  (exitWith / beginCommand / finalizeCommand) that records the real exit
  code + outcome tag, flushes PostHog + Sentry, then exits. All command
  call sites migrated.
- adr_count / has_project went stale post-init because the project
  context was cached before init ran. Dropped the cache — readdirSync
  is cheap enough to re-run per event.

Enrichments:

- Common props: ci_provider (github-actions/gitlab/...), shell, locale,
  adr_domains_count, repo_is_git, repo_host, repo_id (sha256 hash of
  normalized remote URL, 16 hex), git_default_branch.
- command_executed: command_depth, options_used.
- command_completed: outcome (success/user_error/internal_error/cancelled),
  error_kind.
- check_completed: files_scanned, load_duration_ms, check_duration_ms.
- login_completed: failure_reason bucket.
- New event project_initialized on `archgate init` with opt-in repo
  identity (--share-repo-identity flag or ARCHGATE_SHARE_REPO_IDENTITY
  env var). Hashed repo_id ships with every event; owner/name/remote URL
  only on explicit opt-in.
- New event telemetry_preference_changed on enable/disable.

Test hygiene: telemetry tests now force NODE_ENV=test so trackEvent is a
no-op, matching the Sentry convention in ARCH-005. This stops the prod
PostHog project from receiving real events from bun test runs.

Privacy docs updated (en + pt-br) to cover the new fields and the opt-in
repo-identity path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Apr 14, 2026

Deploying archgate-cli with  Cloudflare Pages  Cloudflare Pages

Latest commit: 2c2d07d
Status: ✅  Deploy successful!
Preview URL: https://6b435675.archgate-cli.pages.dev
Branch Preview URL: https://feat-telemetry-improvements.archgate-cli.pages.dev

View logs

rhuanbarreto and others added 5 commits April 14, 2026 00:34
… DevOps

Tighten the repo-identity path so only repositories whose hosts confirm
public visibility ever have owner / name / remote_url leave the machine,
and add Azure DevOps to the supported host list.

Changes:

- `isPublicRepo(repo)` — unauthenticated API probe against the host:
    - GitHub:       api.github.com/repos/{o}/{r}      → `private: false`
    - GitLab:       gitlab.com/api/v4/projects/{p}    → `visibility: public`
    - Bitbucket:    api.bitbucket.org/2.0/...         → `is_private: false`
    - Azure DevOps: dev.azure.com/{org}/_apis/...     → `visibility: public`
  Returns null on self-hosted, 403/429 rate-limit, timeout, or network
  failure so "unknown" never collapses to "share". Bounded by a 3s
  per-call timeout. Cached per process.

- `shouldShareRepoIdentity(flag, repoPublic)` is now opt-out-for-public:
  default ON for confirmed public repos, OFF for private / unknown /
  self-hosted regardless of flag. Flag flipped to `--no-share-repo-identity`
  (Commander `--no-` pattern), and env var semantics inverted to accept
  `0/false/no/off` as opt-out.

- Azure DevOps URL parsing:
    - HTTPS modern:  dev.azure.com/{org}/{project}/_git/{repo}
    - HTTPS legacy:  {org}.visualstudio.com/{project}/_git/{repo}
    - SSH:           ssh.dev.azure.com:v3/{org}/{project}/{repo}
  All three normalise to the same `dev.azure.com/{org}/{project}/{repo}`
  form so the hashed `repo_id` is stable across URL styles.

- Split `helpers/repo.ts` (local git + URL parsing) from
  `helpers/repo-probe.ts` (network probes) to keep both under the
  max-lines cap and to keep the network surface out of the fast path for
  every command — the probe only fires from `archgate init`.

- `project_initialized` event now carries `repo_public: true|false|null`
  so analytics can segment adoption by confirmed-public vs private/unknown
  without relying on whether identity was shared.

- Docs (en + pt-br) rewritten: section explains that identity ships only
  for confirmed-public repos, lists all four hosts, documents the
  `--no-share-repo-identity` / `ARCHGATE_SHARE_REPO_IDENTITY=0` opt-out.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…re-repo-identity flag

Redundant with the existing telemetry opt-out: disabling telemetry
(`ARCHGATE_TELEMETRY=0` or `archgate telemetry disable`) already suppresses
every event including `project_initialized`. A second identity-specific knob
on top of that just means two ways to achieve the same outcome, plus a
surface for users and docs to misalign on.

Simplifies the rule to:
  share identity iff the repo is confirmed public by its host.

`shouldShareRepoIdentity(repoPublic)` is now a one-liner, the init command
loses `--no-share-repo-identity`, and the docs (en + pt-br) point users at
the telemetry opt-out instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@rhuanbarreto rhuanbarreto enabled auto-merge (squash) April 14, 2026 01:30
@rhuanbarreto rhuanbarreto merged commit d798061 into main Apr 14, 2026
12 of 14 checks passed
@rhuanbarreto rhuanbarreto deleted the feat/telemetry-improvements branch April 14, 2026 01:33
@github-actions github-actions bot mentioned this pull request Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant