Skip to content

fix(cli): skip identity stitch in CI to stop ephemeral-env identify spam#5366

Merged
jgoux merged 4 commits into
developfrom
pamela/growth-886-cli-gate-stitch-on-isci
May 28, 2026
Merged

fix(cli): skip identity stitch in CI to stop ephemeral-env identify spam#5366
jgoux merged 4 commits into
developfrom
pamela/growth-886-cli-gate-stitch-on-isci

Conversation

@pamelachia
Copy link
Copy Markdown
Contributor

What

Gate the auto-stitch path on !isCI so CI runners and other ephemeral environments stop firing $identify / $create_alias once per CLI invocation.

func (s *Service) NeedsIdentityStitch() bool {
    return s != nil && s.state.DistinctID == "" && s.canSend() && !s.isCI
}

canSend() is unchanged, so cli_* capture events still fire from CI. login.go calls StitchLogin directly without this guard, so an explicit supabase login still identifies in CI.

Why

PostHog alert "total identify events increase" fired 2026-05-22 (244.5% day-over-day). I traced it to the Go CLI's first credentialed production deploy (#5329 at 2026-05-21 08:24 UTC, the binary that combined #5054's identity-stitch logic with #5314's credential wiring). Hour-by-hour change-point matches that deploy within minutes; the spike is 100% from $lib = 'posthog-go'.

The persistence code is correct. SaveState writes ~/.supabase/telemetry.json synchronously after a successful stitch. The break is environmental: CI runners, Docker containers, and npx supabase wrappers wipe the home directory between invocations, so every fresh process re-stitches.

Cohort breakdown over 6 days post-deploy:

Days active in window Users Identifies / user
1 day only 48,387 6.5
6 days (daily) 1,747 95.7

Single-day users look like CI runs. The daily, 96-identifies-per-user cohort looks like engineers whose own CI runs many Supabase workflows.

Daily posthog-go $identify volume went from ~15K to 638K/day and was still growing.

Why not gate canSend() itself

cli_* capture events are heavily used:

Event CI share
cli_stack_started 85%
cli_project_linked 57%
cli_command_executed 31%
cli_login_completed 14%

Six existing PostHog insights consume cli_command_executed, including the Agent-Led Growth dashboards. Killing CI capture would blind us to the dominant CLI use case.

Identity stitches in CI have no analytical value because each ephemeral run mints a fresh device_id and immediately discards it. Capture events ARE valuable because the is_ci property already segments them cleanly in PostHog.

Test plan

  • TestServiceNeedsIdentityStitch adds a subtest covering IsCI: true (passes locally)
  • Full internal/telemetry/... package tests pass
  • gofmt -d clean, go vet ./internal/telemetry/... clean
  • Watch posthog-go $identify daily volume drop back toward the pre-spike ~15K/day baseline within a day of release

Linked

GROWTH-886

The OnGotrueID hook in cmd/root.go calls StitchLogin once per process when
NeedsIdentityStitch() returns true. SaveState persists distinct_id to
~/.supabase/telemetry.json synchronously, which works fine on a stable
machine. In CI runners, Docker, and npx wrappers the home directory is
wiped between invocations, so every fresh process sees an empty
DistinctID and re-stitches. Daily $identify volume from posthog-go went
from ~15K to ~640K/day after the Go CLI's first credentialed deploy and
kept growing.

Gate NeedsIdentityStitch on !isCI so the auto-stitch from the X-Gotrue-Id
response header is suppressed in CI. canSend() is left alone, so cli_*
capture events (cli_command_executed, cli_stack_started, cli_project_linked)
still fire from CI, preserving the 31-85% of CLI usage that runs in CI and
the dashboards built on it. login.go calls StitchLogin directly without the
guard, so an explicit supabase login still identifies in CI.
@pamelachia pamelachia requested a review from a team as a code owner May 27, 2026 10:54
@pamelachia pamelachia requested a review from seanoliver May 27, 2026 10:58
@coveralls
Copy link
Copy Markdown

coveralls commented May 27, 2026

Coverage Report for CI Build 26586287373

Warning

No base build found for commit f03d18e on develop.
Coverage changes can't be calculated without a base build.
If a base build is processing, this comment will update automatically when it completes.

Coverage: 63.747%

Details

  • Patch coverage: No coverable lines changed in this PR.

Uncovered Changes

No uncovered changes found.

Coverage Regressions

Requires a base build to compare against. How to fix this →


Coverage Stats

Coverage Status
Relevant Lines: 15745
Covered Lines: 10037
Line Coverage: 63.75%
Coverage Strength: 7.07 hits per line

💛 - Coveralls

@jgoux
Copy link
Copy Markdown
Contributor

jgoux commented May 27, 2026

Hey @pamelachia thanks for that. As we're porting commands, we also have the same telemetry implementation on the TypeScript side. Can you also apply this change to the telemetry implementation we're using in TypeScript so it's 1:1 for ported and non ported commands?

Copy link
Copy Markdown
Contributor

@seanoliver seanoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Left one non-blocking question inline!

Comment thread apps/cli-go/internal/telemetry/service.go Outdated
@pamelachia pamelachia marked this pull request as draft May 27, 2026 14:59
@pamelachia pamelachia marked this pull request as ready for review May 28, 2026 15:26
@jgoux jgoux merged commit ca47a15 into develop May 28, 2026
17 checks passed
@jgoux jgoux deleted the pamela/growth-886-cli-gate-stitch-on-isci branch May 28, 2026 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants