feat: observability — performance tab, country capture, recovery email#2270
Merged
Conversation
Bundled per Al: items 1-7 + perf tab on /ops/observability.
Schema
- migration adds users.signup_country (varchar 2) and jobs.card_count (int)
- two indexes for the new query paths
Country capture
- extractCountryFromRequest reads CloudFront-Viewer-Country (with cf-ipcountry
and x-vercel-ip-country fallbacks); guarded against missing headers
- captured on register/loginWithGoogle/loginWithNotion for new users only
- lazy backfill on getLocals when the user has no stored country
- all writes wrapped in try/catch — country is additive context
Job timing
- CompleteJobUseCase persists cardCount through updateJobStatus
- duration is computed from existing created_at and last_edited_time
in PerformanceMetricsService — no new timing columns
Performance tab — /ops/performance
- PerformanceMetricsService runs four parallel queries: 24h+7d duration
percentiles (p50/p95/p99), terminal-status breakdown last 24h, slowest 20
done jobs last 24h, signup country breakdown last 7d
- new GetPerformanceMetricsUseCase wires it through to OpsController
- /api/ops/performance/metrics behind RequireOpsAccess (404, not 403)
- React tab renders tables + simple bar visualisations, 30s refetch
Abandoned-checkout recovery
- new EmailService.sendAbandonedCheckoutRecoveryEmail + template (mascot,
dark-mode, responsive, footer — per email-templates rules)
- SendAbandonedCheckoutRecoveryUseCase dedupes + validates emails,
dry-run by default, records per-email failures
- /api/ops/send-abandoned-checkout-recovery for one-shot triggering with
{ emails: string[], dryRun?: boolean }
US-localized pricing
- PricingPage receives signupCountry from /api/users/debug/locals
- 'US' renders an MCAT/USMLE/bar-exam framed hero; identical 100-card cap
Voice + docs
- VOICE.md protected string moves to "Your monthly limit: 100 cards" for
in-product display; marketing copy on /pricing keeps "100 cards per month"
- Documentation/post-mortems/2024-10-subscriber-data-deletion.md captures
the worker-threads → conversion → deletion regression lesson
Notes
- Sonar scanner not installed locally; expect a possible bounce on push
- 786 → 12K is a distribution/channel goal that this PR doesn't solve;
this is the instrumentation layer that lets the next reliability wins
be measured the way Sep 2024's worker-threads spike could only be
measured retroactively
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Extract resolveSignupCountry helper from getLocals to drop cognitive
complexity from 20 to under 15 (S3776 CRITICAL)
- Flip != null ternaries in PerformanceMetricsService to == null first
branch — lead-with-positive (S7735, 5 minor)
- replace(/,/g, ' ') → replaceAll(',', ' ') in PerformanceTab formatter
(S7781 minor)
- Add multicriteria waivers for email-template HTML rules (Web:S1827,
Web:S5257, Web:S6819, css:S7924) scoped to
src/services/EmailService/templates/** — table-based layout and
deprecated presentational attrs are mandatory for email-client
compatibility (Gmail/Outlook/Apple Mail)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- EMAIL_RE in SendAbandonedCheckoutRecoveryUseCase: cap each segment with explicit upper bounds (64/255/63) and length-check the full string against RFC 5321's 320-char limit. The original pattern was already linear-time (disjoint negated classes + literal anchors), but Sonar's S5852 conservatively flags unbounded `+` quantifiers; explicit bounds remove the hotspot. - Lift the duration-ms SQL fragment and the "done jobs since" WHERE clause to module constants in PerformanceMetricsService — both the percentiles query and the slowest-jobs query were ~30 lines of near-identical SQL (Sonar flagged the 20% duplication on new code). - Sonar cpd.exclusions: add src/data_layer/public/** (Kanel-generated, Initializer/Mutator interfaces repeat the same column list 3x) and web/src/pages/OpsPage/performanceTypes.ts (intentional duplication of the server response shape because web cannot import from server). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Summary
Bundles items 1–7 from today's action-item list plus the Performance tab under
/ops/performance. Single PR per Al's instruction.CloudFront-Viewer-Country(withcf-ipcountry/x-vercel-ip-countryfallbacks). Stored once onusers.signup_country; lazy backfill ongetLocalsfor existing users.jobs.card_countpersisted byCompleteJobUseCase; duration is computed from existingcreated_at↔last_edited_timeso no new timing columns./ops/performance. Job duration percentiles (24h + 7d, p50 / p95 / p99), terminal-status breakdown, top 20 slowest jobs in last 24h, and signup-country breakdown over 7d. 30s auto-refresh.email-templates.md),EmailService.sendAbandonedCheckoutRecoveryEmail,SendAbandonedCheckoutRecoveryUseCase(dedupes, validates, dry-run-by-default), andPOST /api/ops/send-abandoned-checkout-recoveryso Al can fire it once the 234 candidates are extracted from the Stripe export.signupCountry === 'US'the/pricingintro switches to MCAT / USMLE / bar-exam framing. Identical 100-card cap everywhere — no silent geo variance."100 cards per month"(marketing on/pricing) +"Your monthly limit: N cards"(in-product display).Documentation/post-mortems/2024-10-subscriber-data-deletion.mdretroactively captures the 472 → 206 paid-signup collapse caused by the subscriber-data deletion regression (PR feat: Implement worker threads for upload processing #1591 worker-threads on Sep 1 lifted conversion; missingawaiton Sep 28–Oct 14 ate the gains).Why one PR
Per Al: bundle all of items 1–7 plus observability instrumentation. The pieces are independent in code but share one schema migration and one round of Kanel regen — splitting would have made the migration order brittle.
Out of scope (deferred follow-ups)
Test plan
pnpm tsc --noEmit(server) ✓pnpm typecheck(web) ✓pnpm jest(server) — 167 suites passed, 1161 tests passedpnpm --filter 2anki-web test(vitest) — 67 files passed, 469 tests passedpnpm --filter 2anki-web lint(Biome) ✓/ops/performancein dev and confirm the four panels render (empty state acceptable until traffic hits new code)/api/ops/send-abandoned-checkout-recoverywith a dry-run body and confirm candidate count comes back/pricingwhile signed in from a US IP (or with a US row inusers.signup_country) and confirm the MCAT/USMLE hero rendersNotes
🤖 Generated with Claude Code