docs: product telemetry taxonomy (#341)#741
Conversation
Define event naming convention (noun.verb), categories, per-event property contracts, privacy guardrails, and R1/R2/R3 launch-gate telemetry anchors. Telemetry is opt-in and disabled by default.
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive telemetry taxonomy for Taskdeck, establishing strict privacy guardrails and standardized event naming conventions. The feedback focuses on improving consistency across the document, specifically recommending the use of bucketing for item counts to enhance privacy, standardizing duration units to milliseconds, and ensuring HTTP status codes are consistently represented as numeric types.
| ### `proposal.list_loaded` | ||
| Review/proposals list loaded successfully. | ||
|
|
||
| Required: `pending_count: number` |
There was a problem hiding this comment.
Inconsistency in count collection. While capture.inbox_loaded (line 115) and board.loaded (line 181) use bucketing (e.g., item_count_bucket) to protect user privacy, proposal.list_loaded uses a raw number. To align with the "privacy-first" stance and prevent potential fingerprinting via exact item counts, this should also be bucketed. This recommendation also applies to approved_count (line 165), rejected_count (line 170), and proposals_created (line 322).
| Required: `pending_count: number` | |
| Required: pending_count_bucket: string — bucketed: empty, small (1–10), medium (11–50), large (>50) |
| User explicitly approved a proposal. | ||
|
|
||
| Required: `proposal_id: string`, `proposal_risk_level: string` | ||
| Optional: `time_to_decision_seconds: number` — seconds from `proposal.opened` to approval |
There was a problem hiding this comment.
Inconsistency in duration units. Most events in this taxonomy use _ms (milliseconds), but proposal.approved and proposal.rejected use seconds. Standardizing on milliseconds across all events ensures consistency in the telemetry pipeline and avoids unit conversion errors during analysis. Please update line 149 as well for consistency.
| Optional: `time_to_decision_seconds: number` — seconds from `proposal.opened` to approval | |
| Optional: time_to_decision_ms: number — milliseconds from proposal.opened to approval |
| ### `auth.login_failed` | ||
| Login attempt failed. | ||
|
|
||
| Required: `error_code: string` — use HTTP status code as string (e.g. `401`, `429`). Do not include credential content. |
There was a problem hiding this comment.
Inconsistency in HTTP status code representation. auth.login_failed uses error_code as a string, while error.api_request_failed (line 354) uses status_code as a number. It is recommended to use status_code: number for HTTP status codes to maintain consistency with the rest of the taxonomy and reserve error_code: string for application-specific error identifiers.
| Required: `error_code: string` — use HTTP status code as string (e.g. `401`, `429`). Do not include credential content. | |
| Required: status_code: number — HTTP status code (e.g. 401, 429). Do not include credential content. |
Clarify that capture.triage_clicked item_id is opaque UUID only. Enumerate surface values for error.unhandled to match page.loaded enum plus component-level values, closing adversarial review gaps.
Adversarial self-reviewPII protection: All event properties are opaque UUIDs, counts, durations, booleans, or fixed-enum values. No card/board names, usernames, email addresses, or free-text fields appear anywhere. The `input_length_bucket` pattern avoids exact-length inference. The "What NOT to Collect" table is concrete with examples. Pass. Naming convention consistency: All 35+ events follow `noun.verb` with lowercase underscores for multi-word nouns. No abbreviations. Consistent throughout. Pass. Core flow coverage: The primary Taskdeck loop (capture → proposal → approve → execute → board) is fully instrumented. The R1 launch gate explicitly calls out the `capture.submitted → proposal.approved → proposal.executed` funnel as a required observable signal. Pass. Gaps identified and fixed (follow-up commit f4f40fe):
Governance checks: `check-docs-governance.mjs` and `check-github-ops-governance.mjs` both pass. No further issues found. |
There was a problem hiding this comment.
Pull request overview
Adds a new canonical documentation spec for Taskdeck’s product telemetry taxonomy, aiming to standardize event naming, property contracts, and privacy guardrails to support later implementation and release-gate validation.
Changes:
- Defines a
noun.verbtelemetry naming convention plus universal envelope fields and privacy “never collect” rules. - Documents 7 telemetry event categories (Capture, Proposal/Review, Board, Auth, Navigation, Agent, Error) with required/optional properties.
- Introduces R1/R2/R3 launch-gate “minimum signal coverage” anchors tied to specific telemetry signals.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - **Verb**: the action or state transition (e.g. `created`, `approved`, `loaded`, `failed`) | ||
|
|
||
| Examples: | ||
| - `card.created` | ||
| - `proposal.approved` | ||
| - `session.started` | ||
| - `agent_run.completed` | ||
|
|
||
| Rules: | ||
| - All names are lowercase with underscores for multi-word nouns (e.g. `agent_run`, `first_run`). | ||
| - Verbs use past tense for completed actions (`created`, `approved`, `failed`) and present tense for states (`loaded`, `opened`). |
There was a problem hiding this comment.
The naming rules here don’t match the event list below: many verbs use underscores (e.g. modal_opened, inbox_loaded, login_failed) even though the rule only mentions underscores for multi-word nouns, and the “present tense for states” examples (loaded, opened) are not present tense. Please tighten the convention to reflect the actual intended grammar/format (e.g., explicitly allow underscores in both segments and clarify tense/voice for state events).
| - **Verb**: the action or state transition (e.g. `created`, `approved`, `loaded`, `failed`) | |
| Examples: | |
| - `card.created` | |
| - `proposal.approved` | |
| - `session.started` | |
| - `agent_run.completed` | |
| Rules: | |
| - All names are lowercase with underscores for multi-word nouns (e.g. `agent_run`, `first_run`). | |
| - Verbs use past tense for completed actions (`created`, `approved`, `failed`) and present tense for states (`loaded`, `opened`). | |
| - **Verb**: the action, outcome, or resulting state transition (e.g. `created`, `approved`, `loaded`, `failed`) | |
| Examples: | |
| - `card.created` | |
| - `proposal.approved` | |
| - `session.started` | |
| - `agent_run.completed` | |
| - `inbox.marked_read` | |
| Rules: | |
| - Event names consist of two dot-separated segments: `noun.verb`. | |
| - All segments are lowercase. Use underscores within any multi-word segment for legibility (e.g. `agent_run`, `first_run`, `marked_read`). | |
| - Use action/outcome names in past tense or past-participle form for completed actions and resulting state events (e.g. `created`, `approved`, `loaded`, `opened`, `failed`). |
| ### `capture.submitted` | ||
| User submitted a capture item. | ||
|
|
||
| Required: `has_attachment: boolean` |
There was a problem hiding this comment.
capture.submitted doesn’t include any source/channel field, but the R3 launch-gate anchors later require measuring capture.submitted from inbound/import paths (source: import). Add a property contract that makes this measurable (e.g., optional/required source with an explicit enum) or adjust the R3 anchor to use an event that actually carries that distinction.
| Required: `has_attachment: boolean` | |
| Required: `has_attachment: boolean`, `source: string` — submission origin: `manual`, `import` |
| ### `page.loaded` | ||
| A route-level page loaded successfully. | ||
|
|
||
| Required: `page: string` — enumerated: `home`, `today`, `inbox`, `review`, `board`, `metrics`, `agents`, `agent_run_detail`, `knowledge`, `settings`, `help` |
There was a problem hiding this comment.
The page.loaded page enum includes values that don’t currently exist as route-level pages in the web app (e.g. agents, agent_run_detail, knowledge, help). This can lead to inconsistent instrumentation and dashboards. Consider either aligning the enum to current router surfaces (or route.name values), or explicitly marking these as reserved/future values with a pointer to the owning issue/PR.
| Required: `page: string` — enumerated: `home`, `today`, `inbox`, `review`, `board`, `metrics`, `agents`, `agent_run_detail`, `knowledge`, `settings`, `help` | |
| Required: `page: string` — current route-level values: `home`, `today`, `inbox`, `review`, `board`, `metrics`, `settings` | |
| Reserved/future values (do not emit until corresponding router surfaces exist and instrumentation is added; tracked via Related: #341, #77): `agents`, `agent_run_detail`, `knowledge`, `help` |
| ### `session.started` | ||
| User started a new authenticated session. | ||
|
|
||
| Required: `auth_method: string` — `password`, `oauth_google`, `oauth_github` | ||
|
|
||
| ### `session.ended` | ||
| User explicitly signed out. | ||
|
|
||
| Required: *(envelope only)* | ||
|
|
||
| ### `session.expired` |
There was a problem hiding this comment.
The session.* event namespace is easy to confuse with the universal envelope’s session_id (app session) since both use “session” terminology but refer to different concepts here (auth session lifecycle vs app session identifier). Consider renaming the auth events to something less ambiguous (e.g. auth_session.started / auth_session.ended or login.succeeded / logout.completed).
| ### `session.started` | |
| User started a new authenticated session. | |
| Required: `auth_method: string` — `password`, `oauth_google`, `oauth_github` | |
| ### `session.ended` | |
| User explicitly signed out. | |
| Required: *(envelope only)* | |
| ### `session.expired` | |
| ### `auth_session.started` | |
| User started a new authenticated session. | |
| Required: `auth_method: string` — `password`, `oauth_google`, `oauth_github` | |
| ### `auth_session.ended` | |
| User explicitly signed out. | |
| Required: *(envelope only)* | |
| ### `auth_session.expired` |
- Fix naming convention rules: clarify underscores allowed in verb segments, standardize on past-tense/past-participle (not "present tense for states"), update examples to match - Add `source` field to `capture.submitted` to satisfy R3 anchor which requires measuring import-path submissions - Bucket `pending_count`, `approved_count`, `rejected_count`, and `proposals_created` to match privacy stance applied elsewhere - Standardize duration units: rename `time_to_decision_seconds` to `time_to_decision_ms` across proposal.approved and proposal.rejected - Fix `auth.login_failed` HTTP status type: use `status_code: number` to match `error.api_request_failed` (was `error_code: string`) - Rename `session.*` auth events to `auth_session.*` to avoid semantic clash with the universal envelope's `session_id` - Mark `agents`, `agent_run_detail`, `knowledge`, `help` as reserved/future in page.loaded enum (routes not yet shipped) - Add `mcp_tool.invoked` and `mcp_tool.failed` events for shipped MCP server (PR #739) - Document that `settings.telemetry.enabled` and the telemetry service are not yet implemented; taxonomy must precede instrumentation
Updates STATUS.md, TESTING_GUIDE.md, MANUAL_TEST_CHECKLIST.md, and IMPLEMENTATION_MASTERPLAN.md to reflect the 9 PRs merged in the 2026-04-04 post-adversarial-review hardening wave. - STATUS.md: adds new wave entry for PRs #741–#756 covering product telemetry taxonomy, two bug fixes (#683/#744, #680/#754), six frontend regression test additions, and two backend webhook test additions; updates Last Updated date - TESTING_GUIDE.md: updates frontend count from 1496 to 1592, backend estimate to ~3010+, combined total to ~4600+; marks #710 and #726 as delivered in wave table; adds resolved entries to Key Gaps section; updates telemetry taxonomy note to delivered; appends three new coverage sections for webhook HMAC, SSRF/delivery reliability, and frontend regression wave - MANUAL_TEST_CHECKLIST.md: adds auth-flow toast check (Section A), WIP-limit dedup check (Section B), manual card provenance empty state check (Section B), board header presence label check (Section B), and D2 router auth guard and workspace state section - IMPLEMENTATION_MASTERPLAN.md: adds item 128 for the post-adversarial hardening wave with all 9 issues, fixes, and wave progress update
Summary
noun.verb), 7 event categories, per-event property contracts, and hard privacy guardrails for Taskdeck product telemetryTest plan
GOLDEN_PRINCIPLES.md(GP-06 review-first, GP-08 product legibility, GP-09 traceable agent expansion)noun.verbconvention throughout