docs(integrations): planning doc for Datadog integration (#15)#36
Merged
Conversation
Pre-implementation plan for the Datadog integration that resolves the five open questions from #15, proposes a 5-PR breakdown, and surfaces the decisions that still need user input before slice 3 ships. Headline decisions documented (all overridable on review): - DORA Metrics API v2 (event-level), not the generic Metrics API - Vercel Cron for the daily pull, not Supabase Edge Functions or a standalone worker - Postgres pgcrypto with INTEGRATIONS_ENCRYPTION_KEY for at-rest credential encryption; KMS deferred to v2 - service→repo matched via Datadog's repository_url field with manual override when unmatched - Initial backfill capped at 30 days to bound first-sync API cost Five user-facing decisions are flagged at the end (backfill window, cron cadence, test Datadog key, first correlation card, Stage 3 opening). PR 1 (UI skeleton) ships independent of all of them; PRs 2-5 gate on these answers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
3 tasks
trentas
added a commit
that referenced
this pull request
May 13, 2026
Bumps platform/package.json (1.0.0 → 1.0.6 — catching up from the initial scaffold), pyproject.toml (1.0.5 → 1.0.6), and iris/cli.py:VERSION (v1.0.5 → v1.0.6). Adds the CHANGELOG entry for v1.0.6 covering the Datadog integration end-to-end across slices 1-5 (PRs #36, #37, #39, #40, #41, #42). Highlights: - Connect flow + encrypted credentials (slice 2) - Daily Vercel Cron sync into external_deployments / _commits / _incidents (slice 3) - Engine consumes events and emits 18 new dora_* fields including CFR / MTTR per-deploy / MTTR per-incident / rollback rate / lead time / deploy frequency / by-origin breakdowns (slice 4 + 5) - Dashboard DORA section with the "Datadog" badge and the AI-vs-human correlation card (slice 5) - Setup docs at docs/integrations/datadog.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
trentas
added a commit
that referenced
this pull request
May 13, 2026
Bumps platform/package.json (1.0.0 → 1.0.6 — catching up from the initial scaffold), pyproject.toml (1.0.5 → 1.0.6), and iris/cli.py:VERSION (v1.0.5 → v1.0.6). Adds the CHANGELOG entry for v1.0.6 covering the Datadog integration end-to-end across slices 1-5 (PRs #36, #37, #39, #40, #41, #42). Highlights: - Connect flow + encrypted credentials (slice 2) - Daily Vercel Cron sync into external_deployments / _commits / _incidents (slice 3) - Engine consumes events and emits 18 new dora_* fields including CFR / MTTR per-deploy / MTTR per-incident / rollback rate / lead time / deploy frequency / by-origin breakdowns (slice 4 + 5) - Dashboard DORA section with the "Datadog" badge and the AI-vs-human correlation card (slice 5) - Setup docs at docs/integrations/datadog.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
trentas
added a commit
that referenced
this pull request
May 13, 2026
#15) (#42) * feat(dashboard): DORA section + CFR-by-origin correlation + setup docs (#15) Slice 5 — closes the Datadog integration loop. The dashboard now surfaces the dora_* metric family with a "Datadog" badge, an AI-vs-human CFR correlation card backed by a per-commit join, and a silent-decay guard on the integration detail page. Final piece: customer-facing setup documentation. Engine (Python): - iris/analysis/dora_real.py — new ``cfr_by_origin`` / ``rollback_rate_by_origin`` breakdowns when the aggregator passes the local commit-origin map. Per-commit join: each commit on each evaluated deploy is bucketed by its origin; commits not present in the local window are dropped silently but reflected in coverage_pct so the dashboard can warn when attribution is thin. - iris/metrics/aggregator.py — passes ``origin_map`` through to ``analyze_dora_real``. - iris/models/metrics.py — adds ``dora_cfr_by_origin`` and ``dora_rollback_rate_by_origin``. - tests/test_dora_real.py — 4 new tests covering the per-commit join, unknown-commit handling with coverage reporting, rollback filtering, and the no-origin-map default. Platform: - src/types/metrics.ts — TS mirrors of the two new dora_* fields. - src/types/org-summary.ts — new OrgDORA aggregation type. - lib/queries/org-summary.ts — computeDORA() sums deploys / failures / rollbacks across repos, weights CFR by evaluated deploys, and aggregates the by-origin breakdown. Returns null when no repo has an active integration. - src/app/[tenant]/dashboard/sections/DORAOverview.tsx — headline cards (CFR, MTTR per failed deploy, deploy frequency, lead time) plus a "Datadog" badge, a fact strip (deploys / rollback rate / pending), and the CFR-by-origin + rollback-rate-by-origin correlation tables. The correlation card stays hidden until the org has ≥ 10 failed deploys (per §9.6 — was 10 incidents pre-revision). - src/app/[tenant]/dashboard/page.tsx — wires the new section in. - src/components/integrations/datadog-connect-form.tsx + src/app/[tenant]/settings/integrations/[provider]/page.tsx — the §9.8 silent-decay hint: "last incident registered X days ago" on the detail page. Days are server-computed to keep the client component pure. - platform/lib/translations.ts — full en + pt-br copy for the new surfaces. - platform/tests/dora-aggregation.test.ts — 4 tests for computeDORA(). Docs: - docs/integrations/datadog.md — customer setup guide. Covers the Application Key scope, regional sites, the connect flow, the cron schedule, what we read / don't read, repository matching, the disconnect behavior, and operational notes (backfill window, rate limits, encryption rotation). - docs/METRICS.md — adds the two new dora_*_by_origin fields and the module-map row. Verified: - python -m pytest tests/ -q → 113 passed (4 new) - platform: npx tsc --noEmit → clean - platform: npx vitest run → 175 passed (4 new) - platform: npx eslint → clean Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(integrations): coverage_pct math + dead code + form gating on error (#15) Three issues surfaced in the slice 5 audit: 1. `iris/analysis/dora_real.py` — the per-origin `coverage_pct` divided each origin's commits by `(this origin + ALL unknowns)`, so every origin's coverage dropped by the full unknown count. The right semantic is org-wide attribution coverage. Hoisted to a single result field `cfr_by_origin_coverage_pct` and removed from each per-origin dict. 2. Same file — `_referenced` was assigned and immediately popped from the dict; dead code, dropped. 3. `platform/src/components/integrations/datadog-connect-form.tsx` — the connected card only rendered when `status === "active"`, so an integration in `status: "error"` fell through to the connect form and lost the very surfaces (last_sync_at, last_error, unmatched count, days-since-last-incident) the operator needs to debug. Now renders the status card for both `active` and `error`, with the shield icon and copy switched to an error variant when the sync is failing. Schema / TS / docs aligned: - `iris/models/metrics.py` adds `dora_cfr_by_origin_coverage_pct`. - `iris/metrics/aggregator.py` wires it. - `platform/src/types/metrics.ts` drops `coverage_pct` from the per-origin shape and adds the new top-level field. - `docs/METRICS.md` updates the field table and the explanatory blurb; module-map row picks up the new field. - `platform/lib/translations.ts` — en + pt-br copy for the new error state. Tests: - `tests/test_dora_real.py` — old `coverage_pct` assertion replaced by two focused tests (mixed known/unknown drops org-wide coverage; full attribution reports 100%; no origin map → None). - `platform/tests/dora-aggregation.test.ts` — adjusts mock payloads to drop the (now-removed) `coverage_pct` field on per-origin entries. Verified: pytest 115 passed (16 dora_real tests), tsc clean, vitest 175 passed, eslint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(platform): restore build version in footer (#15) The footer's `process.env.NEXT_PUBLIC_BUILD_VERSION || "dev"` lookup fell back to "dev" on every Vercel deploy because the env var was never wired up. Reads from `package.json` at config-load time and appends the Vercel commit SHA (`VERCEL_GIT_COMMIT_SHA`) when present so production / preview deploys carry a unique identifier between releases. Loaded via fs instead of an ESM JSON import to stay portable across Next's TS loader and direct Node ESM execution (the latter requires `with { type: "json" }`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(release): v1.0.6 — Datadog DORA integration (#15) Bumps platform/package.json (1.0.0 → 1.0.6 — catching up from the initial scaffold), pyproject.toml (1.0.5 → 1.0.6), and iris/cli.py:VERSION (v1.0.5 → v1.0.6). Adds the CHANGELOG entry for v1.0.6 covering the Datadog integration end-to-end across slices 1-5 (PRs #36, #37, #39, #40, #41, #42). Highlights: - Connect flow + encrypted credentials (slice 2) - Daily Vercel Cron sync into external_deployments / _commits / _incidents (slice 3) - Engine consumes events and emits 18 new dora_* fields including CFR / MTTR per-deploy / MTTR per-incident / rollback rate / lead time / deploy frequency / by-origin breakdowns (slice 4 + 5) - Dashboard DORA section with the "Datadog" badge and the AI-vs-human correlation card (slice 5) - Setup docs at docs/integrations/datadog.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
10 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Pre-implementation plan for #15. Doesn't ship code — resolves the five open questions in the issue, lays out a 5-PR breakdown, and flags the decisions that still need explicit user input before ingestion slices can land.
Decisions made (all overridable on review):
commit_sha/repository_url— needed for attribution.pgcrypto+ env master keyorganizationsaccess.repository_url+ manual overridePR plan (each ~independent, depends only on the prior):
User can stop at any boundary — slice 2 alone gives "Connect Datadog" UX useful for marketing/sales validation, slice 3 starts producing data, etc.
Decisions still needed from the user (block slice 3+)
PR 1 ships independent of all of these.
Test plan
🤖 Generated with Claude Code