Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions docs/PLAN-datadog.md
Original file line number Diff line number Diff line change
Expand Up @@ -319,3 +319,85 @@ PR 1 (skeleton), but PR 3 (ingestion) needs all of them resolved.
Datadog integration is shipped end-to-end.

Once these are answered, PR 1 can ship.

---

## 8. Revision (2026-05-13) — list endpoint is POST-with-body, not GET

The original §1 picked **DORA Metrics API v2 (read endpoints)** as the
source for raw deployment and failure events. That was directionally
correct. An earlier draft of this section claimed the API was
write-only based on GET probes that returned 404 — that conclusion was
wrong. Datadog's list-with-filters endpoints accept POST with a JSON
body (the filter object doesn't fit cleanly as URL query params),
which is a known but non-REST-standard pattern. Re-probing with POST:

```
POST /api/v2/dora/deployments (no auth, valid body shape) → HTTP 401
POST /api/v2/dora/failures (no auth, valid body shape) → HTTP 401
GET /api/v2/dora/deployments/test-id (no auth) → HTTP 401
```

401 (not 404) means the endpoints exist and reject only for missing
auth — confirming the original §1 plan. The 404s observed earlier were
Datadog returning "no GET handler at this path" instead of HTTP 405
Method Not Allowed.

Slice 1 (UI skeleton) was unaffected. The rest of the original plan
stands; only the HTTP verb and request shape change for the list calls.

### The four endpoints that matter

| Endpoint | Method | Purpose |
|---|---|---|
| `/api/v2/dora/deployments` | POST | List deployment events with filters (`from`, `to`, `query`, `limit`, `sort`). Body wraps the filters under `data.attributes`, `data.type = "dora_deployments_list_request"`. |
| `/api/v2/dora/failures` | POST | List failure/incident events with the same filter shape; `data.type = "dora_failures_list_request"`. |
| `/api/v2/dora/deployments/{id}` | GET | Single deployment detail (used to enrich a row on demand). |
| `/api/v2/dora/failures/{id}` | GET | Single failure detail. |

Each deployment event exposes — per Datadog's published reference —
`service`, `team`, `env`, `version`, `repository_id`, `change_failure`
(bool), `recovery_time`, `avg_change_lead_time`, commit count, and the
constituent lead-time stages (review, merge, time-to-deploy). That's
more than enough to populate `external_deployments` with commit-linked
detail and to compute per-origin CFR — the original correlation card
goal in §7 #4 is back on the table.

**One real caveat carried over** from the doc you shared: DORA event
retention on Datadog is **2 years**. Initial backfill window is fine
inside that envelope (the §7 default was 30 days); we shouldn't try to
go further back than 24 months on first sync.

### Impact on the rest of the plan

- **§1 row 1** ("DORA Metrics API v2 (read endpoints)") — the spirit
stays, but the wording was off. Read = `POST .../deployments` and
`POST .../failures` (list with body filters) plus `GET .../{id}`
(single record). Treat those as the canonical read surface.
- **§3 schemas** — no change. `external_deployments` and
`external_incidents` were modeled on the same fields the DORA event
payload provides.
- **§4 PR breakdown** — no slice changes. Slice 3 stays one chunk.
- **§7 decision #4** — "CFR by code origin" is feasible after all;
keep as the first correlation card.
- **§7 decision #6** introduced in the earlier (incorrect) draft of
this §8 — drop it. There's no A/B/C choice; we go with the
Option B / "raw events" path that was the original intent.

### Validation status

The two probe runs (GET → 404, POST → 401) confirm endpoints exist.
The dev keys provided in this session returned 403 on
`/api/v1/validate` so we don't yet have a happy-path response to
inspect — but the doc and the structural probes agree, and the user
provided a concrete `curl` example that matches the request shape
documented above. Before slice 3 starts (raw event ingestion), we need
working keys to:

- Run a real `POST /api/v2/dora/deployments` and confirm the response
envelope shape matches what the schema in §3 expects.
- Confirm pagination cursor field name and behavior on real responses
(the doc's `attributes.next_token` vs an envelope-level cursor — the
exact spelling has changed between API versions before).
- Decide whether `query` should default to `*` or to `env:production`
when the customer hasn't customized it.
Loading