Skip to content

feat(alerting): handle Sentry resource:issue webhooks (Internal Integration surface)#1291

Merged
zbigniewsobiecki merged 2 commits into
devfrom
feat/sentry-issue-lifecycle-trigger
May 9, 2026
Merged

feat(alerting): handle Sentry resource:issue webhooks (Internal Integration surface)#1291
zbigniewsobiecki merged 2 commits into
devfrom
feat/sentry-issue-lifecycle-trigger

Conversation

@zbigniewsobiecki
Copy link
Copy Markdown
Member

Summary

Closes the silent-skip path for Sentry's "Internal Integration" / Custom Webhook surface. Prod 2026-05-09: a wedged-lock-canary alert fired in the cascade Sentry project, the alerting agent was enabled, but no agent ran. Webhook log id fbdc6d87-b962-444c-8a2a-a9452a74ff71 shows processed=false, decisionReason="Event unparseable or not processable" — the trigger was never invoked.

Root cause: src/router/adapters/sentry.ts:31 whitelisted only ['event_alert', 'metric_alert'] (Sentry Alert Rule surfaces). The webhook arrived with Sentry-Hook-Resource: issue (Internal Integration default surface — the natural way users wire Sentry → cascade). Spec 019 was scoped to event_alert; the issue-lifecycle path was deferred and never landed. Users who configured Sentry the natural way got silent skips for every issue.

Changes

This is Option B end-to-end — extend cascade to natively process resource: issue webhooks, mirroring the existing event_alert shape.

Code

File Change
src/router/adapters/sentry.ts:31 PROCESSABLE_RESOURCES += 'issue'
src/sentry/types.ts Fix SentryIssuePayload to match actual webhook shape (nested data.issue.{...} instead of flattened data.{...} — the captured prod fixture confirmed the existing type was wrong)
src/integrations/alerting/_shared/types.ts AlertSource += 'sentry-issue' (distinct literal so the partial-unique index on pr_work_items doesn't collide if the same Sentry issue arrives via both surfaces)
src/integrations/alerting/_shared/format.ts Add formatSentryIssueLifecycleCardBody — builds AlertHints from data.issue.{title, web_url, level, shortId, culprit, metadata.{filename, function}}
src/triggers/sentry/alerting-issue-lifecycle.ts NEW SentryIssueLifecycleTrigger — matches resource: 'issue' + action: 'created', fires the alerting agent. Resolved/archived/etc. lifecycle actions are deferred (would auto-close the cascade card; out of scope)
src/triggers/sentry/register.ts Register the new handler
src/triggers/sentry/webhook-handler.ts Extract a materializeSentryAlertWorkItem helper with three branches (event_alert / metric_alert / issue-lifecycle) keyed on agentInput.triggerEvent. Same AlertSlotMissingError graceful-skip + transient-PM-error retry semantics as before

Tests

  • NEW tests/unit/triggers/sentry/issue-lifecycle-format.test.ts (7 tests) — pins formatSentryIssueLifecycleCardBody against the captured prod fixture
  • NEW tests/unit/triggers/sentry/issue-lifecycle.test.ts (18 tests) — pins handler matches() (issue-resource only, action=created only, distinct from event_alert) and handle() (alertIssueId, alertOrgId, alertTitle, lockKey/coalesceKey namespacing, deferred materialisation, slot-missing pre-flight)
  • Extended tests/unit/triggers/sentry-webhook-handler.test.ts (+5 tests) — pins materializer-dispatch picks formatSentryIssueLifecycleCardBody + 'sentry-issue' AlertSource for triggerEvent: 'alerting:issue-lifecycle'; existing event_alert + metric_alert paths unchanged
  • Updated tests/unit/router/adapters/sentry.test.ts — flips the previously-asserting-rejection cases to assert acceptance for resource: 'issue'
  • Updated tests/unit/triggers/builtins.test.ts — bumps registered-handler count 24 → 25

Drive-by

Refactored materializeAlertWorkItem (per discussion in review): extracted reuseOrLazyHealMapping and pollForConcurrentWinner helpers, bringing the parent under the cognitive-complexity ceiling. No behavioural change — same idempotency contract (existing tests at tests/unit/triggers/sentry/alerting-issue-materializer.test.ts continue to pass unchanged).

Distinctness from event_alert

Both surfaces can deliver for the same Sentry issue ID. The new 'sentry-issue' literal isolates the materializer's dedup namespace:

event_alert  → AlertSource='sentry'        → external_id=<issueId>
metric_alert → AlertSource='sentry-metric' → external_id=<orgSlug>:<ruleTitle>
issue        → AlertSource='sentry-issue'  → external_id=<issueId>   ← NEW

Two surfaces, same Sentry issue ID → two cards materialize (one per surface). That's the safe default; collapsing across surfaces would need a separate decision.

lockKey / coalesceKey also use a sentry-issue: namespace distinct from the existing sentry: (event_alert) so concurrent deliveries via both surfaces don't lock-contend.

Operator pre-req (no code change)

Cascade project's PM lists.alerts (Trello) / statuses.alerts (JIRA, Linear) must be configured for materialisation to actually create a card. The pre-flight validation rule at src/triggers/shared/integration-validation.ts already emits a pm-category error when alerting is enabled but the slot is unset — unchanged; the same message will fire for the new 'sentry-issue' source. Configure via the dashboard's PM wizard "Status Mapping" → "Alerts" row.

Test plan

  • npm test9112 / 9112 passing (3 new test files + extensions)
  • npm run lint — clean (0 errors, 0 warnings; the prior materialize.ts complexity warning is gone after the drive-by refactor)
  • npm run typecheck — clean
  • Pre-push hook ran the full suite green
  • Captured prod fixture (fbdc6d87-b962-444c-8a2a-a9452a74ff71) used as regression baseline in the new tests
  • Existing event_alert and metric_alert flows pinned unchanged (regression net in extended sentry-webhook-handler tests)
  • Idempotency contract pinned: same fixture twice → same workItemId via the partial-unique-index path (existing materializer tests cover this)

Out of scope (deferred)

  • Other Sentry actions for resource: issue'resolved', 'unresolved', 'archived', 'assigned'. First cut handles 'created' only. Auto-closing the cascade work item on Sentry resolution is spec-019 §7's deferred concern (likely a future spec 020).
  • Coalescing or deduping across surfaces (event_alert + issue for the same Sentry issue ID).
  • Operator-UX for the lists.alerts slot (separate ergonomics ticket).

🤖 Generated with Claude Code

…ration surface)

Closes the silent-skip path for Sentry's "Internal Integration" / Custom
Webhook surface. Prod 2026-05-09: a wedged-lock-canary alert fired in the
cascade Sentry project, the alerting agent was enabled, but no agent ran.
Webhook log id `fbdc6d87-b962-444c-8a2a-a9452a74ff71` shows
`processed=false, decisionReason="Event unparseable or not processable"` —
the trigger was never invoked.

Root cause: `src/router/adapters/sentry.ts:31` whitelisted only
`['event_alert', 'metric_alert']` (Sentry Alert Rule surfaces). The webhook
arrived with `Sentry-Hook-Resource: issue` (Internal Integration default
surface — the natural way users wire Sentry → cascade). Spec 019 was
scoped to event_alert; the issue-lifecycle path was deferred and never
landed. Users who configured Sentry the natural way got silent skips for
every issue.

This adds end-to-end support, mirroring the event_alert pattern:

- Router adapter accepts `'issue'` resource + new test asserting it parses.
- New `SentryIssueLifecycleTrigger` (matches `resource: 'issue'` +
  `action: 'created'`) fires the alerting agent. Resolved/archived/etc.
  actions are deferred (would auto-close the cascade card; out of scope).
- Distinct AlertSource literal `'sentry-issue'` so the
  `(project_id, external_source, external_id)` partial-unique index on
  `pr_work_items` doesn't collide if the same Sentry issue arrives via
  both surfaces (event_alert and issue) — each surface materializes its
  own card.
- New `formatSentryIssueLifecycleCardBody` builds AlertHints from
  `data.issue.{title, web_url, level, shortId, culprit, metadata.{filename, function}}`.
  Mirror of `formatSentryCardBody` adapted to the issue-lifecycle payload shape.
- Worker-side `processSentryWebhook` extends the materializer dispatch with
  a third branch keyed on `agentInput.triggerEvent === 'alerting:issue-lifecycle'`.
  Same AlertSlotMissingError graceful-skip + transient-PM-error retry semantics
  as the existing two branches.
- `SentryIssuePayload` type updated to match the actual Sentry webhook shape
  (nested `data.issue.{...}` instead of flattened `data.{...}` — the captured
  prod fixture confirmed the existing type was wrong).

Drive-by lint cleanup (per request): refactored `materializeAlertWorkItem`
to extract `reuseOrLazyHealMapping` and `pollForConcurrentWinner` helpers,
bringing the parent function under the cognitive-complexity ceiling.

Verification:
- 9112 unit tests passing (3 new test files: issue-lifecycle-format,
  issue-lifecycle handler, plus extensions to sentry-webhook-handler and
  the router/adapters/sentry tests). Captured live prod fixture used as
  the regression baseline.
- Lint clean (0 errors, 0 warnings).
- Typecheck clean.

Operator notes:
- Cascade project's PM `lists.alerts` (Trello) / `statuses.alerts`
  (JIRA, Linear) must be configured for materialisation to actually
  create a card. The pre-flight validation rule at
  `src/triggers/shared/integration-validation.ts` already emits a
  `pm`-category error when alerting is enabled but the slot is unset —
  unchanged; same message will fire for the new `'sentry-issue'` source.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@nhopeatall nhopeatall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Requesting changes: the new Sentry issue lifecycle handler is registered, but its enablement event is not declared in the alerting agent definition, so normal resource: issue webhooks still skip before enqueueing an agent run.

Code Issues

Blocking

  • src/triggers/sentry/alerting-issue-lifecycle.ts:53 — The handler gates on alerting:issue-lifecycle, but src/agents/definitions/alerting.yaml only declares alerting:issue-alert and alerting:metric-alert. checkTriggerEnabledWithParams calls getResolvedTriggerConfig, which returns null when the event is absent from the YAML definition before considering any DB override. The runtime path is: Sentry sends resource: issue / action: created -> this handler matches -> line 53 resolves an unknown trigger event -> enabled=false -> handler returns null -> router logs no trigger matched and no job is queued. That leaves the prod incident path effectively unfixed unless the event is added to the alerting definition/catalog (with the expected context pipeline/default) or this handler intentionally reuses the existing alerting:issue-alert event.

🕵️ codex · gpt-5.5 · run details

const triggerConfig = await checkTriggerEnabledWithParams(
ctx.project.id,
'alerting',
'alerting:issue-lifecycle',
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This event is not declared in src/agents/definitions/alerting.yaml (nor the trigger event catalog). Because checkTriggerEnabledWithParams goes through getResolvedTriggerConfig, an unknown event returns null and this handler always sees enabled=false, even if the alerting agent itself is enabled. A resource: issue / created webhook will still return null here and no worker job will be queued unless alerting:issue-lifecycle is added to the alerting definition/catalog (or this intentionally gates on an existing alerting event).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added as a declared trigger in src/agents/definitions/alerting.yaml with defaultEnabled: true, providers: [sentry], and contextPipeline: [alertingIssue, directoryListing, contextFiles] (matching the existing alerting:issue-alert entry).

Without this entry, getResolvedTriggerConfig couldn't find the event in the YAML definition and returned null, so checkTriggerEnabledWithParams always saw enabled=false — leaving resource: issue webhooks silently skipped even after the router fix.

Also added TRIGGER_EVENTS.ALERTING.ISSUE_LIFECYCLE to the event catalog (events.ts) and updated the handler + webhook-handler.ts to reference the constant instead of raw strings, as required by the trigger-event-string consistency static guard for new handlers.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 9, 2026

Codecov Report

❌ Patch coverage is 98.07692% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/integrations/alerting/_shared/format.ts 90.47% 0 Missing and 2 partials ⚠️
src/integrations/alerting/_shared/materialize.ts 97.56% 1 Missing ⚠️
src/triggers/sentry/alerting-issue-lifecycle.ts 98.66% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

…inition

Adds `alerting:issue-lifecycle` to `src/agents/definitions/alerting.yaml`
so `getResolvedTriggerConfig` can find the event. Without this entry the
lookup returned `null`, causing `checkTriggerEnabledWithParams` to always
see `enabled=false` — leaving the prod incident path effectively unfixed
even though the router now accepts `resource: issue` webhooks.

Also adds `TRIGGER_EVENTS.ALERTING.ISSUE_LIFECYCLE` to the events catalog
and updates the handler + webhook-handler to use the constant (required by
the trigger-event-string consistency static guard for new handlers).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@nhopeatall nhopeatall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - verified the new Sentry issue lifecycle path is registered in the trigger catalog/agent definition, routes resource=issue through the Sentry adapter, materializes via the distinct sentry-issue source, and preserves the existing event_alert and metric_alert branches. Focused unit tests passed locally: 5 files, 72 tests.

🕵️ codex · gpt-5.5 · run details

@zbigniewsobiecki zbigniewsobiecki merged commit ce051a4 into dev May 9, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants