Skip to content

Mute poll failure alerts#43

Merged
dantheuber merged 2 commits into
mainfrom
mute-poll-failure-alerts
Mar 5, 2026
Merged

Mute poll failure alerts#43
dantheuber merged 2 commits into
mainfrom
mute-poll-failure-alerts

Conversation

@dantheuber

Copy link
Copy Markdown
Member

Summary

Adds the ability to mute poll failure alerts for specific services. Previously, mutes only targeted individual dependency instances (dependency_id) or all instances of a dependency type (canonical_name). Poll failure events (poll_error) are service-level and don't carry a dependencyId, so they bypassed the existing mute system entirely. This adds service_id as a third mutually-exclusive mute scope — service mutes only suppress poll_error alerts and do not affect status_change alerts for dependencies within that service.

Changes

Database

  • New migration 032_add_service_mutes — recreates alert_mutes with service_id TEXT column (FK → services), updated CHECK constraint enforcing exactly one of three targets, and unique index on (team_id, service_id)
  • Registered migration in migrate.ts

Server

  • AlertMute / CreateAlertMuteInput types updated with service_id field
  • IAlertMuteStore + AlertMuteStore: new isServiceMuted() method, create() now includes service_id
  • AlertService.processEvent(): service mute check added before dependency mute check for poll_error events — records history as muted and returns early
  • validateMuteCreate(): accepts service_id as third scope, validates exactly one target provided
  • Mute routes updated: create verifies service ownership, list/admin enrich service_name, delete includes service_id in audit log

Client

  • AlertMute / CreateAlertMuteInput types updated
  • AlertMutes.tsx: new "Service (poll failures)" scope option, service ID input field, table shows "Service" type and service name for service mutes

Docs

  • 02-data-model.md: updated schema, indexes, and migration history
  • 04-api-reference.md: added alert mute endpoint documentation

Testing

  • New/updated tests included
  • All tests pass (npm test)
  • Linting passes (npm run lint)

New server tests:

  • Service mute CRUD via API (create, reject bad ownership, reject nonexistent service, reject multi-target, list enrichment)
  • AlertService: poll_error suppressed when service muted, poll_error dispatched when not muted, status_change not suppressed by service mute
  • Validation: updated expectations for three-target error message

New client tests:

  • Service mute renders correctly in table (type + name)
  • Service scope form shows Service ID input

Results: 2987 server tests passed, 1442 client tests passed

Checklist

@dantheuber dantheuber merged commit 978cbe0 into main Mar 5, 2026
@dantheuber dantheuber deleted the mute-poll-failure-alerts branch March 5, 2026 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant