feat: implement escalated notifications for monitors by realnoob007 · Pull Request #199 · ramfam101/Checkmate

realnoob007 · 2026-04-09T01:32:51Z

Describe your changes

Add escalation rules that allow users to define time-based alert escalation when incidents persist. Users can configure multiple escalation rules per monitor, each with a wait time and notification channels to alert after the specified duration of downtime.

Frontend:

Add escalation rules UI section to monitor create/edit page
Add EscalationRule type, form validation, and i18n translations
Support dynamic add/remove of escalation rules with time and notification channel selection

Backend:

Add EscalationRule type and escalationRules field to Monitor schema
Add escalation checking in heartbeat job on every tick when monitor is down
Add escalation message builder for escalation-specific notifications
Track sent escalations per incident to prevent duplicate alerts
Clear escalation tracking on incident resolution
Map escalationRules in repository toEntity methods
Support comma-separated CLIENT_HOST for multiple CORS origins

Write your issue number after "Fixes "

Fixes #123

Please ensure all items are checked off before requesting a review. "Checked off" means you need to add an "x" character between brackets so they turn into checkmarks.

(Do not skip this or your PR will be closed) I deployed the application locally.
(Do not skip this or your PR will be closed) I have performed a self-review and testing of my code.
I have included the issue # in the PR.
I have added i18n support to visible strings (instead of <div>Add</div>, use):

const { t } = useTranslation();
<div>{t('add')}</div>

I have not included any files that are not related to my pull request, including package-lock and package-json if dependencies have not changed
I didn't use any hardcoded values (otherwise it will not scale, and will make it difficult to maintain consistency across the application).
I made sure font sizes, color choices etc are all referenced from the theme. I don't have any hardcoded dimensions.
My PR is granular and targeted to one specific feature.
I ran npm run format in server and client directories, which automatically formats your code.
I took a screenshot or a video and attached to this PR if there is a UI change.

Add escalation rules that allow users to define time-based alert escalation when incidents persist. Users can configure multiple escalation rules per monitor, each with a wait time and notification channels to alert after the specified duration of downtime. Frontend: - Add escalation rules UI section to monitor create/edit page - Add EscalationRule type, form validation, and i18n translations - Support dynamic add/remove of escalation rules with time and notification channel selection Backend: - Add EscalationRule type and escalationRules field to Monitor schema - Add escalation checking in heartbeat job on every tick when monitor is down - Add escalation message builder for escalation-specific notifications - Track sent escalations per incident to prevent duplicate alerts - Clear escalation tracking on incident resolution - Map escalationRules in repository toEntity methods - Support comma-separated CLIENT_HOST for multiple CORS origins

Copilot

Pull request overview

Implements time-based escalation notifications for monitors so additional notification channels can be alerted if an incident remains unresolved past configured thresholds.

Changes:

Adds escalationRules to monitor types/schemas and monitor create/edit validation.
Triggers escalation checks on each heartbeat tick while a monitor is down/breached and clears escalation tracking on recovery.
Adds escalation-specific notification message building and enables comma-separated CORS origins via CLIENT_HOST.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
server/src/validation/monitorValidation.ts	Validates `escalationRules` payload for create/edit monitor requests.
server/src/validation/envValidation.ts	Relaxes `CLIENT_HOST` validation to support non-URL / comma-separated values.
server/src/types/notificationMessage.ts	Adds new notification type `"escalation"`.
server/src/types/monitor.ts	Introduces `EscalationRule` and `Monitor.escalationRules`.
server/src/service/infrastructure/SuperSimpleQueue/SuperSimpleQueueHelper.ts	Runs escalation checks during heartbeat processing and clears tracking on resolution.
server/src/service/infrastructure/notificationsService.ts	Implements escalation sending and deduplication tracking.
server/src/service/infrastructure/notificationMessageBuilder.ts	Builds escalation-specific notification messages.
server/src/repositories/monitors/MongoMonitorsRepository.ts	Maps `escalationRules` from DB documents into monitor entities.
server/src/db/models/Monitor.ts	Adds `escalationRules` to the Mongoose monitor schema.
server/src/app.ts	Splits `CLIENT_HOST` into multiple CORS origins when comma-separated.
client/src/Validation/monitor.ts	Adds client-side validation for `escalationRules`.
client/src/Types/Monitor.ts	Adds `EscalationRule` and `Monitor.escalationRules` on the client.
client/src/Pages/CreateMonitor/index.tsx	Adds escalation rules UI with add/remove and notification channel selection.
client/src/locales/en.json	Adds i18n strings for the escalation rules UI.
client/src/Hooks/useMonitorForm.ts	Adds escalation rules defaults for the monitor form.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T01:36:02Z

server/src/service/infrastructure/SuperSimpleQueue/SuperSimpleQueueHelper.ts

+		const activeIncident = await this.incidentsRepository.findActiveByMonitorId(monitor.id, monitor.teamId);
+		if (!activeIncident) {
+			return;
+		}
+
+		const incidentStartTime = new Date(parseInt(activeIncident.startTime));
+		await this.notificationsService.handleEscalations(monitor, incidentStartTime);


activeIncident.startTime is an ISO string (see incidents repository toDateString), so parseInt(activeIncident.startTime) will parse only the year (e.g. 2026) and produce an incorrect incidentStartTime. This will make incidentDuration huge and escalate immediately. Parse it as a date string instead (e.g. new Date(activeIncident.startTime)) and handle invalid dates explicitly.

Copilot · 2026-04-09T01:36:02Z

server/src/service/infrastructure/notificationsService.ts

+			// Check if enough time has elapsed
+			if (incidentDuration >= rule.waitTime) {
+				this.escalationsSent.add(ruleKey);
+
+				const notifications = await this.notificationsRepository.findNotificationsByIds(rule.notificationIds);
+				if (notifications.length === 0) {
+					continue;
+				}
+


handleEscalations adds ruleKey to escalationsSent before verifying there are any matching notifications and before knowing whether any send succeeded. If notificationIds is empty/invalid or all sends fail, the rule will be permanently suppressed for this incident. Only mark the rule as sent after a successful send (or at least after a non-empty notification set and an attempted send) depending on desired retry semantics.

Copilot · 2026-04-09T01:36:02Z

server/src/service/infrastructure/notificationsService.ts

+		for (const rule of escalationRules) {
+			const ruleKey = `${incidentKey}:${rule.waitTime}`;
+
+			// Skip if already escalated for this rule in this incident
+			if (this.escalationsSent.has(ruleKey)) {
+				continue;
+			}


Escalation deduplication key is based only on incidentStartTime + waitTime. If a monitor has two escalation rules with the same waitTime but different notificationIds, only one will ever send because they collide on the same ruleKey. Either enforce unique waitTime in validation/UI or include additional rule identity (e.g. index or a stable hash of notificationIds) in the key.

Copilot · 2026-04-09T01:36:02Z

server/src/service/infrastructure/notificationsService.ts

+	// Tracks which escalation rules have already been sent, keyed by "monitorId:incidentStartTime:waitTime"
+	private escalationsSent: Set<string> = new Set();
+


Escalation tracking is stored only in an in-memory Set. In a multi-process / multi-instance deployment or after a restart, escalations can be re-sent for the same incident because the set is empty again. If the intention is to prevent duplicates per incident across restarts/replicas, persist this state (e.g. on the Incident record) or derive it from stored notification history.

Copilot · 2026-04-09T01:36:03Z

server/src/app.ts

+	const allowedOrigin = envSettings.clientHost.includes(",")
+		? envSettings.clientHost.split(",").map((s) => s.trim())
+		: envSettings.clientHost;


CLIENT_HOST is now allowed to be comma-separated and is split for CORS, but the same clientHost is also used to build links in emails/notifications (e.g. password reset, invite links, incident URLs). If CLIENT_HOST contains commas, those links will be invalid. Consider introducing a separate env var for CORS origins (or parse and pick a single canonical base URL for link building).

Copilot · 2026-04-09T01:36:03Z

server/src/service/infrastructure/notificationMessageBuilder.ts

+		const details = [
+			`URL: ${monitor.url}`,
+			`Status: Down`,
+			`Type: ${monitor.type}`,
+			`Downtime Duration: ${durationMinutes} minute(s)`,
+			`Escalation Threshold: ${waitMinutes} minute(s)`,
+		];


Escalation message details hard-code Status: Down, but escalations are also triggered for status === "breached". This will send incorrect status information for threshold incidents. Use monitor.status (or map to a friendly label) in the message content.

Copilot · 2026-04-09T01:36:03Z

server/src/validation/monitorValidation.ts

+	escalationRules: z.array(z.object({
+		waitTime: z.number().min(1000, "Wait time must be at least 1 second"),
+		notificationIds: z.array(z.string()).min(1, "At least one notification is required"),


Server-side validation allows waitTime as low as 1s, but the client-side schema/UI enforce a minimum of 1 minute and only present minute+ options. This mismatch can lead to confusing behavior (e.g. API accepts values UI can’t set; logs that show 0 minute(s)). Align server validation with the client/UI constraints (or update the client/UI to match the intended minimum).

Copilot · 2026-04-09T01:36:03Z

server/src/validation/monitorValidation.ts

+	escalationRules: z.array(z.object({
+		waitTime: z.number().min(1000, "Wait time must be at least 1 second"),
+		notificationIds: z.array(z.string()).min(1, "At least one notification is required"),


Same as create validation: server allows waitTime down to 1s while the client enforces 1 minute. Please keep create/edit validation consistent with the UI and the escalation scheduler expectations.

Copilot · 2026-04-09T01:36:04Z

server/src/service/infrastructure/SuperSimpleQueue/SuperSimpleQueueHelper.ts

+				// Step 6.5. Handle escalation notifications (check on every heartbeat when monitor is down)
+				if (statusChangeResult.monitor.status === "down" || statusChangeResult.monitor.status === "breached") {
+					this.handleEscalationCheck(statusChangeResult.monitor).catch((error: unknown) => {
+						this.logger.error({
+							message: `Error handling escalations for monitor ${statusChangeResult.monitor.id}: ${error instanceof Error ? error.message : "Unknown error"}`,
+							service: SERVICE_NAME,
+							method: "getMonitorJob",
+							stack: error instanceof Error ? error.stack : undefined,
+						});
+					});
+				}
+
+				// Step 6.6. Clear escalation tracking when monitor recovers
+				if (decision.shouldResolveIncident) {
+					this.notificationsService.clearEscalationTracking(statusChangeResult.monitor.id);
+				}


New escalation handling in the heartbeat job isn’t covered by existing tests in server/test/superSimpleQueueHelper.test.ts. Adding tests for (1) escalation checks being triggered only when status is down/breached and (2) clearEscalationTracking being called on incident resolution would help prevent regressions.

Copilot · 2026-04-09T01:36:04Z

client/src/Pages/CreateMonitor/index.tsx

+								<IconButton
+									size="small"
+									onClick={() => removeEscalation(index)}
+									aria-label="Remove escalation rule"
+									sx={{ mt: theme.spacing(SPACING.LG) }}


The aria-label for the remove button is hard-coded in English. Since the page uses i18n for visible strings, this label should also be translated to ensure consistent localization and accessibility.

Copilot AI review requested due to automatic review settings April 9, 2026 01:32

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement escalated notifications for monitors#199

feat: implement escalated notifications for monitors#199
realnoob007 wants to merge 1 commit intoramfam101:developfrom
realnoob007:feat/escalated-notifications

realnoob007 commented Apr 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		// Tracks which escalation rules have already been sent, keyed by "monitorId:incidentStartTime:waitTime"
		private escalationsSent: Set<string> = new Set();

Conversation

realnoob007 commented Apr 9, 2026

Describe your changes

Write your issue number after "Fixes "

Please ensure all items are checked off before requesting a review. "Checked off" means you need to add an "x" character between brackets so they turn into checkmarks.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants