v3.0.0 - A Month Of Sundays
[3.0.0] - 2026-06-16
Important
- Major release — 2.11.0 → 3.0.0, no breaking changes. This version rolls up a codebase-wide correctness and security hardening pass (the
code-review-*series spanning the SQL schema, collectors, and views, the installer, the Lite and Dashboard services, and the shared libraries); a major UI-responsiveness overhaul that moves the data path off the WPF dispatcher in both apps; new object- and index-level collection (per-table / per-index size, growth, usage, and locking/contention); the rebuilt Recommendations / Apply Fix engine (advise-and-act, with safe and destructive fixes appliable behind informed two-sided consent); and a batch of smaller fixes and features. Nothing here is a breaking change — existing installations upgrade in place viaupgrades/2.11.0-to-3.0.0/(typed blocked-process columns, a nullable host-CPU column, theTRANSACTION_MUTEXignored wait, and new server-health columns), and the Dashboard and Lite apps auto-update over the top
Fixed
-
Lite and Dashboard: Azure SQL Database shows its real product name in FinOps → Server Inventory — the Edition column displayed the legacy
SQL Azurevalue thatSERVERPROPERTY('Edition')returns for Azure SQL DB; it now readsAzure SQL Database (<service tier>)(e.g.Azure SQL Database (General Purpose)), derived fromDATABASEPROPERTYEX(DB_NAME(), 'Edition'), for any engine-edition-5 instance. Normalized at every edition display/storage site across both apps — the live inventory queries (Lite + Dashboard) and the SQL-side collectors (install/42,install/53) plus Lite'sserver_propertiescollector — so the value is consistent app-wide; on-prem editions are unchanged. (The licensing-recommendation queries are left raw and identical in both apps: they only do anEnterprisesubstring check and never display the edition for Azure.) -
Dashboard: "Deadlocks Cleared" no longer flaps right after every deadlock (#1091) — deadlock detection is edge-triggered off a delta against the cumulative perfmon counter, so the check immediately after a deadlock saw a zero delta and fired a "Deadlocks Cleared" notification ~one interval (≈60s) after every "Deadlock Detected". The alert now stays active and clears only once a deadlock-quiet window (1 hour) has elapsed since the last new deadlock, so the detect/clear pair lines up with Lite, whose rolling 1-hour count drains about an hour after the last deadlock. Each new deadlock resets the window. The clear message is now "No deadlocks in the last hour" (was "No deadlocks since last check"). Covered by
DeadlockAlertClearPolicyTests -
Lite: blocking and deadlock alerts no longer re-fire for the same events every cooldown (#1091) — the overview alert engine treated the blocking and deadlock counts as a level: each check compared the rolling 1-hour count against the threshold, so a single deadlock (or blocked-process report) kept the count above the threshold for the whole hour it lingered in the window, and the alert re-fired every cooldown (the reporter saw the same "2 deadlocks in the last hour" notification every five minutes for an hour). The Dashboard already edge-triggers off a delta; Lite now does too. Both alerts are gated by a new
RollingCountAlertGatethat fires only when the rolling count climbs above the count recorded at the last fired alert — a genuinely new event. The watermark decays as old events age out of the window (so a later rise re-alerts), resets when the window empties, and advances only when an alert actually fires (so an event arriving during a cooldown is reported once the cooldown elapses rather than being swallowed). Covered byRollingCountAlertGateTests -
Lite and Dashboard: low-disk (Volume Free Space) alert no longer re-fires every cooldown for a standing full volume — a breached volume is a sustained condition, but the alert engine treated free space as a level and re-fired every
AlertCooldownMinutes(default 5) for as long as the volume stayed below threshold. Besides the repeated tray/email, every cycle wrote a fresh Alert-History row, so dismissing the alert appeared not to work — the dismissed row was immediately replaced by an identical, newer one. The alert is now gated by a sharedLowDiskAlertGatethat notifies only on a fresh breach or one that has worsened by at least 1 percentage point of free space below the last-alerted level, and clears its watermark when the volume recovers — mirroring the failed-job watermark and the #1091 rolling-count edge trigger. Fixed identically in both apps. Covered byLowDiskAlertGateTests -
Lite and Dashboard: low-disk and failed-Agent-job conditions now light the server tab badge (#754/#749) — the per-server tab badge was driven only by blocking, deadlocks, CPU, and memory, so a server whose only problem was a full volume or a failed Agent job showed no tab indicator — you couldn't tell which server was affected at a glance (the alert surfaced only as a one-shot tray toast and an Alert-History row). Both apps now fold the alert engine's active low-disk / failed-job state into the badge: it lights while the breach (or a failure within the lookback window) is active, auto-clears when the disk recovers or the failure ages out, and acknowledges/silences exactly like the other badges. This is the persistent-indicator complement to the low-disk re-fire fix above — alert once, then stay quietly flagged instead of re-nagging. Covered by
AlertBadgeConditionTests -
Lite: blocking/deadlock XE sessions now self-heal and failures are surfaced (#1086) — the
PerformanceMonitor_BlockedProcessandPerformanceMonitor_DeadlockExtended Events sessions were created only when a server tab was opened; the recurring background collection loop never created or retried them. A server monitored without an open tab (e.g. app minimized to tray after a restart), or a first attempt that failed (connection not ready, missingALTER ANY EVENT SESSION), left blocking/deadlock capture permanently dead — while the collectors read the non-existent ring buffer, got zero rows, and reported OK. The session ensure now runs inside the collector itself on every cycle (cheap existence check once created), so both the tab-open path and the background loop create/start/retry it. A failed ensure can no longer be masked: it fails the collector run, shows in the status-bar collector health (including permission failures, which previously didn't count as "erroring"), and fires a one-time tray notification ("Capture Not Running") on the transition. The Azure SQL DB database-scoped sessions also gainSTARTUP_STATE = ONso they restart automatically after a failover -
Dashboard: blocking/deadlock XE sessions self-heal, Azure SQL DB sessions are actually created, and a missing session raises a Capture Down alert — same silent-failure family as #1086, worse on the Dashboard side. (1) The server-scoped sessions were created once at install and never re-ensured: if later stopped or dropped,
collect.blocked_process_xml_collectorandcollect.deadlock_xml_collectorswallowed the missing-session error and loggedSUCCESSwith zero rows forever. Both procs now ensure (create/start) the session at the top of every run. (2) On Azure SQL DB, the code comments claimed the database-scoped sessions were "auto-created by the collection procedures" — nothing anywhere created them, so blocking/deadlock capture was 100% non-functional on Azure SQL DB; the procs now create and start them (database_xml_deadlock_reportfor deadlocks — the Azure read also filtered on the wrong event name and would have returned nothing even with a session present). (3) Honest logging: when the session is genuinely absent and can't be created (typically missingALTER ANY EVENT SESSIONon-prem /CREATE ANY DATABASE EVENT SESSIONon Azure SQL DB), the run logsSESSION_MISSINGwith the real error instead ofSUCCESS. (4) The alert engine reads that status and raises a Capture Down alert through the standard pipeline — snoozable tray notification, email, webhook, alert history, cooldown, and mute — with a Capture Restored clear when the session comes back. Note: on Azure SQL DB the blocked-process threshold cannot be set viasp_configureand Microsoft documents no default, so the blocked-process session may exist yet capture nothing there; deadlock capture has no such dependency -
Blocked-process and deadlock XML processors no longer loop on un-parseable events — the second-phase parsers (
collect.process_blocked_process_xml→sp_HumanEventsBlockViewer, andcollect.process_deadlock_xml→sp_BlitzLock) only marked a captured event processed when the parse produced at least one row. Events that legitimately yield zero — a self-block or non-lock wait (e.g. a memory-grantRESOURCE_SEMAPHOREwait that trippedblocked_process_threshold, which SQL Server reports as a session blocked by itself), or a deadlock graph the parser can't reconstruct — were never marked, so every collection cycle re-ran the CPU-intensive parser over the same dead events and re-logged a perpetualNO_RESULTSwhile the staging table never drained. Both processors now mark events processed after any clean parse run and logSUCCESS; genuine parse failures still roll back and retry. Separately, the blocked-process processor's parse window was half-open (event_time < @end_date), so a batch of reports sharing one timestamp — the common case, since a blocked-process monitor loop emits every report at a single instant — fell outside[MIN, MAX)and was silently dropped; the upper bound is now inclusive (matching the deadlock processor). Covered bytools/test_blocked_process_processor.sqlusing real self-block and two-session samples -
Lite and Dashboard UI no longer goes blank or disappears after sleep/wake (#1050) — closing a laptop lid (or locking the screen) and then resuming could leave the app running with no usable window: notifications kept firing but the window was gone from the desktop and taskbar, and relaunching showed an empty window until a full exit/restart. Two causes, both fixed. (1) WPF's GPU render thread can lose its rendering surface across a sleep/wake or RDP reconnect and never recover, leaving a live-but-blank window; both apps now use software rendering (
RenderOptions.ProcessRenderMode = SoftwareOnly) to remove the GPU dependency — charts are unaffected because ScottPlot already renders via SkiaSharp. (2) When Windows turned the sleep-driven minimize into a hidden window, the minimize-to-tray logic left it hidden with no automatic way back; a new shared resume guard now restores the window from the tray on resume/unlock if it was visible beforehand (a window the user deliberately sent to the tray is left alone) -
"Silence All Alerts" now suppresses email too (#1035) — right-clicking a monitored instance and choosing Silence All Alerts hid tray notifications and Alerts-tab badges, but two email paths ignored the silenced state and kept sending: connection up/down emails (Server Unreachable / Server Restored) and analysis-finding emails (the narrative findings from the analysis engine, which include CPU/memory/blocking stories). Only the threshold-alert path (High CPU, blocking, deadlocks, etc.) honored silencing. Both gaps are closed — a silenced server now produces no tray, email, or alert-history row from any path. The analysis path was the likely source of the reporter's "High CPU" email, since the threshold-based High CPU alert was already suppressed. The shared
AnalysisNotificationService(used by Lite too) gains an optional per-server silence predicate; Lite has no silencing feature and passes none -
Dashboard time labels are now consistently 24-hour (#1012) — the time-range header at the top of each tab (e.g. "Original: May 28, 11:30 PM – May 29, 1:30 AM (PST)") and the Query Performance heatmap x-axis tick labels used
h:mm tt, while every other timestamp in the app (footer "Last refresh", DataGrid columns, slicer, tooltips, logs) already used 24-hourHH:mm/HH:mm:ss. The AM/PM marker was also being truncated in the column shown by the reporter. Normalized the four outliers toHH:mmto match the rest of the app. The Lite heatmap had the sameh:mm ttstraggler — fixed alongside -
Lite UI no longer freezes during archival (#979) — archival held DuckDB's exclusive write lock across the entire export-to-Parquet step, blocking every UI query (tab switches showed the spinning wheel, worse with more monitored servers). Export-to-Parquet only reads the database, so it now runs under a shared read lock concurrently with the UI; only the brief
DELETEtakes the exclusive write lock -
Lite FinOps no longer recommends an edition downgrade on an Availability Group secondary (#980) — the licensing recommendations suggested "downgrade to Standard to save $X/mo" for any Enterprise instance, with no AG awareness. On a secondary replica that advice is misleading — every replica in an AG must run the same edition. FinOps now detects the AG replica role and, on a secondary, shows an informational note instead of the downgrade/savings estimate
-
Lite alert emails no longer re-fire after an app restart (#981) — the per-metric email cooldown lived only in memory, so restarting Lite cleared it and an alert sent minutes earlier could be sent again immediately. The cooldown is now seeded from
config_alert_log(the most recent successful send for that server/metric) the first time each alert is evaluated, so it survives restarts -
Dashboard alert emails no longer re-fire after an app restart — brings Dashboard
EmailAlertServiceto parity with the Lite-side persistence introduced in #981. The cooldown is now seeded from the in-memory alert log (loaded fromalert_history.jsonon startup) the first time each{serverId}:{metricName}key is evaluated -
Analysis-finding notification cooldowns now persist across restarts on both Lite and Dashboard — the per-finding re-notification cooldown in
AnalysisNotificationServicelived only in memory, so restarting either app cleared it and a finding that had just fired (and entered itsAnalysisNotifyCooldownMinutescooldown) could re-notify immediately. The cooldown now seeds lazily from the alert log (Lite:config_alert_log; Dashboard:alert_history.json) on first lookup per finding, mirroring the email-cooldown pattern from #981. Entries past 2× the cooldown window are pruned on each notify cycle so the dictionary stays bounded -
Data Retention job no longer fails with
xp_delete_fileerror 22049 (#972) — the trace-file cleanup added in v2.11.0 passed a wildcard path toxp_delete_file, raising an uncatchableMsg 22049that failed the entirePerformanceMonitor - Data RetentionAgent job on every run once anyMonitor_LongQueries_*.trcfiles existed.xp_delete_filealso cannot delete.trcfiles at all — it only accepts SQL Server backup files and Maintenance Plan report files — so that cleanup step has been removed fromconfig.data_retention -
Codebase-wide correctness and security hardening pass — a broad review (the
code-review-*PR series, #1093–#1108) fixed defects across the stack without changing behavior users depend on:- Shared libraries — defects in the extracted
PerformanceMonitor.Analysis/.PlanAnalysis/.Ui/.Commoncode - Dashboard — timezone and CPU-path defects
- Lite — services, analysis, and UI defects, plus
ArchiveServicedata-loss / corruption fixes - Installer — CLI version-detection and failure-handling
- SQL — high-impact collector defects, view / analyzer crashes (including a Linux CPU gap), and schema / job / validation defects
- Shared libraries — defects in the extracted
-
FinOps no longer recommends downgrading to Standard Edition on a server running Availability Groups (#1085) — an Enterprise instance with no TDE was told to "review whether Standard Edition would meet workload requirements" even when it was running AGs, which Standard supports only in the limited Basic Availability Groups form. FinOps now counts advanced (non-basic) AGs via
sys.availability_groups.basic_featuresand, when any are present, appends a caveat naming the AG count and Standard's Basic-AG limitations (two replicas, one database per group, no readable secondary), retitles the finding to "review Availability Group requirements before downgrading," and lowers its confidence — the savings estimate is retained. The Dashboard, which previously had no AG awareness at all, was brought to full parity and also gains the #980 AG-secondary informational note it never received -
Server-tab alert badge is now clearable (#1092) — the red alert badge on a server tab could previously only be cleared through an undiscoverable right-click menu. Left-clicking the badge now acknowledges and clears it (hand cursor, "Click to dismiss · Right-click for options" tooltip), and Alert History Dismiss All clears the matching server badge(s) too. A follow-up (#1122) closed the last gap: Dismiss Selected now also clears the badge for every distinct server represented in the dismissed rows. On the Dashboard, which already had richer auto-resolving badges, this added the missing left-click affordance for parity
-
Long-running-query alert no longer constantly trips on CDC capture jobs (#1096) — the Change Data Capture capture job runs as a continuous SQL Agent session (
sp_MScdc_capture_job→sp_cdc_scan), so its elapsed time permanently exceeded the long-running-query threshold and the alert fired non-stop; none of the four existingwait_type-based exclusions caught it. Both apps gain an Exclude CDC capture jobs toggle (default on) that identifies the capture session server-side by decoding its Agentprogram_nameto ajob_idand matchingmsdb.dbo.cdc_jobs(job_type = 'capture'), falling back to a whole-text match when msdb is unreadable orcdc_jobsdoesn't yet exist — so it stays CDC-specific and never hides unrelated Agent jobs. Dashboard filters the live DMV query inline; Lite computes a per-rowis_cdc_captureflag in the collector (its snapshots store only statement-level text) and filters on read
Changed
- Plan parsing / analysis extracted to shared library
PerformanceMonitor.PlanAnalysis— the previously duplicatedShowPlanParser,PlanAnalyzer,BenefitScorer,PlanLayoutEngine, andPlanModelspairs acrossDashboard/Services+Dashboard/ModelsandLite/Services+Lite/Modelsare now one copy referenced by both apps via<ProjectReference>. The new library targetsnet10.0(no WPF) and has zero dependency onPerformanceMonitor.Analysis— the two shared libraries are independent. ~5,100 LOC of byte-equivalent duplication eliminated. Theplanalyzer-sync-checkeragent is retired (no copies to sync).ActualPlanExecutorstays per-app this release because it callsReproScriptBuilder(Class B, drifted between Lite and Dashboard); both will be extracted in a follow-up PR onceReproScriptBuilderis reconciled and a logging abstraction is designed PlanIconMappersplit to break a shared-library WPF dependency —ShowPlanParsercallsPlanIconMapper.GetIconNameto populatePlanNode.IconNameduring parse, but the rest ofPlanIconMapperis WPF-bound (GetIconreturnsBitmapImage). The pure-data half (theIconMapdictionary + theGetIconNamelookup) is nowIconNameMapperinsidePerformanceMonitor.PlanAnalysis. The per-appPlanIconMapper.GetIcon(string iconName)is unchanged; the per-appGetIconNameforwarder is gone (ShowPlanParsercallsIconNameMapper.GetIconNamedirectly, and there were no other callers)- Analysis engine extracted to shared library
PerformanceMonitor.Analysis— the previously duplicatedFactScorer,RelationshipGraph,InferenceEngine,AnalysisModels,IFactCollector,IPlanFetcher, andBlockingChainReconstructorpairs acrossDashboard/Analysis/andLite/Analysis/are now one copy referenced by both apps and both test projects via<ProjectReference>. The new library targetsnet10.0(no WPF) so it can be picked up by future non-WPF consumers without a multi-target rewrite. Theblocking-reconstructor-sync-checkeragent is retired (no copies to sync).BlockingChainReconstructorTestsported toDashboard.Tests(10 tests) as part of the same change — Dashboard now exercises the same reconstruction coverage as Lite.AnalysisServiceand the DB-bound adapters (*FactCollector,*DrillDownCollector,*FindingStore,*AnomalyDetector,*BaselineProvider,*PlanFetcher) stay per-app because they bind toDuckDBConnectionvsSqlConnection.PlanAnalyzerand itsplanalyzer-sync-checkerare outside this extraction's scope and stay - Trace files are now bounded at the source (#972) —
collect.trace_management_collectorcreates the long-query trace with a rollover file-count cap (@filecount, via the new@max_filesparameter, default 5), so SQL Server itself deletes the oldest.trcfile as the trace rolls. The scheduled collector also now issuesSTARTinstead ofRESTART: it keeps one trace running rather than tearing it down and spawning a fresh timestamped trace — and a fresh batch of orphaned files — every cycle - Blocked-process reports expose blocker-side fields as typed columns —
collect.blocking_BlockedProcessReportnow carriesblocking_spid,blocking_last_tran_started,blocking_status,blocked_sql_text, andblocking_sql_textpopulated at insert time fromblocked_process_report_xml. Existing rows are backfilled idempotently by the 2.11.0 → 3.0.0 upgrade script - Blocking-chain reconstruction now reads typed columns from
collect.blocking_BlockedProcessReportinstead of re-parsingblocked_process_report_xmlon every analysis cycle — eliminates up to 5000XElement.Parsecalls perBLOCKING_CHAINfact collection. The DashboardBlockedProcessXmlParserhas been deleted; the Lite collection-time parser is unchanged (Lite has no SQL-side staging table and still parses once at collect time) - Analysis minimum-data threshold lowered to 24 hours —
Lite/Analysis/AnalysisService.csandDashboard/Analysis/AnalysisService.csnow require 24 hours of collected data before analysis runs, down from 72. Validated empirically as sufficient for fraction-of-period calculations, so a fresh install starts producing findings after one day instead of three - Major UI-responsiveness overhaul — the data path now runs off the WPF dispatcher in both apps — DuckDB.NET is synchronous, so in Lite
await _dataService.X()completed on the calling (UI) thread, and a single DuckDB connection open under load is ~750 ms; the result was multi-hundred-millisecond to multi-second UI freezes on the per-minute pipeline, refreshes, and alert checks. The fix moves the work onto pool threads (Task.Run) across the board: Lite's background collect/checkpoint/archive pipeline, the full-refresh fan-out, the 60-second sub-tab refreshes, picker charts, the overview sweep, timeline lanes, connect, and the FinOps and Recommendations reads; the Dashboard'sServerTabrow materialization and its execution-plan parse/analyze; and — found later by wall-clock thread-time profiling under a HammerDB TPC-C load — the alert-check / overview-sweep DuckDB queries that were still on the dispatcher (#1121, which cut the worst measured dispatcher stall from ~1.2 s to under 10 ms). Lite also skips the heavy refresh for non-selected (hidden) server tabs, the shared crosshair/hover hot path was made cheaper for both apps, and Dashboard timers gained re-entrancy guards. A cluster of long-session memory leaks that progressively degraded responsiveness was fixed alongside (#1116): an Alerts-tabDispatcherTimerthat kept ticking after the tab closed, unbounded per-run alert-key dictionaries, a tray-service handler re-subscribed on every theme change, and plan-viewer controls leaked through a static theme event. (The related sleep/wake blank-window and software-rendering fix is tracked separately under #1050 above.) Net effect: the UI stays responsive under heavy collection and query load
Added
tools/Remove-OrphanedTraceFiles.ps1(#972) — one-time cleanup script forMonitor_LongQueries_*.trcfiles left on disk by versions through 2.11.0. Run it on the SQL Server host; it skips files belonging to a running trace and files that are in useFactAdviceandFactRemediationinPerformanceMonitor.Analysis— new shared-library data layer that maps every scorable fact-key to a Headline / Investigation / Remediation advice block, plus a copy-paste-readysp_query_store_force_planT-SQL generator forPLAN_REGRESSIONfindings (gated to that single fact-key in v1;PARAMETER_SENSITIVITYdeliberately does not generate plan-force T-SQL because forcing locks in the wrong plan for some parameter values). Drill-down collectors now also projectbest_plan_id(viaMAX(plan_id)in the plan-dedup CTE) so the generated EXEC carries the integer IDsp_query_store_force_planactually accepts, not just the hash. Lite'sBuildContextnow mirrors Dashboard's — both apps emit a Diagnosis card atDetails[0]carrying Story / Severity / Notify threshold / Confidence / Facts / Database / Window before the drill-down items. The rendering surfaces that consume this data (email HTML, plain-text email, Teams + Slack webhook payloads, in-app Alert Details window) ship in a separate follow-up PR- Object- and index-level collection: sizes, growth, usage, and locking/contention (#1103) — both apps gain a daily collector that snapshots per-table and per-index storage (
sys.dm_db_partition_stats), index usage (sys.dm_db_index_usage_stats— seeks/scans/lookups/updates), and per-object locking/latch/escalation (sys.dm_db_index_operational_stats— row-lock waits, page-latch waits, lock-escalation attempts), all from stock DMVs verified stable from SQL Server 2016 through 2025 and on Azure SQL DB / Managed Instance. On-prem and MI iterate user databases (honoring the collector exclusion list); Azure SQL DB uses its single-database branch. Dashboard collects intocollect.index_object_stats(install/55_collect_index_object_stats.sql, scheduled daily with 90-day retention picked up by dynamic retention); Lite collects into DuckDB with archival registered. Three new FinOps sub-tabs in each app — Object Sizes & Growth (per-table size plus 7-/30-day growth and daily rate), Index Usage (Unused / Write-only / Active classification), and Locking & Contention (top-contended indexes) — plus MCP read tools (get_table_index_sizes,get_index_usage,get_object_locking). Because the daily snapshots are cumulative, the new object-growth (ANOMALY_OBJECT_GROWTH, a table grew >100 MB and ≥20% day-over-day) and lock-contention (ANOMALY_OBJECT_CONTENTION, an index gained ≥60s of new row-lock wait) alerts are delta-based (the two most recent snapshots, reset-guarded) and flow through the existing anomaly →AnalysisNotificationServicepipeline. Thresholds are fixed constants in this release; making them user-configurable is a follow-up - Recommendations / Apply Fix engine (advise-and-act rebuild) — the analysis engine's advisory output is now a first-class Recommendations surface in both apps, alongside Critical Issues. Each finding renders as a card with a plain-language Headline / Investigation / Remediation block (from the
FactAdvice/FactRemediationshared-library data layer) and routes the reader into the relevant in-app view or MCP tool instead of dumping raw DMV queries. Advise-only recommendations include server-config advisories (MAXDOP / cost threshold for parallelism / max server memory), per-database config (autogrowth, percent-growth on large files), server-health facts (Lock Pages in Memory, Instant File Initialization, recent memory dumps), and missing-index / plan-warning recommendations mined from collected plans — missing-indexCREATEstatements are surfaced as copy-paste text. A subset is appliable in place behind informed, two-sided consent: always-safeALTER DATABASE SETconfig fixes, and the destructive RCSI (enable read-committed snapshot) and clear cached plan (DBCC FREEPROCCACHE/ unforce) fixes, which gate behind an acknowledge-each-risk dialog that quantifies both the risk of changing and the risk of doing nothing from the finding's own monitoring data. The advice and remediation T-SQL also render across every notification surface — email (HTML and plain text), Teams and Slack webhook payloads, and the in-app Alert Details window — and through theanalyze_serverandget_analysis_findingsMCP tools - Low volume free-space alert (#754) in both apps — a new Volume Free Space alert (default on) fires when a monitored server's disk volume drops below a free-space percentage or a fixed GB amount (set either threshold to
0to disable that dimension; if both are set, either breach fires). It reads the per-volume size/free data already collected by the database-size collector, evaluates every volume on the server, and fires one alert per server naming the worst (lowest-free) volume with up to five breaching volumes in the context — with the same cooldown, mute, alert-history, tray, and email plumbing as the existing tempdb-space alert. Defaults: 10% / 5 GB. Azure SQL DB has no volume data, so the alert never fires there - Failed SQL Agent job alert (#749) in both apps — complements the existing job-duration alerts with a Failed Agent Job alert (default on) that issues a live
msdb.dbo.sysjobhistoryquery at alert-check time for job-outcome rows (step_id = 0,run_status = 0) that failed within a configurable look-back window (default 60 minutes). The read degrades gracefully when the login lacks msdb /SQLAgentReaderRoleaccess (returns empty, never faults the alert cycle) and is skipped entirely on Azure SQL DB, which has no SQL Agent - Installer: optional custom data/log file locations (#768) — two optional CLI flags,
--data-pathand--log-path(both--flag VALUEand--flag=VALUEforms accepted), place thePerformanceMonitordatabase's.mdf/.ldfon specific server-side volumes at install time; an omitted flag falls back to the instance default path as before. The paths apply only on first creation (the create block is guarded byIF DB_ID(N'PerformanceMonitor') IS NULL), and Azure SQL Managed Instance ignores them. The path is validated and escaped (control characters and the dangerous filename characters are rejected; single quotes are doubled in both the C# injection layer and the dynamicCREATE DATABASE) because a data-fileFILENAMEliteral cannot be parameterized