Release v1.7.0 — wait stats as warnings#245
Conversation
Each wait stat recorded on a statement now emits a first-class PlanWarning alongside the existing wait-stats card, so waits sort by benefit % in the warnings list next to operator findings instead of living only in their own panel. - WaitStatsKnowledge: per-wait description + how-to-fix content covering PAGEIOLATCH_*, PAGELATCH_*, LATCH_*, LCK_M_*, MEMORY_ALLOCATION_EXT, RESOURCE_SEMAPHORE*, CX*/HT*/BMPBUILD parallel + batch-mode families, SOS_SCHEDULER_YIELD, THREADPOOL, ASYNC_NETWORK_IO, WRITELOG, IO_COMPLETION, LOGBUFFER, EXECSYNC, PREEMPTIVE_OS_WRITEFILEGATHER, with prefix fallbacks. - PAGEIOLATCH_SH/EX/UP/DT warnings include effective latency per wait (wait_ms / wait_count) in the message. - Severity tiers off benefit %: >=50 Critical, >=10 Warning, else Info. - Flows through existing PlanWarnings pipeline so web viewer, HTML export, JSON/text, and Avalonia Plan Warnings expander all pick it up with no renderer changes. Wait-stats chart stays as a complementary view. - Version bump 1.6.0 -> 1.7.0. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
erikdarlingdata
left a comment
There was a problem hiding this comment.
Review summary
Release PR for v1.7.0 surfacing wait stats as first-class PlanWarnings. Four files, ~370 LOC, mostly the new WaitStatsKnowledge lookup table plus an EmitWaitStatWarnings pass in BenefitScorer. No XAML / Avalonia surface area changes, no MVVM creep, no contrast/brush regressions to flag.
What this does
BenefitScorer.Scoregets a new pass that, for everyWaitStatInfoon a statement, emits aPlanWarningwith description + how-to-fix fromWaitStatsKnowledge, severity bucketed by the per-waitMaxBenefitPercentfromScoreWaitStats(>=50Critical />=10Warning / else Info).WaitStatsKnowledgeis a static lookup with ~50 exact-match entries plus prefix fallbacks (LCK_M_, PAGEIOLATCH_, PAGELATCH_, LATCH_, HT, CX, PREEMPTIVE_) and a generic default. Never returns null.PAGEIOLATCH_*entries setShowEffectiveLatency = true, so their warning message includeswait_ms / wait_countformatted viaFormatLatency.- Plumbed into the Web project via a
<Compile Include=... Link=...>reference inPlanViewer.Web.csproj(correct pattern matching the rest of the file).
What's good
- Lookup falls back gracefully and is null-safe.
FormatLatencycovers s/ms/µs ranges sensibly; division byWaitCountis guarded bywait.WaitCount > 0.- Severity bucketing on
Nullable<double>via pattern switch is clean — null benefit cleanly defaults to Info. - Change is additive — no renderer/UI changes needed because everything flows through the existing
PlanWarningspipeline, which matches the PR description.
What needs attention
-
Base branch is
main. Project convention is to targetdev. Confirmed this is intentional because it's a release PR (v1.6.0 → v1.7.0 inPlanViewer.App.csproj), but flagging per the standing review checklist. -
PerformanceMonitor sync. This PR changes
BenefitScorer(scoring) and adds theWaitStatsKnowledgerule data — both live in the analyzer surface that the PerformanceMonitor Dashboard and PerformanceMonitor Lite repos keep in sync with Studio. The same wait-warning emission + knowledge table needs to land there too, otherwise those products will diverge from Studio on what wait stats produce as warnings. (Also worth deciding whether the severity thresholds —>=50/>=10— should be a shared constant rather than duplicated in three repos.) -
No tests. New logic in
src/PlanViewer.Core/Services/(the entireWaitStatsKnowledge.Lookupfallback chain, severity bucketing,FormatLatency, message assembly) has no corresponding test intests/PlanViewer.Core.Tests/. The prefix-fallback ordering and theExact["MEMORY_ALLOCATION_EXT"]direct access (see inline) are exactly the kind of thing a 30-line table-driven test would have caught any regression on. At minimum: a few[InlineData]cases verifying exact match wins over prefix,LATCH_FOOvsPAGELATCH_FOOvsPAGEIOLATCH_FOOgo to the right entry, unknown returnsDefault, severity buckets pick the right label. -
Inline issues (left on the lines):
WaitStatsKnowledge.cs:303—Exact["MEMORY_ALLOCATION_EXT"]is a hard dictionary read in the fallback path; will throw if the key is ever removed.WaitStatsKnowledge.cs:281and:288—StartsWith("HT")andStartsWith("CX")prefix fallbacks are too broad given the full known wait-type sets are already inExact.BenefitScorer.cs:43—EmitWaitStatWarningsruns even whenQueryTimeStatsis null, in which case all severities silently collapse toInfo. Confirm intended.BenefitScorer.cs:62— no minimum-benefit threshold; every nonzero wait emits a warning, which can flood the Plan Warnings expander on busy plans.
-
Release-PR checklist — the unchecked items in the description (web viewer + desktop expander visual checks post-deploy) are exactly the surfaces this change lights up. Worth doing before merging to main rather than after.
Not approving / not requesting changes — comments only.
Generated by Claude Code
| }; | ||
|
|
||
| if (wt.Contains("MEMORY_ALLOCATION")) | ||
| return Exact["MEMORY_ALLOCATION_EXT"]; |
There was a problem hiding this comment.
Exact["MEMORY_ALLOCATION_EXT"] is a direct dictionary access — if anyone later removes or renames that key in the Exact initializer above, this throws KeyNotFoundException at runtime instead of falling back to Default. Either TryGetValue it, or just inline a fresh Entry { ... } here so the fallback isn't coupled to the exact-match table.
Generated by Claude Code
| HowToFix = "Check sys.dm_os_latch_stats for the specific latch class — these are usually concurrency signals rather than query-tuning problems." | ||
| }; | ||
|
|
||
| if (wt.StartsWith("HT") && wt.Length > 2) |
There was a problem hiding this comment.
StartsWith("HT") && Length > 2 is too broad — it will swallow any future/unrelated wait type whose name happens to begin with HT and mislabel it as a batch-mode hash wait with batch-mode-hash how-to-fix advice. The known batch-mode hash family is HTBUILD/HTREPARTITION/HTDELETE/HTMEMO/HTREINIT (all already in Exact), so the prefix fallback adds no value today and risks wrong attribution tomorrow. Consider either dropping this branch entirely (let unknown HT* fall through to Default) or tightening to a known prefix like HTREP / explicit list.
Generated by Claude Code
| HowToFix = "Reduce input rows and skew for the batch-mode hash. Confirm the smaller side is on the build, and check that batch mode is the right choice for this plan shape." | ||
| }; | ||
|
|
||
| if (wt.StartsWith("CX")) |
There was a problem hiding this comment.
Same issue as the HT prefix above: StartsWith("CX") matches anything starting with CX. The full known parallel-exchange set (CXPACKET/CXCONSUMER/CXSYNC_PORT/CXSYNC_CONSUMER) is already in Exact. A bare CX prefix gives a generic "parallel exchange" answer for anything that happens to start with those two letters — same false-positive risk as HT. Either tighten to CXSYNC_ / CXPACKET / CXCONSUMER, or drop the fallback and let unknowns hit Default.
Generated by Claude Code
| ScoreWaitStats(stmt); | ||
|
|
||
| if (stmt.WaitStats.Count > 0) | ||
| EmitWaitStatWarnings(stmt); |
There was a problem hiding this comment.
EmitWaitStatWarnings runs even when QueryTimeStats == null (line 39's ScoreWaitStats is skipped in that case). With no WaitBenefits populated, benefitByType is empty and every wait gets severity = Info regardless of how dominant the wait is. That may be intentional (estimated/incomplete plans don't have elapsed to score against) but worth confirming — if it is, a one-line comment here saying "no QueryTimeStats → all severities pinned to Info" would save the next reader a trip through the lookup chain.
Also: the two adjacent if (stmt.WaitStats.Count > 0) blocks can collapse into a single guarded block, slightly cheaper and reads more clearly:
if (stmt.WaitStats.Count > 0)
{
if (stmt.QueryTimeStats != null) ScoreWaitStats(stmt);
EmitWaitStatWarnings(stmt);
}Generated by Claude Code
|
|
||
| foreach (var wait in stmt.WaitStats) | ||
| { | ||
| if (wait.WaitTimeMs <= 0) continue; |
There was a problem hiding this comment.
The only filter on emitting a warning is WaitTimeMs <= 0. Every nonzero wait — including 1ms SOS_SCHEDULER_YIELD chatter on a 30s query — becomes a first-class PlanWarning in the expander. On busy real-world plans (10–30 distinct wait types) this can flood the Plan Warnings UI with low-signal Info rows and make the genuinely actionable ones harder to find.
Consider a minimum-relevance threshold, e.g. skip emission if wait.WaitTimeMs / elapsedMs < 0.005 and benefitPct < 1, or drop pure Info waits below some absolute floor.
Generated by Claude Code
Summary
PAGEIOLATCH_*warnings include effective latency per wait (wait_ms / wait_count).PlanWarningspipeline → web viewer, HTML export, JSON/text, Avalonia Plan Warnings expander all updated with no renderer changes.See PR #244 for full details.
Test plan
20260415_1.sqlplanandNXfso4jC0L.sqlplan🤖 Generated with Claude Code