Skip to content

Release v1.7.2 — operator self-time fix + decimal benefit %#256

Merged
erikdarlingdata merged 6 commits intomainfrom
dev
Apr 22, 2026
Merged

Release v1.7.2 — operator self-time fix + decimal benefit %#256
erikdarlingdata merged 6 commits intomainfrom
dev

Conversation

@erikdarlingdata
Copy link
Copy Markdown
Owner

Summary

Hotfix for #215 Joe feedback:

  • C5: operator self-time was wrong for parents of Compute Scalar children (bug since v1.5.0). Fixed via look-through helper. Node-6 Hash Spill on Joe's plan went from a bogus 95% to the correct 17.4%.
  • C2: benefit % now shows one decimal instead of rounding tiny waits to "up to 0%".

See PR #255 for details.

Test plan

  • Joe's private plan math matches
  • Regression check on `20260415_1.sqlplan`
  • Web viewer visual post-deploy

🤖 Generated with Claude Code

erikdarlingdata and others added 6 commits April 21, 2026 17:09
- PlanShare dashboard: rebuild referrer table with DOM + textContent instead
  of innerHTML, so legacy rows containing hostile strings can't execute script.
- PlanShare /api/event: drop the referrer entirely when Uri.TryCreate rejects
  it, instead of persisting the raw client-supplied string.
- WindowsCredentialService: save credentials with Enterprise scope
  (current-user, DPAPI-encrypted) instead of LocalMachine (readable by every
  user on the box).
- PlanShare RateLimiter: add Sweep() that prunes and TryRemoves empty keys,
  invoked from CleanupService's existing hourly tick so the dictionary no
  longer grows unbounded across unique IPs.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- ReproScriptBuilder: double ']' in database names before emitting USE, so
  identifiers like `cool]stuff` produce valid T-SQL instead of unbalanced
  brackets.
- ReproScriptBuilder: whitelist SET TRANSACTION ISOLATION LEVEL against the
  five valid T-SQL names; unknown/hostile strings drop silently instead of
  interpolating into the script.
- ShowPlanParser.ParseLong: pass NumberStyles.Integer + InvariantCulture to
  match sibling ParseDouble; makes numeric parsing culture-independent.
- KeychainCredentialService: read stdout and stderr concurrently via
  Task.WhenAll to avoid the classic "sync stderr while stdout is async"
  deadlock, which could surface on `security dump-keychain` with large
  keychains.
- QueryStoreHistoryWindow: override OnClosed to cancel + dispose the
  in-flight fetch CancellationTokenSource, so SqlConnection doesn't sit
  open on the server when the dialog is closed mid-load.
- QuerySessionControl: on DetachedFromVisualTree, cancel + dispose
  _statusClearCts alongside the existing TextMate teardown, so the
  fire-and-forget status-clear dispatch can't touch a dead control.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Swap the predictable `ssms_plan_{yyyyMMdd_HHmmss}.sqlplan` name for a random
suffix produced by Path.GetRandomFileName, and open with FileMode.CreateNew
so we refuse to overwrite a pre-existing file. Closes the race where another
local process could predict the filename and plant a symlink at that path
before the extension's File.WriteAllText landed.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Added Services/AtomicFile.WriteAllText which writes to a sibling .tmp and
then renames into place. File.Move(src, dst, overwrite: true) maps to
MoveFileEx(MOVEFILE_REPLACE_EXISTING) on Windows and rename(2) on Unix —
both atomic when source and destination share a filesystem, which is
always the case here.

Swapped File.WriteAllText for AtomicFile.WriteAllText at every config-save
site:
  - ConnectionStore.Save (saved server list — the worst-case loss)
  - AppSettingsService.Save (recent/open plans, slicer days-back, etc.)
  - SqlFormatSettingsService.Save (format options)
  - AboutWindow.SaveMcpSettings (MCP enable + port)

A crash between the .tmp write and the rename leaves the original file
intact and a stray .tmp sibling; the next save overwrites .tmp before
renaming, so no manual cleanup is needed.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Passwords given on the command line are visible in process listings, shell
history, and audit logs. Add --password-stdin to analyze and query-store so
scripts and humans can feed the password through a pipe instead. When
--password is used inline, emit a stderr warning pointing at the safer
alternatives (stdin, env var, credential store).

Helper `PasswordResolver.TryResolve` centralizes:
 - Mutual exclusion between --password and --password-stdin.
 - Conflict detection between --stdin (plan XML) and --password-stdin.
 - Validation that stdin is actually redirected when --password-stdin is set.
 - Fallback to PLANVIEW_PASSWORD env var.
 - The inline warning for --password.

CredentialCommand already reads from redirected stdin when --password is
omitted; kept that behavior and added the inline warning for symmetry.

Also fixed a pre-existing bug where `Environment.ExitCode = 1` set by
handlers was clobbered by `return await root.InvokeAsync(args)`. Program.cs
now returns `code != 0 ? code : Environment.ExitCode` so scripts can tell
a validation failure from a successful run.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…decimal on benefit % (#255)

C5 (operator self-time look-through):
GetSerialOwnElapsed subtracted each direct child's ActualElapsedMs, but
Compute Scalar operators don't carry runtime stats (ActualElapsedMs = 0).
For an operator whose direct children are Compute Scalars — like the node-6
Hash Match in Joe's private plan — that meant "children elapsed = 0" and the
parent kept its full elapsed as self-time.

Effect: node-6 Hash Spill reported 95% benefit / 8,829ms self-time. With
look-through through the Compute Scalars it's 1,609ms self-time / 17.4% —
matching Joe's math of 8.829 - 7.121 - 0.099.

New GetEffectiveChildElapsedMs helper walks through pass-through operators
(Compute Scalar and anything else missing runtime stats) to the first
descendant with real stats. Parallelism exchange children keep existing
max-child behavior because exchange times are unreliable.

Applies to every operator-time-based benefit score (spills, key lookups,
bare scans, scan-with-predicate, scan cardinality, eager index spool, etc).

C2 (decimal precision on benefit %):
Benefit % was rounded to N0 everywhere, so tiny waits showed as "up to 0%".
Switched to one decimal except for 100 (shown as "100"), matching the
existing pattern in ComparisonFormatter. Applied to web strip, wait-stats
card, HTML export (both surfaces), text formatter (both paths), Avalonia
Plan Warnings + Node Warnings headers.

Version bump 1.7.1 -> 1.7.2.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Owner Author

@erikdarlingdata erikdarlingdata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What this PR does

Release v1.7.2 rollup. Two user-visible fixes from Joe's feedback on #215:

  • C5 operator self-time: PlanAnalyzer.GetSerialOwnElapsed now looks through pass-through operators (Compute Scalar, Parallelism exchange) that don't carry reliable runtime stats, via a new GetEffectiveChildElapsedMs helper. The Hash Spill-parent-of-Compute-Scalar case that was reporting ~95% self-time now reports the correct ~17%.
  • C2 benefit %: wait-stat / warning "up to X%" headers render with one decimal below 100 via a FormatBenefitPercent helper, so tiny waits no longer round to "up to 0%." Applied consistently across PlanViewerControl, HtmlExporter, TextFormatter, and Web/Index.razor.

Also bundles previously-merged fixes: dashboard XSS hardening, rate-limiter key eviction, atomic settings writes, random SSMS temp-plan filename, --password-stdin, keychain stderr-deadlock fix, isolation-level allowlist in ReproScriptBuilder, culture-invariant ParseLong, CredentialPersistence.Enterprise.

Good

  • Benefit-% helper is applied everywhere the old N0 format appeared — no surface missed.
  • Sweep() is wired through DI and invoked by CleanupService rather than a separate timer; nice.
  • Dashboard switches from innerHTML to textContent + createElement — correct XSS fix.
  • ReproScriptBuilder isolation-level allowlist and databaseName ]-doubling close two real SQL-injection shapes in a user-facing script generator.
  • KeychainCredentialService concurrent stream read fixes a real deadlock.

Needs attention

  1. Base branch is main. Typical release flow, but flagging per convention — non-release PRs should target dev.
  2. PlanAnalyzer rule change must mirror into PerformanceMonitor Dashboard + Lite. Those separate repos share Studio's rule set; if they don't get the same GetEffectiveChildElapsedMs update they'll keep reporting the bogus self-time. Inline on PlanAnalyzer.cs:1634.
  3. Rate-limiter Sweep race. IsAllowed can hold a reference to a list that Sweep just evicted from the dictionary; the next timestamp lands on an orphaned list. Inline on Program.cs:479.
  4. Credential persistence change is breaking for existing users. LocalMachineEnterprise won't read back credentials written under the prior scope. Inline on WindowsCredentialService.cs:19.
  5. No test changes in tests/PlanViewer.Core.Tests/. The self-time fix is exactly the kind of thing that should land with a regression test — e.g. a .sqlplan fixture reproducing the Hash-Spill-over-Compute-Scalar case, asserting GetSerialOwnElapsed returns the corrected value. PasswordResolver, AtomicFile, ReproScriptBuilder isolation gating, and ParseLong invariant-culture all lack coverage too.
  6. Nit on PasswordResolver line 49 — redundant TrimEnd after ReadLine.
  7. Nit on AtomicFile — crash-safe but not safe against two concurrent writers to the same path.

Generated by Claude Code

sum += GetEffectiveChildElapsedMs(grandchild);
return sum;
}

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plan-analysis rule change. Confirm the equivalent look-through fix is landing in PerformanceMonitor's Dashboard and Lite copies — those repos must stay in sync with Studio's self-time math, otherwise the same Hash Spill / Compute Scalar parent will continue to report the stale ~95% figure there.

Minor: GetEffectiveChildElapsedMs treats ActualElapsedMs == 0 as "no stats" and recurses, but a genuinely instantaneous operator with recorded 0ms would also recurse and over-count via grandchildren. Probably fine in practice (sub-ms timings don't aggregate into anything user-visible) but worth noting if you later hit a weird case.


Generated by Claude Code

lock (kvp.Value)
{
kvp.Value.RemoveAll(t => t < cutoff);
if (kvp.Value.Count == 0 && _requests.TryRemove(kvp))
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race with IsAllowed: a caller that already resolved _requests.GetOrAdd(key, ...) → timestamps can be blocked on lock (timestamps) while Sweep holds the same lock, observes Count == 0, and TryRemoves the kvp. When the caller then acquires the lock it adds its timestamp to a list that's no longer in the dictionary — the next IsAllowed(key) call creates a fresh list, so that one request effectively bypasses the rate limit.

Not catastrophic at 10/min, but worth either (a) re-checking that _requests[key] still points at kvp.Value after the remove, or (b) adding a sentinel/"tombstoned" flag on the list so a late arrival can detect the eviction and retry via GetOrAdd.


Generated by Claude Code

secret: password,
comment: "planview credential",
persistence: CredentialPersistence.LocalMachine);
persistence: CredentialPersistence.Enterprise);
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Persistence scope change from LocalMachineEnterprise is a behavior change for existing installs. Credentials previously stored under the machine-wide scope won't be returned by a read that's targeting the enterprise (roaming) scope — upgrading users may silently lose saved SQL logins and be forced to re-enter them. If that's intentional (DPAPI scope hardening from #249), worth a release-notes line; if not, reading from both scopes on load and migrating once would avoid the regression.


Generated by Claude Code

}
password = Console.In.ReadLine()?.TrimEnd('\r', '\n') ?? "";
return true;
}
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Console.In.ReadLine() already strips the trailing line terminator, so the .TrimEnd('\r', '\n') is redundant — and if a caller pipes a password that legitimately ends in \r (unlikely but possible on Windows-CRLF input that's been partially normalized), you'd silently strip a real character. Safe to drop the TrimEnd.


Generated by Claude Code

{
var tmp = path + ".tmp";
File.WriteAllText(tmp, contents);
// File.Move with overwrite:true maps to MoveFileEx(MOVEFILE_REPLACE_EXISTING)
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two concurrent writers (two Studio processes saving settings at once) both race on path + ".tmp"File.WriteAllText on the second one truncates the first's in-progress temp, then both File.Move calls succeed in some order, so the final content is well-formed but may be either process's payload rather than the later write. Settings don't change often enough for this to matter, but the docstring's "atomic with respect to process crashes" is accurate — not "atomic with respect to concurrent writers." A unique suffix (e.g. path + "." + Guid.NewGuid().ToString("N") + ".tmp") would close it if you ever need it.


Generated by Claude Code

@erikdarlingdata erikdarlingdata merged commit fdf490d into main Apr 22, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant