B3 PR-A: approval-gated Apply Fix — privileged execution core (no UI) by erikdarlingdata · Pull Request #1032 · erikdarlingdata/PerformanceMonitor

erikdarlingdata · 2026-05-30T23:37:47Z

PR-A — Approval-gated "Apply Fix": privileged execution core (no UI)

PR-A of B3: the security-critical core that mutates a monitored, likely-production SQL Server via sp_query_store_force_plan. No UI, no MCP exposure, no mutating caller — PR-B adds the gated Apply button over this already-reviewed core. Do not merge until the security-reviewer pass on this code.

Plan (spec): C:\Users\edarl\.claude\plans\b3-phase1-implementation.md. Baseline origin/dev b404c76. Three commits: core, gate-split, real-server fixes.

The six security invariants and how each is enforced

Structured execution only. sp_query_store_force_plan/unforce_plan is issued with typed @query_id/@plan_id as SqlDbType.BigInt; never the rendered SQL text. database is applied only as InitialCatalog, built solely through SqlConnectionStringBuilder (never concatenated). No general Execute(sql) exists. → DatabaseService.Remediation.cs.
Single-connection self-gating (R2-MOD-1). ForcePlanAsync/UnforcePlanAsync open one retargeted SqlConnection and run, on that same open connection with no re-open: identity/permission gate (DB_NAME()==target assert (A5) + DB-scoped ALTER check), then the Query Store freshness read, then — only if all pass — the EXEC. The outcome carries the @@SPID at the gate read and at the EXEC; the real-server harness asserts they are equal (observed 72==72). ApplyAsync takes no preflight disposition and re-derives its own gate — unbypassable.
No elevation. Runs under the existing per-server monitoring connection. No credential prompt, no SecureString/SqlCredential, no elevated path. has_alter=0 fails closed with map-then-grant guidance. used_elevated_cred always 0.
Audit-table-absent = HARD BLOCK (R2-MOD-2). OBJECT_ID('config.remediation_action_log') IS NULL (monitoring connection) blocks every target with no mutation attempted.
Audited apply + unapply. One config.remediation_action_log row per attempt (success/skip/error/abort), after both apply and unapply, on the monitoring connection.
Never automatic. PR-A has no caller that triggers a mutation (A4 no-caller test). PR-B adds the gated UI.

Real-server findings — two bugs the faked-executor unit tests could not reach

Caught by running the actual DatabaseServiceRemediationExecutor + ForcePlanHandler against sql2022 (the faked IRemediationExecutor can't exercise live T-SQL):

ALTER-permission form (security-gate correctness). The plan's O5 specified HAS_PERMS_BY_NAME(NULL, NULL, 'ALTER'), but that server-scoped form returns NULL even for sysadmin (verified on sql2022) — so the gate would fail closed for every login and Apply could never run. Fixed to the DB-scoped form HAS_PERMS_BY_NAME(DB_NAME(), 'DATABASE', 'ALTER') (1 for ALTER-holders incl. sysadmin/db_owner, 0 otherwise). DB_NAME() keeps it correct after the catalog retarget.
Unforce delegation. UnforcePlanAsync delegated with isUnforce: false (would re-force on un-apply); corrected to isUnforce: true.
Gate split so has_alter=0 fails closed before any Query Store catalog read (a least-privilege login lacks VIEW DATABASE STATE and would otherwise error 297 reading sys.query_store_plan). Identity+ALTER use only always-accessible intrinsics; the QS read runs only after ALTER passes — on the same open connection (R2-MOD-1 intact).

Structured-params persistence

RemediationAction/ForcePlanTarget (.Analysis, incl. LatestCpuPerExecUs/BestCpuPerExecUs for render stability — M1). FactRemediation refactored to extract once (ExtractPlanRegressionTargets) + BuildAction; rendered preview byte-for-byte unchanged. Round-trips via an optional member on AlertDetailItem/AlertContextDto/AlertContextSerializer in the existing contextJson. Old contexts without it → null (no Apply, no crash).

Upgrade script

upgrades/2.11.0-to-2.12.0/02_create_remediation_action_log.sql (+ upgrade.txt), config schema, idempotent (OBJECT_ID guard). 2.12.0 is the in-flight version (tag v2.11.0, csprojs 2.11.0) so it appends to the existing folder. No install/*.sql edited.

Tests (unit)

Golden render-stability (byte-for-byte incl. the two (cpu/exec … us) lines).
Serializer round-trip (Dashboard + Lite); legacy JSON without the field → null.
Handler gate vs faked executor: has_alter=0 fail-closed; audit-table-absent block; already-forced/QS-off/stale/wrong-DB dispositions; per-target independence; gate re-derivation; applied-but-unlogged; per-outcome audit; un-apply restricted to prior B3 forces.
A4 no-caller: ForcePlanAsync/UnforcePlanAsync referenced only in the executor seam.
Suites green: Dashboard.Tests 68/0, Lite.Tests 310/0.

Real-server verification (sql2022, Query Store DB)

Installer @2.12.0 (temporary <Version>/<AssemblyVersion>/<FileVersion>/<InformationalVersion> bump in both installer csprojs — reverted, not in this PR; server is the bare hostname SQL2022): Existing installation detected: v2.11.0.0 → Found 1 upgrade(s) → applied 2.11.0-to-2.12.0 (02_create_remediation_action_log.sql - Success), RC=0 "Installation completed successfully". Verified: config.remediation_action_log PRESENT (15 cols), history recorded 2.12.0.0 : SUCCESS. Second run: No pending upgrades found (idempotent).
Executor/handler harness (throwaway, passed 1/1; not committed). Audit table ground-truth (queried via sqlcmd):
- force/success, executing_login=sa, operator_identity=HARNESS\tester — is_forced_plan=1.
- R2-MOD-1: gate @@SPID == EXEC @@SPID (72 == 72).
- unforce/success, executing_login=sa — is_forced_plan=0.
- Absent-table → Blocked, no mutation.
- Scoped login (b3_noalter, VIEW DATABASE STATE but no ALTER) → has_alter=False, PermissionDenied, fail closed, no mutation. (Preflight surfaced the login name correctly; the no-ALTER message login string had a harness-logging quirk — the audit table and preflight both record the correct login.)

Not UI-reachable

A4 no-caller test confirms nothing in the running app reaches the executor. DatabaseService.Remediation.* methods are internal, reachable only via DatabaseServiceRemediationExecutor.

Notes / deviations

Executor sets standard ANSI options (incl. ARITHABORT ON) before the gate/EXEC — sp_query_store_force_plan errors 1934 otherwise (Microsoft.Data.SqlClient defaults ARITHABORT OFF). Same open connection, so R2-MOD-1 holds.
RemediationIdentity carries an optional SourceAlertRef (audit traceback) — minor superset of the plan's { OperatorIdentity }.
Plan O5 was wrong about the HAS_PERMS_BY_NAME form (see above); corrected with real-server evidence.
All six invariants met; none could-not-be-met.

Do NOT merge — awaiting security-reviewer pass; PR-B follows.

🤖 Generated with Claude Code

The security-critical core for the "Apply Fix" feature: structured remediation params, the audited force-plan execution path, and its self-gating handler. No UI, no MCP exposure, no mutating caller — PR-B wires the gated UI later. Shared libs: - RemediationAction / ForcePlanTarget (PerformanceMonitor.Analysis): typed, data-only payload. FactRemediation refactored to extract once (ExtractPlanRegressionTargets) + BuildAction; the rendered preview is byte-for-byte unchanged (golden test). - AlertContext round-trips RemediationAction in the existing contextJson; legacy contexts without it deserialize to null (no Apply). Populated in FindingMessageFormatter.BuildContext. Dashboard execution core (dead code until PR-B): - IRemediationExecutor seam + IRemediationHandler + RemediationHandlerRegistry + one ForcePlanHandler (PLAN_REGRESSION). - DatabaseService.Remediation.cs: structured sp_query_store_force_plan / unforce with typed BigInt params; DB applied only as InitialCatalog via SqlConnectionStringBuilder. The authoritative gate (DB_NAME assert + HAS_PERMS_BY_NAME ALTER + freshness) and the EXEC run on ONE open connection (R2-MOD-1). No elevation, existing monitoring connection only; has_alter=0 fails closed with grant guidance. - Audit-table-absent is a HARD BLOCK before any mutation (R2-MOD-2). Every apply/unapply attempt writes config.remediation_action_log. Schema: upgrades/2.11.0-to-2.12.0/02_create_remediation_action_log.sql (config schema, idempotent). Coupled to the 2.12.0 schema upgrade. Tests: golden render-stability, serializer round-trip + legacy-null, faked- executor handler gate (fail-closed perms, absent-table block, freshness/skip, per-target independence, gate re-derivation, applied-but-unlogged), un-apply restriction, and an A4 no-caller guard. Plan: C:\Users\edarl\.claude\plans\b3-phase1-implementation.md Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@spid

Real-server verification on sql2022 caught that a least-privilege login (no ALTER, no VIEW DATABASE STATE) hit error 297 reading sys.query_store_plan / sys.database_query_store_options inside the gate query — so the intended clean "PermissionDenied + grant guidance" never surfaced (it came back as a wrong-DB Blocked with a null gate row instead). Fix: split the per-target gate into two reads on the SAME open connection (R2-MOD-1 preserved): 1. ReadGateIdentityAsync — DB_NAME / SUSER_SNAME / HAS_PERMS_BY_NAME / @@spid (always-accessible intrinsics). DB-match (A5) and ALTER are checked here. 2. ReadQueryStoreStateAsync — qs_state / plan_present / is_forced / force_failure_count, run only after the ALTER check passes (and, in preflight, only when ALTER is held and the DB matches). So has_alter=0 fails closed with the map-then-grant message without ever touching Query Store catalog views the login can't read. Verified on sql2022: NO-ALTER scoped login -> PermissionDenied (login surfaced), no mutation; force/ unforce/audit and the single-connection gate (gate SPID == exec SPID) still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@spid

…r findings) Two bugs the faked-executor unit tests could not reach, caught by real-server verification on sql2022: 1. HAS_PERMS_BY_NAME form. The plan's O5 specified HAS_PERMS_BY_NAME(NULL, NULL, 'ALTER') — but that server-scoped form returns NULL even for a sysadmin (verified on sql2022), so the gate would fail closed for EVERY login and Apply could never run. Switched to the DB-scoped form HAS_PERMS_BY_NAME(DB_NAME(), 'DATABASE', 'ALTER'), which returns 1 for a principal holding ALTER on the connected DB (sysadmin / db_owner / granted) and 0 otherwise — the permission sp_query_store_force_plan actually needs. DB_NAME() (not a literal) stays correct after the catalog retarget. 2. UnforcePlanAsync delegated to ForceOrUnforceAsync with isUnforce: false, so un-apply would re-force instead of unforce. Corrected to isUnforce: true. Verified end-to-end on sql2022 (throwaway harness, not committed): force -> is_forced_plan=1 + audit force/success (executing_login=sa); single connection for gate+EXEC (gate @@spid == exec @@spid, e.g. 72==72); unforce -> is_forced_plan=0 + audit unforce/success; audit-table-absent -> Blocked, no mutation; scoped login without ALTER -> PermissionDenied, fail closed, no mutation. config.remediation_action_log rows confirmed on the server. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…eview LOW-1) The PR-A no-caller guard greps only ForcePlanAsync/UnforcePlanAsync, but PR-B reaches the privileged executor via the handler/registry types (registry.TryGet(...).ApplyAsync()) without ever typing those method names — so the guard that protects PR-A's "not UI-reachable" invariant would pass a PR-B wiring silently. Extend the guard to the whole machinery (RemediationHandlerRegistry / DatabaseServiceRemediationExecutor / ForcePlanHandler / IRemediationExecutor / IRemediationHandler) so it actually fires when a future surface wires it in. Still green today (nothing outside the core references these). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

erikdarlingdata and others added 4 commits May 30, 2026 19:37

erikdarlingdata merged commit 7432f45 into dev May 31, 2026
2 checks passed

erikdarlingdata deleted the feature/b3-pr-a-apply-fix-core branch May 31, 2026 14:18

erikdarlingdata mentioned this pull request May 31, 2026

B3 PR-B: approval-gated Apply Fix UI over the privileged core #1033

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

B3 PR-A: approval-gated Apply Fix — privileged execution core (no UI)#1032

B3 PR-A: approval-gated Apply Fix — privileged execution core (no UI)#1032
erikdarlingdata merged 4 commits into
devfrom
feature/b3-pr-a-apply-fix-core

erikdarlingdata commented May 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

erikdarlingdata commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR-A — Approval-gated "Apply Fix": privileged execution core (no UI)

The six security invariants and how each is enforced

Real-server findings — two bugs the faked-executor unit tests could not reach

Structured-params persistence

Upgrade script

Tests (unit)

Real-server verification (sql2022, Query Store DB)

Not UI-reachable

Notes / deviations

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

erikdarlingdata commented May 30, 2026 •

edited

Loading