B3 PR-A: approval-gated Apply Fix — privileged execution core (no UI)#1032
Merged
Conversation
The security-critical core for the "Apply Fix" feature: structured remediation params, the audited force-plan execution path, and its self-gating handler. No UI, no MCP exposure, no mutating caller — PR-B wires the gated UI later. Shared libs: - RemediationAction / ForcePlanTarget (PerformanceMonitor.Analysis): typed, data-only payload. FactRemediation refactored to extract once (ExtractPlanRegressionTargets) + BuildAction; the rendered preview is byte-for-byte unchanged (golden test). - AlertContext round-trips RemediationAction in the existing contextJson; legacy contexts without it deserialize to null (no Apply). Populated in FindingMessageFormatter.BuildContext. Dashboard execution core (dead code until PR-B): - IRemediationExecutor seam + IRemediationHandler + RemediationHandlerRegistry + one ForcePlanHandler (PLAN_REGRESSION). - DatabaseService.Remediation.cs: structured sp_query_store_force_plan / unforce with typed BigInt params; DB applied only as InitialCatalog via SqlConnectionStringBuilder. The authoritative gate (DB_NAME assert + HAS_PERMS_BY_NAME ALTER + freshness) and the EXEC run on ONE open connection (R2-MOD-1). No elevation, existing monitoring connection only; has_alter=0 fails closed with grant guidance. - Audit-table-absent is a HARD BLOCK before any mutation (R2-MOD-2). Every apply/unapply attempt writes config.remediation_action_log. Schema: upgrades/2.11.0-to-2.12.0/02_create_remediation_action_log.sql (config schema, idempotent). Coupled to the 2.12.0 schema upgrade. Tests: golden render-stability, serializer round-trip + legacy-null, faked- executor handler gate (fail-closed perms, absent-table block, freshness/skip, per-target independence, gate re-derivation, applied-but-unlogged), un-apply restriction, and an A4 no-caller guard. Plan: C:\Users\edarl\.claude\plans\b3-phase1-implementation.md Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Real-server verification on sql2022 caught that a least-privilege login (no ALTER, no VIEW DATABASE STATE) hit error 297 reading sys.query_store_plan / sys.database_query_store_options inside the gate query — so the intended clean "PermissionDenied + grant guidance" never surfaced (it came back as a wrong-DB Blocked with a null gate row instead). Fix: split the per-target gate into two reads on the SAME open connection (R2-MOD-1 preserved): 1. ReadGateIdentityAsync — DB_NAME / SUSER_SNAME / HAS_PERMS_BY_NAME / @@spid (always-accessible intrinsics). DB-match (A5) and ALTER are checked here. 2. ReadQueryStoreStateAsync — qs_state / plan_present / is_forced / force_failure_count, run only after the ALTER check passes (and, in preflight, only when ALTER is held and the DB matches). So has_alter=0 fails closed with the map-then-grant message without ever touching Query Store catalog views the login can't read. Verified on sql2022: NO-ALTER scoped login -> PermissionDenied (login surfaced), no mutation; force/ unforce/audit and the single-connection gate (gate SPID == exec SPID) still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…r findings) Two bugs the faked-executor unit tests could not reach, caught by real-server verification on sql2022: 1. HAS_PERMS_BY_NAME form. The plan's O5 specified HAS_PERMS_BY_NAME(NULL, NULL, 'ALTER') — but that server-scoped form returns NULL even for a sysadmin (verified on sql2022), so the gate would fail closed for EVERY login and Apply could never run. Switched to the DB-scoped form HAS_PERMS_BY_NAME(DB_NAME(), 'DATABASE', 'ALTER'), which returns 1 for a principal holding ALTER on the connected DB (sysadmin / db_owner / granted) and 0 otherwise — the permission sp_query_store_force_plan actually needs. DB_NAME() (not a literal) stays correct after the catalog retarget. 2. UnforcePlanAsync delegated to ForceOrUnforceAsync with isUnforce: false, so un-apply would re-force instead of unforce. Corrected to isUnforce: true. Verified end-to-end on sql2022 (throwaway harness, not committed): force -> is_forced_plan=1 + audit force/success (executing_login=sa); single connection for gate+EXEC (gate @@spid == exec @@spid, e.g. 72==72); unforce -> is_forced_plan=0 + audit unforce/success; audit-table-absent -> Blocked, no mutation; scoped login without ALTER -> PermissionDenied, fail closed, no mutation. config.remediation_action_log rows confirmed on the server. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eview LOW-1) The PR-A no-caller guard greps only ForcePlanAsync/UnforcePlanAsync, but PR-B reaches the privileged executor via the handler/registry types (registry.TryGet(...).ApplyAsync()) without ever typing those method names — so the guard that protects PR-A's "not UI-reachable" invariant would pass a PR-B wiring silently. Extend the guard to the whole machinery (RemediationHandlerRegistry / DatabaseServiceRemediationExecutor / ForcePlanHandler / IRemediationExecutor / IRemediationHandler) so it actually fires when a future surface wires it in. Still green today (nothing outside the core references these). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR-A — Approval-gated "Apply Fix": privileged execution core (no UI)
PR-A of B3: the security-critical core that mutates a monitored, likely-production SQL Server via
sp_query_store_force_plan. No UI, no MCP exposure, no mutating caller — PR-B adds the gated Apply button over this already-reviewed core. Do not merge until the security-reviewer pass on this code.Plan (spec):
C:\Users\edarl\.claude\plans\b3-phase1-implementation.md. Baselineorigin/devb404c76. Three commits: core, gate-split, real-server fixes.The six security invariants and how each is enforced
sp_query_store_force_plan/unforce_planis issued with typed@query_id/@plan_idasSqlDbType.BigInt; never the rendered SQL text.databaseis applied only asInitialCatalog, built solely throughSqlConnectionStringBuilder(never concatenated). No generalExecute(sql)exists. →DatabaseService.Remediation.cs.ForcePlanAsync/UnforcePlanAsyncopen one retargetedSqlConnectionand run, on that same open connection with no re-open: identity/permission gate (DB_NAME()==targetassert (A5) + DB-scoped ALTER check), then the Query Store freshness read, then — only if all pass — the EXEC. The outcome carries the@@SPIDat the gate read and at the EXEC; the real-server harness asserts they are equal (observed72==72).ApplyAsynctakes no preflight disposition and re-derives its own gate — unbypassable.SecureString/SqlCredential, no elevated path.has_alter=0fails closed with map-then-grant guidance.used_elevated_credalways0.OBJECT_ID('config.remediation_action_log') IS NULL(monitoring connection) blocks every target with no mutation attempted.config.remediation_action_logrow per attempt (success/skip/error/abort), after both apply and unapply, on the monitoring connection.Real-server findings — two bugs the faked-executor unit tests could not reach
Caught by running the actual
DatabaseServiceRemediationExecutor+ForcePlanHandleragainst sql2022 (the fakedIRemediationExecutorcan't exercise live T-SQL):HAS_PERMS_BY_NAME(NULL, NULL, 'ALTER'), but that server-scoped form returns NULL even for sysadmin (verified on sql2022) — so the gate would fail closed for every login and Apply could never run. Fixed to the DB-scoped formHAS_PERMS_BY_NAME(DB_NAME(), 'DATABASE', 'ALTER')(1 for ALTER-holders incl. sysadmin/db_owner, 0 otherwise).DB_NAME()keeps it correct after the catalog retarget.UnforcePlanAsyncdelegated withisUnforce: false(would re-force on un-apply); corrected toisUnforce: true.has_alter=0fails closed before any Query Store catalog read (a least-privilege login lacks VIEW DATABASE STATE and would otherwise error 297 readingsys.query_store_plan). Identity+ALTER use only always-accessible intrinsics; the QS read runs only after ALTER passes — on the same open connection (R2-MOD-1 intact).Structured-params persistence
RemediationAction/ForcePlanTarget(.Analysis, incl.LatestCpuPerExecUs/BestCpuPerExecUsfor render stability — M1).FactRemediationrefactored to extract once (ExtractPlanRegressionTargets) +BuildAction; rendered preview byte-for-byte unchanged. Round-trips via an optional member onAlertDetailItem/AlertContextDto/AlertContextSerializerin the existingcontextJson. Old contexts without it → null (no Apply, no crash).Upgrade script
upgrades/2.11.0-to-2.12.0/02_create_remediation_action_log.sql(+upgrade.txt),configschema, idempotent (OBJECT_IDguard). 2.12.0 is the in-flight version (tagv2.11.0, csprojs 2.11.0) so it appends to the existing folder. Noinstall/*.sqledited.Tests (unit)
(cpu/exec … us)lines).has_alter=0fail-closed; audit-table-absent block; already-forced/QS-off/stale/wrong-DB dispositions; per-target independence; gate re-derivation; applied-but-unlogged; per-outcome audit; un-apply restricted to prior B3 forces.ForcePlanAsync/UnforcePlanAsyncreferenced only in the executor seam.Real-server verification (sql2022, Query Store DB)
<Version>/<AssemblyVersion>/<FileVersion>/<InformationalVersion>bump in both installer csprojs — reverted, not in this PR; server is the bare hostnameSQL2022):Existing installation detected: v2.11.0.0→Found 1 upgrade(s)→ applied2.11.0-to-2.12.0(02_create_remediation_action_log.sql - Success), RC=0 "Installation completed successfully". Verified:config.remediation_action_logPRESENT (15 cols), history recorded 2.12.0.0 : SUCCESS. Second run:No pending upgrades found(idempotent).force/success,executing_login=sa,operator_identity=HARNESS\tester—is_forced_plan=1.@@SPID== EXEC@@SPID(72 == 72).unforce/success,executing_login=sa—is_forced_plan=0.Blocked, no mutation.b3_noalter, VIEW DATABASE STATE but no ALTER) →has_alter=False,PermissionDenied, fail closed, no mutation. (Preflight surfaced the login name correctly; the no-ALTER message login string had a harness-logging quirk — the audit table and preflight both record the correct login.)Not UI-reachable
A4 no-caller test confirms nothing in the running app reaches the executor.
DatabaseService.Remediation.*methods areinternal, reachable only viaDatabaseServiceRemediationExecutor.Notes / deviations
ARITHABORT ON) before the gate/EXEC —sp_query_store_force_planerrors 1934 otherwise (Microsoft.Data.SqlClientdefaultsARITHABORT OFF). Same open connection, so R2-MOD-1 holds.RemediationIdentitycarries an optionalSourceAlertRef(audit traceback) — minor superset of the plan's{ OperatorIdentity }.HAS_PERMS_BY_NAMEform (see above); corrected with real-server evidence.Do NOT merge — awaiting security-reviewer pass; PR-B follows.
🤖 Generated with Claude Code