Skip to content

feat(v1.100 PR-26): code-D post-restore evidence record (§39.3 / §48.6 lock)#517

Merged
itcmsgr merged 2 commits intomainfrom
feat/v1.100-pr26-code-d-evidence-record
Apr 28, 2026
Merged

feat(v1.100 PR-26): code-D post-restore evidence record (§39.3 / §48.6 lock)#517
itcmsgr merged 2 commits intomainfrom
feat/v1.100-pr26-code-d-evidence-record

Conversation

@itcmsgr
Copy link
Copy Markdown
Owner

@itcmsgr itcmsgr commented Apr 28, 2026

PR-26-code-D adds post-restore evidence recording

After every --mode=restore execution, the dispatcher writes a structured JSON evidence record under /var/lib/nftban/state/restore-evidence/ so the post-mutation state is captured truthfully and machine-auditably without consulting private memory or live host introspection.

Key guarantees

  • Single evidence writer helperwriteRestoreEvidenceRecord(ctx, exec, record, log). Exactly one WriteFileAtomic call in restore_evidence.go; structurally enforced by G4-RESTORE-EVIDENCE-RECORD.
  • Evidence path locked to /var/lib/nftban/state/restore-evidence/ via the restoreEvidenceDir named constant. No write outside this directory is permitted.
  • Schema version "1.0.0" pinned by restoreEvidenceSchemaVersion. Filename: restore-evidence-<UTC-RFC3339-basic>-<short-random-hex>.json.
  • No update-history.json writes. §19.2 layer-4 invariant retained — main.go:132 mode-gate untouched. File-scan test TestDispatcher_PR26D_NoUpdateHistoryWrite_FileScan pins this.
  • No decision re-entry. Recording-only invariant (operator design call): no restore.Decide, no restore.PlanFromDecision, no uninstall.Probe, no detect.DetectPanel. Structurally enforced by both the file-scan test and G4-RESTORE-EVIDENCE-RECORD.
  • No validator / module-health sweep. Evidence is recorded from values the dispatcher already has (target / execRes / VerifyResult) plus a small set of read-only kernel/service queries (NftTableExists, detect.SSHPortWithSource).
  • Evidence write failure after StateRestoreExecuted downgrades to StateRestoreDegraded. The state model already supports StateRestoreDegraded (no machine change). Successful restore that cannot be recorded is success-with-warnings, not clean success. Pinned by TestRunRestoreExecutionFromProceed_PR26D_ExecutedPlusEvidenceFail_DowngradesToDegraded.
  • Evidence write failure after non-success restore terminals is warning-only. StateRestoreFailedExecution and StateRestoreFailedVerification terminals are preserved unchanged; evidence failure is logged but does not alter the outcome. Pinned by tests Update nftban_init_nftables_conf.sh #2 and Update nftban_init_nftables_conf.sh #3 of the dispatcher semantics suite.
  • G4-RESTORE-EVIDENCE-RECORD structural gate added (§46). Pins the named-constant + single-helper invariant + recording-only forbidden-symbol scan + dispatcher-consumption check.

Authority

Behavior delta

Outcome Before (PR-26-code-C) After (PR-26-code-D)
StateRestoreExecuted + clean write terminal preserved terminal preserved + evidence file written
StateRestoreExecuted + write failure terminal preserved (no evidence) downgraded to StateRestoreDegraded with warning
StateRestoreFailedExecution + any write outcome terminal preserved terminal preserved (evidence failure warning-only)
StateRestoreFailedVerification + any write outcome terminal preserved terminal preserved (evidence failure warning-only)
update-history.json unchanged for restore mode unchanged for restore mode (file-scan-pinned)

Schema (§48.6 lock)

{
  "schema_version": "1.0.0",
  "timestamp_utc": "2026-04-28T13:45:00Z",
  "mode": "restore",
  "phase": "post_restore_verify",
  "target": { "kind": "...", "firewall_type": "csf", "panel": "..." },
  "result": { "state": "...", "exit_code": 7, "stage": "...", "success": true },
  "verification": {
    "target_firewall_active": true,
    "authority_class": "external",
    "safety_net_removal_safe": true,
    "emergency_table_present_after": false,
    "nftban_tables_present_after": false,
    "ssh_port": 22,
    "ssh_port_source": "ss"
  },
  "history_gate": {
    "update_history_unchanged": true,
    "restore_mode_history_write_forbidden": true
  },
  "warnings": [""]
}

Files changed (6)

  • NEW cmd/nftban-installer/restore_evidence.go — schema types + sentinels + single-helper writer + recording-only assembler
  • NEW cmd/nftban-installer/restore_evidence_test.go — 10 writer/builder/lock-pin tests
  • internal/installer/detect/ssh.go — read-only SSHPortWithSource helper (§51.5-A2)
  • cmd/nftban-installer/restore_decide.go — Step D added (build → write → downgrade-on-failure-after-Executed)
  • cmd/nftban-installer/restore_decide_test.go — 5 existing dispatcher tests updated to inject MockExecutor; 5 new PR26D_* dispatcher semantics tests added
  • .github/workflows/ci-restore-canonization.yml — new G4-RESTORE-EVIDENCE-RECORD structural gate

Out of scope (and untouched)

  • Destructive real-host CSF soak (PR-26-code-E)
  • A.4 cron changes (already shipped in code-C)
  • Executor new mutation methods (Stat is read-only, scoped to this PR)
  • IptablesRuleExists / iptables introspection (Option B lock)
  • main.go history gate / state-machine / exit codes
  • Restore planner / TargetAuthority / PR-24 lattice
  • contract.md
  • Repo hygiene / UX / GOTH / metrics / module cleanup

Lab2 verification (head 8b713085)

  • go build ./... clean
  • go test ./... PASS (full repo, 64 packages)
  • go test -race -count=1 cmd + restore + state + switchop + detect PASS
  • go vet ./... clean
  • go mod tidy no-op
  • 10 evidence-module tests + 5 dispatcher semantics tests = 15 new PR-26-code-D tests, all PASS
  • All 3 G4 gates (NO-OUT-OF-TARGET + CRON-MANIFEST-INTEGRITY + EVIDENCE-RECORD) local replay: FAIL=0

Test plan

  • Restore Canonization Gate matrix (ubuntu-24.04 + almalinux-9 + summary) green
  • G4-RESTORE-EXEC-NO-OUT-OF-TARGET green
  • G4-RESTORE-CRON-MANIFEST-INTEGRITY green
  • G4-RESTORE-EVIDENCE-RECORD (NEW) green
  • G4-RESTORE-NO-IMPLICIT-EXEC green
  • Architecture Policy / Policy Gates green
  • Go Build & Test + race + full DEB+RPM matrix + CodeQL / Semgrep / Secure Go / OSV / Gitleaks / GitGuardian green
  • ShellCheck / Bash Validation / Docs Quality green

No extra labs needed for code-D — GitHub CI carries the matrix; audit any failures before merge.

🤖 Generated with Claude Code

itcmsgr and others added 2 commits April 28, 2026 16:58
…6 lock)

PR-26-code-D — restore verification / evidence hardening, slice D.
Adds the structured post-restore evidence-record writer per §39.3
+ §48.6 operator lock. Recording-only — does NOT re-run PR-24
decisions, rebuild TargetAuthority, or add validator/module-health
probes (operator design call).

Authority:
- PR #512 / contract.md Part IV §§37-50
- PR #513 / §51 lock record
- PR #514 / code-A merge 4e98ff5
- PR #515 / code-B merge 45fc63e
- PR #516 / code-C merge 6d8386d
- §39 Q1 BLOCKING evidence rows
- §39.3 evidence-record file requirement
- §46 CI gate requirements
- §48.6 (operator-locked at this commit's open):
    - path:    /var/lib/nftban/state/restore-evidence/
    - filename: restore-evidence-<UTC-RFC3339-basic>-<short-random>.json
    - schema:   1.0.0
    - writer helper: writeRestoreEvidenceRecord(ctx, exec, record)
    - path constant: restoreEvidenceDir
- §51.5-A2 (read-only typed introspection outside mutation cap)

Files added (2):

cmd/nftban-installer/restore_evidence.go
- Constants:
    restoreEvidenceSchemaVersion  = "1.0.0"
    restoreEvidenceDir            = "/var/lib/nftban/state/restore-evidence"
    restoreEvidenceFilenamePrefix = "restore-evidence-"
    restoreEvidenceMode           = 0o640
    restoreEvidenceDirMode        = 0o750
- Schema types: RestoreEvidenceRecord (schema_version, timestamp_utc,
  mode, phase, target, result, verification, history_gate, warnings) +
  the 4 nested structs.
- Sentinels: ErrEvidenceWriteFailed, ErrEvidenceNilExecutor,
  ErrEvidenceNilRecord.
- writeRestoreEvidenceRecord — the SINGLE helper. MkdirAll, marshal,
  WriteFileAtomic. Filename: prefix + UTC RFC3339-basic stamp +
  "-" + 8-hex random suffix + ".json".
- buildRestoreEvidenceRecord — recording-only assembler. Sources:
  target.Kind/FirewallType/Panel, execRes.Terminal/Stage/VerifyResult,
  exec.NftTableExists for emergency + nftban tables, detect.SSHPortWithSource.
  No re-derivation; no Probe / Decide / DetectPanel calls.
- evidenceShortRandom — crypto/rand-backed 8-hex suffix to avoid
  same-second filename collisions.

cmd/nftban-installer/restore_evidence_test.go
- 10 tests:
    1. WriteRestoreEvidence_HappyPath — filename pattern + single write
    2. WriteRestoreEvidence_RoundTripsJSON — schema_version + mode +
       phase + history_gate flags
    3. WriteRestoreEvidence_NilExecutor — defensive guard
    4. WriteRestoreEvidence_NilRecord — defensive guard
    5. WriteRestoreEvidence_OnlyHelperWritesUnderEvidenceDir_FileScan —
       single-WriteFileAtomic invariant
    6. WriteRestoreEvidence_NoForbiddenSurfaces_FileScan —
       recording-only invariant pin
    7. BuildRestoreEvidenceRecord_RecordedPriorHappy — full happy
       path with ss-listener SSH port resolution
    8. BuildRestoreEvidenceRecord_NftbanTablesPresent_Recorded —
       post-mutation kernel observation
    9. BuildRestoreEvidenceRecord_AuthorityClassDivergenceWarning —
       ObservedAuthority diverging from AuthorityExternal surfaces
       in warnings
    10. RestoreEvidenceConstants_LockPin — §48.6 path/version/prefix
        pinned exactly

Files modified (4):

internal/installer/detect/ssh.go
- Added detect.SSHPortWithSource (read-only). Same 4-source priority
  chain as detect.SSHPort but also returns the source name (ss /
  sshd_config / state / config) — required by the §48.6 schema's
  ssh_port_source enum. Per §51.5-A2 outside the mutation cap.

cmd/nftban-installer/restore_decide.go
- runRestoreExecutionFromProceed gains a Step D (between Execute
  and Transition):
    1. buildRestoreEvidenceRecord(target, execRes)
    2. writeRestoreEvidenceRecord(ctx, exec, rec, log)
- §48.6 downgrade rule: if evidence-write fails AFTER a successful
  StateRestoreExecuted, downgrade to StateRestoreDegraded
  (state.machine.go:152 already supports this terminal). The state
  model supports the downgrade; no contract amendment needed.
- Operator-facing log line on Degraded now includes the evidence-
  write failure reason.
- No state-machine / exit-code / history-gate change. main.go:132
  mode-gate untouched.

cmd/nftban-installer/restore_decide_test.go
- TestRunRestoreExecutionFromProceed_FakeDeps_HappyPath_PersistsExecuted
  + 4 other dispatcher tests updated: pass executor.NewMockExecutor()
  instead of nil so the new evidence-write step succeeds and the
  terminal stays at StateRestoreExecuted (fake happy path). The 3
  tests that pass nil exec via _ = runRestoreExecutionFromProceed
  do not assert on sf.State so they still pass under the downgrade.

.github/workflows/ci-restore-canonization.yml
- New gate G4-RESTORE-EVIDENCE-RECORD (§46). Structural — pins the
  named-constant + single-helper invariant:
    * restore_evidence.go declares restoreEvidenceDir,
      restoreEvidenceSchemaVersion, restoreEvidenceFilenamePrefix
      verbatim + locked values
    * restore_evidence.go declares writeRestoreEvidenceRecord +
      buildRestoreEvidenceRecord + RestoreEvidenceRecord struct
    * exactly ONE WriteFileAtomic call in restore_evidence.go
      (the single-helper invariant — locked by §48.6)
    * forbidden-symbol scan: restore.Decide /
      restore.PlanFromDecision / uninstall.Probe / detect.DetectPanel
      / writeHistory / update-history.json / mutation primitives /
      direct OS bypass (recording-only invariant)
    * dispatcher (restore_decide.go) calls BOTH
      writeRestoreEvidenceRecord AND buildRestoreEvidenceRecord
      (proves evidence is consumed, not just imported)
- §46.1 line-skipping discipline applied (production-code-only,
  comment-stripped).

Recording-only invariant (operator design call) honored:
- No restore.Decide / restore.PlanFromDecision calls
- No uninstall.Probe call
- No detect.DetectPanel call (only detect.SSHPortWithSource —
  read-only typed introspection)
- No validator full-sweep / module-health probe
- No update-history.json write (§19.2 layer 4 / main.go:132 retained)
- No new mutation primitive

Constraints honored (per operator scope):

IN:
- evidence record type + schema ✓ (§48.6 lock)
- evidence writer helper ✓ (single helper writeRestoreEvidenceRecord)
- production write after restore execution path ✓ (dispatcher Step D)
- structural CI gate G4-RESTORE-EVIDENCE-RECORD ✓
- tests proving all writes stay under restoreEvidenceDir ✓
- tests proving update-history is untouched ✓ (HistoryGate flags +
  no writeHistory references in evidence module)

OUT:
- destructive soak (PR-26-code-E)
- A.4 cron changes (already shipped in code-C)
- executor new mutation methods (Stat is read-only, shipped in code-C)
- iptables introspection (Option B lock)
- main.go history gate changes (untouched)
- state/exit-code changes — only the existing StateRestoreDegraded
  is consumed, no new state added
- repo hygiene / UX / GOTH / metrics / module cleanup

Verified on lab2 (Ubuntu 24.04, go1.22.2):
- go build ./... clean
- go test ./... PASS (full repo, 64 packages)
- go test -race -count=1 cmd + restore + state + switchop + detect PASS
- go vet ./... clean
- go mod tidy no-op
- 10 new TestWriteRestoreEvidence_* / TestBuildRestoreEvidenceRecord_* /
  TestRestoreEvidenceConstants_LockPin tests all PASS
- existing 5 dispatcher fake-deps tests updated + still PASS
- All 3 G4 gates (NO-OUT-OF-TARGET / CRON-MANIFEST-INTEGRITY /
  EVIDENCE-RECORD) local replay: FAIL=0

Awaiting auditor pass before push.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…mantics tests (auditor checkpoint)

Auditor focused-audit on 849b372 flagged that PR-26-code-D's Step D
introduces a real operator-visible terminal transition:

  StateRestoreExecuted + evidence write failure → StateRestoreDegraded

The 10 unit tests already covered the writer + builder + recording
invariants but did NOT pin the dispatcher-level downgrade semantics.
This commit adds 5 dispatcher-level tests to close that gap.

Tests added:

cmd/nftban-installer/restore_decide_test.go

  1. PR26D_ExecutedPlusEvidenceFail_DowngradesToDegraded
     fake deps return StateRestoreExecuted; writeFailExec wrapper
     forces evidence WriteFileAtomic to fail. Asserts:
     - sf.State == StateRestoreDegraded (downgrade fires)
     - exit code == StateRestoreDegraded.ExitCode()
     - sf.State != StateRestoreExecuted (no false claim)
     Note: sf.FailureReason stays empty by design (Transition only
     populates FailureReason on .IsFailed() states; Degraded is
     success-with-warnings). The downgrade reason surfaces via
     log.Result, which is the authoritative operator channel for
     Degraded outcomes.

  2. PR26D_FailedExecutionPlusEvidenceFail_TerminalPreserved
     fake.mutateErr forces FailedExecution; writeFailExec forces
     evidence-write failure. Asserts:
     - sf.State == StateRestoreFailedExecution (terminal preserved)
     - exit == StateRestoreFailedExecution.ExitCode()
     Evidence failure is warning-only on non-Executed terminals.

  3. PR26D_FailedVerificationPlusEvidenceFail_TerminalPreserved
     fake.activeRet=false forces inline-verify SafeToRemove=false →
     FailedVerification; writeFailExec forces evidence-write fail.
     Asserts terminal + exit code unchanged from FailedVerification.

  4. PR26D_ExecutedPlusEvidenceOk_PreservesExecuted
     Plain MockExecutor (writes succeed). Asserts:
     - sf.State == StateRestoreExecuted (no downgrade on clean write)
     - exit == StateRestoreExecuted.ExitCode()
     - exactly one file written under restoreEvidenceDir
     - no writes outside restoreEvidenceDir

  5. PR26D_NoUpdateHistoryWrite_FileScan
     File-scan against restore_decide.go. Strips line-leading // per
     §46.1; asserts no production-code reference to writeHistory(
     or update-history.json. Pins the §19.2 layer-4 invariant stays
     untouched after PR-26-code-D adds Step D.

writeFailExec wrapper (test-only):
  Wraps *executor.MockExecutor and overrides only WriteFileAtomic
  to fail. Avoids changing the production MockExecutor; uses the
  same composition pattern as flakyCSFActiveExec (introduced in
  PR-25 4B-3-csf for analogous test purposes).

Verified on lab2 (Ubuntu 24.04, go1.22.2):
- go build ./... clean
- go test ./cmd/nftban-installer/... PASS
- 5 new TestRunRestoreExecutionFromProceed_PR26D_* /
  TestDispatcher_PR26D_* tests all PASS
- go test -race -count=1 cmd + restore + state PASS
- existing PR-25 + PR-26-code-A/B/C tests still PASS

No production code change. No CI workflow change. No contract
amendment needed. Restore semantics from §48.6 lock + §19.2 layer-4
invariant are both now structurally pinned by tests.

Awaiting auditor sign-off + push signal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@itcmsgr itcmsgr merged commit 9f148bb into main Apr 28, 2026
63 checks passed
@itcmsgr itcmsgr deleted the feat/v1.100-pr26-code-d-evidence-record branch April 28, 2026 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant