feat: v1.98.0 Phase 1 — lifecycle bridge + auto-heal wiring + validate re-check#458
Merged
feat: v1.98.0 Phase 1 — lifecycle bridge + auto-heal wiring + validate re-check#458
Conversation
Wire lifecycle event emission into existing installer phases: - lifecycleBridge: observes installer decisions, emits lifecycle events - observeDetect(): records authority + detection at DETECT completion - observePlan(): records authority action from installer decision - observeResult(): maps installer StateFile to lifecycle outcome - v1.96 recovery marker read for last_operation truth Integration points in runInstall(): - After phaseDetect: emit detect + plan observations - On phase failure: emit result with failure outcome - After phaseValidate: emit final result INV-I-004 ENFORCED: Lifecycle is OBSERVATIONAL ONLY. Bridge mirrors decisions — does NOT influence installer execution. Installer logic remains the source of execution truth. No behavior change. Additive lifecycle logging only. Contract: V198_INSTALL_CANONIZATION_CONTRACT.md §4.1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The /usr/sbin/nftban permission drift on DEB (root:root 755 instead of root:nftban 750) is already handled by the existing permissions module: nftban_permissions_enforce_all() → perms_enforce_sbin() uses $PERMS_SBIN from NFTBAN_SBIN_DIR (distro-config based, not hardcoded) Replace hardcoded fix with comment documenting the existing path. The permissions module (nftban_permissions.sh:230) already: - uses distro-aware path ($PERMS_SBIN) - creates nftban group if missing - sets root:nftban 0750 on /usr/sbin/nftban* Verified on lab2 (Ubuntu 24.04): nftban health fix permissions correctly fixes the drift via the existing module. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add ExecStartPost to nftban-health.service that triggers nftban-health-fix.service after each health check cycle. This closes the auto-heal gap: - Health CHECK runs periodically as User=nftban (timer) - Health CHECK can fix services/nftables (polkit + CAP_NET_ADMIN) - Health CHECK cannot fix root-owned file permissions - Health FIX runs as root and CAN fix permissions/ownership - Previously: health FIX was manual-only, never auto-triggered - Now: health CHECK triggers health FIX on every cycle The fix service is idempotent — if no permission issues exist, it completes instantly with no changes. Uses --no-block to avoid blocking the health check timer. The `-` prefix on ExecStartPost makes it non-fatal if the fix service fails or is already running. Install/update path already calls RunPermissionsEnforce() in phaseValidate, so this only affects the background periodic path. Contract: V198_INSTALL_CANONIZATION_CONTRACT.md INV-I-010 Evidence: V198_PR08_PR09_PARITY_EVIDENCE.md (DEB-PERM-001) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…-validate Add VALIDATE_1 → safe auto-fix → VALIDATE_2 flow to phaseValidate: If initial assertions fail: 1. Log failed assertions (VALIDATE_1) 2. Run 'nftban health fix all' (one-shot, INV-I-012) 3. Re-run assertions (VALIDATE_2, INV-I-013) 4. Only VALIDATE_2 result determines final outcome This closes the operational gap where install could leave safe-fixable drift (e.g. DEB /usr/sbin/nftban permissions) that would cause DEGRADED when a single auto-fix pass would have corrected it. The auto-fix runs at most ONCE per install (INV-I-012). Re-validation is mandatory (INV-I-013). Only allowlisted safe fixes are applied (INV-I-011). If VALIDATE_2 still fails → DEGRADED (INV-I-008, no false success). Contract: V198_INSTALL_CANONIZATION_CONTRACT.md INV-I-010 through INV-I-013 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
Blocker #1 (phases.go:295): installer validate called 'health fix all' which runs 9 unbounded steps including disabling UFW/firewalld/fail2ban, triggering rebuild, GeoIP download, and panel enable. Violates INV-I-011 (allowlist scope) and INV-I-006 (authority takeover). Fix: Replace with 'permissions enforce' — bounded, safe, idempotent. Only fixes ownership/mode on NFTBan-managed paths. Does not cross authority boundaries or mutate external firewall state. Blocker #2 (nftban-health.service ExecStartPost): unconditionally triggered 'nftban-health-fix.service' (which runs fix all) on every health check timer cycle. Violates INV-I-012 (one-shot) and ships unbounded root remediation to every host. Fix: Remove ExecStartPost trigger. Root-level permission fixes now run only during install/update (phaseValidate → permissions enforce) or manual operator invocation. Document the rationale for future bounded safe-fix target. Audit: V198_FOUNDATION_BATCH_AUDIT.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Policy gate requires FHS spec version to match VERSION file. Regenerated via build/generate-fhs-outputs.sh. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ate module Add bounded safe fix state machine as a testable module: - validate.RunWithBoundedFix(): VALIDATE_1 → permissions enforce → VALIDATE_2 - Only calls 'permissions enforce' (INV-I-011), never 'health fix all' - Fix runs at most once (INV-I-012) - Only VALIDATE_2 result determines final outcome (INV-I-013) NB-6 test cases (from V198_PR13_GO_DECISION.md §11): - Test 1: V1 passes → no fix called → success - Test 2: V1 fails → fix runs → V2 passes → COMMITTED - Test 3: V1 fails → fix runs → V2 still fails → DEGRADED - Test 4: permissions enforce called at most once - Test 5: no destructive side-effects (no service disable, no package removal) MockExecutor enhanced with: - OnCommand(): register callbacks for simulating side-effects - CommandCalled(): assert command was/wasn't executed - CommandCallCount(): assert execution count bounds - Callback firing in Run() for command simulation Contract: V198_INSTALL_CANONIZATION_CONTRACT.md INV-I-010 through INV-I-013 Audit closure: NB-6 from V198_FOUNDATION_BATCH_AUDIT.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
v1.98 Phase 1: Architecture batch (PR-07 through PR-12). No default path change yet.
What this PR adds
Key invariants enforced
Auto-heal gap closed
NOT in this PR (Phase 2, after G2 gate)
Phase 2 requires G2 parity gate on real hosts before proceeding.
Lab Validation
Contract
V198_INSTALL_CANONIZATION_CONTRACT.md(13 invariants, INV-I-001 through INV-I-013)V198_PR08_PR09_PARITY_EVIDENCE.md(detect + FHS evidence)Test plan
🤖 Generated with Claude Code