feat(nico): add tenant-transition data sanitization validations (SEC21-04/05/06)#458
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a NICo sanitization CLI that derives neutral lifecycle tokens from machine status/history and emits per-machine sanitization records; wires a new ChangesTenant-transition sanitization validation (SEC21-04/05/06)
Sequence Diagram(s)sequenceDiagram
participant NicoAPI as NICo API
participant QueryScript as query_sanitization.py
participant ValidationCheck as MemorySanitizationCheck
participant Report as Test Report
QueryScript->>NicoAPI: GET machines (org, site)
NicoAPI-->>QueryScript: machine records (status, statusHistory, capabilities, identity)
QueryScript->>QueryScript: ordered_history_statuses() -> status_token()
QueryScript->>QueryScript: evaluate_transitions() -> (served_tenant, sanitized)
QueryScript->>QueryScript: machine_record() (flags, transitions, firmware)
QueryScript-->>ValidationCheck: emit JSON step_output with machines array
ValidationCheck->>ValidationCheck: validate step_output.success and machines list
ValidationCheck->>ValidationCheck: per-machine evaluate_sanitization()
ValidationCheck->>Report: emit per-machine subtests and aggregated pass/fail summary
🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers:
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
/ok to test 3ca8e7c |
🔐 TruffleHog Secret Scan✅ No secrets or credentials found! Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉 🕐 Last updated: 2026-06-09 20:28:10 UTC | Commit: 3ca8e7c |
|
/ok to test 2b005a1 |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@isvctl/tests/test_nico_provider.py`:
- Around line 902-903: The timestamp formatting in the list comprehension that
constructs entries from history_statuses uses f"2026-01-01T00:0{i}:00Z" which
breaks for i >= 10; update the "created" format in that comprehension (the
expression producing {"status": s, "message": "", "created": ...}) to zero-pad
the minute value (e.g., use a two-digit integer format like {i:02d} or
equivalent datetime formatting) so timestamps remain valid ISO 8601 for i >= 10.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: ad35a962-e9f7-4b09-bfdf-059147804aa9
📒 Files selected for processing (8)
isvctl/configs/providers/nico/config/bare_metal.yamlisvctl/configs/providers/nico/scripts/sanitization/query_sanitization.pyisvctl/configs/suites/README.mdisvctl/configs/suites/bare_metal.yamlisvctl/tests/test_nico_provider.pyisvtest/src/isvtest/validations/__init__.pyisvtest/src/isvtest/validations/sanitization.pyisvtest/tests/test_sanitization.py
…1-04/05/06) Add three provider-agnostic checks that assert a host which served a tenant is not returned to the allocatable pool until it passes through a sanitizing lifecycle stage: - MemorySanitizationCheck (SEC21-04): host (RAM) memory sanitization. - GpuMemorySanitizationCheck (SEC21-05): SRAM/GPU memory sanitization, scoped to GPU-equipped hosts. - FirmwareResetCheck (SEC21-06/SEC22): TPM clear and BIOS/UEFI recommit on transition, plus report-only firmware identity. Wire a NICo query_sanitization.py step that maps machine status + statusHistory (Reset -> sanitizing, InUse -> in_use, Ready -> available) into the neutral contract, into the bare_metal suite and NICo provider config. Ships unreleased; released_tests.json intentionally untouched. Closes #312, #313, #314 Signed-off-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Alexandre Begnoche <abegnoche@users.noreply.github.com>
The tenant-transition sanitization checks concatenated every failing machine's full message (incl. its entire state-transition history) into the top-level failure summary, which pytest and the orchestrator then echoed several times -- a wall of text for large fleets. Summarize with a count plus a few offending machine IDs instead; the full per-machine reason (including transitions) is still emitted per subtest and in the JUnit XML, so no diagnostic detail is lost. Update the tests to assert the concise summary plus the detail on the subtest. Signed-off-by: Alexandre Begnoche <abegnoche@nvidia.com>
2b005a1 to
62579f5
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
isvctl/tests/test_nico_provider.py (1)
293-302: ⚡ Quick winKeep API-base contract tests aligned with all configured NICo scripts.
Line 293 and Line 316 add
query_sanitizationbut still omitquery_ib_tenant_isolationand
query_ib_keys, which are configured test steps inbare_metal.yamland also consume--api-base.
Adding them to both parametrizations prevents quiet coverage drift.♻️ Proposed test update
`@pytest.mark.parametrize`( "step_name", [ "verify_ingestion", "check_dpu_health", "query_governance_metrics", "query_host_health", "query_health_aggregation", + "query_ib_tenant_isolation", + "query_ib_keys", "query_sanitization", ], ) @@ `@pytest.mark.parametrize`( ("script_name", "load_script"), [ ("verify_ingestion.py", _load_ingestion_script), ("check_dpu_health.py", _load_dpu_health_script), ("query_metrics.py", _load_governance_metrics_script), ("query_host_health.py", _load_host_health_script), ("query_health_aggregation.py", _load_health_aggregation_script), + ("query_ib_tenant_isolation.py", _load_ib_tenant_isolation_script), + ("query_ib_keys.py", _load_ib_keys_script), ("query_sanitization.py", _load_sanitization_script), ], )Also applies to: 316-325
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@isvctl/tests/test_nico_provider.py` around lines 293 - 302, The parametrized test lists for "step_name" (the pytest.mark.parametrize blocks in isvctl/tests/test_nico_provider.py that include values like "query_sanitization") are missing two configured NICo script steps; update both parametrizations that define "step_name" to include "query_ib_tenant_isolation" and "query_ib_keys" alongside the existing entries (e.g., "verify_ingestion", "check_dpu_health", etc.) so the API-base contract tests remain aligned with bare_metal.yaml.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@isvctl/tests/test_nico_provider.py`:
- Around line 293-302: The parametrized test lists for "step_name" (the
pytest.mark.parametrize blocks in isvctl/tests/test_nico_provider.py that
include values like "query_sanitization") are missing two configured NICo script
steps; update both parametrizations that define "step_name" to include
"query_ib_tenant_isolation" and "query_ib_keys" alongside the existing entries
(e.g., "verify_ingestion", "check_dpu_health", etc.) so the API-base contract
tests remain aligned with bare_metal.yaml.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: c8213171-8cd8-45d6-ab7c-673d41042b49
📒 Files selected for processing (8)
isvctl/configs/providers/nico/config/bare_metal.yamlisvctl/configs/providers/nico/scripts/sanitization/query_sanitization.pyisvctl/configs/suites/README.mdisvctl/configs/suites/bare_metal.yamlisvctl/tests/test_nico_provider.pyisvtest/src/isvtest/validations/__init__.pyisvtest/src/isvtest/validations/sanitization.pyisvtest/tests/test_sanitization.py
✅ Files skipped from review due to trivial changes (1)
- isvctl/configs/suites/README.md
🚧 Files skipped from review as they are similar to previous changes (4)
- isvtest/src/isvtest/validations/init.py
- isvctl/configs/suites/bare_metal.yaml
- isvtest/src/isvtest/validations/sanitization.py
- isvtest/tests/test_sanitization.py
|
/ok to test 62579f5 |
Signed-off-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Alexandre Begnoche <abegnoche@users.noreply.github.com>
|
/ok to test 02c5639 |
Summary
Adds three provider-agnostic validations for the "Data Sanitization" requirement (SEC21/SEC22), wired for NICo via the
bare_metalsuite. Each asserts the same provider-neutral invariant: a host that has served a tenant must pass through a dedicated sanitizing lifecycle stage before it becomes allocatable to a new tenant again.MemorySanitizationCheck(SEC21-04): host (RAM) memory is sanitized between tenants.GpuMemorySanitizationCheck(SEC21-05): SRAM/GPU memory is sanitized between tenants (scoped to GPU-equipped hosts).FirmwareResetCheck(SEC21-06 / SEC22): TPM is cleared and BIOS/UEFI is recommitted during tenant transitions or hardware replacement, plus report-only firmware identity (vendor / product / BIOS version).The checks fail a host that went from
in_useback toavailablewithout an interveningsanitizingstage, or that is offered to new tenants while still bound to a prior tenant.How it maps to NICo
NICo runs its cleanup/sanitization workflow between tenants (host/RAM cleanup + UEFI
MemoryOverwriteRequestControl, NVMe/HDD secure erase, InfiniBand cleanup, TPM clear, BIOS/UEFI recommit). At the REST level this whole workflow is the machineResetstatus — a released host movesInUse -> Reset -> ... -> Readyand must not return toReadyuntil cleanup completes (per infra-controllerdocs/operations/tenant-lifecycle-cleanup.mdand the managed-host state machine).query_sanitization.pyreads each machine'sstatus+statusHistoryand maps the NICo lifecycle into a provider-neutral token sequence (Reset -> sanitizing,InUse -> in_use,Ready -> available), computingserved_tenant,sanitized,available, andstale_tenant_binding, plushas_gpuand firmware identity. The mapping is robust to truncated history: a violation is only flagged on positive evidence of anin_use -> availabletransition without an interveningsanitizing.Changes
isvtest/src/isvtest/validations/sanitization.py: new validations (shared_TenantSanitizationCheckbase + the three checks).isvtest/src/isvtest/validations/__init__.py: register/export the three checks.isvctl/configs/providers/nico/scripts/sanitization/query_sanitization.py: NICo step that emits the neutral per-machine contract.isvctl/configs/suites/bare_metal.yaml: addmemory_sanitization,gpu_memory_sanitization,firmware_resetvalidation groups (only run for providers that implement thequery_sanitizationstep).isvctl/configs/providers/nico/config/bare_metal.yaml: wire thequery_sanitizationstep.isvctl/configs/suites/README.md: document the step's key JSON fields.isvtest/tests/test_sanitization.py+isvctl/tests/test_nico_provider.py: unit tests for the validations and the NICo script (token mapping, history ordering, gate logic, record building, and end-to-end script → validation contract).isvtest/src/isvtest/released_tests.jsonintentionally untouched (new checks land in a separate release commit; exercise withISVTEST_INCLUDE_UNRELEASED=1).Validation
Exercised end-to-end through the
isvctlorchestrator with an echo-based step (no cloud credentials needed): a clean fleet passes all three checks and a host that wentin_use -> available(skippingsanitizing) fails all three with an actionable message.sec21_sanitization_e2e.log
Programmatic checks:
make test(964 + 68 passed),make lint(clean),make demo-test(clean),uvx pre-commit run -a(clean).Issues
Closes #312
Closes #313
Closes #314
To show artifacts inline, enable in settings.
Summary by CodeRabbit
New Features
Tests
Documentation