docs(slice-c): plan host intelligence as the slice after Slice B by remyluslosius · Pull Request #417 · Hanalyx/OpenWatch

remyluslosius · 2026-05-28T16:49:32Z

Summary

Planning doc for Slice C — Host Intelligence, the slice after Slice B ships. Closes the visibility gap identified during Slice B B.1c planning: today we can answer "what's the compliance score" but not "which hosts have package X installed", which is required for asset management (ISO 27001 A.8, FedRAMP CM-8) and vulnerability correlation.

Companion to the boundary doc and stage_2_slice_a.md. Project-committed (not gitignored).

Locked design decisions

Decision	Rationale
OpenWatch-owned, not a Kensa extension	Boundary doc § 5.2 keeps Kensa pure-compliance
Separate scheduled probe, not piggybacking on Kensa scans	Decoupled cadence; failure-independent
Storage: write-on-change for state facts, snapshot-based for metrics	Write-on-change is the wrong model for time-series — articulated during B.1c planning
Reuses Slice B trunk unchanged	Same scheduler / queue / credential / SSH discipline
No on-host agent	SSH + stock OS utilities (`rpm`, `dpkg`, `systemctl`, `ss`, `ip`, `nft`, `getent`, `/proc`, `/sys`)
Privacy-first collector design	Explicit per-collector allowlist; no shell history, no `/proc/<pid>/cmdline`, no log forwarding

Sub-slices

Wave	Components
C.1 — probe trunk	runner + `host_facts` writer + `host_metrics` writer
C.2 — core state collectors	packages, services, users, hardware
C.3 — network + metrics	network interfaces / routes / firewall + metrics sampler
C.4 — visibility surface	read API, fleet rollups, vulnerability-correlation queries

14 specs planned, ~150 ACs total. Comparable to Slice A's footprint.

Sequencing

Slice C cannot start until Slice B (B.1 through B.4) ships. Estimated 6-8 weeks once B is complete.

What's in the doc (~380 lines)

Why this slice exists (with framework citations: ISO 27001 A.8, FedRAMP CM-8, CMMC CA.L2-3.12.4, NIST SP 800-53 CM-8)
Six locked design decisions with rationale
Data model: host_facts, host_fact_state, host_metrics, intel_probes tables
Sub-slices (waves) and their order
Keep/change/drop audit against the Python system_info/ package
Spec inventory (the 14 planned specs)
OpenAPI surface preview (16 new endpoints)
Privacy and security (collector charter, what we do NOT collect, audit emission)
Performance budget (≤ 16s wall-clock per probe; ~1 RPC/sec sustained for 1000 hosts)
Out of scope (auditd forwarding, process monitoring, network traffic analysis, configuration management)
Six open questions for resolution before C.1 specs land
Slice B entry criteria
What "Slice C done" means concretely
Trade-off note: why two storage shapes (state facts vs metrics)

Open questions surfaced at the bottom

Probe runner concurrency — does it share Slice B's per-host guard?
Probe cadence policy shape — global / per-fact-type / per-host?
Retention defaults — 90d / 1y for facts / metrics?
Backoff after collection failure — share or separate from scan backoff?
Idempotency for manual refresh — debounce window?
Migration of Python-era data — backfill or start fresh?

(Plus one item not yet a question: how intel_schedule policy structurally relates to schedules policy. Worth a § 4 followup once you've reviewed.)

What this PR is NOT

Not specs yet (those land per-wave once Slice B is in flight or done)
Not implementation (waits for Slice B to ship)
Not a roadmap update — happy to follow up if you want openwatch_roadmap.md to reference this

Test plan

Doc renders cleanly in GitHub markdown
Cross-references resolve (docs/KENSA_OPENWATCH_BOUNDARY.md, stage_2_slice_a.md)
User review for content + open questions

Slice C scope: collect package, service, user, network, hardware, and metrics state from hosts via SSH so OpenWatch can answer asset-management and vulnerability-correlation queries. Closes the visibility gap identified during Slice B planning: today (after Slice A+B) we can answer "what's the compliance score" and "which rules failed", not "which hosts have package X installed". Architectural decisions locked in this doc: - OpenWatch-owned, NOT a Kensa extension (boundary doc § 5.2 keeps Kensa pure-compliance) - Separate scheduled probe, NOT piggybacking on Kensa scans (decoupled cadence; failure-independent) - Storage: write-on-change for state facts (host_facts), snapshot-based for continuous metrics (host_metrics). Write-on-change is the wrong model for time-series; this split was articulated during B.1c planning - Reuses Slice B trunk unchanged: scheduler dispatches a new intel_probe job type alongside scan jobs; same SKIP LOCKED, same HMAC payload, same credential resolver, same SSH known_hosts policy - No on-host agent (SSH + stock OS utilities only) - Privacy-first collector design: explicit allowlist of files / commands per collector; no shell history, no /proc/*/cmdline, no log forwarding Sub-slices (waves): C.1 probe trunk: runner + host_facts writer + host_metrics writer C.2 core state collectors: packages, services, users, hardware C.3 network + metrics sampler C.4 read API and fleet rollup queries 14 specs planned across all four waves, ~150 ACs (comparable to Slice A's spec footprint). Sequencing: Slice C cannot start until Slice B (B.1 through B.4) ships. Estimated effort: 6-8 weeks once B is complete. Six open questions surface at the bottom of the doc for resolution before C.1 trunk specs land.

github-actions Bot added the size/L label May 28, 2026

This was referenced May 28, 2026

feat(scheduler): B.1a — system-scheduler implementation (15/15 ACs, 100%) #418

Merged

feat(transactionlog): B.1c — system-transaction-log-writer (15/15 ACs, 100%) #420

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(slice-c): plan host intelligence as the slice after Slice B#417

docs(slice-c): plan host intelligence as the slice after Slice B#417
remyluslosius wants to merge 1 commit into
mainfrom
docs/slice-c-plan

remyluslosius commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

remyluslosius commented May 28, 2026

Summary

Locked design decisions

Sub-slices

Sequencing

What's in the doc (~380 lines)

Open questions surfaced at the bottom

What this PR is NOT

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant