diff --git a/.gitignore b/.gitignore index 0917ea5..bebf72a 100644 --- a/.gitignore +++ b/.gitignore @@ -17,3 +17,5 @@ env/ dist/ build/ *.egg-info/ + +.claude/scheduled_tasks.lock diff --git a/.specify/feature.json b/.specify/feature.json index 8c64834..a6b33c4 100644 --- a/.specify/feature.json +++ b/.specify/feature.json @@ -1,3 +1,3 @@ { - "feature_directory": "specs/007-log-attachment-offsets" + "feature_directory": "specs/008-event-ingestion-follow" } diff --git a/CLAUDE.md b/CLAUDE.md index f1ea727..6915941 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,7 +1,7 @@ For additional context about technologies to be used, project structure, shell commands, and other important information, read the current plan: -`specs/007-log-attachment-offsets/plan.md`. +`specs/008-event-ingestion-follow/plan.md`. # AgentTower Agent Context diff --git a/docs/architecture.md b/docs/architecture.md index 6770057..bea2bbd 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -108,6 +108,44 @@ Opensoft namespace: Bench containers mount the daemon socket and, preferably, the state log directory. No network listener is required for MVP. +### 4.1 UID-mapping invariant for bench containers + +The host daemon reads pane log files written by in-container `tmux +pipe-pane` processes via the bind mount. For the daemon's read to +succeed, **the UID that wrote the log file must be reachable from the +daemon's UID** — i.e., the host UID and the in-container UID must be +the same value (or the host daemon must be able to read files owned +by the mapped UID). + +Two supported configurations: + +1. **No userns remap** (default): the bench container runs with + `--user $(id -u):$(id -g)` so the in-container process and the host + daemon share the same UID. The daemon reads its own files; SO_PEERCRED + on the mounted socket sees the same UID on both sides. This is the + recommended setup. + +2. **Userns remap**: if the operator deploys with Docker's + `--userns=remap:` (or `userns_mode=host` with a manual + `--user X:Y` that diverges from the host UID), the in-container + process writes log files with the *remapped host UID* (e.g., + `100000`). The host daemon (running as e.g. UID `1000`) cannot read + these files; the FEAT-008 reader surfaces a per-attachment + `failure_class=PermissionError` (FR-038), but the root cause is + silent at the spec level. + + Operators using userns-remapped containers MUST either: + - Map the in-container UID to the daemon's host UID (e.g., add + `userns-remap` config that rewrites `0` → daemon-UID), OR + - Run the daemon as `root` (NOT recommended — undermines the + local-first least-privilege constitution). + +The `agenttower config doctor` command (FEAT-005) currently checks +socket reachability but **does not yet** check log-file readability; +operators encountering EACCES errors should manually verify +`stat $(ls ~/.local/state/opensoft/agenttower/logs/*/*.log | head -1)` +matches the daemon's UID. + ## 5. Components ### 5.1 Host Daemon @@ -414,6 +452,12 @@ Classification is rule-based for MVP. It should be conservative and transparent. The daemon may emit uncertain events, but it must not turn uncertain classification into automatic command execution. +The pipeline above is implemented in FEAT-008. See +`specs/008-event-ingestion-follow/plan.md` for the implementation +detail (atomic SQLite + offset commit, FR-029 JSONL retry watermark, +FR-040 buffered-retry on degraded SQLite, FR-014 debounce semantics, +FR-013 / FR-016 synthesized event types). + ## 14. Routing Model AgentTower routes structured messages between agents. The main MVP routing path diff --git a/docs/mvp-feature-sequence.md b/docs/mvp-feature-sequence.md index a98c0b9..c06007f 100644 --- a/docs/mvp-feature-sequence.md +++ b/docs/mvp-feature-sequence.md @@ -226,6 +226,14 @@ Out of scope: ## FEAT-008: Event Ingestion, Classification, and Follow CLI +**Status: implemented.** See +`specs/008-event-ingestion-follow/plan.md` for the implementation +record. Acceptance items below are tested by integration tests under +`tests/integration/test_events_us{1..6}*.py` plus +`test_lifecycle_separation.py`. Carry-over obligations from FEAT-007 +(T175 truncation, T176 recreation, T177 round-trip) land in +`test_events_us4_carryover.py`. + Goal: convert pane logs into durable, inspectable AgentTower events. Build: diff --git a/pyproject.toml b/pyproject.toml index 826807b..612b435 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -18,7 +18,14 @@ agenttower = "agenttower.cli:main" agenttowerd = "agenttower.daemon:main" [project.optional-dependencies] -test = ["pytest>=7"] +test = [ + "pytest>=7", + # FEAT-008 T005: JSON Schema validator for the FR-027 / FR-032 + # event schema. Test-only — runtime stays stdlib-only per the plan. + # Upper-bound pinned to <5 so a major-version bump can't silently + # break CI (review CRIT #2). + "jsonschema>=4,<5", +] [tool.hatch.build.targets.wheel] packages = ["src/agenttower"] diff --git a/specs/008-event-ingestion-follow/checklists/carryover.md b/specs/008-event-ingestion-follow/checklists/carryover.md new file mode 100644 index 0000000..ba3c33c --- /dev/null +++ b/specs/008-event-ingestion-follow/checklists/carryover.md @@ -0,0 +1,73 @@ +# FEAT-007 Carry-Over Integration Requirements Checklist: Event Ingestion, Classification, and Follow CLI + +**Purpose**: Validate that the FEAT-007 carry-over obligations (T175/T176/T177, file-change classification, no-replay invariant, lifecycle separation) are complete, clear, consistent, and measurable. This checklist tests the **requirements writing**, not the implementation. +**Created**: 2026-05-10 +**Feature**: [spec.md](../spec.md) +**Depth**: Formal release gate + +## Requirement Completeness + +- [ ] CHK001 Are integration test deliverables (T175 truncation, T176 recreation, T177 missing→recreated→re-attach) explicitly required as in-scope for FEAT-008? [Completeness, Spec §FR-043] +- [ ] CHK002 Is the "no-replay invariant" defined precisely enough to be the form of an assertion (e.g., "no event whose `byte_range_start` is below the post-reset offset")? [Completeness, Spec §FR-043] +- [ ] CHK003 Is the requirement to call `reader_cycle_offset_recovery` "exactly once per cycle BEFORE reading bytes" testable at the unit level (call-count assertion)? [Completeness, Spec §FR-002, FR-041] +- [ ] CHK004 Are FR-004's prohibitions on production use of `advance_offset_for_test` enforced by an existing AST gate that this feature must continue to pass? [Completeness, Spec §FR-004, SC-008] +- [ ] CHK005 Are requirements covering the file-change classifier obligation (`detect_file_change`) explicit about the prohibition on re-implementation? [Completeness, Spec §FR-042] +- [ ] CHK006 Is the dispatcher mapping (`unchanged | truncated | recreated | missing | reappeared`) referenced in the FEAT-008 reader requirements as the canonical taxonomy? [Completeness, Spec §FR-002, FR-041] +- [ ] CHK007 Is the optional consolidated lifecycle-surface assertion (FR-044) defined with a clear scope vs the dedicated per-class FEAT-007 tests it consolidates? [Completeness, Spec §FR-044] +- [ ] CHK008 Are requirements defined for the audit-row append (`log_attachment_change`) idempotence under retry? [Completeness, Gap] + +## Requirement Clarity + +- [ ] CHK009 Is "≤ 1 reader cycle (≤ 1 s wall-clock at MVP scale)" measurable with a deterministic injected test clock (no real-time sleeps in tests)? [Clarity, Spec §FR-043] +- [ ] CHK010 Is "no durable event whose excerpt comes from pre-reset bytes" precisely defined for the truncate-then-write-same-bytes case (excerpt-content vs source-byte-range distinction)? [Clarity, Spec §FR-043, US4 AS1] +- [ ] CHK011 Is "operator-explicit re-attach" distinguished from automatic recovery in the requirements (which path applies in which scenario)? [Clarity, Spec §US4 AS5] +- [ ] CHK012 Is "delegating to `reader_cycle_offset_recovery`" in FR-023 specific enough to prevent the reader from inlining or duplicating the helper's logic? [Ambiguity, Spec §FR-023] + +## Requirement Consistency + +- [ ] CHK013 Does FR-041's helper-ownership claim agree with FR-003's prohibition on direct `log_attachments` / `log_offsets` row mutation? [Consistency, Spec §FR-003, FR-041] +- [ ] CHK014 Are the row-status transitions referenced in US4 AS3/AS4 consistent with FEAT-007's documented `active → stale → active` state machine? [Consistency, Spec §US4] +- [ ] CHK015 Is the `log_rotation_detected` / `log_file_missing` / `log_file_returned` lifecycle separation in FR-026 consistent with US4's per-scenario "exactly one" emission counts? [Consistency, Spec §FR-026, US4] +- [ ] CHK016 Is the "(suppression-keyed by `(agent_id, log_path, file_inode)`)" rule in US4 AS4 consistent with FEAT-007's documented suppression key shape (FR-061 reference)? [Consistency, Spec §US4 AS4, FR-041] + +## Acceptance Criteria Quality + +- [ ] CHK017 Is SC-004's "T175 promoted to FEAT-008 integration coverage" tied to a specific test-file path or naming convention so it can be located? [Measurability, Spec §SC-004] +- [ ] CHK018 Is SC-006's "100 runs of the integration test" reproducibility assured by a documented seed, clock-injection, or fixed-fixture strategy? [Measurability, Spec §SC-006] +- [ ] CHK019 Are timing assertions for ≤ 1 reader cycle measurable without flakiness on slow CI runners (e.g., logical-clock model rather than wall-clock)? [Measurability, Gap] +- [ ] CHK020 Is the assertion "exactly one `log_rotation_detected` lifecycle event" in US4 AS1/AS2 measurable against a deterministic lifecycle-event sink? [Measurability, Spec §US4 AS1, AS2] + +## Scenario Coverage + +- [ ] CHK021 Is the "missing→reappear→re-attach" round-trip required to be covered as a single end-to-end test, not three independent tests? [Coverage, Spec §FR-043, US4 AS5] +- [ ] CHK022 Are requirements specified for a reader that observes RECREATED in the same cycle as a pending event from the previous (now-truncated) inode? [Coverage, Gap] +- [ ] CHK023 Are requirements specified for the case where re-attach succeeds but the file is missing again before the next cycle? [Coverage, Gap] +- [ ] CHK024 Is the deletion → permanent-missing case (no recreation) covered by separate requirements from deletion → recreation? [Coverage, Spec §US4 AS3, AS4] +- [ ] CHK025 Are requirements defined for the no-replay invariant under ALL four file-change kinds (truncated, recreated, missing, reappeared), not just truncated/recreated? [Coverage, Spec §FR-043] + +## Edge Case Coverage + +- [ ] CHK026 Is the case "inode reuse within a short window" (OS-level inode recycling) addressed by the file-change classifier requirements? [Edge Case, Gap] +- [ ] CHK027 Is the case "file size returns to identical pre-truncate value with new bytes" (size-only check would miss this) addressed? [Edge Case, Gap] +- [ ] CHK028 Is the case "MISSING followed by REAPPEARED in adjacent cycles before any operator action" required to emit no durable event? [Edge Case, Spec §Edge Cases] +- [ ] CHK029 Is `log_file_returned` suppression-keyed by `(agent_id, log_path, file_inode)` enforced for the duration of a single stale period (does the same key fire again across stale → active → stale cycles)? [Edge Case, Spec §US4 AS4] +- [ ] CHK030 Are requirements defined for the case where FEAT-007 and FEAT-008 disagree on the row's expected state at cycle entry (defensive read)? [Edge Case, Gap] + +## Non-Functional Requirements + +- [ ] CHK031 Is the test-runtime budget for the round-trip integration tests bounded (so SC-006's 100 runs is feasible on CI)? [NFR, Gap] +- [ ] CHK032 Are flake-rate budgets for SC-006's 100 runs documented (e.g., 0% flake target)? [NFR, Gap] + +## Dependencies & Assumptions + +- [ ] CHK033 Is the version-pin of FEAT-007's `reader_recovery` API surface documented (so a FEAT-007 patch cannot silently change FEAT-008 behavior)? [Dependency, Spec §FR-041] +- [ ] CHK034 Is the assumption that FEAT-007 lifecycle suppression (FR-061 reference) is in place documented as an explicit precondition? [Assumption, Spec §FR-041] +- [ ] CHK035 Is the dependency on FEAT-007's audit-row append idempotence documented? [Dependency, Gap] + +## Ambiguities & Conflicts + +- [ ] CHK036 Could FR-044's "MAY add a single integration test" leave the FR-026 lifecycle-separation requirement under-tested if the feature opts not to add the test? [Conflict, Spec §FR-026, FR-044] +- [ ] CHK037 Is "the same byte sequence appears twice in distinct cycles" (Edge Cases) the same scenario as US3 AS2, or a different scenario? [Ambiguity, Spec §Edge Cases, US3 AS2] +- [ ] CHK038 Is FR-018's "if the same pane id reappears later... it counts as a new lifecycle once the attachment is re-bound" precisely defined in terms of which event triggers the new lifecycle counter? [Ambiguity, Spec §FR-018] +- [ ] CHK039 Is "at most one reader cycle" (US4 AS1/AS2) consistent with the sometimes-stricter "one reader cycle" used elsewhere (US4 AS3, AS4)? [Ambiguity, Spec §US4] +- [ ] CHK040 Is the FR-043 "no-replay invariant" requirement scoped to the test suite alone, or also a normative reader-behavior requirement (would the bug be caught outside the named tests)? [Ambiguity, Spec §FR-043] diff --git a/specs/008-event-ingestion-follow/checklists/classifier.md b/specs/008-event-ingestion-follow/checklists/classifier.md new file mode 100644 index 0000000..ba7f221 --- /dev/null +++ b/specs/008-event-ingestion-follow/checklists/classifier.md @@ -0,0 +1,70 @@ +# Classifier Rule Catalogue Requirements Checklist: Event Ingestion, Classification, and Follow CLI + +**Purpose**: Validate that classifier rule, redaction, and debounce requirements are complete, clear, consistent, and measurable. This checklist tests the **requirements writing**, not the implementation. +**Created**: 2026-05-10 +**Feature**: [spec.md](../spec.md) +**Depth**: Formal release gate + +## Requirement Completeness + +- [ ] CHK001 Are all 10 event types in the FR-008 catalogue required to have at least one matching rule documented in the rule catalogue? [Completeness, Spec §FR-008] +- [ ] CHK002 Is the rule-priority order required to be expressed in a deterministic, testable form (priority table or ordered list, not prose)? [Completeness, Spec §FR-008, Spec §Edge Cases] +- [ ] CHK003 Is `long_running` eligibility required to be defined as a complete state-transition table (which prior `event_type` values qualify, which do not)? [Completeness, Spec §FR-013] +- [ ] CHK004 Are explicit rules required for what triggers `completed`? [Gap, Spec §FR-008] +- [ ] CHK005 Are explicit rules required for what triggers `waiting_for_input`? [Gap, Spec §FR-008] +- [ ] CHK006 Are explicit rules required for what triggers `manual_review_needed`? [Gap, Spec §FR-008] +- [ ] CHK007 Are explicit rules required for distinguishing `error` from `test_failed` (separate matchers, no overlap by construction)? [Completeness, Spec §FR-008] +- [ ] CHK008 Are explicit rules required for what triggers `test_passed`? [Gap, Spec §FR-008] +- [ ] CHK009 Is the integration with the FEAT-007 redaction utility required at the rule level (every rule emits a redacted excerpt, never the raw bytes)? [Completeness, Spec §FR-012] +- [ ] CHK010 Are the per-attachment "last output at" data lifecycle requirements (initialization, update, reset on restart) specified? [Completeness, Spec §FR-013, FR-015] + +## Requirement Clarity + +- [ ] CHK011 Is "rule-based only" measurable in code (e.g., a regex/matcher list with no learned components, no network calls)? [Clarity, Spec §FR-007] +- [ ] CHK012 Is "conservative" defined operationally beyond "default to activity" (e.g., a precise decision rule for ambiguity)? [Clarity, Spec §FR-011] +- [ ] CHK013 Is "ongoing work following waiting_for_input is ineligible for long_running" precisely defined for the eligibility table? [Clarity, Spec §FR-013] +- [ ] CHK014 Is the `swarm_member_reported` regex shape exhaustive on whitespace, key ordering, escaping, and quoting? [Clarity, Spec §FR-009] +- [ ] CHK015 Is "redaction runs before truncation" clear about which redactor and which truncation marker apply? [Clarity, Spec §Edge Cases] +- [ ] CHK016 Is "exactly one event per debounce window" unambiguous about which record's excerpt is preserved (latest? first? configurable?)? [Clarity, Spec §FR-014] + +## Requirement Consistency + +- [ ] CHK017 Are the 10 event types in FR-008 the same set referenced by FR-014's collapse-eligible / one-to-one classification? [Consistency, Spec §FR-008, FR-014] +- [ ] CHK018 Does FR-009's strict-parse rule (malformed → `activity`) align with FR-011's conservative-default rule (ambiguous → `activity`) without rule overlap? [Consistency, Spec §FR-009, FR-011] +- [ ] CHK019 Are `classifier_rule_id` values required to be stable across the catalogue and consistent between SQLite and JSONL output? [Consistency, Spec §FR-027] +- [ ] CHK020 Are `pane_exited` requirements consistent between FR-016 (must be inferred from FEAT-004 state + grace), FR-017 (grace window), and FR-018 (one per lifecycle)? [Consistency, Spec §FR-016, FR-017, FR-018] + +## Acceptance Criteria Quality + +- [ ] CHK021 Is SC-007's "100% accuracy on every fixture line" measurable against a documented, version-pinned fixture set? [Measurability, Spec §SC-007] +- [ ] CHK022 Are negative test fixtures required (lines that MUST NOT classify as a domain-specific type)? [Acceptance Criteria, Gap] +- [ ] CHK023 Is "documented ambiguous line" defined with explicit fixture entries rather than leaving "ambiguous" up to test author judgment? [Measurability, Spec §SC-007] +- [ ] CHK024 Are acceptance criteria defined for the `debounce` object's `window_id`, `collapsed_count`, `window_started_at`, `window_ended_at` fields' shape and population rules? [Acceptance Criteria, Spec §FR-027] + +## Scenario Coverage + +- [ ] CHK025 Are requirements defined for multi-line records (continuation lines, line continuations from shells)? [Coverage, Gap] +- [ ] CHK026 Are requirements defined for ANSI escape sequences in lines (color codes, cursor motion, OSC sequences)? [Coverage, Gap] +- [ ] CHK027 Are requirements defined for rule matching across the per-cycle byte-cap (FR-019) truncation boundary? [Coverage, Spec §FR-019, Edge Cases] +- [ ] CHK028 Are requirements specified for protecting against catastrophic regex backtracking (ReDoS)? [Coverage, Gap, NFR] +- [ ] CHK029 Are requirements specified for the case where a classifier rule depends on prior reader-state (e.g., `long_running`) and that state is unavailable on first cycle? [Coverage, Spec §FR-013, FR-015] + +## Edge Case Coverage + +- [ ] CHK030 Is overlap between `error` and `test_failed` resolved by a documented priority order in the spec, not just the catalogue? [Edge Case, Spec §FR-008, Edge Cases] +- [ ] CHK031 Is the malformed-`AGENTTOWER_SWARM_MEMBER` case explicitly required to fall through to `activity`, not silently dropped? [Edge Case, Spec §FR-009] +- [ ] CHK032 Is `pane_exited` required to NOT be emitted when the log text mentions "exited" or similar (FR-016: "MUST NOT be inferred from log text alone")? [Edge Case, Spec §FR-016] +- [ ] CHK033 Is the case "rule matches partial trailing bytes" excluded by FR-005's complete-record rule? [Edge Case, Spec §FR-005] +- [ ] CHK034 Are debounce-collapse semantics defined when the latest record's excerpt is empty or whitespace-only? [Edge Case, Spec §FR-014] +- [ ] CHK035 Is the "secret pattern split across the truncation boundary" redaction guarantee documented as a hard requirement, not best-effort? [Edge Case, Spec §Edge Cases] + +## Non-Functional Requirements + +- [ ] CHK036 Are classifier latency requirements quantified per record (e.g., a per-record budget that fits within the cycle wall-clock cap at upper-bound throughput)? [NFR, Gap] +- [ ] CHK037 Are FR-010's pure-function purity requirements verifiable by static analysis or property test (no I/O, no clock reads inside the rule fn)? [NFR, Spec §FR-010] +- [ ] CHK038 Are memory bounds defined for per-attachment classifier state (last-output-at, debounce window) across the upper-bound 50-agent scale? [NFR, Gap] + +## Dependencies & Assumptions + +- [ ] CHK039 Is the dependency on FEAT-004 pane discovery for `pane_exited` documented and version-pinned? [Dependency, Spec §FR-016] +- [ ] CHK040 Is the assumption that `\n` is the record boundary documented (FR-005, Assumptions) AND the classifier explicitly required to break on it? [Assumption, Spec §FR-005, Assumptions] diff --git a/specs/008-event-ingestion-follow/checklists/cli.md b/specs/008-event-ingestion-follow/checklists/cli.md new file mode 100644 index 0000000..3d5076f --- /dev/null +++ b/specs/008-event-ingestion-follow/checklists/cli.md @@ -0,0 +1,70 @@ +# CLI Contract & JSON Schema Stability Checklist: Event Ingestion, Classification, and Follow CLI + +**Purpose**: Validate that CLI surface, error contract, pagination, and JSONL stable-schema requirements are complete, clear, consistent, and measurable. This checklist tests the **requirements writing**, not the implementation. +**Created**: 2026-05-10 +**Feature**: [spec.md](../spec.md) +**Depth**: Formal release gate + +## Requirement Completeness + +- [ ] CHK001 Are all `agenttower events` flags enumerated with their pairwise interactions defined (e.g., `--cursor` × `--reverse`, `--since` × `--follow`)? [Completeness, Spec §FR-030, FR-033] +- [ ] CHK002 Is the JSONL stable schema in FR-027 complete enough that every field is present (or required-null) for all 10 event types? [Completeness, Spec §FR-027] +- [ ] CHK003 Are exit-code requirements defined for each error class (`agent_not_found`, daemon-unavailable, follow-stream-failure, invalid-args)? [Completeness, Spec §FR-034, FR-035a] +- [ ] CHK004 Are pagination cursor encoding requirements specified (encoding format, opacity, version compatibility, expiration semantics)? [Completeness, Spec §FR-030] +- [ ] CHK005 Are requirements defined for `--type ` invalid-value behavior (unknown type → error vs ignore)? [Gap, Spec §FR-030] +- [ ] CHK006 Are requirements defined for `--since` / `--until` invalid ISO-8601 input handling? [Gap, Spec §FR-030] +- [ ] CHK007 Are requirements defined for `--limit` upper bound, zero, and below-zero handling? [Gap, Spec §FR-030] +- [ ] CHK008 Is `--target` agent-id syntactic validation specified BEFORE registry lookup (so syntactic errors are distinguishable from `agent_not_found`)? [Completeness, Gap] +- [ ] CHK009 Are requirements defined for `--type` repeatability semantics (multi-value OR vs intersect)? [Completeness, Spec §FR-030] +- [ ] CHK010 Is the `schema_version` field's required initial value and bump semantics documented (FR-027 says non-breaking only — defined how)? [Completeness, Spec §FR-027] + +## Requirement Clarity + +- [ ] CHK011 Is "stable contract for scripting consumers" measurable across schema-version bumps (a non-breaking-change rule is documented)? [Clarity, Spec §FR-032] +- [ ] CHK012 Is "documented MVP page size (≤ 50)" specifying default vs maximum unambiguously? [Ambiguity, Spec §FR-030] +- [ ] CHK013 Is "one JSON object per event, one event per line" explicitly required to be NDJSON-compatible (no embedded newlines, terminating `\n`)? [Clarity, Spec §FR-032] +- [ ] CHK014 Is "closed-set `agent_not_found` error" defined as a specific machine-readable code (string identifier? integer? both)? [Clarity, Spec §FR-035a] +- [ ] CHK015 Is "non-zero status" specified as an exact exit code per error class, or any non-zero value? [Ambiguity, Spec §FR-034, FR-035a] +- [ ] CHK016 Is "human output not contractually stable" explicit about which fields/columns may change vs which must stay? [Clarity, Spec §FR-031] + +## Requirement Consistency + +- [ ] CHK017 Is the JSONL schema in FR-027 exactly the same as the `--json` CLI output schema in FR-032 (same field set, types, ordering)? [Consistency, Spec §FR-027, FR-032] +- [ ] CHK018 Is `--target` semantics consistent between `events` and `events --follow` for both the empty-result and `agent_not_found` cases? [Consistency, Spec §FR-035a] +- [ ] CHK019 Are error-message conventions consistent across `events`, `events --follow`, and the daemon-unreachable surface (same code names, same exit codes)? [Consistency, Spec §FR-034, FR-035a] +- [ ] CHK020 Is the ordering contract in FR-028 consistent with the cursor encoding in FR-030 (cursor encodes the same tuple used for sorting)? [Consistency, Spec §FR-028, FR-030] +- [ ] CHK021 Is `event_id`'s integer-backed but CLI-opaque treatment in FR-030 consistent with its JSON-number serialization in FR-027? [Consistency, Spec §FR-027, FR-030, Clarifications] + +## Acceptance Criteria Quality + +- [ ] CHK022 Is SC-011's "zero schema validation failures" tied to a concrete schema artifact (JSON Schema file path, version pin)? [Measurability, Spec §SC-011] +- [ ] CHK023 Is SC-012's "identical output (modulo pagination cursor)" defined byte-for-byte or field-for-field? [Measurability, Spec §SC-012] +- [ ] CHK024 Are acceptance criteria defined for human-output formatting stability vs explicit non-stability (which fields are columns, which are free-form)? [Acceptance Criteria, Spec §FR-031] +- [ ] CHK025 Are acceptance criteria specified for cursor round-trip integrity (cursor from page N+1 returns the next page after N, never overlap)? [Acceptance Criteria, Gap] + +## Scenario Coverage + +- [ ] CHK026 Are requirements specified for `--follow` against an agent that becomes unregistered mid-stream? [Coverage, Gap] +- [ ] CHK027 Are requirements specified for `--follow` when the daemon restarts mid-stream? [Coverage, Gap] +- [ ] CHK028 Are requirements defined for `events --json --follow --since` (interleaving rules between bounded backlog and live stream)? [Coverage, Spec §FR-033] +- [ ] CHK029 Are requirements defined for empty-result vs error in `events --type ` vs `events --type `? [Coverage, Gap] +- [ ] CHK030 Are requirements specified for `--target` agent registered but with attachment in `stale` (not `active`) status? [Coverage, Spec §US1 AS4, US4] +- [ ] CHK031 Are requirements specified for the host-vs-container parity contract (FR-035) at the failure surface (same exit codes from both)? [Coverage, Spec §FR-035, SC-012] + +## Edge Case Coverage + +- [ ] CHK032 Is the case "agent registered, no attachment ever created" distinguished from "agent registered, attachment was deleted" in CLI behavior? [Edge Case, Gap] +- [ ] CHK033 Are requirements defined for excerpts containing characters that need JSON escaping (control chars, surrogates, invalid UTF-8 bytes)? [Edge Case, Gap] +- [ ] CHK034 Is the case "events from before a schema_version bump" required to remain readable by the new CLI? [Edge Case, Gap] +- [ ] CHK035 Are requirements defined for stdout vs stderr separation in `events` failure output (so machine-parsing of stdout is safe)? [Edge Case, Gap] +- [ ] CHK036 Are requirements specified for SIGPIPE behavior when piping `events --follow` into `head` or similar? [Edge Case, Gap] +- [ ] CHK037 Is the case "`--limit 0`" specified (empty result with success vs validation error)? [Edge Case, Gap] + +## Non-Functional Requirements + +- [ ] CHK038 Are `events` query latency requirements quantified for large event tables (millions of rows)? [NFR, Gap] +- [ ] CHK039 Is follow-stream backpressure behavior specified when the operator's terminal is slow (drop? buffer? error)? [NFR, Gap] + +## Dependencies & Assumptions + +- [ ] CHK040 Is the dependency on FEAT-005 thin-client routing version-pinned to a specific socket protocol surface? [Dependency, Spec §FR-035] diff --git a/specs/008-event-ingestion-follow/checklists/configuration.md b/specs/008-event-ingestion-follow/checklists/configuration.md new file mode 100644 index 0000000..2047285 --- /dev/null +++ b/specs/008-event-ingestion-follow/checklists/configuration.md @@ -0,0 +1,69 @@ +# Configuration & Defaults Requirements Checklist: Event Ingestion, Classification, and Follow CLI + +**Purpose**: Validate that FR-045 default-value coverage, override precedence, `agenttower config paths` exposure, and configurability requirements are complete, clear, consistent, and measurable. This checklist tests the **requirements writing**, not the implementation. +**Created**: 2026-05-10 +**Feature**: [spec.md](../spec.md) +**Depth**: Formal release gate + +## Requirement Completeness + +- [ ] CHK001 Is every spec-named MVP default (reader cycle cap, debounce window, `pane_exited` grace, per-event excerpt cap, per-cycle byte cap, default page size) required to have a documented value somewhere in the spec or plan? [Completeness, Spec §FR-045] +- [ ] CHK002 Are the plan-only defaults (`long_running_grace_seconds`, `excerpt_truncation_marker`, `max_page_size`, `follow_long_poll_max_seconds`, `follow_session_idle_timeout_seconds`) required to be normative or are they implementation-specific? [Completeness, Plan §"Defaults locked"] +- [ ] CHK003 Are override-precedence requirements specified (env > config.toml > built-in, or some other order)? [Completeness, Gap] +- [ ] CHK004 Are requirements specified for `agenttower config paths` exposing the resolved values for every FR-045 setting? [Completeness, Spec §FR-045] +- [ ] CHK005 Are configuration-validation requirements specified (out-of-range values, malformed types, unknown keys)? [Completeness, Gap] +- [ ] CHK006 Are requirements specified for the boundary between "MVP cap" (≤ 5 s for debounce, ≤ 30 s for grace, ≤ 50 page size) and "default value" (the implementer-chosen specific value within the cap)? [Completeness, Spec §FR-014, FR-017, FR-030] +- [ ] CHK007 Are requirements specified for the configuration-reload behavior (does daemon need restart, or is hot-reload supported)? [Completeness, Gap] +- [ ] CHK008 Are requirements specified for the documentation surface where defaults are published (FR-045 says "documented in the FEAT-008 plan" — is that sufficient or should they also be in operator-facing docs)? [Completeness, Spec §FR-045] +- [ ] CHK009 Are requirements specified for what happens when `[events]` section is absent from `config.toml` (built-in defaults apply)? [Completeness, Gap] +- [ ] CHK010 Are requirements specified for what happens when an unknown key appears in `[events]` (warn, error, or ignore)? [Completeness, Gap] + +## Requirement Clarity + +- [ ] CHK011 Is "must have a documented MVP default" precise enough to be testable (e.g., "the value must be a literal in `events/__init__.py`")? [Clarity, Spec §FR-045] +- [ ] CHK012 Is "configurable through the FEAT-001 configuration surface" precise about the configuration mechanism (TOML key shape, type, units)? [Clarity, Spec §FR-045] +- [ ] CHK013 Is "must be observable in test (e.g., via configuration injection)" precise about the test seam path? [Clarity, Spec §FR-019] +- [ ] CHK014 Is "≤ 5 seconds at MVP scale" specifying a maximum cap or a default value (the spec leaves room for either)? [Ambiguity, Spec §FR-014] +- [ ] CHK015 Is "documented MVP default sufficient to drain typical interactive output without starving other attachments" measurable enough to derive a specific number? [Clarity, Spec §FR-019] + +## Requirement Consistency + +- [ ] CHK016 Are the spec's six FR-045-named defaults consistent with the plan's eleven-row defaults table (no missing, no contradicting values)? [Consistency, Spec §FR-045, Plan §"Defaults locked"] +- [ ] CHK017 Is the per-cycle byte cap consistent between the spec ("documented MVP default sufficient...") and the plan (64 KiB)? [Consistency, Spec §FR-019, Plan] +- [ ] CHK018 Are the debounce window default (5 s) and the `pane_exited` grace (30 s) consistent with the upper-bound caps stated in the spec (≤ 5 s, ≤ 30 s)? [Consistency, Spec §FR-014, FR-017] +- [ ] CHK019 Is `default_page_size = 50` consistent with `max_page_size = 50` (i.e., the cap and the default are deliberately equal in MVP)? [Consistency, Plan §"Defaults locked"] +- [ ] CHK020 Are the configuration units consistent across all settings (seconds for time, bytes for size; no millisecond/kilobyte mixing)? [Consistency, Plan §"Defaults locked"] + +## Acceptance Criteria Quality + +- [ ] CHK021 Is there an SC requiring `agenttower config paths` to surface every FR-045 default? [Measurability, Spec §FR-045] +- [ ] CHK022 Are acceptance criteria specified for the configuration-injection test seam (FR-019: "MUST be observable in test (e.g., via configuration injection)")? [Measurability, Spec §FR-019] +- [ ] CHK023 Are acceptance criteria specified for what `agenttower config paths` should output for an `[events]` section that has overrides applied? [Measurability, Gap] + +## Scenario Coverage + +- [ ] CHK024 Are requirements defined for the all-defaults scenario (no `[events]` overrides; built-in values apply)? [Coverage, Gap] +- [ ] CHK025 Are requirements defined for the partial-override scenario (some keys in `[events]`, others fall back to defaults)? [Coverage, Gap] +- [ ] CHK026 Are requirements defined for the all-overrides scenario (every key in `[events]` set explicitly)? [Coverage, Gap] +- [ ] CHK027 Are requirements defined for the env-var override scenario (if env vars are part of the precedence chain)? [Coverage, Gap] +- [ ] CHK028 Are requirements defined for the bad-value scenario (malformed type, out-of-range, negative number)? [Coverage, Gap] + +## Edge Case Coverage + +- [ ] CHK029 Is the case "configurable value at the absolute MVP cap (e.g., debounce = exactly 5 s)" addressed as still acceptable? [Edge Case, Spec §FR-014] +- [ ] CHK030 Is the case "configurable value above the MVP cap (e.g., debounce > 5 s)" addressed (rejected at startup, or accepted with warning)? [Edge Case, Gap] +- [ ] CHK031 Is the case "zero or negative value for a positive-duration setting" addressed? [Edge Case, Gap] +- [ ] CHK032 Is the case "config.toml file is unreadable or invalid TOML" addressed by FR-001 (FEAT-001 ownership) — is the FEAT-008 layering explicit about delegating? [Edge Case, Gap] +- [ ] CHK033 Is the case "`per_cycle_byte_cap_bytes` smaller than `per_event_excerpt_cap_bytes`" addressed (likely an invalid combination — lines longer than the byte cap can never be ingested)? [Edge Case, Gap] + +## Non-Functional Requirements + +- [ ] CHK034 Are requirements specified for the configuration's startup-time impact (single TOML parse vs lazy)? [NFR, Gap] +- [ ] CHK035 Are requirements specified for thread-safety of the configuration object during reader cycles (read-only after startup, or mutable)? [NFR, Gap] +- [ ] CHK036 Are requirements specified for backwards compatibility of the `[events]` section schema (adding a key in a future feature must not break a v6 daemon reading the v6 config)? [NFR, Gap] + +## Dependencies & Assumptions + +- [ ] CHK037 Is the dependency on FEAT-001's configuration surface version-pinned for stable `[events]` parsing? [Dependency, Spec §FR-045] +- [ ] CHK038 Is the assumption that operators understand the difference between MVP cap and default-value-within-cap documented? [Assumption, Gap] +- [ ] CHK039 Is the assumption that `agenttower config paths` is the operator's authoritative view of resolved values documented as a hard contract? [Assumption, Spec §FR-045] diff --git a/specs/008-event-ingestion-follow/checklists/failure.md b/specs/008-event-ingestion-follow/checklists/failure.md new file mode 100644 index 0000000..10ebde8 --- /dev/null +++ b/specs/008-event-ingestion-follow/checklists/failure.md @@ -0,0 +1,73 @@ +# Failure Surface & Observability Requirements Checklist: Event Ingestion, Classification, and Follow CLI + +**Purpose**: Validate that per-attachment failure isolation, daemon `status` surfacing, and FEAT-007 lifecycle separation requirements are complete, clear, consistent, and measurable. This checklist tests the **requirements writing**, not the implementation. +**Created**: 2026-05-10 +**Feature**: [spec.md](../spec.md) +**Depth**: Formal release gate + +## Requirement Completeness + +- [ ] CHK001 Are all per-attachment failure classes enumerated (EACCES, ENOENT outside FEAT-007 path, missing offset row, degraded SQLite, degraded JSONL, other I/O)? [Completeness, Spec §FR-038, FR-039] +- [ ] CHK002 Is the `agenttower status` failure-surface schema specified (object shape, field types, required vs optional)? [Completeness, Spec §FR-037, Gap] +- [ ] CHK003 Are requirements defined for which failure classes must ALSO surface as FEAT-007 lifecycle events vs status-only? [Completeness, Spec §FR-037] +- [ ] CHK004 Are requirements defined for clearing a failure indicator after recovery (auto-clear vs operator action)? [Completeness, Spec §FR-040, Gap] +- [ ] CHK005 Is the "degraded condition" object/structure defined consistently for both SQLite (FR-040) and JSONL (FR-029) paths? [Completeness, Spec §FR-029, FR-040] +- [ ] CHK006 Are operator-observable conditions defined for "missing offset row for active attachment" (what does the operator see, where)? [Completeness, Spec §FR-039] +- [ ] CHK007 Are requirements specified for failure-counter, last-seen-at, and first-seen-at metadata in the `status` failure surface? [Completeness, Gap] +- [ ] CHK008 Is the failure-record retention policy defined (does a transient failure remain visible after auto-clear)? [Completeness, Gap] + +## Requirement Clarity + +- [ ] CHK009 Is "MUST NOT cause loss of the attachment row" measurable as a row-existence assertion across all failure classes? [Clarity, Spec §FR-038] +- [ ] CHK010 Is "the reader skips that attachment for the cycle and surfaces the inconsistency" precisely defined as a test assertion (no offset advance, status visibility within N cycles)? [Clarity, Spec §FR-039] +- [ ] CHK011 Is "visible failure that points to the underlying degraded condition" (US6 AS2) measurable beyond "operator can see something"? [Clarity, Spec §US6 AS2, FR-040] +- [ ] CHK012 Is "diagnostic surface FEAT-007 already uses" (US6 AS1) enumerated to a specific surface, or is the prose abstract? [Ambiguity, Spec §US6 AS1] +- [ ] CHK013 Is "(or an equivalent inspect path)" in FR-037 precise enough to constrain the implementation, or is it a punt to plan-time? [Ambiguity, Spec §FR-037] + +## Requirement Consistency + +- [ ] CHK014 Are failure-surface requirements consistent across FR-037 (status surface), FR-026 (FEAT-007 lifecycle separation), FR-029, and FR-040 (degraded conditions)? [Consistency, Spec §FR-026, FR-029, FR-037, FR-040] +- [ ] CHK015 Do FR-038 (attachment-row preservation) and FR-039 (skip cycle on missing offset row) align on what constitutes a "lost" vs "skipped" attachment? [Consistency, Spec §FR-038, FR-039] +- [ ] CHK016 Is the FR-026 lifecycle/event separation consistent with FR-044's optional consolidated assertion? [Consistency, Spec §FR-026, FR-044] +- [ ] CHK017 Is FR-040's clarified "buffer + retry + visible status" pattern consistent with US6 AS2's older "(a) retry OR (b) surface failure" wording (post-clarification, both paths apply)? [Consistency, Spec §US6 AS2, FR-040, Clarifications] + +## Acceptance Criteria Quality + +- [ ] CHK018 Is SC-009's "zero FEAT-007 lifecycle event classes appear in the JSONL events history" measurable against a documented set of FEAT-007 event-type names? [Measurability, Spec §SC-009, FR-026] +- [ ] CHK019 Is SC-010's "100% of test iterations" specified with a concrete iteration count and a documented strategy for inducing the failure? [Measurability, Spec §SC-010] +- [ ] CHK020 Are acceptance criteria defined for the `status` surface response shape (so a script can parse failure details deterministically)? [Acceptance Criteria, Gap] +- [ ] CHK021 Are acceptance criteria defined for failure-to-status-visibility latency (within N reader cycles)? [Acceptance Criteria, Gap] + +## Scenario Coverage + +- [ ] CHK022 Are requirements defined for one attachment in failure while many others are healthy? [Coverage, Spec §US6 AS1] +- [ ] CHK023 Are requirements defined for ALL attachments in failure simultaneously (e.g., disk full)? [Coverage, Gap] +- [ ] CHK024 Is the case "permission-restored after EACCES" required to clear the failure indicator within a bounded number of cycles? [Coverage, Gap] +- [ ] CHK025 Are requirements specified for failures that occur during the FR-040 buffered-retry path (failure on retry attempt)? [Coverage, Gap] +- [ ] CHK026 Are requirements specified for the case where `status` itself fails to render the failure surface (e.g., daemon-unreachable while degraded)? [Coverage, Gap] + +## Edge Case Coverage + +- [ ] CHK027 Is the case "offset row exists but `byte_offset` > current file size" addressed as an inconsistency, distinct from truncation? [Edge Case, Gap] +- [ ] CHK028 Is the case "attachment row exists but `log_offsets` row missing AND the file is also missing" handled by exactly one of the failure paths (no double-classification)? [Edge Case, Spec §FR-039] +- [ ] CHK029 Are repeated identical failures required to NOT spam the failure surface or lifecycle log (rate limiting / suppression)? [Edge Case, Gap] +- [ ] CHK030 Is "ENOENT outside the FEAT-007 missing/recreated path" precisely defined (which paths are FEAT-007's scope vs not)? [Edge Case, Spec §FR-038] +- [ ] CHK031 Are requirements defined for the case where the FEAT-007 lifecycle logger itself fails (where does that diagnostic land)? [Edge Case, Gap] + +## Non-Functional Requirements + +- [ ] CHK032 Are observability-latency requirements specified for failure-to-status-visibility (e.g., visible within K reader cycles, default K=2)? [NFR, Gap] +- [ ] CHK033 Are requirements for failure-message redaction specified (no secret leakage in error text shown via `status`)? [NFR, Spec §FR-012, Gap] +- [ ] CHK034 Are isolation requirements (one-attachment failure must not delay other attachments' cycles by more than X) quantified? [NFR, Spec §FR-036, Gap] + +## Dependencies & Assumptions + +- [ ] CHK035 Is the dependency on the FEAT-007 lifecycle logger surface version-pinned and stable? [Dependency, Spec §FR-037] +- [ ] CHK036 Is the assumption that `agenttower status` already exists (FEAT-002) documented as a hard precondition? [Assumption, Gap] +- [ ] CHK037 Is the dependency on FEAT-002's daemon-unreachable surface explicit and version-pinned? [Dependency, Spec §FR-034] + +## Ambiguities & Conflicts + +- [ ] CHK038 Could FR-040's clarified pattern (mandatory buffer + retry + visible status) be misread as the older FR-040 "OR" wording in US6 AS2 if a reader skips the Clarifications section? [Conflict, Spec §US6 AS2, FR-040, Clarifications] +- [ ] CHK039 Is "(or the same diagnostic surface FEAT-007 already uses)" in US6 AS1 a single concrete surface, or is the spec leaving the implementation a choice? [Ambiguity, Spec §US6 AS1] +- [ ] CHK040 Is the boundary between "per-attachment failure" (visible via status) and "daemon degraded condition" (visible globally) precisely defined? [Ambiguity, Spec §FR-037, FR-040] diff --git a/specs/008-event-ingestion-follow/checklists/migration.md b/specs/008-event-ingestion-follow/checklists/migration.md new file mode 100644 index 0000000..f64d1b5 --- /dev/null +++ b/specs/008-event-ingestion-follow/checklists/migration.md @@ -0,0 +1,68 @@ +# Schema Migration Safety Requirements Checklist: Event Ingestion, Classification, and Follow CLI + +**Purpose**: Validate that v5 → v6 migration, rollback, forward-version refusal, mixed-state recovery, and backwards-compatibility requirements are complete, clear, consistent, and measurable. This checklist tests the **requirements writing**, not the implementation. +**Created**: 2026-05-10 +**Feature**: [spec.md](../spec.md) +**Depth**: Formal release gate + +## Requirement Completeness + +- [ ] CHK001 Is the schema-version bump (v5 → v6) explicitly required at the spec level, or is it only a plan-level decision? [Completeness, Plan §R1, Spec Gap] +- [ ] CHK002 Are migration idempotence requirements specified (running the migration on an already-v6 DB is a no-op)? [Completeness, Plan §R1] +- [ ] CHK003 Are migration atomicity requirements specified (single `BEGIN IMMEDIATE` or equivalent — partial failure leaves the DB at v5)? [Completeness, Plan §R1] +- [ ] CHK004 Are forward-version refusal requirements specified (v6 daemon refuses to open v7+ DB)? [Completeness, Plan §R1] +- [ ] CHK005 Are backwards-version refusal requirements specified (v6 daemon CAN read v5, applies migration; CANNOT read v4 or earlier without prior FEAT-006 / FEAT-007 migrations)? [Completeness, Gap] +- [ ] CHK006 Are requirements specified for the events table's initial state on first open after migration (empty)? [Completeness, Gap] +- [ ] CHK007 Are requirements specified for the indexes' creation order (PK first, then four indexes — `IF NOT EXISTS` semantics)? [Completeness, Plan §2.5] +- [ ] CHK008 Are requirements specified for handling a migration interrupted by daemon kill (PRAGMA `journal_mode=WAL` recovery on reopen)? [Completeness, Gap] +- [ ] CHK009 Are requirements specified for the backwards-compatibility test scope (every FEAT-001..007 CLI surface must produce byte-identical output)? [Completeness, Plan §R12] +- [ ] CHK010 Is the `schema_version` field on each event row required to start at `1` and bump only on non-breaking JSONL/SQLite shape additions? [Completeness, Spec §FR-027] + +## Requirement Clarity + +- [ ] CHK011 Is "single `BEGIN IMMEDIATE` transaction" precise enough (vs `BEGIN DEFERRED` with same effect)? [Clarity, Plan §R1] +- [ ] CHK012 Is "refuses to serve the daemon on rollback" precise about what the operator sees (exit code, stderr message)? [Clarity, Plan §R1] +- [ ] CHK013 Is "non-breaking schema-version bump" defined operationally (which kinds of change qualify; new optional field qualifies, renamed field does not)? [Clarity, Spec §FR-027] +- [ ] CHK014 Is the migration-test reproducibility scheme defined unambiguously (use a fixture v5 DB, not a freshly-created one)? [Clarity, Gap] + +## Requirement Consistency + +- [ ] CHK015 Is the migration shape consistent with the FEAT-007 v4 → v5 migration pattern (same `_apply_pending_migrations` pathway, same forward-refusal behavior)? [Consistency, Plan §R1] +- [ ] CHK016 Are the events-table CHECK constraints consistent with the documented JSONL schema enums (`event_type` closed-set in both)? [Consistency, Plan §2.1, Spec §FR-008, FR-027] +- [ ] CHK017 Is the `schema_version` column default value (1) consistent with the JSON schema's `schema_version` initial value? [Consistency, Plan §2.1, Contracts §event-schema] +- [ ] CHK018 Are migration failure semantics consistent between FEAT-008 and the prior FEAT-006 / FEAT-007 migrations (same lifecycle event, same exit-code, same rollback)? [Consistency, Gap] + +## Acceptance Criteria Quality + +- [ ] CHK019 Is there a test-for-test SC item explicitly requiring the migration to be reproducible against a v5-only DB fixture? [Measurability, Plan §"Testing"] +- [ ] CHK020 Is there an SC item explicitly requiring the v6-already-current re-open path to be a no-op? [Measurability, Plan §"Testing"] +- [ ] CHK021 Is there an SC item explicitly requiring the forward-version refusal path to surface a documented error? [Measurability, Plan §"Testing"] +- [ ] CHK022 Is the backwards-compatibility test's coverage scope auditable (an explicit list of FEAT-001..007 commands that must produce byte-identical output)? [Measurability, Plan §R12] + +## Scenario Coverage + +- [ ] CHK023 Are requirements defined for the FRESH-INSTALL scenario (no prior DB; create at v6 directly)? [Coverage, Gap] +- [ ] CHK024 Are requirements defined for the V5-UPGRADE scenario (existing v5 DB; apply v5 → v6 migration)? [Coverage, Plan §R1] +- [ ] CHK025 Are requirements defined for the V6-NOOP scenario (existing v6 DB; reopen as no-op)? [Coverage, Plan §R1] +- [ ] CHK026 Are requirements defined for the V7-FUTURE scenario (existing v7 DB on a v6 daemon; forward-refuse)? [Coverage, Plan §R1] +- [ ] CHK027 Are requirements defined for the V4-OR-EARLIER scenario (older DB on v6 daemon; chain through v5 first, or refuse)? [Coverage, Gap] +- [ ] CHK028 Are requirements defined for the INTERRUPTED-MIGRATION scenario (daemon killed mid-migration; SQLite WAL recovery on next open)? [Coverage, Gap] + +## Edge Case Coverage + +- [ ] CHK029 Is the case "v5 DB has FEAT-007 audit rows in `log_attachment_change` that reference now-stale `attachment_id` values" addressed (events table FK shape only — no enforced FK)? [Edge Case, Plan §2.2] +- [ ] CHK030 Is the case "FEAT-007 redaction utility was updated between FEAT-007 and FEAT-008 deploys" addressed (existing v5-era audit rows already redacted; no retroactive re-redaction)? [Edge Case, Gap] +- [ ] CHK031 Is the case "operator runs `agenttower events` against an unmigrated v5 DB before `agenttowerd` is upgraded" addressed (CLI surfaces a clear error)? [Edge Case, Gap] +- [ ] CHK032 Is the case "concurrent connections to the DB during migration" addressed (`BEGIN IMMEDIATE` lock semantics)? [Edge Case, Plan §R1] + +## Non-Functional Requirements + +- [ ] CHK033 Are migration runtime requirements specified (the migration is O(1) — empty table, no backfill)? [NFR, Plan §R1] +- [ ] CHK034 Are migration disk-space requirements specified (the indexes are empty post-migration; no rebuild)? [NFR, Gap] +- [ ] CHK035 Are migration logging requirements specified (single audit-log line on success/failure)? [NFR, Gap] + +## Dependencies & Assumptions + +- [ ] CHK036 Is the assumption that the FEAT-007 v4 → v5 migration is already applied at FEAT-008 deploy time documented? [Assumption, Plan §R1] +- [ ] CHK037 Is the dependency on SQLite's `IF NOT EXISTS` semantics for idempotence documented? [Dependency, Plan §R1] +- [ ] CHK038 Is the assumption that no manual SQL has been applied to the v5 DB between deploys documented (operator must not hand-edit the events SQLite file)? [Assumption, Gap] diff --git a/specs/008-event-ingestion-follow/checklists/performance.md b/specs/008-event-ingestion-follow/checklists/performance.md new file mode 100644 index 0000000..62d4175 --- /dev/null +++ b/specs/008-event-ingestion-follow/checklists/performance.md @@ -0,0 +1,70 @@ +# Performance Requirements Checklist: Event Ingestion, Classification, and Follow CLI + +**Purpose**: Validate that latency, throughput, scale, memory, and concurrency requirements are complete, clear, consistent, and measurable. This checklist tests the **requirements writing**, not the implementation. +**Created**: 2026-05-10 +**Feature**: [spec.md](../spec.md) +**Depth**: Formal release gate + +## Requirement Completeness + +- [ ] CHK001 Are end-to-end latency requirements specified (write → reader cycle → SQLite commit → CLI render) for all three CLI surfaces (`events`, `events --follow`, `events --json`)? [Completeness, Spec §SC-001, SC-002] +- [ ] CHK002 Is the per-record classifier latency budget specified at the spec level (the plan §"Performance Goals" cites ≤ 1 ms; is this a spec-grade requirement)? [Completeness, Gap] +- [ ] CHK003 Are SQLite commit latency budgets specified (per emitted event under FR-006 atomicity)? [Completeness, Gap] +- [ ] CHK004 Are JSONL append latency budgets specified (per FR-025 / FR-029 success path)? [Completeness, Gap] +- [ ] CHK005 Are query-latency requirements specified for `events.list` against large tables (millions of rows)? [Completeness, Gap] +- [ ] CHK006 Are throughput requirements specified for the reader (events/second per attachment under upper-bound load)? [Completeness, Gap] +- [ ] CHK007 Are concurrency requirements specified for simultaneous `events --follow` sessions (target N, max N)? [Completeness, Gap] +- [ ] CHK008 Are memory bounds specified for per-attachment cycle state, debounce state, and the FR-040 in-memory buffer? [Completeness, Spec §FR-019, FR-040] +- [ ] CHK009 Are memory bounds specified for the follow-session registry under upper-bound concurrent-follower load? [Completeness, Gap] +- [ ] CHK010 Are degradation-under-load requirements specified (what happens at 2× / 5× MVP scale; graceful or hard-cap)? [Completeness, Gap] +- [ ] CHK011 Are CPU budgets specified for one reader cycle (% of one core under upper-bound load)? [Completeness, Gap] + +## Requirement Clarity + +- [ ] CHK012 Is "≤ 1 reader cycle (≤ 1 s wall-clock at MVP scale)" precise enough about whether the cycle bound is real wall-clock or a logical-clock model in tests? [Clarity, Spec §FR-001] +- [ ] CHK013 Is "within one reader cycle of the underlying log write" measurable from a deterministic wall-clock event (e.g., `fsync` completion) rather than from a soft notion like "log was written"? [Clarity, Spec §SC-002] +- [ ] CHK014 Is "MVP scale" defined unambiguously (≤ 50 agents, ≤ a few KB/s per agent — but what is "few", exactly)? [Clarity, Spec §Assumptions] +- [ ] CHK015 Is "documented MVP page size (≤ 50)" precise about default vs maximum (the plan locks both as 50)? [Clarity, Spec §FR-030, Plan] +- [ ] CHK016 Is "follow_long_poll_max_seconds" defined at the spec level as a contract, or is it solely a plan-level default? [Clarity, Plan §"Defaults locked", Gap] + +## Requirement Consistency + +- [ ] CHK017 Are the SC-001 (5 s) and SC-002 (1 s) latency targets consistent with the FR-001 (1 s) reader cycle cap (i.e., 1 reader cycle + commit + render fits within 5 s, easily)? [Consistency, Spec §SC-001, SC-002, FR-001] +- [ ] CHK018 Is the per-cycle byte cap (FR-019) consistent with the per-event excerpt cap (Edge Cases) for the worst-case fan-out (one cycle's bytes producing N events at excerpt cap)? [Consistency, Spec §FR-019, Edge Cases] +- [ ] CHK019 Are reader memory bounds consistent between the cycle buffer (≤ 64 KiB), the FR-040 degraded deque (≤ 64 KiB), and the upper-bound 50-agent scale (≤ 6.4 MiB total)? [Consistency, Plan §"Performance Goals"] +- [ ] CHK020 Is the follow-session idle timeout (5 min) consistent with the long-poll budget (30 s) such that a healthy follower never times out (idle ≤ poll budget × N polls between activity)? [Consistency, Plan §R9] + +## Acceptance Criteria Quality + +- [ ] CHK021 Is SC-001 measurable with a deterministic trigger (e.g., fixture line written → CLI returns within 5 s under controlled load)? [Measurability, Spec §SC-001] +- [ ] CHK022 Is SC-002 measurable without flaky timing assertions on slow CI runners? [Measurability, Spec §SC-002] +- [ ] CHK023 Are acceptance criteria specified for `events.list` query latency at large-table scale (e.g., median ≤ 50 ms at 1M rows)? [Measurability, Gap] +- [ ] CHK024 Are acceptance criteria specified for the reader's per-cycle CPU budget (e.g., median ≤ X% of one core)? [Measurability, Gap] +- [ ] CHK025 Are acceptance criteria specified for memory ceiling under upper-bound concurrent-follower load? [Measurability, Gap] + +## Scenario Coverage + +- [ ] CHK026 Are performance requirements defined for the BURST scenario (50 agents × peak rate simultaneously)? [Coverage, Gap] +- [ ] CHK027 Are performance requirements defined for the COLD-START scenario (daemon restart with N pending JSONL retries — FR-029)? [Coverage, Gap] +- [ ] CHK028 Are performance requirements defined for the LARGE-BACKLOG scenario (`events --follow --since` printing thousands of backlog rows before live)? [Coverage, Gap] +- [ ] CHK029 Are performance requirements defined for the DEGRADED-RECOVERY scenario (FR-040: how many pending events flush per recovery cycle)? [Coverage, Gap] +- [ ] CHK030 Are performance requirements defined for the QUERY-FILTER scenario (compound filter on `--target` + `--type` + `--since` + `--until`)? [Coverage, Gap] + +## Edge Case Coverage + +- [ ] CHK031 Are requirements specified for the case where one attachment's classify+commit exceeds half the cycle budget (per-attachment fairness)? [Edge Case, Plan §"Constraints"] +- [ ] CHK032 Are requirements defined for query latency when `idx_events_observedat_eventid` is contended with a writer cycle? [Edge Case, Gap] +- [ ] CHK033 Are requirements specified for the SIGPIPE case (downstream consumer closes the pipe; CLI must exit promptly without burning CPU)? [Edge Case, Plan §"Stream-flush behavior"] +- [ ] CHK034 Is the case "many attachments, all idle (no new bytes)" required to consume O(attachments) work per cycle, not O(log_size)? [Edge Case, Gap] +- [ ] CHK035 Is the case "one attachment producing very long lines (> excerpt cap × many)" bounded so it cannot starve other attachments? [Edge Case, Spec §FR-019, Gap] + +## Non-Functional Requirements + +- [ ] CHK036 Are SQLite WAL configuration requirements specified (e.g., WAL mode enabled, busy-timeout configured)? [NFR, Gap] +- [ ] CHK037 Are index-coverage requirements specified for every documented filter combination (the plan describes which index serves which query — is this a normative requirement)? [NFR, Plan §2.5] +- [ ] CHK038 Is the assumption of local-SSD storage performance documented (rotational disks would miss SC-002)? [NFR, Assumption] +- [ ] CHK039 Are requirements specified for the daemon's startup time impact (one new schema migration + reader thread spawn)? [NFR, Gap] + +## Dependencies & Assumptions + +- [ ] CHK040 Is the assumption that "Per-MVP scale: ≤ 50 attached agents, ≤ a few KB/s per agent" measurable enough to trigger a spec amendment if real load exceeds it (e.g., quantified upper bound for "few")? [Assumption, Spec §Assumptions] diff --git a/specs/008-event-ingestion-follow/checklists/reliability.md b/specs/008-event-ingestion-follow/checklists/reliability.md new file mode 100644 index 0000000..3af5fcb --- /dev/null +++ b/specs/008-event-ingestion-follow/checklists/reliability.md @@ -0,0 +1,73 @@ +# Reliability & Durability Requirements Checklist: Event Ingestion, Classification, and Follow CLI + +**Purpose**: Validate that reliability, durability, restart-resume, and degraded-mode requirements are complete, clear, consistent, and measurable. This checklist tests the **requirements writing**, not the implementation. +**Created**: 2026-05-10 +**Feature**: [spec.md](../spec.md) +**Depth**: Formal release gate + +## Requirement Completeness + +- [ ] CHK001 Are reader-cycle wall-clock cap requirements quantified for both the documented MVP default and any tighter implementation floor? [Completeness, Spec §FR-001] +- [ ] CHK002 Are atomic-commit boundaries explicitly specified for the SQLite event row + `log_offsets` advance pair? [Completeness, Spec §FR-006] +- [ ] CHK003 Is the JSONL append-after-SQLite ordering requirement documented as a hard sequencing rule, not a recommendation? [Completeness, Spec §FR-006, FR-029] +- [ ] CHK004 Are restart-resume requirements specified for every persisted state element the reader consumes (offsets, audit rows, lifecycle suppression keys)? [Completeness, Spec §FR-020] +- [ ] CHK005 Are duplicate-suppression requirements expressed in terms of byte ranges rather than event identity alone? [Completeness, Spec §FR-021] +- [ ] CHK006 Are bounds defined for the in-flight cycle buffer used by FR-040's degraded-mode retry path? [Completeness, Spec §FR-040] +- [ ] CHK007 Are degraded-mode "clear conditions" defined for both the SQLite path (FR-040) and the JSONL path (FR-029)? [Completeness, Spec §FR-029, FR-040] +- [ ] CHK008 Are requirements defined for what happens to the in-memory degraded-mode buffer if the daemon stops while degraded? [Completeness, Gap] +- [ ] CHK009 Is the watermark used for JSONL retry (FR-029) defined as a persisted artifact or only an in-memory hint? [Completeness, Spec §FR-029] + +## Requirement Clarity + +- [ ] CHK010 Is "exactly once per cycle BEFORE reading any bytes" measurable with a unit-level assertion (e.g., call-count check)? [Clarity, Spec §FR-002] +- [ ] CHK011 Is "single atomic commit per emitted event (or per cycle batch within a single transaction)" unambiguous about which path the implementer must choose? [Ambiguity, Spec §FR-006] +- [ ] CHK012 Is "monotonically-increasing event_id" defined unambiguously across daemon restarts (sequence preserved? gaps allowed?)? [Clarity, Spec §FR-028, Key Entities] +- [ ] CHK013 Is "byte range begins at or after the persisted byte_offset" precise enough to handle partial-line carry-over scenarios? [Clarity, Spec §FR-021] +- [ ] CHK014 Is FR-040's "buffer the in-flight cycle's classified events in memory" specific about whether the buffer survives a process restart? [Clarity, Spec §FR-040] +- [ ] CHK015 Is "next cycle once the degraded state clears" measurable as a deterministic test condition? [Clarity, Spec §FR-040] + +## Requirement Consistency + +- [ ] CHK016 Do FR-006 (atomic SQLite+offset commit) and FR-040 (in-memory buffer on degraded SQLite) reconcile when the SQLite write itself is the failure? [Consistency, Spec §FR-006, FR-040] +- [ ] CHK017 Is the restart-resume contract internally consistent across FR-020 (offsets authoritative), FR-022 (JSONL not load-bearing), and FR-023 (delegate to `reader_cycle_offset_recovery`)? [Consistency, Spec §FR-020, FR-022, FR-023] +- [ ] CHK018 Does FR-015 (debounce state does not span restarts) align with FR-021 (no event below persisted byte_offset) when an `activity` debounce window is interrupted by a restart? [Consistency, Spec §FR-014, FR-015, FR-021] +- [ ] CHK019 Does the FR-029 (JSONL degraded retry) pattern match the FR-040 (SQLite degraded retry) pattern in terms of operator-visible signal and clear-condition semantics? [Consistency, Spec §FR-029, FR-040] + +## Acceptance Criteria Quality + +- [ ] CHK020 Are SC-003's "10 consecutive daemon restarts with no intervening log writes" reproducibility conditions documented (clock injection, deterministic offsets)? [Measurability, Spec §SC-003] +- [ ] CHK021 Are SC-004 and SC-005 distinguishable by test assertion alone (truncation vs recreation)? [Measurability, Spec §SC-004, SC-005] +- [ ] CHK022 Is SC-006's "100% of test iterations across 100 runs" achievable without an explicit randomness/clock-control strategy in the requirements? [Measurability, Spec §SC-006] +- [ ] CHK023 Are observable success criteria defined for FR-040 buffered-retry behavior (e.g., events appear after recovery within N cycles)? [Acceptance Criteria, Gap] + +## Scenario Coverage + +- [ ] CHK024 Are requirements specified for daemon-stop after byte read but before SQLite commit? [Coverage, Spec §US3 AS2] +- [ ] CHK025 Are requirements specified for daemon-stop mid-batch (some events committed, others not, single transaction model)? [Coverage, Spec §FR-006] +- [ ] CHK026 Are requirements defined for daemon-stop while events are sitting in the FR-040 in-memory degraded buffer? [Coverage, Gap] +- [ ] CHK027 Are requirements specified for the case where SQLite recovers but JSONL is still failing (both degraded paths active simultaneously)? [Coverage, Spec §FR-029, FR-040] + +## Edge Case Coverage + +- [ ] CHK028 Is the empty-log post-restart case (no bytes appended while down) explicitly required to produce zero events? [Edge Case, Spec §US3 AS1] +- [ ] CHK029 Is the "bytes appended while daemon is down" case explicitly required to NOT replay any pre-restart events? [Edge Case, Spec §US3 AS3] +- [ ] CHK030 Are requirements defined for SQLite WAL recovery on restart (e.g., trust the WAL, no manual replay logic)? [Edge Case, Gap] +- [ ] CHK031 Are requirements specified for JSONL truncation, corruption, or partial last-line between restarts? [Edge Case, Gap] +- [ ] CHK032 Is the "same byte sequence in distinct cycles" deduplication scenario covered as a normative requirement, not just an edge-case note? [Edge Case, Spec §Edge Cases, US3 AS2] +- [ ] CHK033 Are requirements documented for clock skew across restart (e.g., system time moves backwards; `observed_at` ordering implications)? [Edge Case, Gap] + +## Non-Functional Requirements + +- [ ] CHK034 Is the per-cycle wall-clock cap (≤ 1 second) bounded under high-throughput load (≤ 50 agents at upper-bound write rates)? [NFR, Spec §FR-001] +- [ ] CHK035 Is the in-memory FR-040 buffer's worst-case memory footprint bounded by an observable cap (e.g., per-cycle byte cap × N agents)? [NFR, Spec §FR-040] +- [ ] CHK036 Are concurrency requirements specified for `events` SQLite reads while a writer cycle is mid-commit (snapshot isolation expectations)? [NFR, Spec §Edge Cases] + +## Dependencies & Assumptions + +- [ ] CHK037 Is the assumption of SQLite read-after-write consistency within a single transaction documented? [Assumption, Gap] +- [ ] CHK038 Is the assumption that the daemon process clock is monotonic enough for `observed_at` ordering documented? [Assumption, Spec §Assumptions] +- [ ] CHK039 Are dependencies on FEAT-007's `reader_cycle_offset_recovery` semantics version-pinned to a specific helper API surface? [Dependency, Spec §FR-002, FR-041] + +## Ambiguities & Conflicts + +- [ ] CHK040 Is "next cycle" in FR-040 unambiguous when a degraded mode persists across many cycles (does the buffer accumulate or stop reading)? [Ambiguity, Spec §FR-040] diff --git a/specs/008-event-ingestion-follow/checklists/requirements.md b/specs/008-event-ingestion-follow/checklists/requirements.md new file mode 100644 index 0000000..d83f77d --- /dev/null +++ b/specs/008-event-ingestion-follow/checklists/requirements.md @@ -0,0 +1,44 @@ +# Specification Quality Checklist: Event Ingestion, Classification, and Follow CLI + +**Purpose**: Validate specification completeness and quality before proceeding to planning +**Created**: 2026-05-09 +**Feature**: [spec.md](../spec.md) + +## Content Quality + +- [x] No implementation details (languages, frameworks, APIs) +- [x] Focused on user value and business needs +- [x] Written for non-technical stakeholders +- [x] All mandatory sections completed + +## Requirement Completeness + +- [x] No [NEEDS CLARIFICATION] markers remain +- [x] Requirements are testable and unambiguous +- [x] Success criteria are measurable +- [x] Success criteria are technology-agnostic (no implementation details) +- [x] All acceptance scenarios are defined +- [x] Edge cases are identified +- [x] Scope is clearly bounded +- [x] Dependencies and assumptions identified + +## Feature Readiness + +- [x] All functional requirements have clear acceptance criteria +- [x] User scenarios cover primary flows +- [x] Feature meets measurable outcomes defined in Success Criteria +- [x] No implementation details leak into specification + +## Notes + +- Items marked incomplete require spec updates before `/speckit.clarify` or `/speckit.plan`. +- The "Integration Contracts with FEAT-007" requirement block (FR-041..FR-044) + names specific helper functions + (`agenttower.logs.reader_recovery.reader_cycle_offset_recovery`, + `agenttower.state.log_offsets.detect_file_change`, + `agenttower.state.log_offsets.advance_offset_for_test` test seam) by symbol + path. This is intentional carry-over from FEAT-007 per `docs/mvp-feature- + sequence.md` and the user request, not stray implementation detail; FEAT-007 + shipped the helpers and unit coverage and required FEAT-008 to consume them + unchanged. Reviewers should evaluate FR-041..FR-044 as integration contracts, + not as design directives. diff --git a/specs/008-event-ingestion-follow/checklists/security.md b/specs/008-event-ingestion-follow/checklists/security.md new file mode 100644 index 0000000..d8870a8 --- /dev/null +++ b/specs/008-event-ingestion-follow/checklists/security.md @@ -0,0 +1,70 @@ +# Security Requirements Checklist: Event Ingestion, Classification, and Follow CLI + +**Purpose**: Validate that security, redaction, authentication, error-message-leakage, and DoS-resistance requirements are complete, clear, consistent, and measurable. This checklist tests the **requirements writing**, not the implementation. +**Created**: 2026-05-10 +**Feature**: [spec.md](../spec.md) +**Depth**: Formal release gate + +## Requirement Completeness + +- [ ] CHK001 Are redaction obligations specified for every durable surface (SQLite excerpt, JSONL excerpt, CLI human output, CLI `--json` output, `agenttower status` degraded surface)? [Completeness, Spec §FR-012] +- [ ] CHK002 Are peer-uid authentication requirements specified for the four new socket methods (`events.list`, `events.follow_open`, `events.follow_next`, `events.follow_close`)? [Completeness, Gap] +- [ ] CHK003 Are file-mode requirements (0o600 / 0o700) re-stated as binding for FEAT-008 additions, or is inheritance from FEAT-001 documented as an explicit precondition? [Completeness, Gap] +- [ ] CHK004 Are requirements specified for the events SQLite table's access path being daemon-only (no direct file read by clients)? [Completeness, Gap] +- [ ] CHK005 Are requirements specified for redacting any log content that might appear in the `events_persistence.degraded_*` error fields surfaced by `agenttower status`? [Completeness, Gap] +- [ ] CHK006 Are session_id unguessability requirements specified (entropy source, length, no monotonic component)? [Completeness, Gap] +- [ ] CHK007 Are requirements specified for what the `agent_not_found` error message MAY include vs MUST NOT include (e.g., must NOT echo back unrelated agent ids; must NOT include log paths)? [Completeness, Gap] +- [ ] CHK008 Are requirements specified for path-traversal validation of stored `log_path` values, or is this delegated to FEAT-007 with an explicit reference? [Completeness, Gap] +- [ ] CHK009 Are CLI input-validation obligations (`agent_id` shape, `--type` enum, ISO-8601, `--limit` bounds) explicitly required as client-side enforcement before daemon dispatch? [Completeness, Spec §FR-035a] +- [ ] CHK010 Are requirements specified for the test-seam production guard so `AGENTTOWER_TEST_*_FAKE` cannot be honored by a production daemon? [Completeness, Gap] +- [ ] CHK011 Is there a requirement that the `excerpt` field cannot exceed `per_event_excerpt_cap_bytes` regardless of input size (DoS via long lines)? [Completeness, Spec §FR-019, Edge Cases] + +## Requirement Clarity + +- [ ] CHK012 Is "redacted excerpt" defined unambiguously at the spec level (which patterns; which utility version)? [Clarity, Spec §FR-012] +- [ ] CHK013 Is "redaction runs before truncation" precise enough to be testable for a secret pattern split exactly at the cap boundary? [Clarity, Spec §Edge Cases] +- [ ] CHK014 Is "opaque at the CLI boundary" defined operationally for cursors (e.g., cannot be hand-derived without a daemon round-trip)? [Clarity, Spec §FR-030, Clarifications] +- [ ] CHK015 Is "closed-set `agent_not_found` error" precisely defined as a stable string identifier with documented payload shape? [Clarity, Spec §FR-035a] +- [ ] CHK016 Is "in-memory buffer" in FR-040 explicit about the buffer NOT being written to disk and NOT spanning daemon restart? [Clarity, Spec §FR-040, Clarifications] +- [ ] CHK017 Is the `classifier_rule_id` pattern required to be ASCII-only and matchable to a closed registry (preventing rule-id injection from log content)? [Clarity, Gap] + +## Requirement Consistency + +- [ ] CHK018 Are redaction requirements consistent between FR-012 (classifier emits redacted excerpt) and FR-027 (JSONL stable schema's excerpt field) — i.e., the same redaction utility runs at the same point? [Consistency, Spec §FR-012, FR-027] +- [ ] CHK019 Do the FEAT-001 file-mode contracts (0o600 / 0o700) apply transitively to FEAT-008 additions without explicit re-statement, or is FEAT-008's reuse of the path explicit enough? [Consistency, Gap] +- [ ] CHK020 Is peer-uid mismatch handling consistent between the new `events.*` methods and the existing FEAT-002 socket methods (same `socket_peer_uid_mismatch` lifecycle event)? [Consistency, Gap] +- [ ] CHK021 Is the closed-set error code naming consistent with FEAT-002 conventions (snake_case, no leading underscore, lowercase only)? [Consistency, Spec §FR-035a] +- [ ] CHK022 Are redaction-before-truncation guarantees consistent between the byte-cap split case (Edge Cases) and the excerpt-cap truncation case (Edge Cases)? [Consistency, Spec §Edge Cases] + +## Acceptance Criteria Quality + +- [ ] CHK023 Are SC items measurable for redaction guarantees (e.g., "100% of secret patterns in fixture set produce a redacted excerpt")? [Measurability, Gap] +- [ ] CHK024 Is SC-009 (FEAT-007 lifecycle classes do not appear in FEAT-008 events stream) strong enough to prevent log_path leakage via lifecycle event excerpts, or is a stronger requirement needed? [Acceptance Criteria, Spec §SC-009] +- [ ] CHK025 Are acceptance criteria specified for the maximum size of an error message body (DoS via crafted error text)? [Measurability, Gap] +- [ ] CHK026 Are acceptance criteria specified for the redaction fixture set's coverage (which categories of secret are tested: JWT, env-var-style, API key, password, generic high-entropy)? [Measurability, Gap] + +## Scenario Coverage + +- [ ] CHK027 Are requirements defined for secret patterns SPLIT across the per-cycle byte cap (first half on cycle N, second half on cycle N+1)? [Coverage, Gap] +- [ ] CHK028 Are requirements defined for secret patterns SPLIT across the per-event excerpt cap (redaction runs before truncation, but is the post-truncate marker boundary itself secret-safe)? [Coverage, Spec §Edge Cases] +- [ ] CHK029 Are requirements defined for the case where a redacted secret appears in an `activity` debounce window's collapsed `latest_excerpt`? [Coverage, Gap] +- [ ] CHK030 Are requirements specified for follow-session isolation between concurrent operators (one operator's `session_id` cannot subscribe to another operator's filter without re-auth)? [Coverage, Gap] +- [ ] CHK031 Are requirements specified for the case where the daemon emits an `agent_not_found` error for an `agent_id` shape-validated client-side (no echo of attacker-controlled value beyond bounded length)? [Coverage, Gap] + +## Edge Case Coverage + +- [ ] CHK032 Is the case "ANSI escape sequences in excerpts that could affect human-output rendering or be interpreted by downstream terminals" addressed (e.g., must be stripped or escaped in human output)? [Edge Case, Gap] +- [ ] CHK033 Is the case "log_path contains characters that need quoting for log aggregators / shell evaluation" addressed at the JSONL/CLI output boundary? [Edge Case, Gap] +- [ ] CHK034 Is `session_id` collision avoidance defined under the birthday-paradox bound for the configured 12-hex shape (≈ 4.7B combinations; collision probability at 50 sessions is ≈ negligible — but is this called out)? [Edge Case, Gap] +- [ ] CHK035 Are requirements specified for the boundary case where the FEAT-007 redaction utility is updated to redact a NEW pattern AFTER FEAT-008 events have already persisted that pattern unredacted (no retroactive redaction in MVP)? [Edge Case, Gap] +- [ ] CHK036 Is the case "operator pipes `events --json` into a downstream JSON parser and an excerpt contains crafted JSON-injection characters" addressed (the JSON encoder handles it — is this written as a requirement)? [Edge Case, Gap] + +## Non-Functional Requirements + +- [ ] CHK037 Are ReDoS-protection requirements specified for the classifier regex catalogue (e.g., no nested-quantifier patterns; pinned at the contract level)? [NFR, Spec §FR-007] +- [ ] CHK038 Are memory-exhaustion bounds specified for the per-cycle buffer AND the FR-040 in-memory degraded buffer (both bounded by `per_cycle_byte_cap_bytes`)? [NFR, Spec §FR-019, FR-040] +- [ ] CHK039 Are requirements specified for the maximum number of concurrent follow sessions (DoS via session-creation flood)? [NFR, Gap] + +## Dependencies & Assumptions + +- [ ] CHK040 Is the dependency on FEAT-007's `redact_one_line` version-pinned for stable security guarantees, AND is the assumption that local Unix socket peer-uid is sufficient authentication explicitly documented (not just implied by FEAT-002 inheritance)? [Dependency / Assumption, Spec §FR-012, Gap] diff --git a/specs/008-event-ingestion-follow/contracts/classifier-catalogue.md b/specs/008-event-ingestion-follow/contracts/classifier-catalogue.md new file mode 100644 index 0000000..b47a416 --- /dev/null +++ b/specs/008-event-ingestion-follow/contracts/classifier-catalogue.md @@ -0,0 +1,170 @@ +# Classifier Rule Catalogue Contract + +**Branch**: `008-event-ingestion-follow` | **Date**: 2026-05-10 +**Plan**: [../plan.md](../plan.md) | **Spec**: [../spec.md](../spec.md) + +The MVP rule catalogue is closed (FR-007). Each rule is defined by: + +- `rule_id`: stable, dotted, with `vN` suffix (e.g., + `error.traceback.v1`). +- `event_type`: one of the ten closed-set values. +- `priority`: integer, lower = higher priority. Walked in ascending + order; first match wins (FR-008 deterministic tie-break). +- `matcher`: a compiled `re.Pattern` against ONE complete record + (post-redaction). +- `extract`: optional callable producing structured fields (only + used by `swarm_member.v1` to record parsed parent/pane/label etc. + in tests; not exposed in MVP JSONL). + +`pane_exited` and `long_running` are synthesized by the reader and +do NOT appear in the matcher catalogue (`research.md` §R11). Their +synthetic rule ids (`pane_exited.synth.v1`, `long_running.synth.v1`) +appear in `events.classifier_rules` output for completeness but are +returned under a separate `synthetic_rule_ids` array. + +--- + +## Catalogue + +| # | priority | `rule_id` | `event_type` | Matcher (post-redaction, ASCII flag where noted) | +|---|---:|---|---|---| +| 1 | 10 | `swarm_member.v1` | `swarm_member_reported` | `^AGENTTOWER_SWARM_MEMBER parent=(?Pagt_[0-9a-f]{12}) pane=(?P%[0-9]+) label=(?P