Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,5 @@ env/
dist/
build/
*.egg-info/

.claude/scheduled_tasks.lock
2 changes: 1 addition & 1 deletion .specify/feature.json
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{
"feature_directory": "specs/007-log-attachment-offsets"
"feature_directory": "specs/008-event-ingestion-follow"
}
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<!-- SPECKIT START -->
For additional context about technologies to be used, project structure,
shell commands, and other important information, read the current plan:
`specs/007-log-attachment-offsets/plan.md`.
`specs/008-event-ingestion-follow/plan.md`.
<!-- SPECKIT END -->

# AgentTower Agent Context
Expand Down
44 changes: 44 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,44 @@ Opensoft namespace:
Bench containers mount the daemon socket and, preferably, the state log
directory. No network listener is required for MVP.

### 4.1 UID-mapping invariant for bench containers

The host daemon reads pane log files written by in-container `tmux
pipe-pane` processes via the bind mount. For the daemon's read to
succeed, **the UID that wrote the log file must be reachable from the
daemon's UID** — i.e., the host UID and the in-container UID must be
the same value (or the host daemon must be able to read files owned
by the mapped UID).

Two supported configurations:

1. **No userns remap** (default): the bench container runs with
`--user $(id -u):$(id -g)` so the in-container process and the host
daemon share the same UID. The daemon reads its own files; SO_PEERCRED
on the mounted socket sees the same UID on both sides. This is the
recommended setup.

2. **Userns remap**: if the operator deploys with Docker's
`--userns=remap:<user>` (or `userns_mode=host` with a manual
`--user X:Y` that diverges from the host UID), the in-container
process writes log files with the *remapped host UID* (e.g.,
`100000`). The host daemon (running as e.g. UID `1000`) cannot read
these files; the FEAT-008 reader surfaces a per-attachment
`failure_class=PermissionError` (FR-038), but the root cause is
silent at the spec level.

Operators using userns-remapped containers MUST either:
- Map the in-container UID to the daemon's host UID (e.g., add
`userns-remap` config that rewrites `0` → daemon-UID), OR
- Run the daemon as `root` (NOT recommended — undermines the
local-first least-privilege constitution).

The `agenttower config doctor` command (FEAT-005) currently checks
socket reachability but **does not yet** check log-file readability;
operators encountering EACCES errors should manually verify
`stat $(ls ~/.local/state/opensoft/agenttower/logs/*/*.log | head -1)`
matches the daemon's UID.

## 5. Components

### 5.1 Host Daemon
Expand Down Expand Up @@ -414,6 +452,12 @@ Classification is rule-based for MVP. It should be conservative and
transparent. The daemon may emit uncertain events, but it must not turn
uncertain classification into automatic command execution.

The pipeline above is implemented in FEAT-008. See
`specs/008-event-ingestion-follow/plan.md` for the implementation
detail (atomic SQLite + offset commit, FR-029 JSONL retry watermark,
FR-040 buffered-retry on degraded SQLite, FR-014 debounce semantics,
FR-013 / FR-016 synthesized event types).

## 14. Routing Model

AgentTower routes structured messages between agents. The main MVP routing path
Expand Down
8 changes: 8 additions & 0 deletions docs/mvp-feature-sequence.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,14 @@ Out of scope:

## FEAT-008: Event Ingestion, Classification, and Follow CLI

**Status: implemented.** See
`specs/008-event-ingestion-follow/plan.md` for the implementation
record. Acceptance items below are tested by integration tests under
`tests/integration/test_events_us{1..6}*.py` plus
`test_lifecycle_separation.py`. Carry-over obligations from FEAT-007
(T175 truncation, T176 recreation, T177 round-trip) land in
`test_events_us4_carryover.py`.

Goal: convert pane logs into durable, inspectable AgentTower events.

Build:
Expand Down
9 changes: 8 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,14 @@ agenttower = "agenttower.cli:main"
agenttowerd = "agenttower.daemon:main"

[project.optional-dependencies]
test = ["pytest>=7"]
test = [
"pytest>=7",
# FEAT-008 T005: JSON Schema validator for the FR-027 / FR-032
# event schema. Test-only — runtime stays stdlib-only per the plan.
# Upper-bound pinned to <5 so a major-version bump can't silently
# break CI (review CRIT #2).
"jsonschema>=4,<5",
]

[tool.hatch.build.targets.wheel]
packages = ["src/agenttower"]
73 changes: 73 additions & 0 deletions specs/008-event-ingestion-follow/checklists/carryover.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# FEAT-007 Carry-Over Integration Requirements Checklist: Event Ingestion, Classification, and Follow CLI

**Purpose**: Validate that the FEAT-007 carry-over obligations (T175/T176/T177, file-change classification, no-replay invariant, lifecycle separation) are complete, clear, consistent, and measurable. This checklist tests the **requirements writing**, not the implementation.
**Created**: 2026-05-10
**Feature**: [spec.md](../spec.md)
**Depth**: Formal release gate

## Requirement Completeness

- [ ] CHK001 Are integration test deliverables (T175 truncation, T176 recreation, T177 missing→recreated→re-attach) explicitly required as in-scope for FEAT-008? [Completeness, Spec §FR-043]
- [ ] CHK002 Is the "no-replay invariant" defined precisely enough to be the form of an assertion (e.g., "no event whose `byte_range_start` is below the post-reset offset")? [Completeness, Spec §FR-043]
- [ ] CHK003 Is the requirement to call `reader_cycle_offset_recovery` "exactly once per cycle BEFORE reading bytes" testable at the unit level (call-count assertion)? [Completeness, Spec §FR-002, FR-041]
- [ ] CHK004 Are FR-004's prohibitions on production use of `advance_offset_for_test` enforced by an existing AST gate that this feature must continue to pass? [Completeness, Spec §FR-004, SC-008]
- [ ] CHK005 Are requirements covering the file-change classifier obligation (`detect_file_change`) explicit about the prohibition on re-implementation? [Completeness, Spec §FR-042]
- [ ] CHK006 Is the dispatcher mapping (`unchanged | truncated | recreated | missing | reappeared`) referenced in the FEAT-008 reader requirements as the canonical taxonomy? [Completeness, Spec §FR-002, FR-041]
- [ ] CHK007 Is the optional consolidated lifecycle-surface assertion (FR-044) defined with a clear scope vs the dedicated per-class FEAT-007 tests it consolidates? [Completeness, Spec §FR-044]
- [ ] CHK008 Are requirements defined for the audit-row append (`log_attachment_change`) idempotence under retry? [Completeness, Gap]

## Requirement Clarity

- [ ] CHK009 Is "≤ 1 reader cycle (≤ 1 s wall-clock at MVP scale)" measurable with a deterministic injected test clock (no real-time sleeps in tests)? [Clarity, Spec §FR-043]
- [ ] CHK010 Is "no durable event whose excerpt comes from pre-reset bytes" precisely defined for the truncate-then-write-same-bytes case (excerpt-content vs source-byte-range distinction)? [Clarity, Spec §FR-043, US4 AS1]
- [ ] CHK011 Is "operator-explicit re-attach" distinguished from automatic recovery in the requirements (which path applies in which scenario)? [Clarity, Spec §US4 AS5]
- [ ] CHK012 Is "delegating to `reader_cycle_offset_recovery`" in FR-023 specific enough to prevent the reader from inlining or duplicating the helper's logic? [Ambiguity, Spec §FR-023]

## Requirement Consistency

- [ ] CHK013 Does FR-041's helper-ownership claim agree with FR-003's prohibition on direct `log_attachments` / `log_offsets` row mutation? [Consistency, Spec §FR-003, FR-041]
- [ ] CHK014 Are the row-status transitions referenced in US4 AS3/AS4 consistent with FEAT-007's documented `active → stale → active` state machine? [Consistency, Spec §US4]
- [ ] CHK015 Is the `log_rotation_detected` / `log_file_missing` / `log_file_returned` lifecycle separation in FR-026 consistent with US4's per-scenario "exactly one" emission counts? [Consistency, Spec §FR-026, US4]
- [ ] CHK016 Is the "(suppression-keyed by `(agent_id, log_path, file_inode)`)" rule in US4 AS4 consistent with FEAT-007's documented suppression key shape (FR-061 reference)? [Consistency, Spec §US4 AS4, FR-041]

## Acceptance Criteria Quality

- [ ] CHK017 Is SC-004's "T175 promoted to FEAT-008 integration coverage" tied to a specific test-file path or naming convention so it can be located? [Measurability, Spec §SC-004]
- [ ] CHK018 Is SC-006's "100 runs of the integration test" reproducibility assured by a documented seed, clock-injection, or fixed-fixture strategy? [Measurability, Spec §SC-006]
- [ ] CHK019 Are timing assertions for ≤ 1 reader cycle measurable without flakiness on slow CI runners (e.g., logical-clock model rather than wall-clock)? [Measurability, Gap]
- [ ] CHK020 Is the assertion "exactly one `log_rotation_detected` lifecycle event" in US4 AS1/AS2 measurable against a deterministic lifecycle-event sink? [Measurability, Spec §US4 AS1, AS2]

## Scenario Coverage

- [ ] CHK021 Is the "missing→reappear→re-attach" round-trip required to be covered as a single end-to-end test, not three independent tests? [Coverage, Spec §FR-043, US4 AS5]
- [ ] CHK022 Are requirements specified for a reader that observes RECREATED in the same cycle as a pending event from the previous (now-truncated) inode? [Coverage, Gap]
- [ ] CHK023 Are requirements specified for the case where re-attach succeeds but the file is missing again before the next cycle? [Coverage, Gap]
- [ ] CHK024 Is the deletion → permanent-missing case (no recreation) covered by separate requirements from deletion → recreation? [Coverage, Spec §US4 AS3, AS4]
- [ ] CHK025 Are requirements defined for the no-replay invariant under ALL four file-change kinds (truncated, recreated, missing, reappeared), not just truncated/recreated? [Coverage, Spec §FR-043]

## Edge Case Coverage

- [ ] CHK026 Is the case "inode reuse within a short window" (OS-level inode recycling) addressed by the file-change classifier requirements? [Edge Case, Gap]
- [ ] CHK027 Is the case "file size returns to identical pre-truncate value with new bytes" (size-only check would miss this) addressed? [Edge Case, Gap]
- [ ] CHK028 Is the case "MISSING followed by REAPPEARED in adjacent cycles before any operator action" required to emit no durable event? [Edge Case, Spec §Edge Cases]
- [ ] CHK029 Is `log_file_returned` suppression-keyed by `(agent_id, log_path, file_inode)` enforced for the duration of a single stale period (does the same key fire again across stale → active → stale cycles)? [Edge Case, Spec §US4 AS4]
- [ ] CHK030 Are requirements defined for the case where FEAT-007 and FEAT-008 disagree on the row's expected state at cycle entry (defensive read)? [Edge Case, Gap]

## Non-Functional Requirements

- [ ] CHK031 Is the test-runtime budget for the round-trip integration tests bounded (so SC-006's 100 runs is feasible on CI)? [NFR, Gap]
- [ ] CHK032 Are flake-rate budgets for SC-006's 100 runs documented (e.g., 0% flake target)? [NFR, Gap]

## Dependencies & Assumptions

- [ ] CHK033 Is the version-pin of FEAT-007's `reader_recovery` API surface documented (so a FEAT-007 patch cannot silently change FEAT-008 behavior)? [Dependency, Spec §FR-041]
- [ ] CHK034 Is the assumption that FEAT-007 lifecycle suppression (FR-061 reference) is in place documented as an explicit precondition? [Assumption, Spec §FR-041]
- [ ] CHK035 Is the dependency on FEAT-007's audit-row append idempotence documented? [Dependency, Gap]

## Ambiguities & Conflicts

- [ ] CHK036 Could FR-044's "MAY add a single integration test" leave the FR-026 lifecycle-separation requirement under-tested if the feature opts not to add the test? [Conflict, Spec §FR-026, FR-044]
- [ ] CHK037 Is "the same byte sequence appears twice in distinct cycles" (Edge Cases) the same scenario as US3 AS2, or a different scenario? [Ambiguity, Spec §Edge Cases, US3 AS2]
- [ ] CHK038 Is FR-018's "if the same pane id reappears later... it counts as a new lifecycle once the attachment is re-bound" precisely defined in terms of which event triggers the new lifecycle counter? [Ambiguity, Spec §FR-018]
- [ ] CHK039 Is "at most one reader cycle" (US4 AS1/AS2) consistent with the sometimes-stricter "one reader cycle" used elsewhere (US4 AS3, AS4)? [Ambiguity, Spec §US4]
- [ ] CHK040 Is the FR-043 "no-replay invariant" requirement scoped to the test suite alone, or also a normative reader-behavior requirement (would the bug be caught outside the named tests)? [Ambiguity, Spec §FR-043]
70 changes: 70 additions & 0 deletions specs/008-event-ingestion-follow/checklists/classifier.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Classifier Rule Catalogue Requirements Checklist: Event Ingestion, Classification, and Follow CLI

**Purpose**: Validate that classifier rule, redaction, and debounce requirements are complete, clear, consistent, and measurable. This checklist tests the **requirements writing**, not the implementation.
**Created**: 2026-05-10
**Feature**: [spec.md](../spec.md)
**Depth**: Formal release gate

## Requirement Completeness

- [ ] CHK001 Are all 10 event types in the FR-008 catalogue required to have at least one matching rule documented in the rule catalogue? [Completeness, Spec §FR-008]
- [ ] CHK002 Is the rule-priority order required to be expressed in a deterministic, testable form (priority table or ordered list, not prose)? [Completeness, Spec §FR-008, Spec §Edge Cases]
- [ ] CHK003 Is `long_running` eligibility required to be defined as a complete state-transition table (which prior `event_type` values qualify, which do not)? [Completeness, Spec §FR-013]
- [ ] CHK004 Are explicit rules required for what triggers `completed`? [Gap, Spec §FR-008]
- [ ] CHK005 Are explicit rules required for what triggers `waiting_for_input`? [Gap, Spec §FR-008]
- [ ] CHK006 Are explicit rules required for what triggers `manual_review_needed`? [Gap, Spec §FR-008]
- [ ] CHK007 Are explicit rules required for distinguishing `error` from `test_failed` (separate matchers, no overlap by construction)? [Completeness, Spec §FR-008]
- [ ] CHK008 Are explicit rules required for what triggers `test_passed`? [Gap, Spec §FR-008]
- [ ] CHK009 Is the integration with the FEAT-007 redaction utility required at the rule level (every rule emits a redacted excerpt, never the raw bytes)? [Completeness, Spec §FR-012]
- [ ] CHK010 Are the per-attachment "last output at" data lifecycle requirements (initialization, update, reset on restart) specified? [Completeness, Spec §FR-013, FR-015]

## Requirement Clarity

- [ ] CHK011 Is "rule-based only" measurable in code (e.g., a regex/matcher list with no learned components, no network calls)? [Clarity, Spec §FR-007]
- [ ] CHK012 Is "conservative" defined operationally beyond "default to activity" (e.g., a precise decision rule for ambiguity)? [Clarity, Spec §FR-011]
- [ ] CHK013 Is "ongoing work following waiting_for_input is ineligible for long_running" precisely defined for the eligibility table? [Clarity, Spec §FR-013]
- [ ] CHK014 Is the `swarm_member_reported` regex shape exhaustive on whitespace, key ordering, escaping, and quoting? [Clarity, Spec §FR-009]
- [ ] CHK015 Is "redaction runs before truncation" clear about which redactor and which truncation marker apply? [Clarity, Spec §Edge Cases]
- [ ] CHK016 Is "exactly one event per debounce window" unambiguous about which record's excerpt is preserved (latest? first? configurable?)? [Clarity, Spec §FR-014]

## Requirement Consistency

- [ ] CHK017 Are the 10 event types in FR-008 the same set referenced by FR-014's collapse-eligible / one-to-one classification? [Consistency, Spec §FR-008, FR-014]
- [ ] CHK018 Does FR-009's strict-parse rule (malformed → `activity`) align with FR-011's conservative-default rule (ambiguous → `activity`) without rule overlap? [Consistency, Spec §FR-009, FR-011]
- [ ] CHK019 Are `classifier_rule_id` values required to be stable across the catalogue and consistent between SQLite and JSONL output? [Consistency, Spec §FR-027]
- [ ] CHK020 Are `pane_exited` requirements consistent between FR-016 (must be inferred from FEAT-004 state + grace), FR-017 (grace window), and FR-018 (one per lifecycle)? [Consistency, Spec §FR-016, FR-017, FR-018]

## Acceptance Criteria Quality

- [ ] CHK021 Is SC-007's "100% accuracy on every fixture line" measurable against a documented, version-pinned fixture set? [Measurability, Spec §SC-007]
- [ ] CHK022 Are negative test fixtures required (lines that MUST NOT classify as a domain-specific type)? [Acceptance Criteria, Gap]
- [ ] CHK023 Is "documented ambiguous line" defined with explicit fixture entries rather than leaving "ambiguous" up to test author judgment? [Measurability, Spec §SC-007]
- [ ] CHK024 Are acceptance criteria defined for the `debounce` object's `window_id`, `collapsed_count`, `window_started_at`, `window_ended_at` fields' shape and population rules? [Acceptance Criteria, Spec §FR-027]

## Scenario Coverage

- [ ] CHK025 Are requirements defined for multi-line records (continuation lines, line continuations from shells)? [Coverage, Gap]
- [ ] CHK026 Are requirements defined for ANSI escape sequences in lines (color codes, cursor motion, OSC sequences)? [Coverage, Gap]
- [ ] CHK027 Are requirements defined for rule matching across the per-cycle byte-cap (FR-019) truncation boundary? [Coverage, Spec §FR-019, Edge Cases]
- [ ] CHK028 Are requirements specified for protecting against catastrophic regex backtracking (ReDoS)? [Coverage, Gap, NFR]
- [ ] CHK029 Are requirements specified for the case where a classifier rule depends on prior reader-state (e.g., `long_running`) and that state is unavailable on first cycle? [Coverage, Spec §FR-013, FR-015]

## Edge Case Coverage

- [ ] CHK030 Is overlap between `error` and `test_failed` resolved by a documented priority order in the spec, not just the catalogue? [Edge Case, Spec §FR-008, Edge Cases]
- [ ] CHK031 Is the malformed-`AGENTTOWER_SWARM_MEMBER` case explicitly required to fall through to `activity`, not silently dropped? [Edge Case, Spec §FR-009]
- [ ] CHK032 Is `pane_exited` required to NOT be emitted when the log text mentions "exited" or similar (FR-016: "MUST NOT be inferred from log text alone")? [Edge Case, Spec §FR-016]
- [ ] CHK033 Is the case "rule matches partial trailing bytes" excluded by FR-005's complete-record rule? [Edge Case, Spec §FR-005]
- [ ] CHK034 Are debounce-collapse semantics defined when the latest record's excerpt is empty or whitespace-only? [Edge Case, Spec §FR-014]
- [ ] CHK035 Is the "secret pattern split across the truncation boundary" redaction guarantee documented as a hard requirement, not best-effort? [Edge Case, Spec §Edge Cases]

## Non-Functional Requirements

- [ ] CHK036 Are classifier latency requirements quantified per record (e.g., a per-record budget that fits within the cycle wall-clock cap at upper-bound throughput)? [NFR, Gap]
- [ ] CHK037 Are FR-010's pure-function purity requirements verifiable by static analysis or property test (no I/O, no clock reads inside the rule fn)? [NFR, Spec §FR-010]
- [ ] CHK038 Are memory bounds defined for per-attachment classifier state (last-output-at, debounce window) across the upper-bound 50-agent scale? [NFR, Gap]

## Dependencies & Assumptions

- [ ] CHK039 Is the dependency on FEAT-004 pane discovery for `pane_exited` documented and version-pinned? [Dependency, Spec §FR-016]
- [ ] CHK040 Is the assumption that `\n` is the record boundary documented (FR-005, Assumptions) AND the classifier explicitly required to break on it? [Assumption, Spec §FR-005, Assumptions]
Loading