feat(daemon): enforce Slack reply before clean exit (silent-exit gate)#68
Merged
Conversation
Detect Slack-triggered sessions that exit cleanly but never call chat.postMessage / chat.update. Treat as failure → spawn a retry whose job is to read the previous session's audit-log slice and post the summary. Today (2026-05-24) two sessions opened PRs + filed Linear issues but never posted to the thread: - f31a21c5 (opencode-tools-integration): 31 tool calls, PR #66, 0 posts - 624e27ec (opus-thread-budget-review): 59 tool calls, consult_opus + PR #67, 0 posts Both journals claimed Sam posted. Neither did. The ✅ reaction fired anyway because it's keyed on exit_code=0, not on 'did the operator actually get a reply.' Operator's framing: 'it gave green check' but no Slack reply. Detection: - _classify_tool_use now returns a 5th value, slack_posted, set when any bash call contains 'chat.postMessage' or 'chat.update' (curl or python3 -c both register; setStatus / reactions.add / conversations.replies / other housekeeping endpoints do NOT). - SessionResult.slack_posted carries the flag through. - SessionResult.silent_exit(message=...) returns True when: (a) session not failed, (b) not scheduled, (c) not retry, (d) slack_posted is False. Enforcement: - Daemon's _worker treats silent_exit the same as failed → spawns retry with retry_context={'silent_exit': True, 'previous_session_id'}. - silent_exit() returns False for retries, preventing recursion. - If the retry ITSELF silent-exits, operator-alert fires directly (the daemon checks slack_posted on retry_result, bypassing the retry-context guard). Retry shape: - New SILENT_EXIT_INTRO / SILENT_EXIT_OUTRO templates in prompts.py. - _format_silent_exit_message() in IncomingMessage dispatches on retry_context.silent_exit, distinct from the failure-narration retry path. - The retry agent is told: read /data/tool_calls/<today>.jsonl filtered by previous_session_id, reconstruct what got done, post ONE summary. Audit log is ground truth; do not infer from journal text (the journal is the failure mode this retry exists to surface). Documentation: - src/capabilities/slack.md gets a new subsection naming the gate and the structural enforcement. So Sam reads about the rule from its own source on every session. Tests: 15 new in test_silent_exit.py + 97 existing all pass (112).
Operator caught the hole before merge: the previous gate counted 'at least one chat.postMessage = closed.' That false-passes the ACK-then-work-no-reply pattern — an ACK at index 0, then gh pr create at index 5, then silent exit. Operator sees 'got it' + green check but never hears about the PR. New rule (timing-based): closed_loop iff (chat.postMessage / chat.update was issued AFTER the last substantive outward-facing tool call) Substantive outward-facing tool calls: - worker / parallel_workers (delegated work) - fetch_url - consult_opus - non-Slack bash (gh, git, curl-non-slack, etc.) - write_file / edit_file outside /data/journal/ Inward (operator doesn't need a report for these): - read_file, grep, glob_files - write_file / edit_file to /data/journal/ - Slack housekeeping bash (setStatus, reactions.add, conversations.replies) Field renamed slack_posted → closed_loop everywhere (PR not merged yet, no external consumers). Docstring + slack.md text rewritten to name the timing rule explicitly. Tests: 22 in test_silent_exit.py (up from 15), including 4 explicit ACK-then-work cases (the headline failure mode). 119 total pass.
spashii
added a commit
that referenced
this pull request
May 24, 2026
## What this enables Sam reads a new principle when proposing source fixes: identity / capability / skill changes are prompts (not deterministic). For recurring behavioral failures, look for the systemic lever (a daemon check, a runtime gate) before reaching for prose. ## Consequences - Sam stops shipping Tier-1 prose patches as the *whole* fix for behaviors the LLM already 'knew' about and failed on anyway (today's silent-exit class). - When Sam DOES ship prose, the PR description names whether there's a paired Tier-3 enforcement, or honestly flags 'no systemic lever available; expect partial compliance.' - The two-layer pattern (system enforces, prose explains why) becomes the default mental model for behavioral fixes. ## Lives in self-maintenance.md Inserted between *'Where does a change belong?'* (which picks the target tier) and *'The flow'* (mechanics). Chose self-maintenance.md over identity.md because the principle is about *how to choose where to fix something* — exactly self-maintenance's domain. Sam reads this when proposing changes, not on every session. ## The silent-exit example is baked in The PR-#68 silent-exit gate is named explicitly as the canonical case: prose-only would have been 'add a stronger rule to slack.md.' The real fix is the daemon's audit-log check + retry. Future-Sam reading this principle has a concrete case to anchor on. ## Tier Tier 1 (capability prose). 21 lines added. Source-integrity tests pass. ## Verify `pytest tests/runtime/test_source_integrity.py` — 25 passed (frontmatter intact, no dangling refs).
5 tasks
spashii
added a commit
that referenced
this pull request
May 24, 2026
… (v2) (#71) ## What this enables Sam can break out of a session mid-work for genuine unknown unknowns, post a question to Slack, exit cleanly. Operator replies whenever — daemon detects the continuation, Sam picks up via the audit log. **Slack reactions are the source of truth** for paused state; no in-memory daemon map, no parallel state store, no special-case boot recovery. ## How the lifecycle works ``` Sam pauses → ask_operator tool posts question + adds 💬 atomically → session exits clean, ✅ on inbound → ledger entry has ask_operator_called: true ... operator replies whenever (works across daemon restarts) ... Operator reply → daemon fetches thread_history (already happens) → finds the bot message with 💬 from this bot → looks up session_id from sessions.jsonl (filter thread_ts + ask_operator_called + ts_start strictly before post) → injects paused_session_id into IncomingMessage → calls reactions.remove on the question post (marks resolved) → continuation prompt fires: read audit log, apply answer, continue ``` ## Consequences - New ADK tool `ask_operator(question: str)` on the main agent only — workers/pro_executor/mentor cannot escalate to operator directly. - Tool atomically posts question + adds `💬` (race-mitigation per your design call). - `SessionLedgerEntry.ask_operator_called: bool` is the new ledger field the daemon's lookup correlates on. - `_find_active_paused_question` + `_lookup_paused_session_id` replace the deleted `_paused_threads` map. Reactions on Slack + ledger on GCS both persist across daemon restarts. - Daily-maintenance §6 handles abandoned questions (>24h) — Sam enumerates 💬 via reactions.list, decides per-question whether to remind / mark abandoned / escalate. Daemon mechanical, Sam judgment. ## Composition with existing gates - silent-exit gate (PR #68): `ask_operator` counts as a post → closes the loop for that turn. - ack-first rule (PR #44): unaffected. - retry / silent-exit retry: takes precedence over continuation (failure narration wins over resumption — defends against a failed pause spawning a continuation loop). - recovered=True (boot recovery): paused_session_id takes precedence (more specific signal). ## Tier Tier 3 (`src/runtime/`) + Tier 1 (`src/capabilities/slack.md`, `src/skills/daily-maintenance/skill.md`). Both layers: system enforces routing; prose explains the rule and Sam's review responsibility. ## Test plan - [x] `pytest tests/` — 159 passed (24 new in `test_ask_operator.py`, no regressions) - [x] `_find_active_paused_question` defended for: empty history, no bot messages, no reaction, reaction from another user, multiple paused (most-recent wins), missing reactions field - [x] `_lookup_paused_session_id` defended for: missing ledger, matching session, future-skipping, most-recent-match - [x] SessionLedgerEntry field defaulting + plumbing - [ ] Live: Sam mid-work calls ask_operator → 💬 appears → operator reply → continuation runs + clears reaction Closes the async-question class of bug. Three new tickets queued for follow-up work (24h reminder, introspect tool, skill descriptions audit) tracked separately.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this enables
Slack-triggered sessions can't exit cleanly anymore without closing the loop — defined as: a
chat.postMessage/chat.updatemust come AFTER the last substantive outward-facing tool call in the session. An ACK at the start followed by work + silence does NOT close the loop.The corrected rule (timing-based, not 'any post')
Substantive outward-facing (operator expects a report):
gh pr create / edit / merge, other non-Slack bashconsult_opusworker/parallel_workersedit_file/write_fileoutside/data/journal/fetch_urlInward (operator doesn't need a report):
read_file,grep,glob_files/data/journal/)setStatus,reactions.add,conversations.replies)closed_loopis True iff a post comes AFTER the last outward call (or no outward work happened at all — a question Sam answered without tools).If the gate fires, the daemon spawns a retry whose only job is to read the previous session's audit-log slice and post the summary. The retry agent is told explicitly not to trust the journal (since the journal-claiming-without-evidence pattern is the parent failure mode).
Consequences
f31a21c5after 'Can you update to add these tools?' and624e27ecafter 'Use opus to review this'); both opened PRs (feat(runtime): support opencode skill structure with symlinks #66, docs(identity): forbid unsourced quantitative or factual claims #67) but never posted. After this PR: caught and retried.read_fileonly or no tools at all) just need any single post — they're 'Sam answered a question' shapes.How to verify
pytest tests/— 119 passed (22 new intest_silent_exit.py+ 97 existing, no regressions). The new tests include 4 explicit ACK-then-work-then-silent cases that would have false-passed the earlier rule.session exited cleanly but never posted to Slack; spawning retry to narrate (session=...)and the retry will post the summary in-thread.Tier
Tier 3 (
src/runtime/session.py,src/runtime/daemon.py,src/runtime/prompts.py) + Tier 1 (src/capabilities/slack.mddocuments the gate in Sam's source).Closes the silent-exit class of bug surfaced by the 2026-05-24 14:30 and 15:55 sessions.