feat(capability/slack): ACK-first rule on tool-using sessions (SAM-32) by spashii · Pull Request #44 · Dembrane/sam

spashii · 2026-05-23T14:33:02Z

Summary

Adds a behavioral rule to src/capabilities/slack.md: any Sam session that'll involve real work posts a substantive first reply within ~5s that restates the ask, names the approach, and commits to a follow-up — before the long work begins.

Why now

Today's README + image ask exposed the gap concretely. The operator sent two messages, waited ~4 minutes seeing only :eyes: → :hourglass: reactions, and only then got a reply. That reply was wrong — Sam had spent 11 minutes going down a misread "ur in gcp" hint into the wrong metadata-server token path. The operator had no window to course-correct because Sam never signaled its plan.

The ACK is what closes that gap:

"Got it — drafting PR against Dembrane/sam with the image under docs/, back in a few minutes."

In ~5 seconds, the operator now knows what Sam heard, what Sam plans to do, and roughly when to expect the result. If the plan is wrong, the operator can redirect before Sam burns 10 minutes acting on it.

The change

One section added to src/capabilities/slack.md, between "Posting to Slack" and "Live UX." Sits where the Sam-as-coworker behavior is described.

The rule is explicit about:

Three parts: restate ask, name approach, commit follow-up.
Not preamble: "Sure!" / "Got it!" alone don't count — the ACK earns its place via information content (restated understanding + plan).
Skip conditions: <10s reply, one-tool read-and-respond, non-tool-using replies (reaction suffices).
The ACK is a commitment to a direction, not a contract — if work surfaces something different, the end-of-session reply can say "I switched to Y because…" honestly.

The existing "Status indicator — always" rule (which sets the :thinking: pill) is unchanged. That rule covers something is happening; this new rule covers here's what and roughly how long.

Validation

Deferred to the SAM-26 eval harness. Golden-set ambiguous-bucket sessions can score:

Did Sam post an ACK within 10s? (binary)
Did the ACK restate the ask? (0–3)
Did the ACK name an approach? (0–3)

If the rate is consistently low after this lands, SAM-32 has a runtime-floor fallback (Option C — daemon posts a templated ACK if Sam hasn't within N seconds).

Test plan

Live trigger after merge: a real tool-using Slack mention to Sam should produce an ACK within 10s, then the substantive reply at session close.
Audit log check: the ACK Slack post lands in tool_calls/<date>.jsonl early in the session_id's trace, not at the end.
Conversational check: a non-tool-using reply ("thanks" → reaction) should not produce an ACK.

Tier

2 — prose change to src/capabilities/.

Refs

Closes: SAM-32
Related: SAM-31 (mid-flight steering depends on the ACK existing to anchor against)
Related: SAM-26 (validation harness)

🤖 Generated with Claude Code

Add "First reply on a tool-using task" section to src/capabilities/slack.md. Closes the operator-UX gap surfaced today: on the README + image ask, the operator waited ~4 minutes seeing only 👀 + ⌛ with no signal of what Sam understood, what approach Sam was taking, or whether to redirect. By the time Sam replied, Sam had spent 11 min going down a misread "ur in gcp" hint into the wrong metadata-server token path — the operator had no window to course-correct. The rule: any session that'll involve real work posts a substantive first reply within ~5s that restates the ask, names the approach, and commits to a follow-up. Then the work proceeds. The end-of-session substantive reply is unchanged. Sits under "Posting to Slack" / before "Live UX" in slack.md so it lands on the Sam-as-coworker behavior in the right narrative spot. Distinct from the existing "Status indicator — always" rule (which only covers the status pill, not a substantive restatement of intent). Validation is deferred to the SAM-26 eval harness — golden-set ambiguous-bucket sessions can score whether Sam ACKed and whether the ACK was substantive. If the rate is consistently low after this lands, SAM-32 has a follow-on (Option C in the ticket — daemon-driven floor). Refs: SAM-32. Related: SAM-31 (mid-flight steering depends on the ACK existing to anchor against), SAM-26 (validation).

linear · 2026-05-23T14:33:07Z

SAM-32

… (v2) (#71) ## What this enables Sam can break out of a session mid-work for genuine unknown unknowns, post a question to Slack, exit cleanly. Operator replies whenever — daemon detects the continuation, Sam picks up via the audit log. **Slack reactions are the source of truth** for paused state; no in-memory daemon map, no parallel state store, no special-case boot recovery. ## How the lifecycle works ``` Sam pauses → ask_operator tool posts question + adds 💬 atomically → session exits clean, ✅ on inbound → ledger entry has ask_operator_called: true ... operator replies whenever (works across daemon restarts) ... Operator reply → daemon fetches thread_history (already happens) → finds the bot message with 💬 from this bot → looks up session_id from sessions.jsonl (filter thread_ts + ask_operator_called + ts_start strictly before post) → injects paused_session_id into IncomingMessage → calls reactions.remove on the question post (marks resolved) → continuation prompt fires: read audit log, apply answer, continue ``` ## Consequences - New ADK tool `ask_operator(question: str)` on the main agent only — workers/pro_executor/mentor cannot escalate to operator directly. - Tool atomically posts question + adds `💬` (race-mitigation per your design call). - `SessionLedgerEntry.ask_operator_called: bool` is the new ledger field the daemon's lookup correlates on. - `_find_active_paused_question` + `_lookup_paused_session_id` replace the deleted `_paused_threads` map. Reactions on Slack + ledger on GCS both persist across daemon restarts. - Daily-maintenance §6 handles abandoned questions (>24h) — Sam enumerates 💬 via reactions.list, decides per-question whether to remind / mark abandoned / escalate. Daemon mechanical, Sam judgment. ## Composition with existing gates - silent-exit gate (PR #68): `ask_operator` counts as a post → closes the loop for that turn. - ack-first rule (PR #44): unaffected. - retry / silent-exit retry: takes precedence over continuation (failure narration wins over resumption — defends against a failed pause spawning a continuation loop). - recovered=True (boot recovery): paused_session_id takes precedence (more specific signal). ## Tier Tier 3 (`src/runtime/`) + Tier 1 (`src/capabilities/slack.md`, `src/skills/daily-maintenance/skill.md`). Both layers: system enforces routing; prose explains the rule and Sam's review responsibility. ## Test plan - [x] `pytest tests/` — 159 passed (24 new in `test_ask_operator.py`, no regressions) - [x] `_find_active_paused_question` defended for: empty history, no bot messages, no reaction, reaction from another user, multiple paused (most-recent wins), missing reactions field - [x] `_lookup_paused_session_id` defended for: missing ledger, matching session, future-skipping, most-recent-match - [x] SessionLedgerEntry field defaulting + plumbing - [ ] Live: Sam mid-work calls ask_operator → 💬 appears → operator reply → continuation runs + clears reaction Closes the async-question class of bug. Three new tickets queued for follow-up work (24h reminder, introspect tool, skill descriptions audit) tracked separately.

dembrane-sam-bot enabled auto-merge May 23, 2026 14:34

dembrane-sam-bot self-requested a review May 23, 2026 14:43

dembrane-sam-bot approved these changes May 23, 2026

View reviewed changes

dembrane-sam-bot added this pull request to the merge queue May 23, 2026

Merged via the queue into main with commit ebec7ce May 23, 2026
2 checks passed

dembrane-sam-bot deleted the sam/sam-32-ack-first branch May 23, 2026 14:47

spashii mentioned this pull request May 24, 2026

feat: ask_operator tool — async pause-and-resume for unknown unknowns (v2) #71

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(capability/slack): ACK-first rule on tool-using sessions (SAM-32)#44

feat(capability/slack): ACK-first rule on tool-using sessions (SAM-32)#44
dembrane-sam-bot merged 1 commit into
mainfrom
sam/sam-32-ack-first

spashii commented May 23, 2026

Uh oh!

linear Bot commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

spashii commented May 23, 2026

Summary

Why now

The change

Validation

Test plan

Tier

Refs

Uh oh!

linear Bot commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants