Skip to content

Allow natural stop while background tasks are in flight#91

Merged
SihaoLiu merged 13 commits intodevfrom
do-not-block-stop-when-background-running
Apr 17, 2026
Merged

Allow natural stop while background tasks are in flight#91
SihaoLiu merged 13 commits intodevfrom
do-not-block-stop-when-background-running

Conversation

@SihaoLiu
Copy link
Copy Markdown
Contributor

Summary

  • Makes the RLCR stop hook exit 0 (instead of blocking) when the current Claude session has in-flight background work (Agent run_in_background=true / Bash run_in_background=true), and surfaces a user-visible systemMessage explaining the pause. State on disk is left intact so the next stop after completion re-enters the normal review flow.
  • Adds cross-session isolation around a new bg-pending.marker: a second Claude session in the same repo cannot hijack or mutate a loop parked by another session. Ambiguous callers (no session_id forwarded) exit silently; foreign sessions see a dedicated "parked by another session" notice; the owning session cleans up the marker only after its own transcript authoritatively shows no pending bg events.
  • Extends find_active_loop with an opt-in allow_bg_marker_fallback argument so validators keep strict session isolation while only the stop hook participates in marker-based adoption.
  • Recognises both completion formats on current Claude Code transcripts: structured type:"system", subtype:"task_notification" SDK records and the legacy queue-operation XML <task-notification> block.
  • Expands leading ~ in transcript_path before the file check and scopes the transcript scan with a loop-start boundary derived from the loop dir basename (converted from local wall clock to real UTC), so pre-loop session-wide background work cannot pin a fresh loop.
  • Fixes a jq object-collapse bug in scripts/rlcr-stop-gate.sh where a plain select(length > 0) field filter wiped the entire hook-input object whenever any forwarded field (e.g. session_id) was empty; replaces with explicit if/then/else nulls.
  • Splits the new bg-handling logic into a dedicated hooks/lib/loop-bg-tasks.sh (6 helpers + a single handle_bg_task_short_circuit entry point) so the stop hook drops ~100 lines and the helpers live in a cohesive module; loop-common.sh sources the new file so every existing consumer picks up the helpers without code changes.
  • Adds a large acceptance-criteria-driven suite (tests/test-stop-hook-bg-allow.sh, 39 tests covering AC-1 to AC-22b) and a stop-gate regression test that freezes the transcript_path survival fix.

Test plan

  • bash tests/test-stop-hook-bg-allow.sh -> 39 passed
  • bash tests/test-stop-gate.sh -> 12 passed (2 new regression tests)
  • bash tests/run-all-tests.sh -> 1720 passed, 0 failed

SihaoLiu added 13 commits April 16, 2026 13:35
Add a background-task short-circuit to the RLCR stop hook so it exits 0
with a user-facing systemMessage whenever the Claude Code transcript
shows at least one pending background dispatch (Agent run_in_background,
or Bash run_in_background) that has not yet produced a completion
queue-operation notification. The short-circuit precedes every other
gate (phase detection, state parse, branch consistency, plan integrity,
git cleanliness, summary, BitLesson, max iterations, Codex review) and
touches no on-disk state; when the background work finishes, the next
natural stop re-enters the normal review flow.

Supporting helpers in loop-common.sh:
  - extract_transcript_path mirrors extract_session_id
  - list_pending_background_task_ids parses the transcript jsonl and
    diffs launched ids against queue-operation completion task-ids
  - has_pending_background_tasks + count_pending_background_tasks wrap
    that list with fail-closed semantics when the transcript path is
    missing or jq is unavailable

The rlcr-stop-gate wrapper had two latent bugs newly exposed by
forwarding transcript_path end-to-end: select(length>0) on an object
field collapsed the entire JSON when session_id was empty, and a
wrapped response with no decision field was treated as Unexpected
instead of ALLOW. Both paths are fixed alongside this change.

tests/test-stop-hook-bg-allow.sh covers AC-1..AC-9 end-to-end (11
cases including two missing-path variants and one wrapper smoke test)
and is registered in run-all-tests.sh. Full suite: 1690 passed, 0
failed.

Branch targets dev; version stays at 1.16.0 as requested.
Round 0 shipped the natural-stop short-circuit using a literal `-f`
check on the transcript path, which silently failed for the
"~/.claude/projects/<encoded-cwd>/<session>.jsonl" form Claude Code
actually publishes. A real session landed on the fall-through path
and still ran the full Codex review, defeating the feature.

Add a small `expand_leading_tilde` helper in loop-common.sh (no
eval, handles bare "~" and "~/..." only, leaves everything else
verbatim) and apply it in both extract_transcript_path and
list_pending_background_task_ids. The stop hook and the
rlcr-stop-gate wrapper inherit the fix through the shared helpers
without any call-site edits.

Regression coverage in tests/test-stop-hook-bg-allow.sh:

  AC-10  hook input with transcript_path="~/...\" under $HOME still
         triggers the short-circuit (exit 0 + systemMessage).
  AC-10b direct list_pending_background_task_ids on the same tilde
         form returns the pending id, so a future regression in the
         helper cannot be masked by additional normalization in the
         hook itself.

Under-$HOME fixture dir is cleaned alongside $TEST_DIR via an
extended EXIT trap.

Full suite: 1692 passed, 0 failed (+2 vs Round 0).

Branch still targets dev; version stays at 1.16.0.
Two test-harness gaps flagged in the previous Codex review:

  1. The tilde-path fixture was created with mktemp -d "$HOME/...",
     which aborts on sandboxed or read-only-HOME environments and
     forces callers to override HOME externally.
  2. AC-9 with the literal "~/..." --transcript-path form was named
     as a target for the round but never made it into automation;
     the checked-in wrapper coverage still used an absolute path.

Fix both inside tests/test-stop-hook-bg-allow.sh only -- no
production code changes. The helper, hook, and wrapper already
handle the tilde form correctly; the missing piece was portable,
checked-in proof.

  - Introduce FAKE_HOME="$TEST_DIR/fake-home" rooted inside the
    existing test temp dir, so cleanup rides on the setup_test_dir
    EXIT trap. The previous cleanup_all / HOME_FIXTURE_DIR
    extension is removed.
  - Extend run_stop_hook_with_input with an optional 3rd argument
    that exports HOME inside the hook subshell when non-empty.
    Absent, behavior is verbatim.
  - Refactor AC-10 and AC-10b to point at
    "$FAKE_HOME/session-data/ac10.jsonl", and pass the tilde-form
    literal "~/session-data/ac10.jsonl" verbatim. The previous
    dynamic derivation "~/${AC10_TRANSCRIPT#$HOME/}" is gone.
  - Add AC-10c: exercises scripts/rlcr-stop-gate.sh with
    --transcript-path "~/..." under HOME=$FAKE_HOME and asserts
    exit 0 + "^ALLOW:" + "background task".

Validation:

  - bash tests/test-stop-hook-bg-allow.sh -> 14 passed, 0 failed
    (baseline 13; +1 for AC-10c).
  - bash tests/run-all-tests.sh -> 1693 passed, 0 failed
    (baseline 1692; +1).
  - Portability: HOME=/nonexistent/readonly bash tests/test-stop-hook-bg-allow.sh
    -> 14 passed, 0 failed.

Branch still targets dev; version stays at 1.16.0.
Codex review P2 on the background-task short-circuit: allowing the
natural stop at the early-exit site returned control to the user
while state.md was still bound to the current session_id. If the
user closed Claude before the background-task completion
notification arrived, find_active_loop in any later session would
reject the loop (stored session_id did not match), so the loop was
stranded and recovery required manual cancellation.

Resolve without touching state.md by introducing a narrow
cross-session adoption signal:

  - Early-exit block now also runs
    `: > "$LOOP_DIR/bg-pending.marker"` before emitting the
    systemMessage JSON. Failure is tolerated so the short-circuit
    never blocks on a flaky filesystem.
  - find_active_loop's session-filter branch gains a second
    acceptance step: on a stored-vs-filter session_id mismatch, if
    the dir has `bg-pending.marker` AND an active state file, it
    is still accepted. Terminal loops with a stale marker are
    ignored.

Three new regression cases in tests/test-stop-hook-bg-allow.sh:

  AC-11  session_id mismatch + marker + pending bg -> short-circuit
         fires; state.md stays byte-identical.
  AC-11b session_id mismatch + no marker -> hook takes the
         existing "no active loop" exit-0 path; confirms the marker
         is the only cross-session adoption signal.
  AC-11c same-session short-circuit really writes the marker, so
         AC-11 is grounded in real hook behavior rather than a
         synthetic setup.

Validation:

  - bash tests/test-stop-hook-bg-allow.sh -> 17 passed, 0 failed
  - bash tests/run-all-tests.sh -> 1696 passed, 0 failed
  - HOME=/nonexistent/readonly bash tests/test-stop-hook-bg-allow.sh
    -> 17 passed, 0 failed

No systemMessage wording change. No state.md mutation. Branch still
targets dev; version stays at 1.16.0.
Codex review flagged two regressions from the previous round's
bg-pending.marker change:

  [P1] find_active_loop could return the newest loop dir that had a
       marker before the scan reached an older dir whose stored
       session_id actually matched the caller. With multiple active
       RLCR loops in one repo that breaks session isolation and lets
       a stop hook attach to the wrong conversation.

  [P2] The short-circuit wrote bg-pending.marker but nothing cleared
       it when the next stop saw no pending background task. A later
       stop from a different session_id would keep being adopted
       through the stale marker long after the bg had resolved.

Both fixes are in the files already touched by this feature:

  * find_active_loop's session-filter branch now makes exact
    stored-vs-filter match win over marker fallback. Marker
    candidates are recorded while the scan walks newest-to-oldest
    and only returned after the whole listing fails to yield an
    exact match. Zombie-loop protection still wins for the
    caller's own session. This preserves Round 3 orphan recovery
    while restoring isolation across concurrent sessions.

  * After has_pending_background_tasks returns false, the stop
    hook now handles the marker: if HOOK_SESSION_ID differs from
    the stored session_id in the active state file, it rewrites
    that line with portable sed -i.bak; then the marker is
    removed unconditionally. Rewrite failure is logged but
    non-fatal.

Regressions in tests/test-stop-hook-bg-allow.sh:

  AC-12   find_active_loop returns the older exact-match dir over
          a newer foreign-session dir that has a marker.
  AC-12b  find_active_loop does not touch a foreign session's
          marker.
  AC-13   Same-session resume with a stale marker + empty bg
          transcript clears the marker.
  AC-13b  Same-session resume leaves state.md session_id
          unchanged.
  AC-14   Cross-session resume clears the marker.
  AC-14b  Cross-session resume rewrites state.md session_id to
          the caller's session.

Validation:

  - bash tests/test-stop-hook-bg-allow.sh -> 23 passed, 0 failed
  - bash tests/run-all-tests.sh -> 1702 passed, 0 failed
  - HOME=/nonexistent/readonly bash tests/test-stop-hook-bg-allow.sh
    -> 23 passed, 0 failed

systemMessage wording unchanged. Version stays at 1.16.0.
Two Codex review findings on the previous round:

  [P1] The completion parser in list_pending_background_task_ids only
       understood the legacy queue-operation XML blocks. Current
       Claude Code transcripts emit background-task completions as
       SDKTaskNotificationMessage records:
         type: "system", subtype: "task_notification", task_id: ...
       Without SDK-format recognition, launched ids stayed pending
       forever and the short-circuit would fire on every stop.

  [P2] The non-short-circuit path cleared bg-pending.marker
       unconditionally whenever has_pending_background_tasks
       returned false. That helper is fail-closed, so it also
       returned false when transcript_path was missing or
       unreadable (e.g. rlcr-stop-gate.sh without
       --transcript-path). In that case the cleanup still deleted
       the marker and rewrote the stored session_id, breaking
       cross-session recovery exactly where transcript inspection
       is unavailable.

Fixes, confined to the files already touched by this feature:

  * loop-common.sh: union the completion set from both SDK
    (type:system subtype:task_notification -> .task_id) and legacy
    (queue-operation XML) sources. Wrap the legacy branch's grep
    in `{ grep -oE ... || true; }` so its "no match -> exit 1"
    cannot combine with set -o pipefail to invalidate the SDK
    side of the union through the `|| completed=""` fallback.

  * loop-codex-stop-hook.sh: gate the non-short-circuit marker
    cleanup on `HOOK_TRANSCRIPT_PATH` being a readable regular
    file. When transcript inspection is unavailable, leave the
    marker AND the stored session_id untouched so cross-session
    recovery stays reachable.

Regressions added in tests/test-stop-hook-bg-allow.sh:

  AC-15   helper treats an SDK task_notification as terminal.
  AC-16   helper unions SDK + legacy completion formats.
  AC-17   missing transcript_path key -> marker preserved.
  AC-17b  same scenario -> stored session_id preserved.
  AC-17c  transcript_path pointing at a non-existent file -> same
          guarantees.

A small emit_sdk_task_notification helper was added to keep AC-15
and AC-16 declarative.

Validation:

  - bash tests/test-stop-hook-bg-allow.sh -> 28 passed, 0 failed
  - bash tests/run-all-tests.sh -> 1707 passed, 0 failed
  - HOME=/nonexistent/readonly bash tests/test-stop-hook-bg-allow.sh
    -> 28 passed, 0 failed

systemMessage wording unchanged. Version stays at 1.16.0.
Three blocking findings from the previous review:

  [P1] The Round 5 non-short-circuit cleanup hijacked foreign
       parked loops: session B could rewrite a parked loop's
       stored session_id to B and delete the marker using B's own
       transcript, even though A's background task was still
       running. A was locked out of its own loop.

  [P2] Every caller of find_active_loop inherited the marker
       fallback, including loop-read-validator, loop-write-
       validator, loop-bash-validator, and loop-plan-file-
       validator. An unrelated session's validators started
       enforcing foreign parked-loop gates (notably the
       methodology-analysis phase) on ordinary writes and bash
       commands, breaking existing session isolation guarantees.

  [P2] The Round 4 sed -i.bak session_id rewrite injected
       HOOK_SESSION_ID into the replacement text unescaped. Session
       IDs containing `&` are valid per this repo's
       test-session-id.sh; adopting such an id corrupted state.md.

Fixes, confined to files already touched by this feature:

  * loop-common.sh: find_active_loop gains a third positional
    parameter allow_bg_marker_fallback (default false). Both the
    inner marker_candidate record and the post-loop fallback
    return are gated on it. Only the stop hook opts in; validators
    keep the pre-Round-3 strict isolation.

  * loop-codex-stop-hook.sh:
    - Call find_active_loop with `true` to opt in.
    - Add a new "Cross-Session Parked-Loop Guard" block before the
      short-circuit: when bg-pending.marker is present AND the
      stored session_id differs from HOOK_SESSION_ID, emit a
      dedicated "parked by another Claude session" systemMessage
      and exit 0 without touching marker, state.md, or session_id.
      B's hook can never advance A's parked loop on B's transcript.
    - Simplify the non-short-circuit cleanup: the guard ensures
      only same-session cases reach here, so the cleanup is now
      just `rm -f bg-pending.marker` (still gated on transcript
      readability). The sed session_id rewrite is removed entirely,
      which also removes the unescaped-metacharacter issue.

Regressions in tests/test-stop-hook-bg-allow.sh:

  AC-11   (rewritten) cross-session + marker -> "parked" systemMessage
           + marker preserved + state.md byte-identical.
  AC-14   (rewritten) anti-hijack: cross-session stop preserves
           bg-pending.marker.
  AC-14b  (rewritten) cross-session stop leaves stored session_id
           intact.
  AC-18   NEW. find_active_loop default (no opt-in) ignores a
           foreign marker dir -> validators stay isolated.
  AC-18b  NEW. find_active_loop with opt-in does return the marker
           dir (confirms the flag is wired).

All other existing regressions continue to pass.

Validation:

  - bash tests/test-stop-hook-bg-allow.sh -> 30 passed, 0 failed
  - bash tests/run-all-tests.sh -> 1709 passed, 0 failed
  - HOME=/nonexistent/readonly bash tests/test-stop-hook-bg-allow.sh
    -> 30 passed, 0 failed

Original short-circuit systemMessage wording unchanged. Version
stays at 1.16.0.
Two blocking findings from the previous review:

  [P1] The cross-session parked-loop guard required both a
       non-empty stored session_id AND a non-empty
       HOOK_SESSION_ID. Callers that reach the hook without a
       session_id field -- including scripts/rlcr-stop-gate.sh
       invoked without --session-id -- bypassed the guard, and a
       foreign parked loop fell through into the normal path.
       With any readable transcript, the later cleanup deleted
       bg-pending.marker and the hook started reviewing a loop it
       did not own.

  [P2] Non-short-circuit cleanup only verified that the
       transcript file existed. list_pending_background_task_ids
       is fail-closed on malformed or truncated transcripts, yet
       the cleanup still deleted the marker in that case. The
       parked-state signal was lost even though background
       completion was never verified.

Fixes, limited to loop-codex-stop-hook.sh:

  * Cross-session guard: drop the `-n "$HOOK_SESSION_ID"` clause.
    A non-empty stored session_id that differs from the
    (possibly empty) hook session_id now triggers the "parked by
    another Claude session" exit path. Backward-compat semantics
    are preserved: an empty stored session_id still matches any
    caller, consistent with find_active_loop's existing rule.

  * Non-short-circuit cleanup: call
    list_pending_background_task_ids inline and check its exit
    code along with its output. The marker is removed only when
    the helper returned exit 0 AND produced an empty id list.
    Every fail-closed path (missing file, empty path, jq parse
    failure, truncation) now leaves the marker intact.

No changes to hooks/lib/loop-common.sh; the helper already has
the exit-code semantics we rely on.

Regressions in tests/test-stop-hook-bg-allow.sh:

  AC-19  Hook input with NO session_id key + state.md session_id
         populated + bg-pending.marker -> exit 0 with "parked"
         systemMessage, marker preserved, state.md byte-identical.

  AC-20  Hook input pointing transcript_path at a deliberately
         malformed JSONL file + bg-pending.marker -> marker
         preserved.

Validation:

  - bash tests/test-stop-hook-bg-allow.sh -> 32 passed, 0 failed
  - bash tests/run-all-tests.sh -> 1711 passed, 0 failed
  - HOME=/nonexistent/readonly bash tests/test-stop-hook-bg-allow.sh
    -> 32 passed, 0 failed

systemMessage wording unchanged. Version stays at 1.16.0.
Two blocking findings from the previous review:

  [P1] Round 7 dropped the `-n "$HOOK_SESSION_ID"` check from the
       cross-session guard to close its own P1. That inverted the
       failure mode: any wrapper call (e.g. rlcr-stop-gate.sh
       without --session-id) matched the guard forever once the
       marker was written, and the RLCR loop never resumed
       through the wrapper path.

  [P2] list_pending_background_task_ids scanned the entire
       session-wide Claude transcript. A long-running background
       Agent/Bash that started earlier in the same session --
       before the current RLCR loop was created -- counted as
       "pending" for this loop. The short-circuit fired forever
       for a loop that had no in-scope pending work of its own.

Fixes:

  * loop-codex-stop-hook.sh:
    - Add an "Ambiguous-Caller Marker Guard" before the cross-
      session guard. When bg-pending.marker is present AND
      HOOK_SESSION_ID is empty, exit 0 silently (no systemMessage,
      no on-disk mutation). The real Claude stop hook always has
      session_id populated and remains the only authoritative
      driver for parking and cleanup.
    - Restore `[[ -n "$HOOK_SESSION_ID" ]]` inside the cross-
      session guard. That branch now fires only when both session
      ids are non-empty and different.
    - Compute LOOP_START_TS via derive_loop_start_iso_ts once and
      pass it through every pending-tasks helper call.

  * loop-common.sh:
    - New derive_loop_start_iso_ts helper: parses the loop dir
      basename YYYY-MM-DD_HH-MM-SS and emits
      YYYY-MM-DDTHH:MM:SS.000Z for lexical comparison against
      transcript timestamps.
    - list_pending_background_task_ids gains an optional
      since_ts argument. Launch events are filtered by
      ($since_ts == "" or (.timestamp // "") == "" or
       (.timestamp // "") >= $since_ts). Empty since_ts
      preserves old scan-everything behavior; events without
      .timestamp remain included for fixture / older record
      compatibility.
    - has_pending_background_tasks and
      count_pending_background_tasks pass since_ts through
      unchanged.

Regressions in tests/test-stop-hook-bg-allow.sh:

  AC-10c refixtured to avoid the AC-10 marker leaking into the
         wrapper ambiguous-caller branch.
  AC-19  rewritten: empty HOOK_SESSION_ID + marker -> silent
         ALLOW, marker and state preserved (inverts Round 7's
         "parked" expectation).
  AC-21  helper filters a pre-loop launch and keeps an in-loop
         launch.
  AC-21b derive_loop_start_iso_ts produces the expected ISO-8601
         form.
  AC-21c end-to-end: pre-loop launch in transcript does not
         trigger the short-circuit; Codex runs.
  AC-22  wrapper without --session-id + no prior marker +
         pending bg -> writes marker, surfaces systemMessage.
  AC-22b wrapper without --session-id + prior marker -> silent
         ALLOW, marker and state preserved.

Validation:

  - bash tests/test-stop-hook-bg-allow.sh -> 37 passed, 0 failed
  - bash tests/run-all-tests.sh -> 1716 passed, 0 failed
  - HOME=/nonexistent/readonly bash tests/test-stop-hook-bg-allow.sh
    -> 37 passed, 0 failed

systemMessage wording unchanged on existing paths. Version stays
at 1.16.0.
setup-rlcr-loop.sh creates loop dirs named YYYY-MM-DD_HH-MM-SS
with `date +%Y-%m-%d_%H-%M-%S` (no -u), so the basename is local
wall clock. The previous derive_loop_start_iso_ts pasted `.000Z`
on top of that basename, treating the local wall clock as if it
were already UTC. Claude transcript events carry real UTC
timestamps, so on any non-UTC machine the filter boundary was
shifted by the local offset: west-of-UTC users inherited pre-loop
background launches and stayed parked forever, east-of-UTC users
missed in-loop background work and ran Codex review too early.

Fix in hooks/lib/loop-common.sh:

  * derive_loop_start_iso_ts now does a two-step local -> epoch
    -> UTC conversion. Local wall clock is parsed into epoch
    seconds with `date -d` (GNU) or `date -j -f` (BSD/macOS),
    then the epoch is formatted in UTC as
    YYYY-MM-DDTHH:MM:SS.000Z with `date -u -d "@<epoch>"` (GNU)
    or `date -u -r <epoch>` (BSD/macOS). Any failure yields an
    empty string, which disables the filter in callers -- same
    backward-compat behaviour as before.

Regressions in tests/test-stop-hook-bg-allow.sh:

  AC-21b  pinned to `export TZ=UTC` inside its subshell so the
          expected 2026-03-01T00:00:00.000Z is TZ-deterministic.
  AC-21d  NEW. TZ=Asia/Tokyo + basename 2026-03-01_09-00-00 ->
          2026-03-01T00:00:00.000Z (9am JST = 0am UTC).
  AC-21e  NEW. TZ=America/Los_Angeles + basename
          2026-03-01_00-00-00 -> 2026-03-01T08:00:00.000Z
          (0am PST = 8am UTC; March 1 is before DST starts on
          March 8, 2026).

Validation:

  - bash tests/test-stop-hook-bg-allow.sh -> 39 passed, 0 failed
  - bash tests/run-all-tests.sh -> 1718 passed, 0 failed
  - TZ=America/Los_Angeles bash tests/test-stop-hook-bg-allow.sh
    -> 39 passed, 0 failed
  - TZ=Asia/Tokyo bash tests/test-stop-hook-bg-allow.sh
    -> 39 passed, 0 failed
  - HOME=/nonexistent/readonly bash tests/test-stop-hook-bg-allow.sh
    -> 39 passed, 0 failed

systemMessage wording unchanged. Version stays at 1.16.0.
tests/test-stop-hook-bg-allow.sh invokes rlcr-stop-gate.sh at four
spots (AC-9, AC-10c, AC-22, AC-22b). The wrapper resolves its
project root as `${CLAUDE_PROJECT_DIR:-$(pwd)}`, giving the env
var precedence over `cd`. When the outer runner exports
CLAUDE_PROJECT_DIR (the normal case in hosted environments),
those four tests were inspecting the outer repo instead of their
per-test fixtures, falling through with "ALLOW: stop gate
passed." and causing the suite to go red inside
tests/run-all-tests.sh.

Reproduced directly:

  CLAUDE_PROJECT_DIR=/tmp/outer-unrelated \
    bash tests/test-stop-hook-bg-allow.sh
  ...
  FAIL: AC-10c
  FAIL: AC-22

Fix: pass `--project-root "$FIXTURE_REPO"` at every wrapper call
site. The wrapper priority order is explicit flag >
CLAUDE_PROJECT_DIR env > cwd, so the gate now pins deterministically
to each fixture regardless of inherited environment.

No product-code change.

Validation:

  - bash tests/test-stop-hook-bg-allow.sh -> 39 passed, 0 failed
  - bash tests/run-all-tests.sh -> 1718 passed, 0 failed
  - CLAUDE_PROJECT_DIR=/tmp/outer-unrelated bash
    tests/test-stop-hook-bg-allow.sh -> 39 passed, 0 failed
  - HOME=/nonexistent/readonly bash tests/test-stop-hook-bg-allow.sh
    -> 39 passed, 0 failed
  - TZ=America/Los_Angeles bash tests/test-stop-hook-bg-allow.sh
    -> 39 passed, 0 failed
  - TZ=Asia/Tokyo bash tests/test-stop-hook-bg-allow.sh
    -> 39 passed, 0 failed

systemMessage wording unchanged. Version stays at 1.16.0.
hooks/loop-codex-stop-hook.sh previously called
has_pending_background_tasks AND count_pending_background_tasks
back-to-back on the short-circuit path. Both wrap
list_pending_background_task_ids, so every pending-bg stop was
running jq over the transcript twice.

Collapse into a single list_pending_background_task_ids call:
capture the id list once, use its non-emptiness for the
short-circuit decision, and derive the count with `wc -l`. The
public helpers has_pending_background_tasks and
count_pending_background_tasks stay for other callers.

No behaviour change. systemMessage wording and exit codes
identical.

Validation (all six suites green):
  - bash tests/run-all-tests.sh -> 1718 passed, 0 failed
  - bash tests/test-stop-hook-bg-allow.sh -> 39 passed, 0 failed
  - HOME=/nonexistent/readonly ... -> 39 passed, 0 failed
  - TZ=America/Los_Angeles ... -> 39 passed, 0 failed
  - TZ=Asia/Tokyo ... -> 39 passed, 0 failed
  - CLAUDE_PROJECT_DIR=/tmp/outer-unrelated ... -> 39 passed, 0 failed

Version stays at 1.16.0.
…survival

Move the six background-task helpers (expand_leading_tilde,
extract_transcript_path, derive_loop_start_iso_ts,
list/has/count_pending_background_task[_ids]) and the four hook guard
blocks (ambiguous caller, cross-session parked, pending-bg short-circuit,
stale-marker cleanup) out of hooks/lib/loop-common.sh and
hooks/loop-codex-stop-hook.sh into a new hooks/lib/loop-bg-tasks.sh. The
stop hook now delegates to a single handle_bg_task_short_circuit entry
point; loop-common.sh sources the new lib so every existing consumer
continues to see the helpers transparently.

Add a regression test to tests/test-stop-gate.sh that asserts
transcript_path is still forwarded to the hook when session_id is empty,
freezing the jq object-collapse fix that replaced plain select(length > 0)
field values with explicit if/then/else nulls.
@SihaoLiu
Copy link
Copy Markdown
Contributor Author

@codex review this PR

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@SihaoLiu SihaoLiu merged commit b955a69 into dev Apr 17, 2026
9 checks passed
@SihaoLiu SihaoLiu mentioned this pull request Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant