Skip to content

rule: scripts must not use bare sleep delays — require polling loop with guaranteed exit path #184

@tbitcs

Description

@tbitcs

Summary

Specsmith-governed projects and the specsmith CLI itself use bare sleep/Start-Sleep delays in shell scripts and agent-generated scripts without a guaranteed exit path. This creates an unbounded blocking risk — if the awaited condition never arrives, the script hangs forever.

Proposed Rule

Add a governance rule to docs/governance/RULES.md (and optionally enforce via specsmith validate) that applies to all scripts (shell, bash, PowerShell, zsh, batch — not source code):

RULE: No bare sleep delays in scripts.
Scripts MUST NOT use sleep, Start-Sleep, time.sleep, asyncio.sleep, or equivalent blocking waits as a standalone timing mechanism.
Every wait that depends on an external condition MUST use a polling loop with:

  1. A maximum iteration count or wall-clock timeout
  2. A non-zero exit code on timeout
  3. A sleep interval that is explicit and bounded

Preferred pattern (bash)

max_retries=30
interval=5
for i in $(seq 1 $max_retries); do
  status=$(check_condition)
  [ "$status" = "ready" ] && exit 0
  echo "Waiting... ($i/$max_retries)"
  sleep $interval
done
echo "ERROR: timed out after $((max_retries * interval))s" >&2
exit 1

Preferred pattern (PowerShell)

$maxRetries = 30; $interval = 5; $i = 0
while ($i -lt $maxRetries) {
  if (Check-Condition) { exit 0 }
  $i++; Start-Sleep -Seconds $interval
}
Write-Error "Timed out after $($maxRetries * $interval)s"; exit 1

Why

  • Bare sleep N; command is the root cause of hung CI pipelines, stuck terminal sessions, and zombie processes
  • The oz environment list hang (oz environment list hangs indefinitely on Windows and cannot be killed with Ctrl+C warpdotdev/warp#11419) was caused by this exact pattern
  • During ctt-neural development, Start-Sleep -Seconds N; gh run list ... blocked terminal sessions when the sleep ran but subsequent commands failed to return
  • specsmith's own specsmith watch had no timeout, hanging indefinitely — related to the same class of problem

Scope

  • Applies to: .sh, .bash, .zsh, .ps1, .bat, .cmd, CI workflow YAML run: blocks, Makefile recipes
  • Does NOT apply to: production source code (Python time.sleep, Rust thread::sleep, etc.) — only scripts that are directly executed by humans or CI runners
  • Detection: specsmith validate could flag ^\s*sleep \d+ patterns in script files without adjacent loop/retry constructs

Implementation suggestions

  1. Add to the default RULES.md scaffold template
  2. Add a specsmith validate check for bare-sleep in script files (similar to the existing unbounded-loop check)
  3. Emit a warning (not error) on first occurrence to allow gradual adoption

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions