Skip to content

Add single-mode dispatch to spec_compression_stress.sh#33

Merged
YWHyuk merged 1 commit into
mainfrom
claude/stress-script-single-mode
May 18, 2026
Merged

Add single-mode dispatch to spec_compression_stress.sh#33
YWHyuk merged 1 commit into
mainfrom
claude/stress-script-single-mode

Conversation

@YWHyuk
Copy link
Copy Markdown

@YWHyuk YWHyuk commented May 18, 2026

Summary

Lets serving/spec_compression_stress.sh run a single mode per invocation, so each mode can be fanned out into its own filesystem-isolated container in parallel.

Why

The script ran baseline → self_verify → cpu_verify sequentially in one process. The simulator regenerates astra-sim/inputs/{network.yml, system.json, memory_expansion.json} from the cluster config on every run, so simply backgrounding the three calls would race on those shared files and corrupt the in-flight sim's view of its own network / system / memory config. The cleanest fix is to keep the script unchanged inside, but let the caller run each mode in its own sim docker container — the per-container overlay layer isolates the writes, no code patch needed.

Changes

Invocation Behaviour
./script.sh (no arg) Sequential, all three modes — legacy behaviour
./script.sh baseline / self_verify / cpu_verify Runs only the named mode
./script.sh prepare Workload synthesis + scaled-config generation only, no sim runs
./script.sh <anything else> Errors out with usage hint

Parallel dispatch pattern (documented in header)

IMAGE="ghcr.io/psal-postech/llmservingsimspec/sim:latest"
export RUN_ID="$(date +%Y%m%d-%H%M%S)"
OUTPUT_DIR="outputs/spec_compression_stress_$RUN_ID"

# 1. Prepare workload + scaled configs once on the host so the parallel
#    runs share the same inputs and don't race on workload synthesis.
./serving/spec_compression_stress.sh prepare

# 2. Launch each mode in its own container.
for mode in baseline self_verify cpu_verify; do
    docker run --rm --name "stress_${mode}_$$" \
        -v "$(pwd)/configs":/workspace/configs:ro \
        -v "$(pwd)/$OUTPUT_DIR":/workspace/$OUTPUT_DIR \
        -e RUN_ID="$RUN_ID" \
        "$IMAGE" \
        bash /workspace/serving/spec_compression_stress.sh "$mode" \
        > "$OUTPUT_DIR/${mode}.stdout" 2>&1 &
done
wait

Each container gets its own overlay layer so the simulator's astra-sim/inputs/* writes are local to that mode. No code change in the simulator was needed.

Test plan

  • ./script.sh (no arg) still runs all three sequentially with identical outputs to before.
  • ./script.sh baseline produces just baseline.csv and baseline.log under the output dir.
  • ./script.sh prepare produces workload.jsonl (and the scaled configs if H100_SCALE != 1) and exits.
  • ./script.sh nonsense errors out with the usage hint and a non-zero exit.
  • Parallel-dispatch pattern from the docstring runs three sim containers in parallel against a shared OUTPUT_DIR, producing all three CSVs without astra-sim/inputs/* corruption.

Generated by Claude Code

The script ran all three modes (baseline / self_verify / cpu_verify)
sequentially in one process. Each mode regenerates
astra-sim/inputs/{network.yml,system.json,memory_expansion.json} from
the cluster config, so naively backgrounding the three calls races on
those shared input files and corrupts the in-flight sim's view.

Two changes to enable parallel dispatch via container-per-mode (which
is filesystem-isolated by default, so the inputs/* race goes away):

* Accept a positional MODE argument: ``./script.sh <mode>`` runs only
  the named mode. ``./script.sh`` keeps the legacy "all three
  sequentially" behaviour. Unknown modes error out.

* Add a ``prepare`` mode that does just the workload synthesis +
  scaled-config generation and exits. Lets a parallel dispatcher
  pre-seed the shared OUTPUT_DIR once before fanning out the three
  simulator jobs, instead of every job racing on those steps.

The header docstring documents the docker-based parallel pattern:
prepare on the host, then launch one sim image container per mode
with the same OUTPUT_DIR mounted in. Each container's astra-sim/
lives in its own overlay layer so the input-file race is impossible.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants