Add single-mode dispatch to spec_compression_stress.sh by YWHyuk · Pull Request #33 · PSAL-POSTECH/LLMServingSimSpec

YWHyuk · 2026-05-18T13:34:56Z

Summary

Lets serving/spec_compression_stress.sh run a single mode per invocation, so each mode can be fanned out into its own filesystem-isolated container in parallel.

Why

The script ran baseline → self_verify → cpu_verify sequentially in one process. The simulator regenerates astra-sim/inputs/{network.yml, system.json, memory_expansion.json} from the cluster config on every run, so simply backgrounding the three calls would race on those shared files and corrupt the in-flight sim's view of its own network / system / memory config. The cleanest fix is to keep the script unchanged inside, but let the caller run each mode in its own sim docker container — the per-container overlay layer isolates the writes, no code patch needed.

Changes

Invocation	Behaviour
`./script.sh` (no arg)	Sequential, all three modes — legacy behaviour
`./script.sh baseline` / `self_verify` / `cpu_verify`	Runs only the named mode
`./script.sh prepare`	Workload synthesis + scaled-config generation only, no sim runs
`./script.sh <anything else>`	Errors out with usage hint

Parallel dispatch pattern (documented in header)

IMAGE="ghcr.io/psal-postech/llmservingsimspec/sim:latest"
export RUN_ID="$(date +%Y%m%d-%H%M%S)"
OUTPUT_DIR="outputs/spec_compression_stress_$RUN_ID"

# 1. Prepare workload + scaled configs once on the host so the parallel
#    runs share the same inputs and don't race on workload synthesis.
./serving/spec_compression_stress.sh prepare

# 2. Launch each mode in its own container.
for mode in baseline self_verify cpu_verify; do
    docker run --rm --name "stress_${mode}_$$" \
        -v "$(pwd)/configs":/workspace/configs:ro \
        -v "$(pwd)/$OUTPUT_DIR":/workspace/$OUTPUT_DIR \
        -e RUN_ID="$RUN_ID" \
        "$IMAGE" \
        bash /workspace/serving/spec_compression_stress.sh "$mode" \
        > "$OUTPUT_DIR/${mode}.stdout" 2>&1 &
done
wait

Each container gets its own overlay layer so the simulator's astra-sim/inputs/* writes are local to that mode. No code change in the simulator was needed.

Test plan

./script.sh (no arg) still runs all three sequentially with identical outputs to before.
./script.sh baseline produces just baseline.csv and baseline.log under the output dir.
./script.sh prepare produces workload.jsonl (and the scaled configs if H100_SCALE != 1) and exits.
./script.sh nonsense errors out with the usage hint and a non-zero exit.
Parallel-dispatch pattern from the docstring runs three sim containers in parallel against a shared OUTPUT_DIR, producing all three CSVs without astra-sim/inputs/* corruption.

Generated by Claude Code

The script ran all three modes (baseline / self_verify / cpu_verify) sequentially in one process. Each mode regenerates astra-sim/inputs/{network.yml,system.json,memory_expansion.json} from the cluster config, so naively backgrounding the three calls races on those shared input files and corrupts the in-flight sim's view. Two changes to enable parallel dispatch via container-per-mode (which is filesystem-isolated by default, so the inputs/* race goes away): * Accept a positional MODE argument: ``./script.sh <mode>`` runs only the named mode. ``./script.sh`` keeps the legacy "all three sequentially" behaviour. Unknown modes error out. * Add a ``prepare`` mode that does just the workload synthesis + scaled-config generation and exits. Lets a parallel dispatcher pre-seed the shared OUTPUT_DIR once before fanning out the three simulator jobs, instead of every job racing on those steps. The header docstring documents the docker-based parallel pattern: prepare on the host, then launch one sim image container per mode with the same OUTPUT_DIR mounted in. Each container's astra-sim/ lives in its own overlay layer so the input-file race is impossible.

YWHyuk merged commit 80bcf1d into main May 18, 2026

YWHyuk mentioned this pull request May 18, 2026

Track image-content directories in build-sim-image trigger #34

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add single-mode dispatch to spec_compression_stress.sh#33

Add single-mode dispatch to spec_compression_stress.sh#33
YWHyuk merged 1 commit into
mainfrom
claude/stress-script-single-mode

YWHyuk commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

YWHyuk commented May 18, 2026

Summary

Why

Changes

Parallel dispatch pattern (documented in header)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants