Skip to content

fix(#664): move CLOSED --inter-option-delay from 20 s to 90 s#665

Merged
FileSystemGuy merged 2 commits into
mainfrom
fix/664-inter-option-delay-90
Jul 3, 2026
Merged

fix(#664): move CLOSED --inter-option-delay from 20 s to 90 s#665
FileSystemGuy merged 2 commits into
mainfrom
fix/664-inter-option-delay-90

Conversation

@FileSystemGuy

Copy link
Copy Markdown
Contributor

Summary

  • Closes kvcache CLOSED --inter-option-delay: move enforced value from 20 s to 90 s #664. Aligns the CLOSED --inter-option-delay enforcement / defaults / help / manpage with PR KV cache Rules for closed and open submission #602 §6.3.2.1 (author @hazemawadalla confirmed 90 s on 2026-07-03). Runtime and tests were still on the temporary 20 s reconciliation from before the confirmation.
  • Adds a class-level autouse fixture on TestClosedEnforcement that patches _execute_command, _probe_results_dir_shared, and _interruptible_sleep. The *_returns_1 tests rely on the CLOSED guard short-circuiting before any subprocess launches; when the runtime and tests drift (as they did during this rename), the guard misses and _execute_run falls into the real mpirun / ssh-probe / aggregation path — on localhost with kvcache_bin_path=None that expands into a fork storm heavy enough to OOM a 32 GB host. The fixture ensures a future guard miss can only produce a clean assertion failure, never a runaway fork.

Behavior change

CLOSED submissions that previously would have accepted --inter-option-delay 20 will now hard-fail; they must use 90. Consistent with the merged §6.3.2.1 text on PR #602.

Sites changed

Runtime code:

  • mlpstorage_py/benchmarks/kvcache.py — CLOSED hard-fail, error text, and effective-value fallback: 20 → 90
  • mlpstorage_py/cli/kvcache_args.py — closed set_defaults: 20 → 90
  • mlpstorage_py/run_summary.py — effective-value fallback: 20 → 90
  • mlpstorage_py/cli/help_formatter.py — CLOSED pinned-defaults help text: 20s → 90s
  • ManPage.md — closed pins list: 20 → 90

Tests:

  • tests/unit/test_cli_kvcache.py — closed set_defaults assertion: 20 → 90
  • tests/unit/test_benchmarks_kvcache.py — CLOSED fixture: 20 → 90; renamed test_closed_inter_option_delay_non_20_returns_1_non_90_returns_1 (deliberately uses 20 as the non-90 value to catch a regression back to 20); autouse fork-storm safety net on TestClosedEnforcement.

Rules.md §6.3.2.1 is updated on PR #602's branch as a companion change (not touched here — this PR ships the runtime alignment independently so #602 can rebase and pass CI cleanly).

Test plan

  • uv run pytest tests/unit — 2516 passed, 1 skipped
  • uv run pytest mlpstorage_py/tests — 822 passed
  • uv run pytest vdb_benchmark/tests — 155 passed
  • uv run pytest kv_cache_benchmark/tests — 238 passed
  • RED-first verified: with runtime stashed, test_closed_inter_option_delay_non_90_returns_1 and 5 sibling tests fail cleanly (assertion errors, no subprocess spawn) — safety-net fixture works as designed

Per storage#664 the CLOSED --inter-option-delay pin moves from 20 s to
90 s (author confirmed 2026-07-03 for PR #602 §6.3.2.1). Update
tests/unit/test_cli_kvcache.py + tests/unit/test_benchmarks_kvcache.py
to assert the new value; runtime alignment follows in the next commit.

Also add an autouse fixture on TestClosedEnforcement that patches
_execute_command, _probe_results_dir_shared, and _interruptible_sleep
for every test in the class. The *_returns_1 tests rely on the CLOSED
guard short-circuiting before any subprocess launches, so they do not
mock these collaborators. When the runtime and the tests drift (as
they do during this rename), the guard misses and _execute_run falls
into the real mpirun / ssh-probe / aggregation path — on localhost
with kvcache_bin_path=None that expands into a fork storm heavy enough
to OOM a 32 GB host. The fixture ensures a guard miss can only cause a
clean assertion failure, never runaway forking.
…90 s

PR #602 §6.3.2.1 fixes the CLOSED --inter-option-delay at 90 s (author
hazemawadalla confirmed 2026-07-03). The runtime CLI, effective-value
fallback, help text, and ManPage still enforced/documented 20 s — an
artifact of a temporary reconciliation while the author was
unresponsive. Bring them all in line with the spec.

Sites changed:
- mlpstorage_py/benchmarks/kvcache.py — CLOSED hard-fail, error text,
  and effective-value fallback all move 20 -> 90
- mlpstorage_py/cli/kvcache_args.py — closed set_defaults 20 -> 90
- mlpstorage_py/run_summary.py — effective-value fallback 20 -> 90
- mlpstorage_py/cli/help_formatter.py — CLOSED pinned-defaults help
  text inter-option-delay 20s -> 90s
- ManPage.md — closed pins list inter-option-delay 20 -> 90

Rules.md §6.3.2.1 is updated on PR #602's branch as a companion change.

Behavior change: CLOSED submissions that previously would have accepted
--inter-option-delay 20 will now hard-fail; they must use 90.
@FileSystemGuy FileSystemGuy requested a review from a team July 3, 2026 04:34
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@FileSystemGuy FileSystemGuy merged commit 94802d3 into main Jul 3, 2026
4 checks passed
@FileSystemGuy FileSystemGuy deleted the fix/664-inter-option-delay-90 branch July 3, 2026 04:39
@github-actions github-actions Bot locked and limited conversation to collaborators Jul 3, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kvcache CLOSED --inter-option-delay: move enforced value from 20 s to 90 s

1 participant