fix(#664): move CLOSED --inter-option-delay from 20 s to 90 s#665
Merged
Conversation
Per storage#664 the CLOSED --inter-option-delay pin moves from 20 s to 90 s (author confirmed 2026-07-03 for PR #602 §6.3.2.1). Update tests/unit/test_cli_kvcache.py + tests/unit/test_benchmarks_kvcache.py to assert the new value; runtime alignment follows in the next commit. Also add an autouse fixture on TestClosedEnforcement that patches _execute_command, _probe_results_dir_shared, and _interruptible_sleep for every test in the class. The *_returns_1 tests rely on the CLOSED guard short-circuiting before any subprocess launches, so they do not mock these collaborators. When the runtime and the tests drift (as they do during this rename), the guard misses and _execute_run falls into the real mpirun / ssh-probe / aggregation path — on localhost with kvcache_bin_path=None that expands into a fork storm heavy enough to OOM a 32 GB host. The fixture ensures a guard miss can only cause a clean assertion failure, never runaway forking.
…90 s PR #602 §6.3.2.1 fixes the CLOSED --inter-option-delay at 90 s (author hazemawadalla confirmed 2026-07-03). The runtime CLI, effective-value fallback, help text, and ManPage still enforced/documented 20 s — an artifact of a temporary reconciliation while the author was unresponsive. Bring them all in line with the spec. Sites changed: - mlpstorage_py/benchmarks/kvcache.py — CLOSED hard-fail, error text, and effective-value fallback all move 20 -> 90 - mlpstorage_py/cli/kvcache_args.py — closed set_defaults 20 -> 90 - mlpstorage_py/run_summary.py — effective-value fallback 20 -> 90 - mlpstorage_py/cli/help_formatter.py — CLOSED pinned-defaults help text inter-option-delay 20s -> 90s - ManPage.md — closed pins list inter-option-delay 20 -> 90 Rules.md §6.3.2.1 is updated on PR #602's branch as a companion change. Behavior change: CLOSED submissions that previously would have accepted --inter-option-delay 20 will now hard-fail; they must use 90.
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
--inter-option-delayenforcement / defaults / help / manpage with PR KV cache Rules for closed and open submission #602 §6.3.2.1 (author@hazemawadallaconfirmed 90 s on 2026-07-03). Runtime and tests were still on the temporary 20 s reconciliation from before the confirmation.TestClosedEnforcementthat patches_execute_command,_probe_results_dir_shared, and_interruptible_sleep. The*_returns_1tests rely on the CLOSED guard short-circuiting before any subprocess launches; when the runtime and tests drift (as they did during this rename), the guard misses and_execute_runfalls into the realmpirun/ ssh-probe / aggregation path — on localhost withkvcache_bin_path=Nonethat expands into a fork storm heavy enough to OOM a 32 GB host. The fixture ensures a future guard miss can only produce a clean assertion failure, never a runaway fork.Behavior change
CLOSED submissions that previously would have accepted
--inter-option-delay 20will now hard-fail; they must use 90. Consistent with the merged §6.3.2.1 text on PR #602.Sites changed
Runtime code:
mlpstorage_py/benchmarks/kvcache.py— CLOSED hard-fail, error text, and effective-value fallback: 20 → 90mlpstorage_py/cli/kvcache_args.py— closedset_defaults: 20 → 90mlpstorage_py/run_summary.py— effective-value fallback: 20 → 90mlpstorage_py/cli/help_formatter.py— CLOSED pinned-defaults help text: 20s → 90sManPage.md— closed pins list: 20 → 90Tests:
tests/unit/test_cli_kvcache.py— closedset_defaultsassertion: 20 → 90tests/unit/test_benchmarks_kvcache.py— CLOSED fixture: 20 → 90; renamedtest_closed_inter_option_delay_non_20_returns_1→_non_90_returns_1(deliberately uses 20 as the non-90 value to catch a regression back to 20); autouse fork-storm safety net onTestClosedEnforcement.Rules.md§6.3.2.1 is updated on PR #602's branch as a companion change (not touched here — this PR ships the runtime alignment independently so #602 can rebase and pass CI cleanly).Test plan
uv run pytest tests/unit— 2516 passed, 1 skippeduv run pytest mlpstorage_py/tests— 822 passeduv run pytest vdb_benchmark/tests— 155 passeduv run pytest kv_cache_benchmark/tests— 238 passedtest_closed_inter_option_delay_non_90_returns_1and 5 sibling tests fail cleanly (assertion errors, no subprocess spawn) — safety-net fixture works as designed