fix(#568): refine CAP-01 object-storage skip — key on storage_type, add helper + dedicated tests#581
Merged
Merged
Conversation
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
Adds 11 new tests covering the object-storage escape hatch:
TestDLIOIsObjectStorage (new class): the shared helper that
reads storage.storage_type from params_dict / combined_params.
Locks the lookup precedence, the {'s3','s3_torch'} membership,
the direct_fs (--o-direct) NEGATIVE case, the
None-combined_params guard, and the
`--params storage.storage_type=s3 without --object` edge case
that a data_access_protocol-keyed signal would miss.
TestTrainingBenchmarkRequiredBytes /
TestCheckpointingBenchmarkRequiredBytes (extended):
test_destination_is_none_when_object_storage on both — gates
must return None so _pre_execution_gate fires the A8 hatch.
Also isolates the three existing destination tests
(test_destination_is_args_data_dir,
test_destination_is_checkpoint_folder_joined_with_model,
test_destination_is_none_when_checkpoint_folder_empty) from the
new helper by stubbing bm._is_object_storage = False; the prior
form passed by accident because the helper did not exist.
Drive-by: fixes pre-existing collection rot in the dep-stub block
(find_spec('pyarrow.ipc') raised after pyarrow itself was stubbed
with MagicMock).
2 destination + 9 helper tests fail as expected; fix lands next
commit.
Training and Checkpointing _capacity_gate_destination() returned the
raw destination unconditionally. In object mode that's an s3:// URI;
check_capacity_4field walks the parent chain to the filesystem root
and aborts with `[E401] CAP-01: no valid parent for s3://…` before any
work starts. --skip-validation does not help — CAP-01 lives in
_pre_execution_gate(), not validate_benchmark_environment().
Adds a shared DLIOBenchmark._is_object_storage() helper that reads
storage.storage_type from params_dict / combined_params (same lookup
as _check_storage_scheme_consistency) and returns True for {'s3',
's3_torch'}. Keying on storage_type (post-_apply_object_storage_params
state) rather than data_access_protocol catches the
`--params storage.storage_type=s3` path where a user wires up object
storage without passing `--object`. direct_fs (--o-direct) still
resolves to a local path and is unaffected.
Both _capacity_gate_destination overrides short-circuit on the helper,
returning None so Benchmark._pre_execution_gate fires the existing A8
remote-backend escape hatch (logged as
`CAP-01 skipped: destination not local`).
Also updates the integration A7-lock test to stub _is_object_storage
on the MagicMock fixture — the prior form passed by accident because
the helper did not exist.
4dc4d6d to
90a761d
Compare
data_access_protocol is registered as a positional with choices= ['file','object'] in cli/common_args.py, not as a --object flag. Updates the helper docstring, the TestDLIOIsObjectStorage class docstring, the edge-case test name (was test_storage_type_set_via_params_without_object_flag → now test_storage_type_set_via_params_under_file_positional), and that test's docstring to use the correct grammar. Behavior unchanged — _is_object_storage keys on storage.storage_type, which is downstream of however the user selected their backend.
This was referenced Jun 29, 2026
This was referenced Jun 29, 2026
russfellows
approved these changes
Jun 29, 2026
russfellows
left a comment
Contributor
There was a problem hiding this comment.
We'll get this working eventually.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Refines the already-merged #579 fix for #568.
#579 (now in main) shipped a working but narrow fix that keyed on
data_access_protocol == 'object'. After reviewing PR #574 (closed, no test coverage, broke 2 existing fixtures), this PR combines the best of both:storage.storage_typeso the gate also skips for--params storage.storage_type=s3without--object.DLIOBenchmark._is_object_storage()consumed by both_capacity_gate_destinationoverrides.AttributeError, GREEN commit passes; dedicatedTestDLIOIsObjectStorageclass for the helper.What changes vs. #579
mlpstorage_py/benchmarks/dlio.py— replaces the twodata_access_protocol-based checks with a single shared_is_object_storage()helper that readsstorage.storage_typefromparams_dict/combined_params(same lookup as_check_storage_scheme_consistency) and returns True for{'s3', 's3_torch'}.direct_fs(--o-direct) is correctly NOT skipped — local statvfs still applies.tests/unit/test_capacity_gate.py— addsTestDLIOIsObjectStorage(9 tests covering precedence, fallback,direct_fsnegative case,Nonecombined_params guard, and the no---objectedge case). Restructures the twotest_destination_is_none_when_object_storagetests to use the helper-stub pattern.tests/integration/test_systemname_yaml_end_to_end.py— stubs_is_object_storage=Falseon the A7-lock fixture (would otherwise silently fail when the helper exists).Test plan
RED commit fails with
AttributeError: type object 'DLIOBenchmark' has no attribute '_is_object_storage'on 9 helper tests + 2 destination tests. GREEN commit passes with no regressions:tests/: 2392 passed, 13 deselectedmlpstorage_py/tests: 780 passed, 1 xfailedvdb_benchmark/tests: 144 passedkv_cache_benchmark/tests: 238 passed--params storage.storage_type=s3repro path also skips the gate cleanly under this refinement.