Update python version, DLIO, and uv.lock by FileSystemGuy · Pull Request #298 · mlcommons/storage

FileSystemGuy · 2026-03-31T20:59:49Z

Constrain python to be in the 3.12 family only,
Change the DLIO dependencies in pyproject.toml to refer to DLIO_local_changes,
Update the uv.lock file to reflect those changes,

github-actions · 2026-03-31T20:59:59Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

FileSystemGuy · 2026-03-31T21:07:50Z

This is intended to resolve #293
Hopefully it will actually do that!

FileSystemGuy · 2026-03-31T23:31:37Z

This one needs real testing before we merge it in, so please don't anyone merge it until we're sure it hasn't broken anything.

idevasena · 2026-04-02T00:39:38Z

@FileSystemGuy Performed the following tests to validate PR:

Verified pyproject.toml changes and uv.lock consistency
Verified uv.lock reproducibility:
confirms the lock file was generated correctly and isn't stale.

smrc@dskbd029:~/Storage_Repo_Tests/storage$ git checkout pr-298
Switched to branch 'pr-298'
smrc@dskbd029:~/Storage_Repo_Tests/storage$ cat .python-version
3.12
smrc@dskbd029:~/Storage_Repo_Tests/storage$ grep "requires-python" pyproject.toml
requires-python = ">=3.12, <3.13"
smrc@dskbd029:~/Storage_Repo_Tests/storage$ grep -n "dlio-benchmark" pyproject.toml
24:    "dlio-benchmark @ git+https://github.com/mlcommons/DLIO_local_changes.git",
35:    "dlio-benchmark @ git+https://github.com/mlcommons/DLIO_local_changes.git",
smrc@dskbd029:~/Storage_Repo_Tests/storage$ grep -i "wvaske" uv.lock
smrc@dskbd029:~/Storage_Repo_Tests/storage$ grep -i "argonne-lcf" uv.lock
smrc@dskbd029:~/Storage_Repo_Tests/storage$ grep -i "DLIO_local_changes" uv.lock
source = { git = "https://github.com/mlcommons/DLIO_local_changes.git#4be40e6b9077674513c0defd5283faf3cbae8445" }
    { name = "dlio-benchmark", marker = "extra == 'dlio'", git = "https://github.com/mlcommons/DLIO_local_changes.git" },
    { name = "dlio-benchmark", marker = "extra == 'full'", git = "https://github.com/mlcommons/DLIO_local_changes.git" },
smrc@dskbd029:~/Storage_Repo_Tests/storage$ cp uv.lock uv.lock.from_pr
smrc@dskbd029:~/Storage_Repo_Tests/storage$ uv lock
Using CPython 3.12.3 interpreter at: /usr/bin/python3.12
Resolved 98 packages in 1ms
smrc@dskbd029:~/Storage_Repo_Tests/storage$ diff uv.lock uv.lock.from_pr
smrc@dskbd029:~/Storage_Repo_Tests/storage$ uv venv --python 3.12
Using CPython 3.12.3 interpreter at: /usr/bin/python3.12
Creating virtual environment at: .venv
Activate with: source .venv/bin/activate

After installation and dependency resolution:
Verified installed DLIO source:

(storage) smrc@dskbd029:~/Storage_Repo_Tests/storage$ cat .venv/lib/python3.12/site-packages/dlio_benchmark-*.dist-info/direct_url.json | python3 -m json.tool
{
    "url": "https://github.com/mlcommons/DLIO_local_changes.git",
    "vcs_info": {
        "vcs": "git",
        "commit_id": "4be40e6b9077674513c0defd5283faf3cbae8445"
    }
}

(storage) smrc@dskbd029:~/Storage_Repo_Tests/storage$ python3 -c "import mlpstorage; print('import OK')"
import OK

Functional tests

DLIO data generation test:

(storage) smrc@dskbd029:~/Storage_Repo_Tests/storage$ mlpstorage training datagen --num-processes 2     --data-dir /tmp/mlperf_storage_test  -m unet3d
Hosts is: ['127.0.0.1']
Hosts is: ['127.0.0.1']
⠋ Validating environment... 0:00:002026-04-01 22:54:19|INFO: Environment validation passed
2026-04-01 22:54:19|STATUS: Benchmark results directory: /tmp/mlperf_storage_results/training/unet3d/datagen/20260401_225419
2026-04-01 22:54:19|INFO: Creating data directory: /tmp/mlperf_storage_test/unet3d...
2026-04-01 22:54:19|INFO: Creating directory: /tmp/mlperf_storage_test/unet3d/train...
2026-04-01 22:54:19|INFO: Creating directory: /tmp/mlperf_storage_test/unet3d/valid...
2026-04-01 22:54:19|INFO: Creating directory: /tmp/mlperf_storage_test/unet3d/test...
⠋ Validating environment... ━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1/4 0:00:002026-04-01 22:54:19|STATUS: Running benchmark command:: mpirun -n 2 -host 127.0.0.1:2 --bind-to none --map-by socket /home/smrc/Storage_Repo_Tests/storage/.venv/bin/dlio_benchmark workload=unet3d_datagen ++hydra.run.dir=/tmp/mlperf_storage_results/training/unet3d/datagen/20260401_225419 ++hydra.output_subdir=dlio_config ++workload.dataset.data_folder=/tmp/mlperf_storage_test/unet3d --config-dir=/home/smrc/Storage_Repo_Tests/storage/configs/dlio
[DEBUG DLIOBenchmark.__init__] After LoadConfig:
  storage_type   = <StorageType.LOCAL_FS: 'local_fs'>
  storage_root   = './'
  storage_options= None
  data_folder    = '/tmp/mlperf_storage_test/unet3d'
  framework      = <FrameworkType.PYTORCH: 'pytorch'>
  num_files_train= 168
[OUTPUT] 2026-04-01T22:54:23.876022 Running DLIO [Generating data] with 2 process(es)
  record_length  = 146600628
  generate_data  = True
  do_train       = False
  do_checkpoint  = False
  epochs         = 1
  batch_size     = 1
[DEBUG DLIOBenchmark.__init__] After LoadConfig:
[OUTPUT] ================================================================================
[OUTPUT] Data Generation Method: DGEN (default)
[OUTPUT]   dgen-py zero-copy BytesView — 155x faster than NumPy, 0 MB overhead
[OUTPUT] ================================================================================
[OUTPUT] ================================================================================
[OUTPUT] Data Generation Method: DGEN (default)
[OUTPUT]   dgen-py zero-copy BytesView — 155x faster than NumPy, 0 MB overhead
[OUTPUT] ================================================================================
[OUTPUT] 2026-04-01T22:54:23.911240 Starting data generation
[OUTPUT] 2026-04-01T22:54:35.168146 Generation done
[OUTPUT] ================================================================================
[OUTPUT] Data Generation Method: DGEN (default)
  storage_type   = <StorageType.LOCAL_FS: 'local_fs'>
[OUTPUT]   dgen-py zero-copy BytesView — 155x faster than NumPy, 0 MB overhead
  storage_root   = './'
  storage_options= None
  data_folder    = '/tmp/mlperf_storage_test/unet3d'
  framework      = <FrameworkType.PYTORCH: 'pytorch'>
  num_files_train= 168
  record_length  = 146600628
  generate_data  = True
  do_train       = False
  do_checkpoint  = False
  epochs         = 1
  batch_size     = 1
[OUTPUT] ================================================================================
[OUTPUT] ================================================================================
[OUTPUT] Data Generation Method: DGEN (default)
[OUTPUT]   dgen-py zero-copy BytesView — 155x faster than NumPy, 0 MB overhead
[OUTPUT] ================================================================================`

DLIO benchmark run test:

(storage) smrc@dskbd029:~/Storage_Repo_Tests/storage$ mlpstorage training run     --model unet3d     --num-accelerators 2     --accelerator-type h100     --data-dir /tmp/mlperf_storage_test     --results-dir /tmp/mlperf_storage_results --client-host-memory-in-gb 1000
Setting attr from num_accelerators to 2
Hosts is: ['127.0.0.1']
Hosts is: ['127.0.0.1']
⠋ Validating environment... 0:00:002026-04-01 22:56:10|INFO: Environment validation passed
2026-04-01 22:56:10|STATUS: Benchmark results directory: /tmp/mlperf_storage_results/training/unet3d/run/20260401_225610
2026-04-01 22:56:11|INFO: Created benchmark run: training_run_unet3d_20260401_225610
2026-04-01 22:56:11|STATUS: Verifying benchmark run for training_run_unet3d_20260401_225610
2026-04-01 22:56:11|RESULT: Minimum file count dictated by dataset size to memory size ratio.
2026-04-01 22:56:11|ERROR: INVALID: [INVALID] Insufficient number of training files (Parameter: dataset.num_files_train, Expected: >= 36889, Actual: 168)
2026-04-01 22:56:11|STATUS: Benchmark run is INVALID due to 1 issues ([RunID(program='training', command='run', model='unet3d', run_datetime='20260401_225610')])
2026-04-01 22:56:11|WARNING: Running the benchmark without verification for open or closed configurations. These results are not valid for submission. Use --open or --closed to specify a configuration.
⠦ Collecting cluster info... ━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━ 2/4 0:00:002026-04-01 22:56:11|STATUS: Running benchmark command:: mpirun -n 2 -host 127.0.0.1:2 --bind-to none --map-by socket /home/smrc/Storage_Repo_Tests/storage/.venv/bin/dlio_benchmark workload=unet3d_h100 ++hydra.run.dir=/tmp/mlperf_storage_results/training/unet3d/run/20260401_225610 ++hydra.output_subdir=dlio_config ++workload.dataset.data_folder=/tmp/mlperf_storage_test/unet3d --config-dir=/home/smrc/Storage_Repo_Tests/storage/configs/dlio
[DEBUG DLIOBenchmark.__init__] After LoadConfig:
  storage_type   = <StorageType.LOCAL_FS: 'local_fs'>
  storage_root   = './'
  storage_options= None
  data_folder    = '/tmp/mlperf_storage_test/unet3d'
  framework      = <FrameworkType.PYTORCH: 'pytorch'>
  num_files_train= 168
  record_length  = 146600628
  generate_data  = False
[OUTPUT] 2026-04-01T22:56:14.512806 Running DLIO [Training & Checkpointing] with 2 process(es)
  do_train       = True
  do_checkpoint  = True
[WARNING] The amount of dataset is smaller than the host memory; data might be cached after the first epoch. Increase the size of dataset to eliminate the 
caching effect!!!
  epochs         = 5
  batch_size     = 7
[DEBUG DLIOBenchmark.__init__] After LoadConfig:
[OUTPUT] ================================================================================
[OUTPUT] Data Generation Method: DGEN (default)
[OUTPUT] ================================================================================
[OUTPUT]   dgen-py zero-copy BytesView — 155x faster than NumPy, 0 MB overhead
[OUTPUT] Data Generation Method: DGEN (default)
[OUTPUT] ================================================================================
[OUTPUT]   dgen-py zero-copy BytesView — 155x faster than NumPy, 0 MB overhead
[OUTPUT] ================================================================================
[OUTPUT] 2026-04-01T22:56:14.553670 Model size: 0.000010 GB 
[OUTPUT] 2026-04-01T22:56:14.553742 Total checkpoint size: 0.000010 GB 
[OUTPUT] 2026-04-01T22:56:14.553813 Max steps per epoch: 12 = 1 * 168 / 7 / 2 (samples per file * num files / batch size / comm size)
[OUTPUT] 2026-04-01T22:56:14.632422 Starting epoch 1: 12 steps expected
[OUTPUT] 2026-04-01T22:56:14.632621 Starting block 1
[OUTPUT] 2026-04-01T22:56:19.313152 Ending block 1 - 12 steps completed in 4.68 s
[OUTPUT] 2026-04-01T22:56:19.326037 Epoch 1 - Block 1 [Training] Accelerator Utilization [AU] (%): 97.1905
[OUTPUT] 2026-04-01T22:56:19.326221 Epoch 1 - Block 1 [Training] Throughput (samples/second): 38.2855
[OUTPUT] 2026-04-01T22:56:19.326338 Epoch 1 - Block 1 [Training] Computation time per step (second): 0.3231+/-0.0000 (set value: {'mean': 0.323})
[OUTPUT] 2026-04-01T22:56:19.331486 Ending epoch 1 - 12 steps completed in 4.70 s
[OUTPUT] 2026-04-01T22:56:19.336531 Starting epoch 2: 12 steps expected
[OUTPUT] 2026-04-01T22:56:19.337212 Starting block 1
[OUTPUT] 2026-04-01T22:56:23.962226 Ending block 1 - 12 steps completed in 4.63 s
[OUTPUT] 2026-04-01T22:56:23.965841 Epoch 2 - Block 1 [Training] Accelerator Utilization [AU] (%): 97.1230
[OUTPUT] 2026-04-01T22:56:23.965986 Epoch 2 - Block 1 [Training] Throughput (samples/second): 38.2590
[OUTPUT] 2026-04-01T22:56:23.966093 Epoch 2 - Block 1 [Training] Computation time per step (second): 0.3231+/-0.0000 (set value: {'mean': 0.323})
[OUTPUT] 2026-04-01T22:56:23.970881 Ending epoch 2 - 12 steps completed in 4.63 s
[OUTPUT] 2026-04-01T22:56:23.975115 Starting epoch 3: 12 steps expected
[OUTPUT] 2026-04-01T22:56:23.975700 Starting block 1
[OUTPUT] 2026-04-01T22:56:28.606212 Ending block 1 - 12 steps completed in 4.63 s
[OUTPUT] 2026-04-01T22:56:28.610569 Epoch 3 - Block 1 [Training] Accelerator Utilization [AU] (%): 97.0761
[OUTPUT] 2026-04-01T22:56:28.610729 Epoch 3 - Block 1 [Training] Throughput (samples/second): 38.2404
[OUTPUT] 2026-04-01T22:56:28.610866 Epoch 3 - Block 1 [Training] Computation time per step (second): 0.3231+/-0.0000 (set value: {'mean': 0.323})
[OUTPUT] 2026-04-01T22:56:28.615659 Ending epoch 3 - 12 steps completed in 4.64 s
[OUTPUT] 2026-04-01T22:56:28.620051 Starting epoch 4: 12 steps expected
[OUTPUT] 2026-04-01T22:56:28.620683 Starting block 1
[OUTPUT] 2026-04-01T22:56:33.245046 Ending block 1 - 12 steps completed in 4.62 s
[OUTPUT] 2026-04-01T22:56:33.248665 Epoch 4 - Block 1 [Training] Accelerator Utilization [AU] (%): 97.0818
[OUTPUT] 2026-04-01T22:56:33.248835 Epoch 4 - Block 1 [Training] Throughput (samples/second): 38.2435
[OUTPUT] 2026-04-01T22:56:33.248946 Epoch 4 - Block 1 [Training] Computation time per step (second): 0.3231+/-0.0000 (set value: {'mean': 0.323})
[OUTPUT] 2026-04-01T22:56:33.253622 Ending epoch 4 - 12 steps completed in 4.63 s
[OUTPUT] 2026-04-01T22:56:33.257922 Starting epoch 5: 12 steps expected
[OUTPUT] 2026-04-01T22:56:33.258482 Starting block 1
[OUTPUT] 2026-04-01T22:56:37.884269 Ending block 1 - 12 steps completed in 4.63 s
[OUTPUT] 2026-04-01T22:56:37.887884 Epoch 5 - Block 1 [Training] Accelerator Utilization [AU] (%): 96.9763
[OUTPUT] 2026-04-01T22:56:37.888032 Epoch 5 - Block 1 [Training] Throughput (samples/second): 38.2007
[OUTPUT] 2026-04-01T22:56:37.888164 Epoch 5 - Block 1 [Training] Computation time per step (second): 0.3231+/-0.0000 (set value: {'mean': 0.323})
[OUTPUT] 2026-04-01T22:56:37.888695 Starting saving checkpoint 1 after total step 12 for epoch 5
[OUTPUT] 2026-04-01T22:56:37.939609 Finished saving checkpoint 1 for epoch 5 in 0.0509 s; Throughput: 0.0002 GB/s
[OUTPUT] 2026-04-01T22:56:37.949224 Ending epoch 5 - 12 steps completed in 4.69 s
[OUTPUT] 2026-04-01T22:56:37.954150 Saved outputs in /tmp/mlperf_storage_results/training/unet3d/run/20260401_225610
[OUTPUT] Averaged metric over all steps/epochs
[METRIC] ==========================================================
[METRIC] Number of Simulated Accelerators: 2 
[METRIC] Training Accelerator Utilization [AU] (%): 97.6480 (0.0570)
[METRIC] Training Throughput (samples/second): 38.4658 (0.0227)
[METRIC] Training I/O Throughput (MB/second): 5377.8813 (3.1798)
[METRIC] train_au_meet_expectation: success
[METRIC] ==========================================================

[OUTPUT] 2026-04-01T22:56:37.957874 outputs saved in RANKID_output.json
  storage_type   = <StorageType.LOCAL_FS: 'local_fs'>
  storage_root   = './'
  storage_options= None
  data_folder    = '/tmp/mlperf_storage_test/unet3d'
  framework      = <FrameworkType.PYTORCH: 'pytorch'>
  num_files_train= 168
  record_length  = 146600628
  generate_data  = False
  do_train       = True
  do_checkpoint  = True
  epochs         = 5
  batch_size     = 7

idevasena · 2026-04-02T00:49:22Z

I did run existing unit/ integration tests too, but see 16 failed tests and 13 errors. However, those seem to be pre-existing bugs in the main branch test suite.

(storage) smrc@dskbd029:~/Storage_Repo_Tests/storage$ uv pip install -e ".[test]"
Resolved 15 packages in 428ms
      Built mlpstorage @ file:///home/smrc/Storage_Repo_Tests/storage
Prepared 7 packages in 631ms
Uninstalled 1 package in 0.92ms
Installed 7 packages in 4ms
 + coverage==7.13.5
 + iniconfig==2.3.0
 ~ mlpstorage==2.0.0b1 (from file:///home/smrc/Storage_Repo_Tests/storage)
 + pluggy==1.6.0
 + pytest==9.0.2
 + pytest-cov==7.1.0
 + pytest-mock==3.15.1
(storage) smrc@dskbd029:~/Storage_Repo_Tests/storage$ pytest

================================================================= short test summary info =================================================================
FAILED tests/integration/test_benchmark_flow.py::TestTrainingBenchmarkFlow::test_training_benchmark_what_if_mode - PermissionError: [Errno 13] Permission denied: '/data'
FAILED tests/integration/test_benchmark_flow.py::TestTrainingBenchmarkFlow::test_training_benchmark_generates_correct_command - PermissionError: [Errno 13] Permission denied: '/data'
FAILED tests/integration/test_benchmark_flow.py::TestBenchmarkWithMockExecutor::test_benchmark_records_executed_commands - PermissionError: [Errno 13] Permission denied: '/data'
FAILED tests/integration/test_benchmark_flow.py::TestBenchmarkWithMockCollector::test_benchmark_uses_mock_cluster_info - PermissionError: [Errno 13] Permission denied: '/data'
FAILED tests/integration/test_benchmark_flow.py::TestMetadataGeneration::test_metadata_file_created_on_run - PermissionError: [Errno 13] Permission denied: '/data'
FAILED tests/unit/test_benchmarks_base.py::TestBenchmarkProgress::test_cluster_collection_shows_spinner - AssertionError: Expected 'progress_context' to have been called once. Called 0 times.
FAILED tests/unit/test_benchmarks_base.py::TestBenchmarkProgress::test_cluster_collection_updates_description_ssh - AssertionError: expected call not found.
FAILED tests/unit/test_benchmarks_base.py::TestBenchmarkProgress::test_cluster_collection_updates_description_mpi - AssertionError: expected call not found.
FAILED tests/unit/test_benchmarks_base.py::TestBenchmarkProgress::test_end_cluster_collection_shows_spinner - AssertionError: Expected 'progress_context' to have been called once. Called 0 times.
FAILED tests/unit/test_reporting.py::TestReportGeneratorInit::test_accepts_custom_logger - SystemExit: FILE_NOT_FOUND (3)
FAILED tests/unit/test_reporting.py::TestReportGeneratorInit::test_uses_debug_from_args - SystemExit: FILE_NOT_FOUND (3)
FAILED tests/unit/test_reporting.py::TestReportGeneratorAccumulateResults::test_accumulates_from_benchmark_runs - SystemExit: FILE_NOT_FOUND (3)
FAILED tests/unit/test_reporting.py::TestReportGeneratorAccumulateResults::test_groups_by_workload - SystemExit: FILE_NOT_FOUND (3)
FAILED tests/unit/test_reporting.py::TestReportGeneratorIntegration::test_full_workflow_with_fixture_data - SystemExit: FILE_NOT_FOUND (3)
FAILED tests/unit/test_rules_calculations.py::TestGetRunsFiles::test_returns_empty_for_nonexistent_dir - AssertionError: Expected 'warning' to have been called.
FAILED tests/unit/test_rules_calculations.py::TestGetRunsFiles::test_skips_dirs_with_multiple_metadata - assert 1 == 0
ERROR tests/unit/test_reporting.py::TestReportGeneratorWriteJson::test_writes_json_file - SystemExit: FILE_NOT_FOUND (3)
ERROR tests/unit/test_reporting.py::TestReportGeneratorWriteJson::test_json_has_proper_formatting - SystemExit: FILE_NOT_FOUND (3)
ERROR tests/unit/test_reporting.py::TestReportGeneratorWriteCsv::test_writes_csv_file - SystemExit: FILE_NOT_FOUND (3)
ERROR tests/unit/test_reporting.py::TestReportGeneratorWriteCsv::test_flattens_nested_dicts - SystemExit: FILE_NOT_FOUND (3)
ERROR tests/unit/test_reporting.py::TestReportGeneratorWriteCsv::test_handles_nan_values - SystemExit: FILE_NOT_FOUND (3)
ERROR tests/unit/test_reporting.py::TestReportGeneratorGenerateReports::test_returns_success - SystemExit: FILE_NOT_FOUND (3)
ERROR tests/unit/test_reporting.py::TestReportGeneratorGenerateReports::test_creates_json_file - SystemExit: FILE_NOT_FOUND (3)
ERROR tests/unit/test_reporting.py::TestReportGeneratorGenerateReports::test_creates_csv_file - SystemExit: FILE_NOT_FOUND (3)
ERROR tests/unit/test_reporting.py::TestReportGeneratorPrintResults::test_prints_closed_results - SystemExit: FILE_NOT_FOUND (3)
ERROR tests/unit/test_reporting.py::TestReportGeneratorPrintResults::test_prints_issues - SystemExit: FILE_NOT_FOUND (3)
ERROR tests/unit/test_reporting.py::TestReportGeneratorPrintResults::test_prints_metrics - SystemExit: FILE_NOT_FOUND (3)
ERROR tests/unit/test_reporting.py::TestReportGeneratorPrintResults::test_prints_metric_lists - SystemExit: FILE_NOT_FOUND (3)
ERROR tests/unit/test_reporting.py::TestReportGeneratorPrintResults::test_prints_workload_results - SystemExit: FILE_NOT_FOUND (3)
================================================== 16 failed, 831 passed, 12 skipped, 13 errors in 6.61s ==================================================

Reasoning for failure:

Category 1: test_reporting.py failures (13 errors + 5 failures) — The test fixtures create a bare results/ directory but the ReportGenerator.init now validates directory structure and calls sys.exit(EXIT_CODE.FILE_NOT_FOUND) when it doesn't find training/, checkpointing/, vector_database/, or kv_cache/ subdirectories. The tests weren't updated to match the new validation logic. This is a test-vs-code mismatch on main, not a PR #298 regression.

Category 2: test_benchmark_flow.py failures (5) — PermissionError: [Errno 13] Permission denied: '/data'. The TrainingBenchmark tries to os.makedirs('/data') which requires root. These tests need to mock the filesystem or use tmp_path. Again, pre-existing on main.

Category 3: test_benchmarks_base.py failures (4) — Progress spinner mock assertions not matching. The mocked progress_context is never called, suggesting the code path changed but tests weren't updated. Pre-existing.

Category 4: test_rules_calculations.py failures (2) — Logic changes in how nonexistent dirs and multiple metadata files are handled. Pre-existing.

idevasena · 2026-04-02T00:59:26Z

Verifying the Submodule Issue (#293) is actually addressed:

(storage) smrc@dskbd029:~/Storage_Repo_Tests/storage$ cat .gitmodules 2>/dev/null
[submodule "dlio_benchmark"]
path = dlio_benchmark
url = https://github.com/mlcommons/DLIO_local_changes
branch = main
(storage) smrc@dskbd029:~/Storage_Repo_Tests/storage$ git submodule status
-4be40e6b9077674513c0defd5283faf3cbae8445 dlio_benchmark

The .gitmodules file shows the submodule has been updated to point to mlcommons/DLIO_local_changes (matching the pyproject.toml change), but the - prefix in git submodule status means the submodule hasn't been initialized/checked out.

To summarize

This PR actually partially addresses issue #293 in three ways:
it aligns pyproject.toml, uv.lock, and .gitmodules to the same repo (mlcommons/DLIO_local_changes).
But the submodule is still declared — it's just not actively used since the dependency is pulled via pip/uv from the git URL in pyproject.toml.

A follow-up cleanup to remove the submodule entirely (i.e. delete .gitmodules, git rm dlio_benchmark) would fully close issue #293, since the submodule is redundant with the pyproject.toml dependency declaration.
But that's a separate concern from the correctness of this PR.

The PR looks safe to approve from functional standpoint overall.

FileSystemGuy · 2026-04-02T19:34:38Z

Devasena, thank you for digging into this!! Both the testing of the pyproject.toml and uv.lock files, and the suggestion that the submodule linkage is not required.

Since using dlio_benchmark as a "submodule" causes trouble for the end-user (they have to add the --recurse-submodules option when cloning the storage repo), if we can depend upon the pyproject.toml file to pull our DLIO into the user's environment, then it sounds like we should go that direction?

IIUC, that means two things:

the "download and run" instructions for the benchmark must include "pip install uv; uv sync" so that the correct DLIO actually gets pulled down onto the local machine
I can delete the dlio_benchmark "submodule" linkage from the storage repo

Does that sound correct, and is it a good idea?

idevasena · 2026-04-02T20:23:21Z

Thank you, Curtis! Yes, that's correct and a good idea.

Agree with removing the submodule as a right move:
The submodule was the original mechanism (v0.5/v1.0) when users ran git clone --recurse-submodules and then pip install -r dlio_benchmark/requirements.txt. The v2.0 installer already moved to pyproject.toml with pip install -e . pulling DLIO from git. So, the submodule has been there since v2.0 — it just creates confusion (as issue dlio_benchmark submodule is unused? #293 highlighted) and a clone-time issue for users who forget --recurse-submodules and then wonder why the dlio_benchmark/ directory is empty.
To your point on installation instructions:
Yes, but with a nuance. The current README already says pip install -e . which pulls DLIO via the pyproject.toml git dependency. With this PR merged, that dependency correctly points to mlcommons/DLIO_local_changes. So the instructions would be either:

pip path: pip install -e ".[dlio]" (or .[full]) — works today, no uv required
uv path: uv sync --extra dlio — faster, deterministic via the lock file

Can we support both in the README, since we are not sure all users will have uv? Something like below?

# Option A: pip (works everywhere)
pip install -e ".[dlio]"

# Option B: uv (faster, reproducible)
pip install uv
uv sync --extra dlio

The key thing is that DLIO gets pulled transitively either way — the user never needs to think about submodules, separate clones, or branch names.

Yes, right for deleting the submodule. The clean up steps can be:

# Remove the submodule entry from .gitmodules
git rm --cached dlio_benchmark
rm -rf dlio_benchmark
# Delete the .gitmodules file entirely (if dlio_benchmark is the only submodule)
git rm .gitmodules
# Clean up .git/modules
rm -rf .git/modules/dlio_benchmark

One last thing to look for is to make sure no scripts in the repo reference dlio_benchmark/as a local path. Quick check with:

grep -r "dlio_benchmark/" --include="*.py" --include="*.sh" --include="*.yaml" --include="*.md" .

FileSystemGuy · 2026-04-02T21:24:29Z

To dig into one sub-topic: I'm attracted to the idea that uv will ensure all the correct versions of all the python libraries we need are installed. My impression is that pip alone will not do that (ie: uv.lock is not referenced by pip?).

I'm building PR's for the other topics you pointed out.

idevasena · 2026-04-02T22:46:56Z

@FileSystemGuy You're right. pip does not read uv.lock. The lock file is purely a uv artifact. So the two paths aren't equivalent in terms of reproducibility.

Shall we just then make uv sync the official install path in the README and submission guidelines, ensuring benchmark reproducibility. We can mention pip install -e ".[dlio]" as a "works but not guaranteed identical" if needed.

So, the README will then just be updated with below:

pip install uv
uv sync --extra dlio

@russfellows what are your thoughts on using just uv as opposed to pip fallback or pip with constraints as a middle ground?

russfellows · 2026-04-03T05:50:33Z

I did a bit more research. The recommendation is to drop pip entirely, because it doesn’t lock down the runtime properly. That is, go to mandating uv. Here is my Gemini response, which I think is correct:

…

------ Moving to uv is a smart play. In the current Python ecosystem, it’s effectively the "fast-forward" button for dependency management. If your goal is reproducibility—especially for benchmarks where a minor version drift in a library like torch or numpy can skew results—locking to uv.lock is the only way to fly. Here is some advice on how to structure your setup and guide your users. 1. Commit to the "uv-First" Workflow Don't offer pip as a co-equal alternative if reproducibility is your top priority. When you provide two ways to do something, users will pick the one they think they know (pip), and then they'll open issues when their environment doesn't match your benchmark. The "Middle Ground" is a trap. Using pip with constraints is just a manual, slower version of what uv does automatically. If someone is technical enough to run a complex Python benchmark, they are technical enough to install uv. Recommended README Workflow I would structure the installation section like this: Markdown ## Setup We use `uv` for deterministic, reproducible environments. 1. **Install uv**: ```bash curl -LsSf [https://astral.sh/uv/install.sh](https://astral.sh/uv/install.sh) | sh Sync the environment: This creates a virtual environment (.venv) and installs all dependencies (including editable project code) exactly as specified in the lockfile. Bash uv sync --extra dlio Run the code: Always prefix commands with uv run to ensure you are using the locked environment. Bash uv run python your_script.py

--- ## 2. Handling the "Editable" Install You mentioned needing `pip install -e .`. The beauty of `uv sync` is that **it handles this by default.** If your `pyproject.toml` is configured correctly, `uv sync` will automatically install your local package in **editable mode** into the `.venv` it creates. You don't need a separate step for it.

--- ## 4. Addressing your specific inputs * **On `uv.lock`:** You are 100% correct. If a user runs `pip install`, they are effectively ignoring your lockfile and asking `pip` to resolve dependencies from scratch. This is how "it works on my machine" bugs are born. * **On the "Fallback":** Instead of a fallback, list `pip` under a **"Legacy or Alternative Installation"** header at the bottom of the README. Explicitly state: *"Using pip will not honor the lockfile and may result in different library versions than those used in our official benchmarks."* * **The Command:** Your proposed command `uv sync --extra dlio` is perfect. Just ensure your `pyproject.toml` defines `dlio` under `[project.optional-dependencies]`. ## 5. Pro-Tip: The `.python-version` file To make it even more foolproof, include a `.python-version` file in your repo (e.g., containing just `3.11`). When a user runs `uv sync`, it will see that file and automatically download/use the correct Python version, so they don't even have to worry about what version of Python they have installed globally. —— Regards, —Russ

On Apr 2, 2026, at 4:47 PM, Devasena I ***@***.***> wrote: idevasena left a comment (mlcommons/storage#298) <#298?email_source=notifications&email_token=AF64UJ3AAMWP4AVM4IHGLOL4T3UXPA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMJYGA4DONBSGAYKM4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJNLQOJPWG33NNVSW45C7N5YGK3S7MNWGSY3L#issuecomment-4180874200> @FileSystemGuy <https://github.com/FileSystemGuy> You're right. pip does not read uv.lock. The lock file is purely a uv artifact. So the two paths aren't equivalent in terms of reproducibility. Shall we just then make uv sync the official install path in the README and submission guidelines, ensuring benchmark reproducibility. We can mention pip install -e ".[dlio]" as a "works but not guaranteed identical" if needed. So, the README will then just be updated with below: pip install uv uv sync --extra dlio @russfellows <https://github.com/russfellows> what are your thoughts on using just uv as opposed to pip fallback or pip with constraints as a middle ground? — Reply to this email directly, view it on GitHub <#298?email_source=notifications&email_token=AF64UJ3AAMWP4AVM4IHGLOL4T3UXPA5CNFSNUABFM5UWIORPF5TWS5BNNB2WEL2JONZXKZKDN5WW2ZLOOQXTIMJYGA4DONBSGAYKM4TFMFZW63VHNVSW45DJN5XKKZLWMVXHJNLQOJPWG33NNVSW45C7N5YGK3S7MNWGSY3L#issuecomment-4180874200>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AF64UJ56VEOYWYHDBOOVU6L4T3UXPAVCNFSM6AAAAACXIJXRT6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DCOBQHA3TIMRQGA>. You are receiving this because you were mentioned.

Without this change, the user has to add "-extra dlio" to the uv sync command, which they likely will do, so better to download it even if they don't use it than to not get it when they need it.

FileSystemGuy · 2026-04-03T14:43:33Z

Based upon Russ' research I made one change in the PR, I moved the dlio dependency to required rather than optional. Without this change, the user has to add "-extra dlio" to the "uv sync" command, which they likely will forget to do, so better to download it even if they don't use it than to not get it when they need it.

I will make the suggested changes to the README to note that "uv" is the required package manager, including instructions on how to install uv, that pip is explicitly not supported for use with the benchmark, and that "uv run" should be prepended to all use of mlpstorage to ensure that all the correct versions of libraries are used. I'll also see if there's a way to force "uv run" into mlpstorage itself to guarantee that it is used on every benchmark execution. Is this too micromanaging, or is this better ease-of-use and repeatability?

FileSystemGuy · 2026-04-03T15:33:26Z

I took my own advice and decided to fix this once and for all.

I created PR#308 to prepend "uv run" to every invocation of mlpstorage. Whether they used pip or not, uv will force the creation and use of the correct virtual environment. If they haven't installed uv yet, then the command will fail and they will have to install uv.

How's that for a big hammer? :-)

Update python version, DLIO, and uv.lock

b3bfe06

FileSystemGuy requested a review from a team March 31, 2026 20:59

idevasena previously approved these changes Mar 31, 2026

View reviewed changes

Merge branch 'main' into fsg-python-version

23aa82e

FileSystemGuy dismissed idevasena’s stale review via 23aa82e April 2, 2026 21:31

Move dlio dependency to project.dependencies

26bca9e

Without this change, the user has to add "-extra dlio" to the uv sync command, which they likely will do, so better to download it even if they don't use it than to not get it when they need it.

FileSystemGuy added the DLIO or mlpstorage related to code in mlpstorage or dlio label Apr 3, 2026

russfellows approved these changes Apr 3, 2026

View reviewed changes

idevasena approved these changes Apr 3, 2026

View reviewed changes

FileSystemGuy merged commit 83abf2e into main Apr 4, 2026
2 checks passed

FileSystemGuy deleted the fsg-python-version branch April 4, 2026 19:53

github-actions bot locked and limited conversation to collaborators Apr 4, 2026

Conversation

FileSystemGuy commented Mar 31, 2026

Uh oh!

github-actions bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FileSystemGuy commented Mar 31, 2026

Uh oh!

FileSystemGuy commented Mar 31, 2026

Uh oh!

idevasena commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

idevasena commented Apr 2, 2026

Reasoning for failure:

Uh oh!

idevasena commented Apr 2, 2026

To summarize

Uh oh!

FileSystemGuy commented Apr 2, 2026

Uh oh!

idevasena commented Apr 2, 2026

Uh oh!

FileSystemGuy commented Apr 2, 2026

Uh oh!

idevasena commented Apr 2, 2026

Uh oh!

russfellows commented Apr 3, 2026 via email

Uh oh!

FileSystemGuy commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FileSystemGuy commented Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Mar 31, 2026 •

edited

Loading

idevasena commented Apr 2, 2026 •

edited

Loading

FileSystemGuy commented Apr 3, 2026 •

edited

Loading