Skip to content

fix(dlio): strip URI scheme from storage.storage_root (#392)#459

Merged
russfellows merged 2 commits into
mainfrom
FileSystemGuy-issue392
Jun 17, 2026
Merged

fix(dlio): strip URI scheme from storage.storage_root (#392)#459
russfellows merged 2 commits into
mainfrom
FileSystemGuy-issue392

Conversation

@FileSystemGuy

Copy link
Copy Markdown
Contributor

Resolves #392.

Summary

When a user passes --params storage.storage_root=s3://bucket/path, every S3 upload fails because DLIO's obj_store_lib builds URLs with a doubled scheme — s3://s3://bucket/path/.... The reporter saw "nothing in S3 / no bucket created" as a result.

The doubled prefix originates in DLIO's obj_store_lib.ObjStoreLibStorage:

  • storage_factory.py:49 passes args.storage_root straight in as namespace.
  • obj_store_lib.py:159 stores it verbatim.
  • obj_store_lib.py:399-401 unconditionally adds uri_scheme:// when constructing object URIs:
    if '://' in str(id):
        return id
    return f"{self.uri_scheme}://{self.namespace.name}/{id.lstrip('/')}"

So a storage_root like s3://mlperf-data/unet3d plus a key like data/unet3d/train/img_000_of_168.npz produces s3://s3://mlperf-data/unet3d/data/unet3d/train/img_000_of_168.npz. MinIO rejects every put, and Keith's command exits without writing anything.

--object mode happens to avoid this because _apply_object_storage_params sets storage_root = $BUCKET (bare bucket, no scheme). Users who hand-roll --params storage.storage_root=... hit it.

What this PR does

  • Adds DLIOBenchmark._strip_uri_scheme and normalizes storage.storage_root in process_dlio_params so any <scheme>:// prefix is stripped before the override is handed off to DLIO. The bare-bucket form is untouched.
  • Adds parametrized tests covering s3://, s3://bucket/, az://, gs://, and bare-bucket cases, plus the reporter's exact command shape.

The proper fix still belongs in DLIO's obj_store_lib — once the pinned fork is patched and pyproject.toml:127 is bumped, this normalization becomes a harmless no-op, since the normalized input is also valid input.

Test plan

  • uv run pytest tests/unit -q → 1372 passed, 4 skipped.
  • Manual: rerun the reporter's mlpstorage training datagen --model unet3d --num-processes 1 --params storage.storage_library=s3dlio --params storage.storage_type=s3 --params storage.storage_root=s3://mlperf-data/unet3d against MinIO and confirm objects land at s3://mlperf-data/unet3d/data/unet3d/train/img_*.npz (single s3://).

DLIO's obj_store_lib treats storage_root as a bare bucket/prefix and
unconditionally prepends uri_scheme when building object URIs, so a
user-supplied --params storage.storage_root=s3://bucket/path produced
malformed URLs like s3://s3://bucket/path/... and every upload failed.

Normalize storage.storage_root in process_dlio_params: if the value
carries a <scheme>://, strip it before handing off to DLIO. The bare
bucket form is unchanged. The proper fix still belongs in DLIO; this
mitigation makes the reporter's command succeed against the current pin.
@FileSystemGuy FileSystemGuy requested a review from a team June 16, 2026 21:01
@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@russfellows russfellows left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving per Curtis.

@russfellows russfellows merged commit 3961bc3 into main Jun 17, 2026
3 checks passed
@russfellows russfellows deleted the FileSystemGuy-issue392 branch June 17, 2026 16:27
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 17, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Datagen to S3 fails using s3dlio

2 participants