Skip to content

Add support for S3-compatible storage with HTTP redirects (e.g., NVIDIA AIStore) #271

@gaikwadabhishek

Description

@gaikwadabhishek

The MLPerf Storage benchmark (DLIO Upstream) currently uses s3torchconnector for S3-compatible storage backends. However, this library does not support HTTP 307 redirects, which prevents S3-compatible distributed storage systems like NVIDIA AIStore from working out of the box.

Problem

AIStore is fully S3-compatible, but its architecture uses HTTP 307 redirects for load balancing — the proxy node redirects requests to the target node where the data resides. The s3torchconnector used in DLIO does not handle these redirects, causing request failures.

This was also reported upstream in DLIO: argonne-lcf/dlio_benchmark#320

As a workaround, I added native AIStore Python SDK support directly in DLIO and was able to successfully benchmark my cluster. However, this workaround doesn't carry over to the MLPerf Storage benchmark.

Proposal

Two possible approaches:

  1. Support S3 redirects: Update the S3 data loading path to handle HTTP 307 redirects. This would benefit any S3-compatible storage system that uses redirect-based load balancing, not just AIStore.

  2. Allow custom storage plugins: Provide a plugin mechanism so that storage systems like AIStore can register their own data reader (e.g., using the aistore Python SDK) and participate in the benchmark natively. This would make the benchmark more extensible and allow more storage vendors to compete on a level playing field.

Context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions