Skip to content

Commit

Permalink
Update README with config for MinIO (#174)
Browse files Browse the repository at this point in the history
* Update README with config for MinIO

* Update README.md with Support S3-Compatible Object Storage section
  • Loading branch information
bhimrazy committed Jun 17, 2024
1 parent 5ff7ae4 commit d5eff39
Showing 1 changed file with 31 additions and 1 deletion.
32 changes: 31 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ dataloader = StreamingDataLoader(dataset)
# Key Features

- [Multi-GPU / Multi-Node Support](#multi-gpu--multi-node-support)
- [Subsample and split your datasets](#access-any-item)
- [Subsample and split your datasets](#subsample-and-split-your-datasets)
- [Access any item](#access-any-item)
- [Use any data transforms](#use-any-data-transforms)
- [The Map Operator](#the-map-operator)
Expand All @@ -131,6 +131,7 @@ dataloader = StreamingDataLoader(dataset)
- [Reduce your memory footprint](#reduce-your-memory-footprint)
- [Configure Cache Size Limit](#configure-cache-size-limit)
- [On-Prem Optimizations](#on-prem-optimizations)
- [Support S3-Compatible Object Storage](#support-s3-compatible-object-storage)


## Multi-GPU / Multi-Node Support
Expand Down Expand Up @@ -367,6 +368,35 @@ from litdata import StreamingDataset
dataset = StreamingDataset(input_dir="local:/data/shared-drive/some-data")
```

## Support S3-Compatible Object Storage

Integrate S3-compatible object storage servers like [MinIO](https://min.io/) with litdata, ideal for on-premises infrastructure setups. Configure the endpoint and credentials using environment variables or configuration files.

Set up the environment variables to connect to MinIO:

```bash
export AWS_ACCESS_KEY_ID=access_key
export AWS_SECRET_ACCESS_KEY=secret_key
export AWS_ENDPOINT_URL=http://localhost:9000 # MinIO endpoint
```

Alternatively, configure credentials and endpoint in `~/.aws/{credentials,config}`:

```bash
mkdir -p ~/.aws && \
cat <<EOL >> ~/.aws/credentials
[default]
aws_access_key_id = access_key
aws_secret_access_key = secret_key
EOL

cat <<EOL >> ~/.aws/config
[default]
endpoint_url = http://localhost:9000 # MinIO endpoint
EOL
```
Explore an example setup of litdata with MinIO in the [LitData with MinIO](https://github.com/bhimrazy/litdata-with-minio) repository for practical implementation details.

# Benchmarks

In order to measure the effectiveness of LitData, we used a commonly used dataset for benchmarks: [Imagenet-1.2M](https://www.image-net.org/) where the training set contains `1,281,167 images`.
Expand Down

0 comments on commit d5eff39

Please sign in to comment.