diff --git a/README.md b/README.md index 78e91a5ef..63b7c477e 100644 --- a/README.md +++ b/README.md @@ -506,6 +506,24 @@ dataset = StreamingDataset(..., max_cache_size="10GB") +
+ ✅ Specify cache directory +  + +Specify the directory where cached files should be stored, ensuring efficient data retrieval and management. This is particularly useful for organizing your data storage and improving access times. + +```python +from litdata import StreamingDataset +from litdata.streaming.cache import Dir + +cache_dir = "/path/to/your/cache" +data_dir = "s3://my-bucket/my_optimized_dataset" + +dataset = StreamingDataset(input_dir=Dir(path=cache_dir, url=data_dir)) +``` + +
+
✅ Optimize loading on networked drives