You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
✅ **Instant access:** Start streaming immediately without preprocessing.
103
+
✅ **Zero setup time:** No data conversion or optimization required.
104
+
✅ **Native format:** Work with original file formats (images, text, etc.).
105
+
✅ **Flexible processing:** Apply transformations on-the-fly during streaming.
106
+
✅ **Cloud-native:** Stream directly from S3, GCS, or Azure storage.
107
+
108
+
## Option 2: Optimize for maximum performance ⚡⚡⚡
82
109
Accelerate model training (20x faster) by optimizing datasets for streaming directly from cloud storage. Work with remote data without local downloads with features like loading data subsets, accessing individual samples, and resumable streaming.
83
110
84
-
**Step 1: Optimize the data**
85
-
This step will format the dataset for fast loading. The data will be written in a chunked binary format.
111
+
**Step 1: Optimize your data (one-time setup)**
112
+
113
+
Transform raw data into optimized chunks for maximum streaming speed.
114
+
This step formats the dataset for fast loading by writing data in an efficient chunked binary format.
86
115
87
116
```python
88
117
import numpy as np
@@ -91,24 +120,24 @@ import litdata as ld
91
120
92
121
defrandom_images(index):
93
122
# Replace with your actual image loading here (e.g., .jpg, .png, etc.)
94
-
#(recommended to pass as compressed formats like JPEG for better storage and optimized streaming speed)
95
-
# You can also apply resizing or reduce image quality to further increase streaming speed and save space.
123
+
#Recommended: use compressed formats like JPEG for better storage and optimized streaming speed
124
+
# You can also apply resizing or reduce image quality to further increase streaming speed and save space
0 commit comments