# Downloading the dataset from s3

The `list_objects_v2` method of `boto3` returns only **up to 1000 files by default** in a single call. If there are more than 1000 files in the bucket, you will need to use pagination to get and download all the files.

### **📌 Explanation of the pagination solution in download_all_files_parallel**
1. **Pagination with `ContinuationToken`:** If the bucket has more than 1000 files, the response will be truncated and a continuation token (`NextContinuationToken`) will be included. This token is used to continue the list of objects on the next call.
2. **Repeated calls to `list_objects_v2`:** As long as there are more files (when `IsTruncated` is `True`), the code keeps making calls with the `ContinuationToken` to get the next set of files.
3. **Batch download:** Files are downloaded in batches, but now it makes sure that **all** files are downloaded correctly.

With this fix, you should now be able to download all files in the bucket, regardless of the amount. 🚀

In [None]:
import sys
import os

# Add the parent directory to the PYTHONPATH
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..')))

from scripts.s3_downloader import download_all_files_parallel

download_all_files_parallel(max_workers=64)

🔹 Total de archivos a descargar: 11749
✅ Downloaded: test_1002.jpg
✅ Downloaded: test_105.jpg
✅ Downloaded: test_1050.jpg
✅ Downloaded: readme.txt
✅ Downloaded: LICENSE.txt
✅ Downloaded: test_1034.jpg
✅ Downloaded: test_1.jpg
✅ Downloaded: test_1038.jpg
✅ Downloaded: .DS_Store
✅ Downloaded: test_1020.jpg
✅ Downloaded: test_1051.jpg
✅ Downloaded: test_1010.jpg
✅ Downloaded: test_1004.jpg
✅ Downloaded: test_1058.jpg
✅ Downloaded: test_1015.jpg
✅ Downloaded: test_1017.jpg
✅ Downloaded: test_103.jpg
✅ Downloaded: test_100.jpg
✅ Downloaded: test_1054.jpg
✅ Downloaded: test_1001.jpg
✅ Downloaded: test_1012.jpg
✅ Downloaded: test_1062.jpg
✅ Downloaded: test_1040.jpg
✅ Downloaded: test_1061.jpg
✅ Downloaded: test_1047.jpg
✅ Downloaded: test_1028.jpg
✅ Downloaded: test_1069.jpg
✅ Downloaded: test_1065.jpg
✅ Downloaded: test_104.jpg
✅ Downloaded: test_1003.jpg
✅ Downloaded: test_1067.jpg
✅ Downloaded: test_1044.jpg
✅ Downloaded: test_1045.jpg
✅ Downloaded: test_106.jpg
✅ Downloaded: test_1070.jp