reference #530
when served path only /mnt/e/retinanet_checkpoints
ConnectionError: ObjStoreLibStorage preflight failed: cannot reach bucket '/mnt/e/retinanet_checkpoints' via s3dlio at endpoint 'http://:9020'. Underlying error: RuntimeError: list_objects_v2 failed: failed to construct S3 request — check AWS_REGION, AWS_ENDPOINT_URL, and credential environment variables (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY)
- Check AWS_ENDPOINT_URL (current: 'http://:9020').
- Check credentials are valid for that endpoint.
- Check that bucket '/mnt/ecs/retinanet_checkpoints' exists at the endpoint.
File "/root/storage/.venv/lib/python3.12/site-packages/dlio_benchmark/checkpointing/pytorch_obj_store_checkpointing.py", line 90, in get_instance
when served file://mnt/e/retinanet_checkpoints
Error Details
What Happened:
You specified checkpoint folder with file:// protocol (NFS storage)
DLIO used storage.storage_type=s3 globally for all storage
s3dlio tried to parse file:///mnt/e/... as an S3 bucket
URI parser failed - can't extract bucket name from file:// URI
All 32 MPI ranks crashed during initialization
ConnectionError: ObjStoreLibStorage preflight failed: cannot reach bucket 'file:///mnt/ecs/retinanet_checkpoints' via s3dlio at endpoint 'http://[REDACTED_IP]:9020'. Underlying error: RuntimeError: Bucket name cannot be empty in URI: s3://file:///mnt/e/retinanet_checkpoints/
- Check AWS_ENDPOINT_URL (current: 'http://[REDACTED_IP]:9020').
- Check credentials are valid for that endpoint.
- Check that bucket 'file:///mnt/e/retinanet_checkpoints' exists at the endpoint.
reference #530
when served path only /mnt/e/retinanet_checkpoints
ConnectionError: ObjStoreLibStorage preflight failed: cannot reach bucket '/mnt/e/retinanet_checkpoints' via s3dlio at endpoint 'http://:9020'. Underlying error: RuntimeError: list_objects_v2 failed: failed to construct S3 request — check AWS_REGION, AWS_ENDPOINT_URL, and credential environment variables (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY)
File "/root/storage/.venv/lib/python3.12/site-packages/dlio_benchmark/checkpointing/pytorch_obj_store_checkpointing.py", line 90, in get_instance
when served file://mnt/e/retinanet_checkpoints
Error Details
What Happened:
You specified checkpoint folder with file:// protocol (NFS storage)
DLIO used storage.storage_type=s3 globally for all storage
s3dlio tried to parse file:///mnt/e/... as an S3 bucket
URI parser failed - can't extract bucket name from file:// URI
All 32 MPI ranks crashed during initialization
ConnectionError: ObjStoreLibStorage preflight failed: cannot reach bucket 'file:///mnt/ecs/retinanet_checkpoints' via s3dlio at endpoint 'http://[REDACTED_IP]:9020'. Underlying error: RuntimeError: Bucket name cannot be empty in URI: s3://file:///mnt/e/retinanet_checkpoints/