[train][Docs] Add Backblaze B2 example + alias B2_APPLICATION_KEY env vars onto AWS_*#63103
Open
goanpeca wants to merge 1 commit into
Open
[train][Docs] Add Backblaze B2 example + alias B2_APPLICATION_KEY env vars onto AWS_*#63103goanpeca wants to merge 1 commit into
goanpeca wants to merge 1 commit into
Conversation
Contributor
|
Warning Gemini encountered an error creating the review. You can try again by commenting |
1546e25 to
c117eb5
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
This PR improves Ray Train’s S3-compatible storage experience (notably Backblaze B2) by ensuring credentials provided via Backblaze’s CLI env var names are made visible to pyarrow’s S3 resolver, and by expanding the docs with a B2-focused example.
Changes:
- Add
_alias_s3_compatible_credentials_to_aws_env_vars()and call it fromget_fs_and_path()to map B2 env vars ontoAWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEYwhen appropriate. - Add unit tests covering aliasing behavior, no-op behavior, and warnings.
- Update Train persistent-storage docs with a Backblaze B2 example, endpoint override guidance, and a link to an end-to-end notebook.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| python/ray/train/_internal/storage.py | Adds credential env var aliasing logic and invokes it during filesystem resolution. |
| python/ray/train/tests/test_storage.py | Adds tests validating the new env var aliasing behavior and logging. |
| doc/source/train/user-guides/persistent-storage.rst | Updates S3-compatible storage docs to include Backblaze B2 guidance and a runnable example link. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
c117eb5 to
74a69f2
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Reviewed by Cursor Bugbot for commit 74a69f2. Configure here.
96cf884 to
8e00f5b
Compare
Signed-off-by: Gonzalo Peña-Castellanos <goanpeca@gmail.com>
8e00f5b to
5c29b68
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Description
Aliases
B2_APPLICATION_KEY_ID/B2_APPLICATION_KEY(the env vars the Backblazeb2CLI uses) ontoAWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEYso pyarrow'sS3FileSystempicks them up when resolvings3://URIs. Without this, users following Backblaze's own docs see credentials silently ignored.Implementation:
_alias_s3_compatible_credentials_to_aws_env_vars()inpython/ray/train/_internal/storage.py, called fromget_fs_and_path(). No-op if any AWS-named var is already set; warns on a partial pair.Also adds a Backblaze B2 example to the S3-compatible storage docs section, linking to a runnable notebook at https://github.com/backblaze-b2-samples/notebooks/tree/main/ray-train-tune-checkpoints.
Related issues
#63104
Additional information
get_fs_and_path()already honorsAWS_ENDPOINT_URL_S3via pyarrow'sFileSystem.from_uri(see https://arrow.apache.org/docs/cpp/env_vars.html); the alias just closes the credential-naming gap. 4 new unit tests inpython/ray/train/tests/test_storage.py. No new dependencies.