Skip to content

[train][Docs] Add Backblaze B2 example + alias B2_APPLICATION_KEY env vars onto AWS_*#63103

Open
goanpeca wants to merge 1 commit into
ray-project:masterfrom
goanpeca:add-b2-integration
Open

[train][Docs] Add Backblaze B2 example + alias B2_APPLICATION_KEY env vars onto AWS_*#63103
goanpeca wants to merge 1 commit into
ray-project:masterfrom
goanpeca:add-b2-integration

Conversation

@goanpeca
Copy link
Copy Markdown

@goanpeca goanpeca commented May 4, 2026

Description

Aliases B2_APPLICATION_KEY_ID / B2_APPLICATION_KEY (the env vars the Backblaze b2 CLI uses) onto AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY so pyarrow's S3FileSystem picks them up when resolving s3:// URIs. Without this, users following Backblaze's own docs see credentials silently ignored.

Implementation: _alias_s3_compatible_credentials_to_aws_env_vars() in python/ray/train/_internal/storage.py, called from get_fs_and_path(). No-op if any AWS-named var is already set; warns on a partial pair.

Also adds a Backblaze B2 example to the S3-compatible storage docs section, linking to a runnable notebook at https://github.com/backblaze-b2-samples/notebooks/tree/main/ray-train-tune-checkpoints.

Related issues

#63104

Additional information

get_fs_and_path() already honors AWS_ENDPOINT_URL_S3 via pyarrow's FileSystem.from_uri (see https://arrow.apache.org/docs/cpp/env_vars.html); the alias just closes the credential-naming gap. 4 new unit tests in python/ray/train/tests/test_storage.py. No new dependencies.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

@goanpeca goanpeca force-pushed the add-b2-integration branch 2 times, most recently from 1546e25 to c117eb5 Compare May 4, 2026 15:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves Ray Train’s S3-compatible storage experience (notably Backblaze B2) by ensuring credentials provided via Backblaze’s CLI env var names are made visible to pyarrow’s S3 resolver, and by expanding the docs with a B2-focused example.

Changes:

  • Add _alias_s3_compatible_credentials_to_aws_env_vars() and call it from get_fs_and_path() to map B2 env vars onto AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY when appropriate.
  • Add unit tests covering aliasing behavior, no-op behavior, and warnings.
  • Update Train persistent-storage docs with a Backblaze B2 example, endpoint override guidance, and a link to an end-to-end notebook.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
python/ray/train/_internal/storage.py Adds credential env var aliasing logic and invokes it during filesystem resolution.
python/ray/train/tests/test_storage.py Adds tests validating the new env var aliasing behavior and logging.
doc/source/train/user-guides/persistent-storage.rst Updates S3-compatible storage docs to include Backblaze B2 guidance and a runnable example link.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread python/ray/train/_internal/storage.py Outdated
Comment thread python/ray/train/_internal/storage.py
Comment thread doc/source/train/user-guides/persistent-storage.rst Outdated
Comment thread python/ray/train/_internal/storage.py Outdated
Comment thread python/ray/train/_internal/storage.py Outdated
@goanpeca goanpeca force-pushed the add-b2-integration branch from c117eb5 to 74a69f2 Compare May 4, 2026 16:36
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 74a69f2. Configure here.

Comment thread python/ray/train/_internal/storage.py
@goanpeca goanpeca force-pushed the add-b2-integration branch 2 times, most recently from 96cf884 to 8e00f5b Compare May 4, 2026 16:55
@ray-gardener ray-gardener Bot added docs An issue or change related to documentation train Ray Train Related Issue community-contribution Contributed by the community labels May 4, 2026
Signed-off-by: Gonzalo Peña-Castellanos <goanpeca@gmail.com>
@goanpeca goanpeca force-pushed the add-b2-integration branch from 8e00f5b to 5c29b68 Compare May 14, 2026 23:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community docs An issue or change related to documentation train Ray Train Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants