Skip to content

Introduce a pod-config command#802

Merged
muellerzr merged 13 commits intomainfrom
pod-config
Nov 1, 2022
Merged

Introduce a pod-config command#802
muellerzr merged 13 commits intomainfrom
pod-config

Conversation

@muellerzr
Copy link
Contributor

Introduce a pod-config command to setup TPU pods

What does this add?

This PR introduces a new command accelerate pod-config which will first ask the user for commands they wish to run across all TPU nodes in a GCP TPU cluster, before then running all of them in along with installing accelerate if specified

Who is it for?

Part of #501

Why is it needed?

When dealing with TPU pods, they need to have initial configurations across all of their processes with all the main data being available there and initial setups. This provides an easy tool to do so without having the user need to enter their own startup script, and can be run at any point in time.

What parts of the API does this impact?

User-facing:

Adds a new accelerate pod-config command which can accept both a bash script of commands to run at once, or an unlimited amount of regular commands passed in

Internal structure:

Configs now store both types of commands and users are prompted for these during accelerate config

Basic Usage Example(s):

With an old accelerate config file:

accelerate pod-config --command "echo 'Hello World'" --tpu_zone "us-central1-a" --tpu_name "test-tpu"

With a new one fully configured:

accelerate pod-config

To install Accelerate in the new environment on all processes:

accelerate pod-config --install_accelerate

@muellerzr muellerzr added the enhancement New feature or request label Oct 31, 2022
@muellerzr muellerzr added the TPU Bug or feature on TPU platforms label Oct 31, 2022
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 31, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@pacman100 pacman100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making it easier to configure TPU pods 🤗. Just a nit and a comment, LGTM!

args.accelerate_version = f"accelerate=={args.accelerate_version}"

if not args.command_file and not args.command:
raise ValueError("You must specify either a command file or a command to run on the pod.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can allow for mix and match by appending commands from the args.command_file to the args.command? Would it be better than raising error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With only allowing one vs the other it keeps the API simplistic, as otherwise we then have to worry about when in the order should the command come from in the CLI vs the bash script and that can be confusing to users. However in this particular case the check is just to ensure that you've passed some command to run in :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can keep this idea open though and if users want that we can enable it

Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, thanks a lot for spearheading this!

@muellerzr muellerzr merged commit b816e25 into main Nov 1, 2022
@muellerzr muellerzr deleted the pod-config branch November 1, 2022 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request TPU Bug or feature on TPU platforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants