Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsetting GCS Bucket paths by default #61

Merged
merged 7 commits into from
Jun 13, 2023
Merged

Conversation

priyanka-ganesha
Copy link
Collaborator

@priyanka-ganesha priyanka-ganesha commented Jun 13, 2023

  • Unset GCS Bucket paths in base.yml file and additional checks to make sure it is not empty before training
  • Removed references to setting dcn parallelism value in README
  • Changed hardcoded reference to dataset_path to use config value
  • Edited multihost_dataloading_test.py to have gcs bucket params

@@ -87,6 +87,8 @@ def user_init(raw_keys):
run_name = raw_keys["run_name"]
assert run_name, "Erroring out, need a real run_name"
base_output_directory = raw_keys["base_output_directory"]
assert base_output_directory, "Erroring out, need a real base_output_directory"
assert raw_keys['dataset_path'], "Erroring out, need a real dataset_path"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to also include a link to the readme section for dowloading the dataset here, something like "Erroring out, need a real dataset_path. See instructions for dowloading the c4 dataset here: https://github.com/google/maxtext/blob/main/README.md#getting-started-download-dataset-and-configure."

Copy link
Collaborator

@gobbleturk gobbleturk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a comment suggesting you add a helpful link in the error message

Copy link
Collaborator

@khatwanimohit khatwanimohit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@priyanka-ganesha priyanka-ganesha merged commit b852191 into main Jun 13, 2023
@priyanka-ganesha priyanka-ganesha deleted the prii-unset-buckets branch June 13, 2023 22:13
A9isha pushed a commit that referenced this pull request Apr 11, 2024
* added remaining nightly pax tests

* fix cron scheduling

* added test dependencies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants