-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unsetting GCS Bucket paths by default #61
Conversation
priyanka-ganesha
commented
Jun 13, 2023
•
edited
Loading
edited
- Unset GCS Bucket paths in base.yml file and additional checks to make sure it is not empty before training
- Removed references to setting dcn parallelism value in README
- Changed hardcoded reference to dataset_path to use config value
- Edited multihost_dataloading_test.py to have gcs bucket params
MaxText/pyconfig.py
Outdated
@@ -87,6 +87,8 @@ def user_init(raw_keys): | |||
run_name = raw_keys["run_name"] | |||
assert run_name, "Erroring out, need a real run_name" | |||
base_output_directory = raw_keys["base_output_directory"] | |||
assert base_output_directory, "Erroring out, need a real base_output_directory" | |||
assert raw_keys['dataset_path'], "Erroring out, need a real dataset_path" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to also include a link to the readme section for dowloading the dataset here, something like "Erroring out, need a real dataset_path. See instructions for dowloading the c4 dataset here: https://github.com/google/maxtext/blob/main/README.md#getting-started-download-dataset-and-configure."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, left a comment suggesting you add a helpful link in the error message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* added remaining nightly pax tests * fix cron scheduling * added test dependencies