-
Notifications
You must be signed in to change notification settings - Fork 364
Small improvements to naming / structure in input_pipeline_interface.py #1845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
99e1569
to
14c16f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe name this module as synthethic_data_processing.py
8728f1a
to
de5ef9c
Compare
a4bc416
to
dd92595
Compare
assert config.packing, "c4_mlperf dataloader only works with packing. For padded version, use tfds dataloader" | ||
train_iterator, eval_iterator = dataset_type_to_train_eval_iterator[config.dataset_type] | ||
else: | ||
max_logging.log(f"WARNING: '{config.dataset_type}' is not a supported dataset type." \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When user specify dataset_type=synthetic, this WARNING message will be confusing. We can exclude that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When dataset_type=synthetic, it should not incur this warning message as it exits in line 64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Sorry I missed that. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Nuojin!
assert config.packing, "c4_mlperf dataloader only works with packing. For padded version, use tfds dataloader" | ||
train_iterator, eval_iterator = dataset_type_to_train_eval_iterator[config.dataset_type] | ||
else: | ||
max_logging.log(f"WARNING: '{config.dataset_type}' is not a supported dataset type." \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Sorry I missed that. Thanks!
dd92595
to
58af99b
Compare
Description
TL;DR: Re-organize input_pipeline_interphase.py for better readability
Following changes are made:
FIXES: b/421596013
Tests
Checklist
Before submitting this PR, please make sure (put X in square brackets):