-
Notifications
You must be signed in to change notification settings - Fork 45.4k
Unified arg parser #3574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unified arg parser #3574
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great-- really looking forward to having fewer instances of --multi_gpu defined across the code. Some comments toward simplification, but looking good.
| @@ -0,0 +1,14 @@ | |||
| # Copyright 2017 The TensorFlow Authors. All Rights Reserved. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it matters one way or the other, but a quick check of TF proper implies that they only include the license in init files if there is actual code in them, otherwise empty. Fine to leave empty in this case, probably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
| # limitations under the License. | ||
| # ============================================================================== | ||
|
|
||
| from .parsers import LocationParser, DeviceParser, SupervisedParser, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's stick with absolute imports. Both for pep8 and for the sake of rules that convert this code internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Fixed.
| """ | ||
|
|
||
| import argparse | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can collapse some of these categories-- all will need location, device, supervised-- those can be lumped into ModelParser or something, with all args True by default. Here we have tiny groups, and also the ability to turn things on and off one by one, which seems redundant. If we collapse those three, then remove the need to turn each option on individually, you substantially reduce the overall code required in resnet above, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I left the default as False for the secondary classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go ahead and make default=True for those as well. I would prefer to reduce the total amount of arg_parser code that users have to read through when looking at Resnet.
|
|
||
| if data_dir: | ||
| self.add_argument( | ||
| "--data_dir", "-dd", default="/tmp", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe /tmp/model-data or something like that? Seems like just /tmp will never be a good guess. Not that model-data is though. So maybe just an empty string? Ditto for model dir below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm expecting defaults to be overridden. I am inclined to either use "/tmp" or an explicitly invalid string like "data_directory_path"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See below.
| if data_dir: | ||
| self.add_argument( | ||
| "--data_dir", "-dd", default="/tmp", | ||
| help="[default: %(default)s] The location of the input data.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe comment on some of the arg parser magic here? What is the default templating, and metavar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to the docstring.
|
|
||
| if inter_op: | ||
| self.add_argument( | ||
| "--inter_op_parallelism_threads", "-inrt", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inrt is really hard for me to parse as an abbreviation. Maybe just inter/intra for these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
|
||
|
|
||
| class DummyParser(argparse.ArgumentParser): | ||
| """Default parser for specification of dummy/mocked behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be combined with performance above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
|
||
| if use_synthetic_data: | ||
| self.add_argument( | ||
| "--use_synthetic_data", "-usd", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-synth?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
| ) | ||
|
|
||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: too many \ns at the end here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
official/resnet/resnet.py
Outdated
| 'for generally homogeneous data sets, should be approximately the ' | ||
| 'number of available CPU cores.') | ||
| super(ResnetArgParser, self).__init__(parents=[ | ||
| official.utils.arg_parsers.LocationParser(data_dir=True, model_dir=True), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import logic seems off here-- shouldn't this be from official.utils import arg_parsers above, then just arg_parsers.... here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the import logic of official module is like @karmel mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, it looks like from foo.bar import baz is allowed by the style guide. I will certainly change this.
official/resnet/resnet.py
Outdated
| 'for generally homogeneous data sets, should be approximately the ' | ||
| 'number of available CPU cores.') | ||
| super(ResnetArgParser, self).__init__(parents=[ | ||
| official.utils.arg_parsers.LocationParser(data_dir=True, model_dir=True), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since each of line is kind of long, can we just import official.utils.arg_parsers as parsers, and use it has parsers.LocationParser?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto on above.
|
|
||
| """Collection of parsers which are shared among the official models. | ||
| The parsers in this module are intended to be used as parents to all arg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indent in the beginning of this line is not necessary.
| import argparse | ||
|
|
||
|
|
||
| class LocationParser(argparse.ArgumentParser): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming of this class is bit confusing. This flag is trying to specify the file path, I think we should be more explicit here, eg DataPathParser or something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moot point now, but I'll keep that in mind for the future.
| model_dir: Create a flag for specifying the model file directory. | ||
| """ | ||
|
|
||
| def __init__(self, add_help=False, data_dir=False, model_dir=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
param data_dir and model_dir give the impression that they should should be a path or string type, instead of boolean. Probably rename them into add_data_dir and add_model_dir.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we should add "output_dir" here to specify the flag of the output directory for hooks logging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered suffixing them with "_flag" (i.e. data_dir_flag=True, ...) but it made things look awkward and long. I am open to suggestions though.
| ) | ||
|
|
||
|
|
||
| class DeviceParser(argparse.ArgumentParser): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class only contains one flag, which is not necessary to wrap. Are you intend to add more flags into this parser?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| ) | ||
|
|
||
|
|
||
| class SupervisedParser(argparse.ArgumentParser): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also a bit more explicit naming is better here, eg SupervisedLearningParser or CommonSupervisedLearningParser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto on above.
| ) | ||
|
|
||
|
|
||
| class DummyParser(argparse.ArgumentParser): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DummyParse gives the impression that it does nothing, whereas it is actually doing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe DefaultParser?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto on above.
| def __init__(self, add_help=False, data_dir=False, model_dir=False): | ||
| super(LocationParser, self).__init__(add_help=add_help) | ||
|
|
||
| if data_dir: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since those are not random string and should be a file path, should we do some validation and make sure they exists?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the arg parser is the place for that, since a model may choose some behavior (i.e. prompt for a specific script to be run or download automatically) on a case by case basis.
|
|
||
| assert namespace.multi_gpu | ||
| assert namespace.use_synthetic_data | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do u also want to test the case that the flag value is specified while the arg parser is not turned on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This turns out to not be practical. When parse_args fails, argparse fails hard by calling sys.exit(2) rather than raising an exception. While I certainly could invoke subprocess and assert an exit code of 2, this seems like too much machinery for a test that is just intended to show off parse_args().
c7eaae0 to
792b89f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
| """ | ||
|
|
||
| import argparse | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go ahead and make default=True for those as well. I would prefer to reduce the total amount of arg_parser code that users have to read through when looking at Resnet.
the new arg parsers. add parser unittests condense classes and make some style cleanups.
792b89f to
4fe8648
Compare
This PR breaks various groups of common arguments into their own argparse class to enable some degree of standardization among the official models. The resnet argparser is replaced as an example.