Skip to content
This repository has been archived by the owner on Mar 19, 2021. It is now read-only.

Argparsing #30

Closed
twuebi opened this issue May 13, 2019 · 5 comments
Closed

Argparsing #30

twuebi opened this issue May 13, 2019 · 5 comments

Comments

@twuebi
Copy link
Collaborator

twuebi commented May 13, 2019

With the addition of the transformer architecture, we will have three model types, each with their own set of hyperparameters.

Currently, arguments are disambiguated by their names, (--rnn_layers) or by the help string (number of dilated convolution levels). With the new model, this is becoming increasingly confusing, it also raises the question for which model type the defaults apply.

I see a few options to make things clearer:

1. subcommands to determine the model type:

+ all model specific arguments that don't belong to the selected one are inactive
+ model specific --help
+ easy to access via a shell script to run some experiments

- only write-graph --help + write-graph <MODEL_TYPE> --help will give the full information
- arguments are not persistent

2. read the hyperparameters from a toml file:

+ structure unambiguous
+ parsing handled by a library
+ adds persistent metadata to pre-trained models
+ manipulation for experiments through e.g. python script

- slightly more difficult to automate experiments
- need to handle many config files

3. read all hyperparameters from config.py

+ no parsing
+ persistent storage

- config.py is loaded via import, no simple switching between configs
- more difficult to automate experiments
- need to handle many config files

4. specific write-graph scripts for each model type

+ clear separation
+ viable for both reading configs from a file or using argparse


If we stick with a single write-graph, I'd prefer a config.toml (2.). If we switch to multiple write-scripts, sticking to command line arguments should be fine, although, saving the training parameters would be a nice feature. What do you think?

@danieldk
Copy link
Member

I dislike the configuration files, they make it harder to do e.g. a grid search from the shell. I was actually planning to move some things from the configuration file to command-line arguments, such as the learning rate and dropouts.

I think multiple write script is the best solution, iff as much code is shared between them as possible.

@twuebi
Copy link
Collaborator Author

twuebi commented May 15, 2019

I think multiple write script is the best solution, iff as much code is shared between them as possible.

Agreed.

I dislike the configuration files, they make it harder to do e.g. a grid search from the shell. I was actually planning to move some things from the configuration file to command-line arguments, such as the learning rate and dropouts.

Maybe we could still dump the arguments as a metadata file alongside the graph.

@danieldk
Copy link
Member

I haven't looked whether Tensorflow offers the option to store metadata in graphs, then the graph could carry the metadata. I guess that in the worst case, metadata could just be stored in a const string tensor.

@twuebi
Copy link
Collaborator Author

twuebi commented May 15, 2019

I'd prefer a simple toml file as it's accessible from the command line and can be read without loading the graph into memory. Although that means an additional file to carry around.

@danieldk
Copy link
Member

The actual graph is very small. The benefit of embedding it in the graph is, that you always have the metadata. With a separate TOML file, you either have to specify an extra option to write the metadata; or you won't have the metadata.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

2 participants