Argparsing #30
Comments
I dislike the configuration files, they make it harder to do e.g. a grid search from the shell. I was actually planning to move some things from the configuration file to command-line arguments, such as the learning rate and dropouts. I think multiple write script is the best solution, iff as much code is shared between them as possible. |
Agreed.
Maybe we could still dump the arguments as a metadata file alongside the graph. |
I haven't looked whether Tensorflow offers the option to store metadata in graphs, then the graph could carry the metadata. I guess that in the worst case, metadata could just be stored in a const string tensor. |
I'd prefer a simple toml file as it's accessible from the command line and can be read without loading the graph into memory. Although that means an additional file to carry around. |
The actual graph is very small. The benefit of embedding it in the graph is, that you always have the metadata. With a separate TOML file, you either have to specify an extra option to write the metadata; or you won't have the metadata. |
With the addition of the transformer architecture, we will have three model types, each with their own set of hyperparameters.
Currently, arguments are disambiguated by their names, (
--rnn_layers
) or by the help string (number of dilated convolution levels
). With the new model, this is becoming increasingly confusing, it also raises the question for which model type the defaults apply.I see a few options to make things clearer:
1. subcommands to determine the model type:
+
all model specific arguments that don't belong to the selected one are inactive+
model specific--help
+
easy to access via a shell script to run some experiments-
onlywrite-graph --help
+write-graph <MODEL_TYPE> --help
will give the full information-
arguments are not persistent2. read the hyperparameters from a toml file:
+
structure unambiguous+
parsing handled by a library+
adds persistent metadata to pre-trained models+
manipulation for experiments through e.g. python script-
slightly more difficult to automate experiments-
need to handle many config files3. read all hyperparameters from
config.py
+
no parsing+
persistent storage-
config.py
is loaded viaimport
, no simple switching between configs-
more difficult to automate experiments-
need to handle many config files4. specific
write-graph
scripts for each model type+
clear separation+
viable for both reading configs from a file or using argparseIf we stick with a single
write-graph
, I'd prefer aconfig.toml
(2.). If we switch to multiple write-scripts, sticking to command line arguments should be fine, although, saving the training parameters would be a nice feature. What do you think?The text was updated successfully, but these errors were encountered: