Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User config selection should have priority #65

Open
sthartman opened this issue Jun 6, 2022 · 1 comment
Open

User config selection should have priority #65

sthartman opened this issue Jun 6, 2022 · 1 comment
Labels
bug Something isn't working

Comments

@sthartman
Copy link

Hello,
There are currently two ways to set certain configuration variables such as keep_data_order: 1) as a command-line argument, or 2) in the config.json file. However, the way that the train_folder script is set up, changing these variables in the config.json file has no effect, because they will be overridden by the default value of the command-line argument.

When that happens, the program actually over-writes the original config.json to make it look like the variable was never changed there in the first place. Forgetful users (me) will think that was an oversight on their part, edit the config.json file again as desired, and re-run with the same result several times before figuring out what's going on.

There are a couple of ways to avoid this problem. I assume the simplest would be to remove these variables from config.json, leaving the command-line argument as the only way to change them. Or if it's desirable to have these variables be changed from either direction, I think the config.json should have priority over the default value for the command line argument, and if the user explicitly sets a command line argument which conflicts with config.json they should get an error instead of a silent over-write.

@bdecost bdecost added the bug Something isn't working label Jun 6, 2022
@bdecost
Copy link
Collaborator

bdecost commented Jun 6, 2022

Thanks for the feedback! I'm labeling this as a bug, because default commandline arguments should not overwrite explicitly set configuration. Sorry if this has caused some grief.

There are a few aspects to this:

We do want to keep writing the full configuration back to the model directory, so that we can try to maintain reproducibility by recording the full settings. We are using pydantic BaseSettings to manage default settings and validation. There's actually a third way to set configuration variables, you can set them through environment variables and they should take precedence over settings from files.

  • we should not overwrite user config here, but store it in a separate file like here so we can keep track of the configuration that was actually used to generate a given model artifact at a later time

  • we also should make sure any command line interfaces respect user configuration, and are consistent with the default settings in alignn/config.py

I think the preferred priority should be something like

  • explicit command line arguments take priority over configuration read from a file
  • explicit configuration in a json file should take priority over default commandline settings
  • default settings should match between the command line script and the settings model in alignn/config.py
  • I think command line arguments should take priority over environment variables, probably, since arguments are more explicit. (I could set an environment variable in login scripts and override it when invoking a training script)
  • configuration files take precedence over environment variables, I think, since that is the [default priority in pydantic](https://pydantic-docs.helpmanual.io/usage/settings/#field-value-priority BaseSettings)

Happy to hear any feedback on this

  • we should probably print some logging message when a configuration variable is overridden

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants