-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a file for ESPEI inputs. #28
Conversation
- Make output have a default so the dict is filled - Add file regexes - Make the model types required in generate_parameters required, as well as mcmc steps in mcmc. This ensures that users have something to enter if they use these keys.
…settings Still need to update main to properly use get_run_settings rather than argparse
I should mention that the minimal required yaml input files are as follows. The goals, evidenced in the schema, are sensible defaults, but everything customizable. One last thing I really like about this schema approach is that we can keep deprecated keys in schemas and use the library to rename them and provide compatibility and nice error messages so the schema can evolve organically. Parameter selection onlysystem:
phase_models: my-phases.json
datasets: my-input-data
generate_parameters:
excess_model: linear
ref_state: SGTE91 MCMC onlysystem:
phase_models: my-phases.json
datasets: my-input-data
mcmc:
mcmc_steps: 1000
input_db: my_tdb.tdb MCMC from a previous runsystem:
phase_models: my-phases.json
datasets: my-input-data
mcmc:
mcmc_steps: 1000
input_db: my_tdb.tdb
restart_chain: my_previous_chain.npy Full run (from scratch)system:
phase_models: my-phases.json
datasets: my-input-data
generate_parameters:
excess_model: linear
ref_state: SGTE91
mcmc:
mcmc_steps: 1000 |
system_settings = run_settings['system'] | ||
output_settings = run_settings['output'] | ||
generate_parameters_settings = run_settings.get('generate_parameters') | ||
mcmc_settings = run_settings.get('mcmc') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking everything roughly after here should be broken out into a separate public function. It should be possible to construct run_settings
programmatically (without writing an input file to disk), and then call a function which will handle the error checking and constructing the call to fit()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
run_settings
comes from the public get_run_settings
function that takes a dict and validates it against a schema. In main
we determine whether the input file is a YAML or JSON, then deserialize to dict and call get_run_settings
.
Is that what you're thinking?
As far as constructing the call to fit, the next step is to factor out the call to fit
into two functions, one for generating parameters and one for fitting them with mcmc. Then, ideally, we just pass in the mcmc_settings
dict as a **kwargs
and everything is handled internally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to be able to pass my own run_settings
to the fitting routine without it needing to know anything about an input file. I'm thinking about programmatic generation of runs with, e.g., different excess models, different point densities (and different output filenames). It would be convenient to just be able to write a template run settings file, load it, and then modify a dict inside a for
loop to kick off all the runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think we are in agreement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now you could reasonably enumerate all the keyword arguments you want to pass to fit
as a dict. The refactor I proposed would just make it more simple to do that. Ultimately, run_espei.py
should just be a way to run ESPEI from the command line with an input file, but you should in principle be able to do without it if you want to work from a custom script, interactively, with a notebook, or whatever.
For the purposes of this PR, we are just switching from a command line argument interface to a file interface to the fit command. What you've brought up is worth opening an issue for, but since this doesn't introduce any regressions on that front, I think it's out of scope here.
@bocklund By the way, the examples in #28 (comment) would make an excellent addition to the docs! |
Major features:
Validation schema
Almost all of the constraints are handled by the following YAML file (really, anything that can be converted to dict)
This allows us to do to do almost all of the validation, including
Remaining
Still TODO (open new issues):