New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write out YAML file after parsing #835
Conversation
@jchodera @ijpulidos thoughts on the format for the parsed yaml? For example, time = datetime.datetime.now().strftime("%Y-%m-%d-%H:%M:%S")
yaml_parse_name = f"parsed-{time}-{yaml_name}" Also this will be a bit more work than I thought, I tried reading in a parsed yaml and it barfs: Traceback (most recent call last):
File "/home/mmh/miniconda3/envs/perses/bin/perses-relative", line 33, in <module>
sys.exit(load_entry_point('perses', 'console_scripts', 'perses-relative')())
File "/home/mmh/Projects/perses/perses/app/setup_relative_calculation.py", line 687, in run
setup_options = getSetupOptions(yaml_filename)
File "/home/mmh/Projects/perses/perses/app/setup_relative_calculation.py", line 110, in getSetupOptions
setup_options['complex_box_dimensions'] = tuple([float(x) for x in setup_options['complex_box_dimensions']])
TypeError: 'NoneType' object is not iterable since orignal: atom_selection: not water
checkpoint_interval: 5
fe_type: repex
forcefield_files:
- amber/ff14SB.xml
- amber/tip3p_standard.xml
- amber/tip3p_HFE_multivalent.xml
- amber/phosaa10.xml
n_cycles: 50
n_equilibration_iterations: 10
n_states: 3
n_steps_per_move_application: 50
new_ligand_index: 15
old_ligand_index: 14
phases:
- vacuum
pressure: 1.0
save_setup_pickle_as: fesetup_hbonds.pkl
small_molecule_forcefield: openff-1.0.0
solvent_padding: 9.0
temperature: 300.0
timestep: 4.0
trajectory_directory: cdk2_repex_hbonds
trajectory_prefix: cdk2
ligand_file: /home/mmh/Projects/perses/examples/cdk2-example/CDK2_ligands_shifted.sdf
protein_pdb: /home/mmh/Projects/perses/examples/cdk2-example/CDK2_protein.pdb parsed: anneal_1,4s: false
atom_expr: null
atom_selection: not water
bond_expr: null
checkpoint_interval: 5
complex_box_dimensions: null
fe_type: repex
forcefield_files:
- amber/ff14SB.xml
- amber/tip3p_standard.xml
- amber/tip3p_HFE_multivalent.xml
- amber/phosaa10.xml
ionic_strength: 0.15
ligand_file: /home/mmh/Projects/perses/examples/cdk2-example/CDK2_ligands_shifted.sdf
map_strength: null
n_cycles: 50
n_equilibration_iterations: 10
n_states: 3
n_steps_per_move_application: 50
neglect_angles: false
new_ligand_index: 15
nonbonded_method: PME
offline-freq: 10
old_ligand_index: 14
phases:
- vacuum
pressure: 1.0
protein_pdb: /home/mmh/Projects/perses/examples/cdk2-example/CDK2_protein.pdb
protocol-type: default
remove_constraints: false
render_atom_map: true
rmsd_restraint: false
run_type: null
save_setup_pickle_as: fesetup_hbonds.pkl
small_molecule_forcefield: openff-1.0.0
small_molecule_parameters_cache: null
softcore_v2: false
solvent_box_dimensions: null
solvent_padding: 9.0
spectators: null
temperature: 300.0
timestep: 4.0
trajectory_directory: cdk2_repex_hbonds
trajectory_prefix: cdk2 So even if I drop the |
It isn't actually because it is an object, the parsing code doesn't handle the case where if 'complex_box_dimensions' not in setup_options:
setup_options['complex_box_dimensions'] = None
else:
setup_options['complex_box_dimensions'] = tuple([float(x) for x in setup_options['complex_box_dimensions']]) |
So I'll just need to make the parsing logic a bit more robust. |
@mikemhenry I actually think having all the values in the parsed YAML is a good idea for replicability (even the ones inferred by the software). If we don't have all of these in the parsed YAML and if we end up changing the defaults in the future we then would get different results for the same YAML file, which I think is not something we want. Another solution would be to just not try to put these if 'complex_box_dimensions' in setup_options:
setup_options['complex_box_dimensions'] = tuple([float(x) for x in setup_options['complex_box_dimensions']]) or something like that. Of course this means we have to be sure the classes have actual sane and working defaults and this may be a hard task. |
Hmm, actually that code snippet wouldn't work as well, maybe you need to check also that they are not |
As long as we can re-parse the YAML file, that works for now! We'll be replacing all of this functionality when we bring the new API online. |
Yes, I want to do the least amount of work possible on this since it is due for an overhaul. Ready for review now @ijpulidos and @jchodera |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! It works, even though I think this way of dealing with setup options from CLI can be done in a better way. Hopefully something for future releases.
I think we do require a test for this, though. Something that starts a sim from a parsed YAML.
Is it crazy to ask also for a test that verifies that the parsed YAML has all the possible parameters/options? Thinking again about reproducibility. |
I think all we need right now is "good enough" because all this functionality will be replaced in an upcoming milestone. What about creating a new issue to discuss what the new YAML functionality should look like with the new API, and making sure we spec out exactly what behavior we want for that? If we design it right, it will be simple to implement and achieve the desired behavior while not compromising flexibility. Tests could, for example, use introspection to regress over all available options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good enough for now!
Description
Write out YAML file after parsing.
Motivation and context
In order to improve reproducibility and quickly see which options were used in the simulation, it would be nice to be able to inspect the YAML that has all the options set as parsed.
Resolves #817
How has this been tested?
Tested on GHA and locally manually.
Change log