[WIP] Revision to Thermo target #61

leeping · 2014-04-04T08:49:28Z

Progress and notes:

Parser for new data format (100% done)
- Multiple files will be read into a single DataFrame.
- The "system index" specifies an experimental data set and corresponding simulations (topology, initial conditions, simulation settings and thermodynamic ensemble).
Create Observable and Simulation objects from user input (100% done)
- A single system index may correspond to multiple simulations to be executed.
- For example, simulating the heat of vaporization would require gas and liquid simulations.
- Simulating the density would also require running the liquid simulation.
- Parallelize across system indices and independent initial conditions.
- Can also parallelize across multiple simulations within a system index if desired (currently performed as a chain).
- Some observables are not uniquely mapped to simulations (e.g. density can come from a liquid or a solid).
- Furthermore, the required simulations are not determined automatically from the user-specified observables, because the method for calculating the observable depends on the type of simulation (e.g. compressibility may be calculated for liquids, solids, and bilayers).
- Thus, the input file must specify both the observables to be calculated and the simulations to be run.
- Restriction: An error will be thrown if more than one simulation name is provided that can calculate a specified observable. Thus, if the density is specified as an observable, either the liquid or solid simulation must be specified but not both.
- How to make this more flexible in the future? Perhaps the column heading can contain the system name such as solid_density or liquid_density
- In order to calculate some timeseries (e.g. deuterium order parameter), the Observable class needs to pass some information to the Simulation. Need to figure out how to do this right.
Specify all simulation options in input file parser (50% done)
- Default settings may apply to all simulations (e.g. eq_steps, md_steps, timestep).
- If initial conditions are specified in the input file, it should override the default search path for initial coordinate files.
Time series class; Split get_timeseries() from molecular_dynamics() (60% done)
- Represents a time series of instantaneous observables; possibly subclass DataFrame.
- OpenMM saves observables to memory as the simulation is run, so the names of needed timeseries must be saved as Engine attributes.
- On the other hand, GROMACS generates all observables in a post-processing step, so the names of needed timeseries don't need to be stored.
- New observables may require new timeseries to be implemented here.
- Certain timeseries may only be available for some engines (e.g. quantum kinetic energy estimator from OpenMM).
Run simulations and save time series to disk. This can be done using md_chain.py (i.e. a chain of simulations for a particular index), or md_one.py (i.e. independent simulations) (50% done)
- Replacement for npt.py and npt_lipid.py
- Should md_chain.py and md_one.py use the same file and directory structure? Need to make sure output from md_one.py is properly named - or put results from md_one.py into different folders.
- Energy / dipole derivatives are calculated here, also as a time series.
Apply MBAR estimator for grouped system indices
- Applying MBAR estimator across system indices with different molecules makes no sense.
Calculate observables from time series. (25% done)
- Store a dictionary of time series, keyed by the system index and the simulation name.
- Formulas for calculating observables and their derivatives from time series are implemented here.
- Observables may require time series from multiple simulations (e.g. heat of vaporization).
- Observables will still be calculated if experimental data is missing (because it's nice to have a full table of predicted values), but they won't go into the objective function.
- If experimental data is very sparse then we shouldn't put them in the same Target anyway.
Multiple independent initial conditions (50% done)
- How to organize? I propose targets/target_name/system_index/simulation_name_#.[gro|pdb|xyz] numbered from 1. Multiple files are best because PDB format often doesn't update the periodic box across different structures.
- If only one initial condition, then _# not needed.
The remote scripts md_one.py and md_chain.py should have ways to calculate all observables that they are able to calculate (as an additional way to check consistency)
Map abbreviated units to full units
XML format parser
Added unit tests
- Read multiple ways of specifying lipid data and check that the data tables are the same.

ebran · 2014-04-05T11:47:50Z

Hi Lee-Ping,

This looks very promising! I will be travelling for a week, but I will
take a close look when I am back at work.

Best,
Erik

…liary files (.top, .mdp) and run the simulation.

Conflicts: src/observable.py src/parser.py src/thermo.py studies/004_thermo/single.in

…erry pick over to main).

Conflicts: src/abinitio.py src/observable.py

…mulation settings)

leeping added 3 commits March 31, 2014 21:38

Begin new data table parsing

2d0e90d

Merge branch 'master' of github.com:leeping/forcebalance into thermo2

f9ef833

Implemented tri-format parser (broke Thermo)

9d939ac

leeping and others added 10 commits April 6, 2014 07:29

Merge branch 'master' of github.com:leeping/forcebalance into thermo2

0a6b6f5

Fix up exception handling

902264c

Added file referencing in parser and build DataFrame

0baea9b

Added unit test for data file parsing (lipid)

e0c3785

Clean up

3d1f598

Create list of Ensembles and table of Observable objects

66f82df

Map observable names to required simulations

e3676f2

Start modifying framework to require user input simulations.

2b6df36

Observable and simulation setup should be working correctly now

afd8485

Clean up

efdb842

leeping mentioned this pull request Apr 8, 2014

Discuss Options for Organizing Experimental Data #56

Closed

leeping and others added 15 commits April 16, 2014 14:35

Target now knows which simulations to launch. Next task: Pass in auxi…

132e70a

…liary files (.top, .mdp) and run the simulation.

Merge branch 'thermo2' of github.com:leeping/forcebalance into thermo2

0f24e9a

Conflicts: src/observable.py src/parser.py src/thermo.py studies/004_thermo/single.in

A few changes for energy/force matching and Q-Chem output parsing (ch…

df46c9a

…erry pick over to main).

Improvements for energy/force and frequency matching

adc1ff4

Merge branch 'master' of github.com:leeping/forcebalance into thermo2

ef425c7

Conflicts: src/abinitio.py src/observable.py

Work in progress

7726de9

Merge branch 'thermo2' of github.com:leeping/forcebalance into thermo2

1287bb6

Reduce the amount of printout

42eb6a2

Work in progress

f7ab585

Added simulation.py which contains simulation class (container for si…

88b1f15

…mulation settings)

md_one.py creates Engine object.

0cb3e61

md_one.py runs molecular dynamics!

c256ce0

Work on extracting timeseries

a8ff3bb

Clean up

537ed48

Clean up

3445751

Lee-Ping Wang and others added 3 commits April 23, 2014 23:17

Density observable can calculate the density and gradient

63cae2f

Fix occasional failure in topology building

2af905f

Merge branch 'thermo2' of github.com:leeping/forcebalance into thermo2

d305330

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Revision to Thermo target #61

[WIP] Revision to Thermo target #61

leeping commented Apr 4, 2014

ebran commented Apr 5, 2014

[WIP] Revision to Thermo target #61

Are you sure you want to change the base?

[WIP] Revision to Thermo target #61

Conversation

leeping commented Apr 4, 2014

ebran commented Apr 5, 2014