# Functional goals for 2019 Q2

This notebook allows interactive exploration of the functionality targeted for the GROMACS master branch in 2019 Q2. Note that there isn't a great way to use a Jupyter notebook as the front-end to an MPI job. The notebook can be converted to a script and run non-interactively.

    jupyter nbconvert RequiredFunctionality.ipynb --to python
    python RequiredFunctionality.py
    # or
    # mpiexec -n 2 python -m mpi4py RequiredFunctionality.py

Before committing changes to this notebook, clear the output and/or run `python strip_notebook.py RequiredFunctionality.py`

In [None]:
# Prepare notebook environment.
import gmxapi as gmx

## Tests

gmx.make_operation wraps importable Python code.
<!-- 24 January -->

In [None]:
# make_operation is a utility used to create the `noop` module attribute
# in the importable `test_support` module that copies its `data` input to its output.
from test_support import noop
op = noop(input={'data': True})
assert op.output.data.extract() == True
op = noop(input={'data': False})
assert op.output.data.extract() == False

gmx.make_operation produces output proxy that establishes execution dependency
<!-- 1 February -->

In [None]:
# the `count` operation copies its `data` input to its output
# and increments its `count` input to its output.
from test_support import count
op1 = count(input={'data': True})
op2 = count(input=op1.output)
op2.run()
# TBD: how to introspect execution dependency or not-yet-executed status?
# To allow introspection in testing, we might use module global data for operations executed in the local process.

gmx.make_operation produces output proxy that can be used as input
<!-- 5 February -->

In [None]:
op1 = count(input={'data': False, 'count': 0})
op2 = count(input=op1.output)
assert op2.output.count.extract() == 2

gmx.make_operation uses dimensionality and typing of named data to generate correct work topologies
<!-- 8 February -->

gmx.gather allows explicit many-to-one or many-to-many data flow
<!-- 15 February -->

gmx.reduce helper simplifies expression of operations dependent on gather
<!-- 15 February -->

gmx.commandline_operation provides utility for wrapping command line tools
<!-- 15 February -->

gmx.commandline_operation produces operations that can be executed in a dependency graph.
<!-- 15 February -->

gmx.mdrun uses bindings to C++ API to launch simulations
<!-- 22 February -->

gmx.mdrun understands ensemble work
<!-- 22 February -->

*gmx.mdrun supports interface for binding MD plugins*
(requires interaction with library development)
<!-- 1 March -->

In [None]:
import sample_restraint

starting_structure = 'input_conf.gro'
topology_file = 'input.top'
run_parameters = 'params.mdp'

initial_tpr = gmx.commandline_operation(
    'gmx',
    'grompp',
    input={
        '-f': run_parameters,
        '-c': starting_structure,
        '-p': topology_file
    },
    output={'-o': gmx.OutputFile('.tpr')})

simulation_input = gmx.read_tpr(initial_tpr.output.file['-o'])

# Prepare a simple harmonic restraint between atoms 1 and 4
restraint_params = {'sites': [1, 4],
                    'R0': 2.0,
                    'k': 10000.0}

restraint = sample_restraint.harmonic_restraint(input=restraint_params)

md = gmx.mdrun(input=simulation_input, potential=sample_restraint)

md.run()

gmx.subgraph fuses operations
<!-- 1 March -->

gmx.while creates an operation wrapping a dynamic number of iterations of a subgraph
<!-- 1 March -->

gmx.logical_* operations allow optimizable manipulation of boolean values
<!-- 8 March -->

gmx.read_tpr utility provides access to TPR file contents
<!-- 22 February -->

gmx.read_tpr operation produces output consumable by gmx.mdrun
<!-- 22 February -->

gmx.mdrun produces gromacs.read_tpr node for tpr filename kwargs
<!-- 22 February -->

gmx.mdrun is properly restartable
<!-- 22 February -->

gmx.run finds and runs operations to produce expected output files
<!-- 8 March -->

gmx.run handles ensemble work topologies
<!-- 8 March -->

gmx.run handles multi-process execution
<!-- 8 March -->

gmx.run safety checks to avoid data loss / corruption
<!-- 8 March -->

*gmx.run conveys run-time parameters to execution context*
(requires interaction with library development)
<!-- 15 March -->

In [None]:
gmx.run(work, tmpi=20, grid=gmx.NDArray([3, 3, 2]), ntomp_pme=1, npme=2, ntomp=1)

*gmx.modify_input produces new (tpr) simulation input in data flow operation*
(requires interaction with library development)
<!-- 1 March -->

In [None]:
initial_input = gmx.read_tpr([tpr_filename for _ in range(10)])
tau_t = list([i/10. for i in range(10)])
param_sweep = gmx.modify_input(input=initial_input,
                               parameters={ 
                                   'tau_t': tau_t
                               }
                              )
md = gmx.mdrun(param_sweep)
for tau_expected, tau_actual in zip(tau_t, md.output.params['tau_t'].extract()):
    assert tau_expected == tau_actual

gmx.make_input dispatches appropriate preprocessing for file or in-memory simulation input.
<!-- 15 March -->

*gmx.make_input handles state from checkpoints*
(requires interaction with library development)
<!-- 22 March -->

In [None]:
initial_input = gmx.read_tpr(tpr_filename)
md = gmx.mdrun(initial_input)
stage2_input = gmx.make_input(topology=initial_input,
                              conformation=md.output,
                              parameters=stage2_params,
                              simulation_state=md.output)
md = gmx.mdrun(stage2_input)
md.run()

gmx.write_tpr (a facility used to implement higher-level functionality) merges tpr data (e.g. inputrec, structure, topology) into new file(s)
<!-- 1 March -->

In [None]:
gmx.fileio.write_tpr(filename=managed_filename, input=stage2_input)

gmx.tool provides wrapping of unmigrated gmx CLI tools
<!-- 1 March -->

gmx.tool uses Python bindings on C++ API for CLI modules
<!-- 15 March -->

*gmx.tool operations are migrated to updated Options infrastructure*
(requires interaction with library development)
<!-- 5 April -->

In [None]:
analysis = gmx.rmsf(trajectory=md.output.trajectory,
                    topology=initial_input,
                   )
file_list = analysis.output.rmsf.extract(filetype='xvg')

gmx.context manages data placement according to where operations run
<!-- 8 March -->

*gmx.context negotiates allocation of 1 node per operation with shared comm*
(requires interaction with library development)
<!-- 8 March -->

In [None]:
from mpi4py import MPI
comm_world = MPI.COMM_WORLD

group2 = comm_world.Get_group().Incl([0,1])
ensemble_comm = comm_world.Create_group(group2)

md = gmx.mdrun([tpr_filename for _ in range(2)])

with gmx.get_context(md, communicator=ensemble_comm) as session:
    session.run()

ensemble_comm.Free()

# Ref: https://bitbucket.org/mpi4py/mpi4py/src/master/demo/wrap-ctypes/helloworld.py
# Ref: https://bitbucket.org/mpi4py/mpi4py/src/master/demo/wrap-c
# TODO: check whether there is a reasonable way we could inspect an argument provided as a comm
# to see if it is a Python wrapper for an MPI communicator or whether we need to have an explicit
# adapter using the mpi4py.h cython-generated header installed with mpi4py.

gmx.context negotiates an integer number of nodes per operation
<!-- 22 March -->

*gmx.context negotiates allocation of resources for scheduled work*
(requires interaction with library development)
<!-- 19 April -->

In [None]:
from mpi4py import MPI
comm_world = MPI.COMM_WORLD

md = gmx.mdrun([tpr_filename for _ in range(2)])

with gmx.get_context(md, communicator=comm_world) as session:
        session.run()

md = gmx.mdrun([tpr_filename for _ in range(4)])


with gmx.get_context(md, communicator=comm_world) as session:
        session.run()