### Imports and settings for this notebool

In [1]:
# general notebook formatting for markdown and plots
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import warnings

import matplotlib.pyplot as plt

warnings.filterwarnings("ignore", category=FutureWarning)
warnings.filterwarnings = lambda *a, **kw: None
from IPython.core.display import HTML

In [2]:
# imports for code examples
import os

from pygromos.files.gromos_system.gromos_system import Gromos_System
from pygromos.files.gromos_system.ff.forcefield_system import forcefield_system

from pygromos.simulations.modules.preset_simulation_modules import sd
from pygromos.simulations.modules.preset_simulation_modules import emin
from pygromos.data.simulation_parameters_templates import template_sd, template_emin_vac



# PyGromosToolsDay

## Introduction

This notebook is created for the PyGromosToolsDay 17.02.2022    
Author: Marc Thierry Lehner  

This notbook is part of the PyGromosTools and can be found under:
https://github.com/rinikerlab/PyGromosTools/blob/pygromosDay/examples/PyGromosToolsDay/pyGromosToolsDay.ipynb

### Schedul for the day

1. [Introduction](#Introduction)
    1. [Schedule](#Schedule)
    2. [Tutorial](#Tutorial)
2. [PyGromosTools](#PyGromosTools)
    1. [PyGromosTools](#PyGromosTools)
    2. [PyGromosTools](#PyGromosTools)

## PyGromosTools - a Python package for GROMOS users

PyGromosTools is a Python package for GROMOS users. It tries to provide a easy to use unified interface to the GROMOS users in python. Leading to a better readability and reproducability of code, as well as easier pipelining of GROMOS simulations.

PyGromosTools is a natural grown package which tries to adhere to all GROMOS user's needs.

At the current state PyGromosTools provides:
1. GROMOSxx wrappers
2. GROMOS++ wrappers
3. File handling of all GROMOS file types for automated creation/modification/analysis
4. Automation and file management system gromos_system
5. Simulation Submission and Execution
6. etc.

In the following sections we will see how to use PyGromosTools to automate the creation of GROMOS simulations and how these tools are intended to be used. We will show simple uscase and how to adjust existing classes to fit more complex simulations or new blocks and finally show how you can contribute to PyGromosTools.

### General File Structure

Since the second major relase (TODO: date???) PyGromosTools is focused around a class called `gromos_system`. This class is the main class of PyGromosTools and is used to create, manage and analyse GROMOS simulations.  

`gromos_system` stores information of all files of a simulation and provides methods to create, modify and analyse these files.


<p style="text-align:center;">
    <img src="./figures/gromos_system_overview.png" width=900 alt="gromos_system_overview"/>
    <div style="text-align:center;">Based on B. Ries Thesis</div>
</p> 

`gromos_system` can be create in many different ways. The simplest way is from existing files. However, it also provides the option to be created from only a smiles and a forcfield.  

Although it has to be mentioned, that for all GROMOS types forcefields the correct MTB name is required.  

In the following example we create first a `forcefield_system` that holds all informations about the forcfield we want to use, and then use this `forcefield_system` to create a `gromos_system` from a smiles.   

If we tick the option `adapt_imd_automatically` and `auto_convert` in `gromos_system` we will automatically create not only the topology, but also the coordinate file and a adjusted INPUT file.

In [3]:
work_dir = os.path.abspath("./example_sys/")

In [4]:
ff = forcefield_system(name="2016H66")
ff.mol_name = "BZN"

In [5]:
groSys = Gromos_System(work_folder=work_dir,
                system_name="test_system",
                in_smiles="c1ccccc1", 
                Forcefield=ff, 
                adapt_imd_automatically=True,
                in_imd_path=template_emin_vac, 
                auto_convert=True)

This `gromos_system` called `groSys` now contains a topology, coordinate and a input file for Cyclohexane. All automatically generated.  

We can access all these files as python classes by accessing the `groSys.top`, `groSys.cnf` and `groSys.imd` attributes.  
All these classe are instances of the `gromos_file` class and have attributes on their own which are the gromos blocks (and of course somties additional content).  

For example we can check if the new topology has a proper TITLE block by accessing the attribute `groSys.top.TITLE`. And we can see, that as expected the title block was automatically generated.

In [6]:
#help(groSys.top.TITLE)

In [7]:
groSys.top.TITLE

TITLE
MAKE_TOP topology, using:
/home/mlehner/PyGromosTools/pygromos/data/ff/Gromos2016H66/2016H66.mtb
/home/mlehner/PyGromosTools/pygromos/data/ff/Gromos2016H66/2016H66_orga.mtb
/home/mlehner/PyGromosTools/pygromos/data/ff/Gromos2016H66/2016H66.ifp

Force-field code: 2016H66

	 >>> Generated with PyGromosTools (riniker group) <<< 
END

The same is also true for the other files. It's espicially worth to mention that coordinate file is automatically generated, using the RDKit conformer generator.

In [8]:
groSys.cnf.POSITION

POSITION
# 	 
    1 C6H6  C          1    0.080649780   -0.114309219    0.001491900
    1 C6H6  C          2    0.139328044    0.012683506   -0.000213298
    1 C6H6  C          3    0.058678222    0.126992825   -0.001704768
    1 C6H6  C          4   -0.080649753    0.114309267   -0.001491014
    1 C6H6  C          5   -0.139327999   -0.012683547    0.000214086
    1 C6H6  C          6   -0.058678224   -0.126992791    0.001705734
    1 C6H6  H          7    0.143042861   -0.202741963    0.002644300
    1 C6H6  H          8    0.247116116    0.022495575   -0.000379138
    1 C6H6  H          9    0.104073182    0.225238122   -0.003024799
    1 C6H6  H         10   -0.143042588    0.202742062   -0.002645445
    1 C6H6  H         11   -0.247116080   -0.022495931    0.000378523
    1 C6H6  H         12   -0.104073562   -0.225237908    0.003023921
END

All these blocks the contain classes for the specific fields. In the case of the POSITION block we have a list of atoms. And each atom stores the name, the type and the position. This makes it easy to make adjustments, search for specific atoms or do some general scripted modifiactions to the positions.

<p style="text-align:center;">
    <img src="./figures/gromos_file.png" height=300 alt="gromos file structure"/>
    <div style="text-align:center;">Based on B. Ries Thesis</div>
</p> 

All the fields have theire own attributs and sometimes functions to modify them. For example the `POSITION` block has a `atomP` that store all the information.

In [9]:
atomP = groSys.cnf.POSITION[1]
print(atomP.atomID, atomP.atomType, atomP.resName, atomP.xp)

2 C C6H6 0.13932804378041846


All cnf files can be easily converted to differet file types (eg. pdb) and checked via the handy `visualize` function. The `visualize` function is intended for notebooks and prototyping, but can be used for any other purpose. It's a fully interactive 3D view.

In [10]:
groSys.cnf.visualize()

<py3Dmol.view at 0x7fa91204e8b0>

### Simple Simulation

PyGromosTools provides a simple way to create a GROMOS simulation. This is done by creating a `gromos_system` and then adding the necessary files to it and then simply selecting one of the preset modules.  

PyGromosTools offers a wide range of preset simulations, but also general simulation templates, which can be easily modified.  

These simulation modules take care of the creation of the necessary files, the execution/scheduling of the simulation, cleaning up the files and the analysis of the results.

<p style="text-align:center;">
    <img src="./figures/simulation_structure.png" width=600 alt="gromos simulation structure"/>
    <div style="text-align:center;">Based on B. Ries Thesis</div>
</p> 

The core of of all simulations is a `general_simulation` class. This class is the base class for all simulations. It provides the basic functionality to create the necessary files, the execution/scheduling of the simulation, cleaning up the files and the analysis of the results. Simple extensions to this class are classes like `MD`, `SD` and `EMIN` which manly provide preset changes in the IMD file. However PyGrromosTools also provides a wide range of more complex simulation approaches like `Hvap`, `TI`, etc. wich often require multiple lower level simulations to be run.

But in all simulation modules the date is always stored and returned in a `gromos_system` object. This is done to make it easy for users to keeep files clean and analyse the results. And all simulation modules use a so called `submission_system`. These classes store the information about the submission of the simulation on a specific platform.


In [11]:
eminGroSys = emin(groSys, in_imd_path=template_emin_vac)

################################################################################

emin
################################################################################

Script:  /home/mlehner/PyGromosTools/pygromos/simulations/hpc_queuing/job_scheduling/schedulers/simulation_scheduler.py

################################################################################
Simulation Setup:
################################################################################

steps_per_run:  3000
equis:  0
simulation runs:  1
################################################################################

 submit final analysis part 

/home/mlehner/PyGromosTools/examples/PyGromosToolsDay/example_sys/emin/ana_out.log
/home/mlehner/PyGromosTools/examples/PyGromosToolsDay/example_sys/emin/job_analysis.py
ANA jobID: 0


In [None]:
sdGroSys = emin(eminGroSys, in_imd_path=template_sd)

After some simulations we still have a `gromos_system` object. This object contains all the files of the simulation. Even trajectories are added to the `gromos_system` object automatically.  
In case files are not automatically attached (for example due to scheduling on a cluster) they can be added automatically via the function `_check_promises()`. This function will go through all files and update them. Some files might only be attached as `future_files` and read in after this function call. `future_files` are files that are not yet available and allow for more flexibility on clusters where jobs might be submitted before the files are available.

In [None]:
sdGroSys._check_promises()

### Simple analysis

At this stage we successfully ran a GROMOS simulation with PyGromosTools. But we still have to analyse the results. Most analysis is done using the trajectories and these classes offer a lot of helping functions.  
Trajectories are in PyGromosTools stored as Pandas DataFrames. This makes it easy and fast to analyse the results. Every type of trajectoriy contains a attribute called `database` which is a Pandas DataFrame. This DataFrame contains all the information of the trajectory. And a user could access it in any way he likes, like a normal DataFrame. However, for most common analysis tasks functions are already provided by the respective trajectory class.  
For example, if we want to compare the distance of a atom to another atom during the simulation we could use the following lines of code:

In [None]:
sdGroSys.trc.database

In [None]:
sdGroSys.trc.get_atom_distance(1, 2)

Similar to cnf files trc files also provide a `visualize` function. The function will show a 3D video of the trajectory.

In [None]:
sdGroSys.trc.visualize()

And similar to the trc files tre files also provide a wide selection of analysis tools out of the box.

In [None]:
sdGroSys.tre.get_total_energy().plot()

### Non standart blocks and modifications

### New files for new gromos functionality

## Release 3

## Hackathlon

## Final Disscussion