# Inputs setter

This notebook is a tool to manually set the workflow inputs with descriptions and examples. Values set here are then exported to a json file which is mandatory to run the MoDEL analyses workflow.

Inputs:
1. [Chainnames](#chainnames)
2. [Ligands](#ligands)
3. [Interactions](#interactions)
4. [Membranes](#membranes)
6. [Metadata](#metadata)
7. [Simulation parameters](#simulation)

In [1]:
inputs = {}

### Chainnames <a name="chainnames"></a>

Set the chain names<br />
This names are used to label chains in the web client

#### Example:

```python
{
    'A':'Protein',
    'B':'Protein',
    'G':'Glycans'
}
```

In [2]:
inputs['chainnames'] = {
    'A':'NSP16',
}

### Ligands <a name="ligands"></a>

Set all ligands in the simulation. Each ligand has the following attributes:
    - name: this is used as the ligand NGL representation label
    - ngl: NGL selection used to represent the ligand on the NGL viewer in the client
    - drugbank (Optional): the drugbank accession which is used in the overview to make a link
    - chembl (Optional): the chembl accession which is used in the overview to make a link
    
At this moment, ligands are used nowhere in the workflow
    
NGL viewer selection:
http://nglviewer.org/ngl/api/manual/usage/selection-language.html
    
#### Example:

```python
{
    'name': 'Some compount',
    'ngl': ':L',
    'drugbank': 'DB00945',
    'chembl': 'CHEMBL25'
},
{
    'name': 'Some nucleic acid',
    'ngl': ':A or :B'
}
```


In [3]:
inputs['ligands'] = []

### Interactions <a name="interactions"></a>

Set which are the interesting interactions to be analyzed<br />
A bunch of interaction-specific analyses will be run for each interaction and displayed in the web client

Interactions are defined by the 'agents' which are meant to interact pairwise. An 'agent' may be anything, even a group of unrelated molecules. The only condition is that agents must be defined by residues. The workflow will find out which residues of each agent are close enought to be considered as interface residues. These residues will be the ones considered in interface analyses<br />

Interactions are uploaded to the database as part of the project metadata. They include the interaction name, agents name, residues selection of both whole agents and the residue selections of both agent interfaces<br />

Each interaction has the following attributes:
    - name: a string tag used to relate interaction analyses data with their corresponding residues.
    In addition, the name is used to label the corresponding analyses in the web client
    - agent_1: the name of the first agent in the interaction, which is used to label in the client
    - selection_1: the prody selection of the first agent in the interaction
    - agent_2: the name of the second agent in the interaction, which is used to label in the client
    - selection_2: the prody selection of the second agent in the interaction
    
Prody selection:
http://prody.csb.pitt.edu/manual/reference/atomic/select.html
    
#### Example:

```python
{
    'name': 'protein-ligand interaction',
    'agent_1': 'protein',
    'selection_1': 'not resname lig',
    'agent_2': 'ligand',
    'selection_2': 'resname lig',
},
{
    'name': 'domain-domain interaction',
    'agent_1': 'domain 1',
    'selection_1': 'resnum 2:291',
    'agent_2': 'domain 2',
    'selection_2': 'resnum 306:529',
},
```

In [4]:
inputs['interactions'] = []

### Membranes <a name="membranes"></a>

[EXPERIMENTAL INPUT]

Set those elements which must be considered membrane<br/>
These elements are excluded in some analyses such as RMSD per residue<br/>
These elements will be representated in the web client with a specific pattern:
- Licorice
- Purple color
- Low opacity


To define a membrane it is required a name and a selection in Prody selection lenguage


Prody selection:
http://prody.csb.pitt.edu/manual/reference/atomic/select.html

#### Example:

```python
{
    'name': 'Cell membrane',
    'selection': 'chain M',
}
```

In [5]:
inputs['membranes'] = []

The web client sets some default representations (ngl configurations) to highlight important features in the structure according to the topology reference or interactions. 

In addition, you may set extra customized representations which are interesting and thus must have an independent highlight system. This has no effect in the workflow, but it is very visual in the client.<br />
<span style="color:red">WARNING: Make sure whatever you want to highlight is not already highlighted by default or it would be duplicated</span>

#### Example:

```python
[
    {
        "name" : "Remdesivir",
        "representations" : [
            {
                "name" : "Remdesivir",
                "selection" : "REM",
                "type" : "licorice"
            }
        ]
    }
]
```

In [6]:
inputs['customs'] = []

## Project metadata <a name="metadata"></a>

The following metadata has no effect on the workflow itself, but it will be written to the output metadata file. These values will be uploaded to the database and then exposed in the project overview. They may be also useful to search this simulation in the browser.

Set which family does this trajectory belong to.

Supported units:
- RBD-ACE2
- RBD
- ACE2
- Spike
- 3CLpro
- PLpro
- Polymerase
- Exoribonuclease
- Other

In [7]:
inputs['unit'] = 'NSP16/10'

Set the source pdb ids of the trajectory structure<br />
Additional data from the pdb is harvested by the loader while uploading to the database<br />
This data is displayed in the overview page

#### Example:

```python
['2AJF', '6M17']
```

In [8]:
inputs['pdbIds'] = ['6W4H']

Write a breif description or title for this trajectory for the overview page<br />
This name may be used by the client to search the trajectory in the database<br />
The name is displayed in the overview page

In [9]:
inputs['name'] = 'Nsp16 monomer from SARS-CoV-2'

Write additional comments<br />
The description is displayed in the overview page

In [10]:
inputs['description'] = 'Markov state model built from 770 microseconds of aggregate simulation time collected on Folding@home. Folding@home simulations were launched from the structures generated using an adaptive sampling method (FAST-Pocket).'

Write author names<br />
Authors are displayed in the overview page

In [11]:
inputs['authors'] = 'Neha Vithani, Maxwell I. Zimmerman, Gregory R. Bowman'

Write author group name/s
The group is displayed in the overview page

In [12]:
inputs['groups'] = 'Gregory R. Bowman (Bowman lab)'

How to contact the authors. The contact is displayed in the overview page

In [13]:
inputs['contact'] = None

Program (software) name which carried the trajectory and its version<br />
Program and version are both displayed in the overview page<br/>
<span style="color:red">WARNING: Check the database search page in order to see current values</span>

In [14]:
inputs['program'] = 'GROMACS'
inputs['version'] = '5.0.4'

Type of molecular dynamics.<br/>
At this moment there are only two options in this field: 'trajectory' and 'ensemble'.<br/>
Note that this field has an effect on the client. Some time-dependent analysis will change the labels of their axes in order to make sense. e.g. RMSD X axis will be 'frames' instead of 'time'.

In [None]:
inputs['type'] = 'trajectory'

MD method
e.g. 'Classical MD', 'Targeted MD', 'Biased MD (Accelerated Weighted Ensemble)', Enhanced sampling (Hamiltonian Replica Exchange) ...
MD method is displayed in the overview

In [15]:
inputs['method'] = 'Classical MD combined with Adaptive sampling method (FAST-pocket)'

License and link to the license web page<br />
The license is displayed in the overview page. Under the license there is a 'More information' button. The link is used to redirect the user when the button is clicked

In [16]:
inputs['license'] = ("This trajectory dataset is released under a Creative Commons "
           "Attribution 4.0 International Public License")
inputs['linkcense'] = "https://creativecommons.org/licenses/by/4.0/"

Citation for refering this simulation. The citation is displayed in the overview page<br />
To set a citation use the following instructions:
To add a line break type '(br)' inside the citation string
To add superior text type '^' before each character

In [17]:
inputs['citation'] = 'Vithani N, Ward MD, Zimmerman MI, Novak B, Borowsky JH, Singh S, Bowman GR (2021) SARS-CoV-2 Nsp16 activation mechanism and a cryptic pocket with pan-coronavirus antiviral potential. Biophys. J. 120: 2880-2889'

Acknowledgements to be shown in the overview page

In [18]:
inputs['thanks'] = 'We are grateful to the citizen scientists who contribute to Folding@home by running simulations on their personal computers. This work was funded by NSF CAREER Award MCB-1552471, NSF RAPID 58628, NIH R01 GM124007 and RF1AG067194'

Links to somewhere related to the simulation


WARNING: This field has no effect anywhere in our work stream BUT others may rely on it. MolSSI uses this field to find simulations in our database and then to place the embed viewer in their website. You must fit to the standard when adding a new MolSSI simulation.

#### Example:

```python
[
    {
        'name': 'Data source',
        'url': 'https://data.source.org/'
    },
    {...}
]
```

In [19]:
inputs['links'] = []

## Simulation parameters <a name="simulation"></a>

These inputs may be automatically mined from the topology and trajectory files<br />
However they also may be forced here<br />
<span style="color:red">DANI: Todo mentira. El minado de metadata para estos valores no funciona casi nunca</span><br />
<span style="color:red">DANI: Dejé de mantenerlo hace tiempo y hay que poner todos los valores a mano</span>

Length is an important value since it is used in many graph axes in the web client

In [20]:
inputs['length'] = 1000 # In nanoseconds (ns)

The rest of values are displayed in the web client as trajectory metadata<br />
These values do not affect other outcomes<br/>
<span style="color:red">WARNING: Check the database search page in order to see current values</span><br/>
<span style="color:red">WARNING: Specially check already existing force fields and how they are named</span>

In [21]:
inputs['temp'] = None # In Kelvin (K)
inputs['ensemble'] = None # e.g. NVT, NPT, etc.
inputs['timestep'] = None # In fs (fs/step)
inputs['ff'] = 'Amber ff03' # Force fields (e.g. ['CHARMM36'])
inputs['wat'] = 'TIP3P' # Water force field (e.g. TIP3P)
inputs['boxtype'] = 'Dodecahedron' # e.g. Triclinic, Cubic, Dodecahedron

## Export

Finally export everything to json format

In [22]:
import json

# Export it to json
inputs_filename = 'inputs.json'
with open(inputs_filename, 'w') as file:
    json.dump(inputs, file, indent=4)