In [1]:
# This cell is removed with the tag: "remove-input"
# As such, it will not be shown in documentation

# Quickstart guide

## Get ready

Install MolSysMT in your conda env:

```bash
conda install -c uibcdf molsysmt
```

Open a Jupyter notebook and import the library:

In [2]:
import molsysmt as msm



## A molecular system

Let's define a first molecular system with a PDB id.

In [3]:
molecular_system = '181L'

In MolSysMT's language, a molecular system can have different forms. Not every form has the same attributes, but they are all different representations of the same system. For example, a PDB id is a form of a molecular system containing a list of atom names, atom residues, chains... etc, of a molecular system together with the spatial coordinates of a structure (or many).

In [4]:
form = msm.get_form(molecular_system)
print(f'The molecular system has the "{form}" form')

The molecular system has the "string:pdb_id" form


We can get a description any molecular system with the help of the `molsysmt.info` function:

In [5]:
msm.info(molecular_system)

form,n_atoms,n_groups,n_components,n_chains,n_molecules,n_entities,n_waters,n_ions,n_small_molecules,n_proteins,n_structures
string:pdb_id,1441,302,141,6,141,5,136,2,2,1,1


In MolSysMT's language, a molecular system can have different forms. Not every form has the same attributes, but they are all different representations of the same system. For example, a PDB id, the corresponding pdb file or mmtf file, a trajectory object of mdtraj or pytraj, an NGLWidget of nglview, or a MolSys native object of MolSysMT can be all different forms of a molecular system.

To follow with this "Quick guide", we will need an appropriate form of our system. Let's try to work with the native form `molsysmt.MolSys` since working all the time with the PDB id is not very efficient. Forms conversions can be done with one of the most relevant MolSysMT basic functions: `molsysmt.convert()`.

In [6]:
molecular_system = msm.convert(molecular_system, to_form='molsysmt.MolSys')

old_form = form
form = msm.get_form(molecular_system)
print(f'The molecular system was converted from "{old_form}" to "{form}"')

The molecular system was converted from "string:pdb_id" to "molsysmt.MolSys"


```{admonition} Note
:class: note
MolSysMT includes some native forms such as 'molsysmt.MolSys', 'molsysmt.Topology' or 'molsysmt.Structures'.
```

## Elements selection

One of the most common steps in every workflow when working with a molecular system model is the elements selection.

A molecular system is composed of elements. The basic and fundamental element is the "atom". But other levels of hyerarchical groups of atoms are defined by other elements such as "group", "component", "molecular", "chain" or "entity".

### Some selection examples

Let's get a general information summary regarding "entity" elements:

In [7]:
msm.info(molecular_system, element='entity')

index,name,type,n atoms,n groups,n components,n chains,n molecules
0,T4 lysozyme,protein,1289,162,1,1,1
1,Chloride ion,ion,2,2,2,2,2
2,2-hydroxyethyl disulfide,small molecule,8,1,1,1,1
3,Benzene,small molecule,6,1,1,1,1
4,water,water,136,136,136,1,136


To quickly introduce how the selection tool works in MolSysMT, let's illustrate its use getting the atom indices from the entity type "ion":

In [8]:
ions = msm.select(molecular_system, selection='entity_type=="ion"')

In [9]:
print(ions)

[1289 1290]


To proof that these atom indices correspond in deed to ions, let's use again the function `molsysmt.info` this time with the input argument `element='atom'`:

In [10]:
msm.info(molecular_system, element='atom', selection=ions)

index,id,name,type,group index,group id,group name,group type,component index,chain index,molecule index,molecule type,entity index,entity name
1289,1290,CL,Cl,162,173,CL,ion,1,1,1,ion,1,Chloride ion
1290,1291,CL,Cl,163,178,CL,ion,2,2,2,ion,1,Chloride ion


The selection tool works as expected. But the selection condition was very simple. Let's show in the following lines some cases just a bit more complicated. For example, can spatial restrictions be included in our selections with MolSysMT? The answer is yes... check this out:

In [11]:
CAs_in_contact = msm.select(molecular_system, selection='atom_name=="CA" within 5.0 angstroms of @ions')

In [12]:
print(CAs_in_contact)

[ 385 1115 1122 1129 1137]


Ok, but what if we want the groups (residues in this case) and not the atoms fulfiling this former condition:

In [13]:
residues_in_contact = msm.select(molecular_system, element='group', selection='atom_name=="CA" within 5.0 angstroms of @ions')

In [14]:
residues_in_contact

array([ 48, 141, 142, 143, 144])

These former group indices in the variable `residues_in_contact` must content those group indices of the atom indices in `CAs_in_contact`. This can be checked with the following line:

In [15]:
residues_from_CAs_in_contact = msm.get(molecular_system, element='group', selection='atom_index in @CAs_in_contact', group_index=True)

In [16]:
residues_from_CAs_in_contact in residues_in_contact

True

### Use your favorite selection syntax

You don't need to learn a new selection syntax if you don't want to. We are sure you got used to the syntax of other very popular and useful tools such as MDTraj. It thats your case, use it:

In [17]:
msm.select(molecular_system, selection='name =~ "C[1-4]"', syntax='MDTraj')

array([1291, 1293, 1299, 1300, 1301, 1302])

### Translate a selection into other syntax

Maybe, you need to use a selection condition in another very popular and useful tool such as NGLView. But you don't remember the proper syntax rules. MolSysMT can also help in this case:

In [18]:
msm.select(molecular_system, element='group', selection='molecule_type=="ion"', to_syntax='NGLView')

'173:B 178:C'

## Getting attributes

Molecular systems, and its elements, have attributes. Those attributes can be the number of atoms, the number of structures, the name of an atom or the id of chain. MolSysMT includes a function to obtain the attribute values need of a molecular system or a specific set of elements of it: `molsysmt.get()`.

Let's show how this function operates with some simple examples. First, wondering about some general attributes of the molecular system:

In [19]:
msm.get(molecular_system, n_atoms=True)

1441

In [20]:
msm.get(molecular_system, n_structures=True)

1

In [21]:
msm.get(molecular_system, box_volume=True)

0,1
Magnitude,[311.5566139349998]
Units,nanometer3


Let's get now some attributes of some elements:

In [22]:
msm.get(molecular_system, element='atom', selection=[10, 11, 12], atom_name=True, group_name=True)

[array(['C', 'O', 'CB'], dtype=object),
 array(['ASN', 'ASN', 'ASN'], dtype=object)]

In [23]:
msm.get(molecular_system, element='atom', selection='molecule_type=="ion"', coordinates=True)

0,1
Magnitude,[[[4.3141 1.6446999999999998 0.17689999999999997]  [3.1832 1.567 2.3874999999999997]]]
Units,nanometer


In [24]:
msm.get(molecular_system, element='chain', selection='molecule_type=="water"', id=True)

array(['F'], dtype=object)

## Tools

MolSysMT have different categories of tools to work with molecular modules. They can be found in the modules: `molsysmt.basic`, `molsysmt.build`, `molsysmt.topology`, `molsysmt.structure`, `molsysmt.pbc`, etc. Let's illustrate here how some of these tools work.

```{admonition} Note
:class: Note
*MolSysMT is form agnostic*. All tools work no matter the form of the input molecular system.
```

### Basic

"Basic" tools such as `select`, `get`, `convert`, `add`, or `remove`, can be found in the module `molsysmt.basic`. Let's see some examples:

In [25]:
molecular_system = msm.basic.convert('181L', to_form='pdbfixer.PDBFixer')

In [26]:
msm.basic.get_form(molecular_system)

'pdbfixer.PDBFixer'

In [27]:
msm.basic.contains(molecular_system, waters=True, ions=True, small_molecules=True)

True

In [28]:
molecular_system = msm.basic.remove(molecular_system, selection='molecule_type==["water", "ion", "small molecule"]')

In [29]:
msm.basic.get(molecular_system, n_waters=True, n_ions=True, n_small_molecules=True)

[0, 0, 0]

In [30]:
# This cell is removed with the tag: "remove-input"
# As such, it will not be shown in documentation

nglview_htmlfile = '../../_static/nglview/quickstart_1.html'

In [32]:
msm.basic.view(molecular_system, viewer='NGLView')

NGLWidget()

In [33]:
# This cell is removed with the tag: "remove-input"
# As such, it will not be shown in documentation

if False:
    # to write an html the view had to be displayed in a cell before
    msm.thirds.nglview.write_html(view, nglview_htmlfile)

```{admonition} Note
:class: note
All methods defined in the `molsysmt.basic` module can be invoked also from the main level of the library. As such, `molsysmt.convert` is the same method as `molsysmt.basic.convert`.
```

### Build

The module `molsysmt.build` offers tools such as `solvate`, `add_missing_hydrogens`, `build_peptide`, `get_atoms_with_alternate_locations`, or `make_bioassembly`. Let's see some examples:

In [34]:
molecular_system = msm.build.build_peptide('AceAlaAlaAlaNme')

In [35]:
molecular_system = msm.structure.center(molecular_system)

In [36]:
msm.get(molecular_system, n_aminoacids=True, n_groups=True)

[3, 5]

In [37]:
molecular_system = msm.build.solvate(molecular_system, box_shape='truncated octahedral', clearance='14.0 angstroms')

In [38]:
msm.build.is_solvated(molecular_system)

True

In [39]:
molecular_system = msm.pbc.wrap_to_mic(molecular_system)

In [40]:
# This cell is removed with the tag: "remove-input"
# As such, it will not be shown in documentation

nglview_htmlfile = '../../_static/nglview/quickstart_2.html'

In [41]:
msm.view(molecular_system, standard=True, with_water_as='surface', viewer='NGLView')

NGLWidget()

In [42]:
# This cell is removed with the tag: "remove-input"
# As such, it will not be shown in documentation

if False:
    # to write an html the view had to be displayed in a cell before
    msm.thirds.nglview.write_html(view, nglview_htmlfile)

### Structure

The module `molsysmt.structure` offers tools such as `get_distances`, `get_center`, `get_contacts`, `translate`, or `fit`. Let's see some examples:

In [43]:
molecular_system = msm.basic.convert('181L', selection='molecule_type=="protein"')

In [44]:
msm.structure.get_distances(molecular_system, selection='atom_index==10', selection_2='atom_index==100')

0,1
Magnitude,[[[1.5211279959293365]]]
Units,nanometer


In [45]:
phi_angles, psi_angles = msm.structure.get_dihedral_angles(molecular_system, selection='group_index==[3,4]', phi=True, psi=True)

In [46]:
phi_angles

0,1
Magnitude,[[-65.97266126462469]]
Units,degree


In [47]:
msm.structure.get_contacts(molecular_system, selection='atom_name=="CA"', threshold='9 angstroms')

array([[[ True,  True,  True, ..., False,  True,  True],
        [ True,  True,  True, ..., False, False, False],
        [ True,  True,  True, ..., False, False, False],
        ...,
        [False, False, False, ...,  True,  True,  True],
        [ True, False, False, ...,  True,  True,  True],
        [ True, False, False, ...,  True,  True,  True]]])

### Topology, PBC, Molecular Mechanics...

There are other set of tools to work with the topology of a molecular system, its periodic boundary condictions, the molecular mechanics of the system, etc. The following cells show some few examples of what you can find in these modules:

## Do you want more?

We hope you found MolSysMT useful. If that is the case, you can either keep on having a look to the ["Showcase"](index.md) section or maybe it is time for you to visit the ["User Guide"](../user/index.ipynb). Enjoy the trip!