In [1]:
%load_ext autoreload
%autoreload 2

# Molecular Systems, Items and Forms

In [2]:
from molsysmt.tools import molecular_systems as tools_molecular_systems
from molsysmt.tools import items as tools_items
from molsysmt.molecular_system import MolecularSystem
from molsysmt import demo_systems



In [None]:
help(tools_molecular_systems)

In [None]:
help(tools_items)

- Puede haber items con topology, coordinates or trajectory o box.
Un sistema molecular puede venir descrito por un item o varios items. Por ejemplo un sistema con topología, coordinates y box por separado. O un sistema con un solo item. O una topología con trajectoria.

## Tests

In [3]:
item = demo_systems.files['1tcd.mmtf']

In [4]:
tools_molecular_systems.is_a_single_molecular_system(item)

True

In [5]:
tools_molecular_systems.where_bonds_in_molecular_system(item)

('/home/diego/Proyectos/MolSysMT/molsysmt/demo_systems/1tcd.mmtf', 'mmtf')

In [9]:
molecular_system = MolecularSystem(item)

In [10]:
molecular_system.__dict__

{'topology_item': '/home/diego/Proyectos/MolSysMT/molsysmt/demo_systems/1tcd.mmtf',
 'topology_form': 'mmtf',
 'bonds_item': '/home/diego/Proyectos/MolSysMT/molsysmt/demo_systems/1tcd.mmtf',
 'bonds_form': 'mmtf',
 'parameters_item': None,
 'parameters_form': None,
 'trajectory_item': '/home/diego/Proyectos/MolSysMT/molsysmt/demo_systems/1tcd.mmtf',
 'trajectory_form': 'mmtf',
 'coordinates_item': '/home/diego/Proyectos/MolSysMT/molsysmt/demo_systems/1tcd.mmtf',
 'coordinates_form': 'mmtf',
 'box_item': '/home/diego/Proyectos/MolSysMT/molsysmt/demo_systems/1tcd.mmtf',
 'box_form': 'mmtf'}

In [None]:
prmtop_file = msm.demo_systems.files['pentalanine.prmtop']
inpcrd_file = msm.demo_systems.files['pentalanine.inpcrd']
ms = msm.molecular_system.MolecularSystem([prmtop_file, inpcrd_file])

## Casos que quiero resolver

### 1 sistema molecular

- El sistema molecular tiene únicamente una topología
- El sistema molecular tiene únicamente una secuencia de frames o coordenadas
- En el caso de un item, está claro.
- Box tomo la que viene con las coordenadas o la última en la lista.
- Sólo si topologies <=2 and coordinates <=2.
- Podríamos tener más de 2 items si por ejemplo un tercero tiene sólo box.

    - 0 topologías, 0 coordenadas -> -
    - 0 topologías, 1 coordenadas -> Si
    - 0 topologías, 2 coordenadas -> No.
    
    - 1 topologías, 0 coordenadas -> Si
    - 1 topologías, 1 coordenadas -> Si
    - 1 topologías, 2 coordenadas -> Si la top va con coords, si. Si no, no.

    - 2 topologías, 0 coordenadas -> No
    - 2 topologías, 1 coordenadas -> Si las coords van con top, si. Si no, no.
    - 2 topologías, 2 coordenadas -> No

- si hay un item con box sin coordinates ni topology, esa box se pilla.

Multiple molecular systems in a single list only possible si todas las topologías llevan sus coordenadas. La box debe ser la misma o será cogida la última.

### More than a molecular system

- Lista de items no definido como single molecular system.
- Lista de listas, donde cada sublista es un molecular system.
- Llista de items con alguna lista dentro, si es que un molecular system está definido por más de un item.

Ahora he sacado el has_topology o has_coordinates en msm.tools.items

Hay un método para completar? O para añadir información a un item de otro item... por ejemplo, un molsysmt.Topology sin bonds y un openmm.Topology igual con bonds... pasar los bonds al otro. O por ejemplo, una trayectoria sin box y le meto box. No.... esto tendría que poder hacerse con 'set'

## Forms

Molecular systems can take different forms. The same system can be encoded for instance as a pdb file, as a python object of mdtraj.Trajectory class, as a UniProt id code or as an aminoacid sequence. Not all forms have the same level of detail, probably some forms have more information, some other less information, but all are forms of the same molecular system. MolSysMT takes the concept 'form' as a central concept at the center of the multitool. Sometimes we have the system in form A, an mmtf file for example, and to be able to make a specific analysis with a given tool form A needs to be converted to form B -an mdtraj.Topology-, and then we probably need a third library to modify the system but this time the system must be encoded in form C -a parmed.Structure-. And so on. Usually, you can find the way to convert these forms in the documentation of those libraries as well as the way those analysis are invoked. To avoid the time of connecting those pieces, MolSysMT provides with a framework where different tools, nativo and coming from other libraries, can be easily plugged to build up the structure of pipes configuring the workflow you need.

At this moment these are the forms MolSysMT can handle.

## Files

The updated list of forms type file can be printed out with the method `MolSysMT.info_forms()`.

In [None]:
msm.info_forms(form_type='file')

## Classes

MolSysMT works with python classes coming from many other libraries such as MDTraj, PyTraj, MDAnalysis, OpenMM, ParmEd among others; as well as some native classes.

In [None]:
msm.info_forms(form_type='class')

## Ids

There are several databases or encoding systems where molecular systems take the form of a string of characters. This is the case of the Protein Data Bank, the ChEMBL database or the UniProt codes. The following table summarizes the list of Ids recognized by MolSysMT.

In [None]:
msm.info_forms(form_type='id')

Notice that form names here ends with ':id'. This suffix is used to distinguish them from other form types. 'pdb' is a form name corresponding to a file and 'pdb:id' is the id form.

## Sequences

Molecular systems can be determined by a sequence of elements. For instance, a peptide as Metenkephaline can be defined by means of its aminoacids sequence. These are the forms of type sequence MolSysMT can handle:

In [None]:
msm.info_forms(form_type='seq')

## Viewers

The last molecular systems form we usually need its the graphical representation: the viewer. MolSysMT works with viewers as if they were a different form type. These are the viewers MolSysMT can work with:

In [None]:
msm.info_forms(form_type='viewer')