In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import molsysmt as msm



# How to convert a form into other form

The meaning of molecular system 'form', in the context of MolSysMT, has been described previously in the section XXX. There is in MolSysMT a method to convert a form into other form: `molsysmt.convert()`. This method is the keystone of this library, the hinge all other methods and tools in MolSysMT rotates on. And in addition, the joining piece connecting the pipes of your work-flow when using different python libraries.

The method `molsysmt.convert()` requires at least two input arguments: the original pre-existing item in whatever form accepted by MolSysMT (see XXX), and the name of the output form: 

In [3]:
molecular_system = msm.convert('mmtf:1tcd', 'molsysmt.MolSys')

The id code `1tcd` from the MMTF Protein Data Bank is converted into a native `molsysmt.MolSys` python object. At this point, you probably think that this operation can also be done with the method `molsysmt.load()`. And you are right. Actually, `molsysmt.load()` is nothing but an alias of `molsysmt.convert()`. Although redundant, a loading method was included in MolSysMT just for the sake of intuitive usability. But it could be removed from the library since `molsysmt.convert()` has the same functionality.

The following cells illustrate some conversions you can do with `molsysmt.convert()`:

In [4]:
msm.convert('pdb:1sux', '1sux.pdb') # fetching a pdb file to save it locally

'1sux.pdb'

In [5]:
msm.convert('mmtf:1sux', '1sux.mmtf') # fetching an mmtf to save it locally

'1sux.mmtf'

In [6]:
pdb_file = msm.demo_systems.files['1tcd.pdb']
molecular_system = msm.convert(pdb_file, 'mdtraj.Trajectory') # loading a pdb file as an mdtraj.Trajectory object

In [7]:
seq_aa1 = msm.convert(molecular_system, 'aminoacids1:seq') # converting an mdtraj.Trajectory into a sequence form

In [8]:
print(seq_aa1)

aminoacids1:KPQPIAAANWKCNGSESLLVPLIETLNAATFDHDVQCVVAPTFLHIPMTKARLTNPKFQIAAQNAITRSGAFTGEVSLQILKDYGISWVVLGHSERRLYYGETNEIVAEKVAQACAAGFHVIVCVGETNEEREAGRTAAVVLTQLAAVAQKLSKEAWSRVVIAYEPVWAIGTGKVATPQQAQEVHELLRRWVRSKLGTDIAAQLRILYGGSVTAKNARTLYQMRDINGFLVGGASLKPEFVEIIEATKSKPQPIAAANWKCNGSESLLVPLIETLNAATFDHDVQCVVAPTFLHIPMTKARLTNPKFQIAAQNAITRSGAFTGEVSLQILKDYGISWVVLGHSERRLYYGETNEIVAEKVAQACAAGFHVIVCVGETNEEREAGRTAAVVLTQLAAVAQKLSKEAWSRVVIAYEPVWAIGTGKVATPQQAQEVHELLRRWVRSKLGTDIAAQLRILYGGSVTAKNARTLYQMRDINGFLVGGASLKPEFVEIIEATKXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX


## How to convert just a selection

The conversion can be done over the entiry system or over a part of it. The input argument `selection` works with most of the MolSysMT methods, with `molsysmt.convert()` also. To know more about how to perform selections there is a section on this documentation entitled "XXX". By now, lets see some simple selections to see how it operates: 

In [9]:
pdb_file = msm.demo_systems.files['1tcd.pdb']
whole_molecular_system = msm.convert(pdb_file, to_form='openmm.Topology')

In [10]:
msm.info(whole_molecular_system)

form,n_atoms,n_groups,n_components,n_chains,n_molecules,n_entities,n_waters,n_proteins,n_frames
openmm.Topology,3983,662,167,4,167,3,165,2,


In [11]:
molecular_system = msm.convert(pdb_file, to_form='openmm.Topology',
                               selection='molecule_type=="protein"')

In [12]:
msm.info(molecular_system)

form,n_atoms,n_groups,n_components,n_chains,n_molecules,n_entities,n_proteins,n_frames
openmm.Topology,3818,497,2,2,2,2,2,


## How to combine multiple forms into one

Sometimes the molecular system comes from the combination of more than a form. For example, we can have two files with topology and coordinates to be converted into an only molecular form:

In [13]:
prmtop_file = msm.demo_systems.files['pentalanine.prmtop']
inpcrd_file = msm.demo_systems.files['pentalanine.inpcrd']
molecular_system = msm.convert([prmtop_file, inpcrd_file], to_form='molsysmt.MolSys')

In [14]:
msm.info(molecular_system)

form,n_atoms,n_groups,n_components,n_chains,n_molecules,n_entities,n_waters,n_peptides,n_frames
molsysmt.MolSys,5207,1722,1716,1,1716,2,1715,1,1


## How to convert a form into multiple ones at once

In the previous section the way to convert multiple forms into one was illustrated. Lets see now how to produce more than an output form in just a single line:

In [15]:
h5_file = msm.demo_systems.files['pentalanine.h5']
topology, trajectory = msm.convert(h5_file, to_form=['molsysmt.Topology','molsysmt.Trajectory'])

  key = numpy.array(key)


In [16]:
msm.info(topology)

form,n_atoms,n_groups,n_components,n_chains,n_molecules,n_entities,n_peptides,n_frames
molsysmt.Topology,62,7,1,1,1,1,1,


In [17]:
msm.info(trajectory)

form,n_atoms,n_groups,n_components,n_chains,n_molecules,n_entities,n_frames
molsysmt.Trajectory,62,,,,,,5000


In [18]:
msm.info([topology, trajectory])

form,n_atoms,n_groups,n_components,n_chains,n_molecules,n_entities,n_peptides,n_frames
"['molsysmt.Topology', 'molsysmt.Trajectory']",62,7,1,1,1,1,1,5000


Lets now combine both forms into one to see their were properly converted:

In [19]:
pdb_string = msm.convert([topology, trajectory], to_form='.pdb', frame_indices=0)
print(pdb_string)

NotImplementedError: It has not been implemeted yet. Write a new issue in https://github.com/uibcdf/MolSysMT/issues asking for it.

## Some examples with files

In [20]:
PDB_file = msm.demo_systems.files['1brs.pdb']
system_pdbfixer = msm.convert(PDB_file, to_form='pdbfixer.PDBFixer')
system_parmed = msm.convert(PDB_file, to_form='parmed.Structure')

NotImplementedFormError: Either the python library this form belongs to was not found, either this form has not been implemeted yet. In this last case, Write a new issue in https://github.com/uibcdf/MolSysMT/issues asking for it.

In [21]:
MOL2_file = msm.demo_systems.files['caffeine.mol2']
system_openmm = msm.convert(MOL2_file, to_form='openmm.Modeller')
system_mdtraj = msm.convert(MOL2_file, to_form='mdtraj.Trajectory')

AttributeError: 'NoneType' object has no attribute 'shape'

In [22]:
MMTF_file = msm.demo_systems.files['1tcd.mmtf']
system_aminoacids1_seq = msm.convert(MMTF_file, to_form='aminoacids1:seq')
system_molsys = msm.convert(MMTF_file)

In [23]:
print('Form of object system_pdbfixer: ', msm.get_form(system_pdbfixer))
print('Form of object system_parmed: ', msm.get_form(system_parmed))
print('Form of object system_openmm: ', msm.get_form(system_openmm))
print('Form of object system_mdtraj: ', msm.get_form(system_mdtraj))
print('Form of object system_aminoacids1_seq: ', msm.get_form(system_aminoacids1_seq))
print('Form of object system_molsys: ', msm.get_form(system_molsys))

Form of object system_pdbfixer:  pdbfixer.PDBFixer


NameError: name 'system_parmed' is not defined

A single file can be converted into more than a form in just a line:

In [24]:
h5_file = msm.demo_systems.files['pentalanine.h5']
topology, trajectory = msm.convert(h5_file, to_form=['molsysmt.Topology','molsysmt.Trajectory'])

  key = numpy.array(key)


When the output file path is only a dot followed by the file extension, the output is a string insted of a written file. Lets see how this works when two forms are combinend into a pdb string:

In [25]:
pdb_string = msm.convert([topology,trajectory], to_form='.pdb', frame_indices=0)

NotImplementedError: It has not been implemeted yet. Write a new issue in https://github.com/uibcdf/MolSysMT/issues asking for it.

## Some examples with IDs

In [26]:
molecular_system = msm.convert('pdb:1SUX', to_form='mdtraj.Trajectory')

## Conversions implemented in MolSysMT

In [27]:
msm.info_convert(from_form='mdtraj.Trajectory', to_form_type='seq')

Unnamed: 0,aminoacids1:seq,aminoacids3:seq
mdtraj.Trajectory,True,True


In [28]:
msm.info_convert(from_form='mdtraj.Trajectory', to_form_type='file', as_rows='to')

Unnamed: 0,mdtraj.Trajectory
crd,False
dcd,False
gro,False
h5,False
inpcrd,False
mdcrd,False
mmtf,False
mol2,False
pdb,True
prmtop,False


In [29]:
from_list=['pytraj.Trajectory','mdanalysis.Universe']
to_list=['mdtraj.Trajectory', 'openmm.Topology']
msm.info_convert(from_form=from_list, to_form=to_list)

KeyError: 'mdanalysis.Universe'