In [1]:
%load_ext autoreload
%autoreload 2

In [3]:
import molsysmt as msm

# How to get attributes from a molecular system

Any attribute of the elements composing a molecular system such as name, index, id or type, as well as some simple observables, can be obtained by means of the method `molsysmt.get()`. Lets load a molecular system to play a bit with it:

In [4]:
molecular_system = msm.load('1tcd.mmtf', to_form='molsysmt.MolSys')

As first example lets obtain the names of the atoms with indices 32, 33 and 34 (0-based). The method `molsysmt.get()` has an input argument named `target` to choose the nature of the elements over which the inquery works: 'atom', 'group', 'component', 'chain', 'molecule', 'entity' or 'system'. By default `target` is 'atom'. Now lets pay attention to the input argument `indices`. `indices` allows us to specify a set of targetted elements by their indices, again, over which the inquery works. This way: 

In [5]:
# name of atoms with index 32, 33 or 34
names = msm.get(molecular_system, indices=[32,33,34], name=True)
print('Atom names:',names)

Atom names: ['N' 'CA' 'C']


The number of attributes we want to know from these atoms is no limited to one. We can ask `molsysmt.get()` to extract as many attributes as you desire:

In [6]:
# name, group index and group name of atoms with index 32, 33 or 34
names, group_indices, group_names = msm.get(molecular_system, target='atom', indices=[32,33,34],
                                            name=True, group_index=True, group_name=True)
print('Atom names:', names)
print('Group indices:', group_indices)
print('Group names:', group_names)

Atom names: ['N' 'CA' 'C']
Group indices: [4 4 4]
Group names: ['ILE' 'ILE' 'ILE']


Notice that if no indices list is provided, the method applies over all elements of the targeted entity. See for example:

In [7]:
# number of atoms in the molecular system
n_atoms = msm.get(molecular_system, target='atom', n_atoms=True)
print(n_atoms)

3983


In [8]:
# number of chains in the molecular system
n_chains = msm.get(molecular_system, target='atom', n_chains=True)
print(n_chains)

4


The method `msm.get()` can also take the input argument `selection`. The use of `selection` is explained in secction XXX. Lets see how `msm.get()` works with `selection` with a simple example:

In [9]:
# Indices of atoms in group with index 20 (with molsysmt.select()!!!)
msm.select(molecular_system, target='atom', selection='group.index==20')

array([148, 149, 150, 151, 152, 153, 154])

In [10]:
# Names and indices of atoms in group with index 20 (with molsysmt.get()!!!)
msm.get(molecular_system, target='atom', selection='group.index==20', name=True, index=True)

[array(['N', 'CA', 'C', 'O', 'CB', 'CG', 'CD'], dtype=object),
 array([148, 149, 150, 151, 152, 153, 154])]

In [11]:
# number of atoms in molecules of type protein
msm.get(molecular_system, target='atom', selection='molecule.type=="protein"', n_atoms=True)

3818

In [12]:
# number of molecules of type water
msm.get(molecular_system, target='atom', selection='molecule.type=="water"', n_molecules=True)

165

## Table with attributes you can get

The following table shows the list of attribute arguments you can use in `molsysmt.get()`, togethere with their meaning and the list of elements each of them can be used with.

| Property | Meaning | Element the property applies |
|:--------|:-------------|:-------------|
| 'index' | **index or indices** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'id' | **id or ids** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'name' | **name or names** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'type' | **type or types** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'atom_index' | **atom index or indices** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'atom_id' | **atom id or ids** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'atom_name' |  **atom name or names** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'atom_type' |  **atom type or types** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'group_index' | **group index or indices** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'group_id' | **group id or ids** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'group_name' | **group name or names** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'group_type' | **group type or types** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'component_index' | **component index or indices** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'component_id' | **component id or ids** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'component_name' | **component name or names** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'component_type' | **component type or types** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'molecule_index' | **molecule index or indices** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'molecule_id' | **molecule id or ids** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'molecule_name' | **molecule name or names** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'molecule_type' | **molecule type or types** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'chain_index' | **chain index or indices** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'chain_id' | **chain id or ids** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'chain_name' | **chain name or names** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'chain_type' | **chain type or types** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'entity_index' | **entity index or indices** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'entity_id' | **entity id or ids** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'entity_name' | **entity name or names** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'entity_type' | **entity type or types** of the list of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity'|
| 'bonded_atoms' | **bonded atoms** to the list of atoms defined by `indices` or `selection`| 'atom', 'system'|
| 'n_atoms' | **number of atoms** in the set of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity', 'system'|
| 'n_groups' | **number of groups** in the set of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity', 'system'|
| 'n_components' | **number of components** in the set of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity', 'system'|
| 'n_molecules' | **number of molecules** in the set of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity', 'system'|
| 'n_chains' | **number of chains** in the set of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity', 'system'|
| 'n_entities' | **number of entities** in the set of targeted elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity', 'system'|
| 'n_bonds' | **number of bonds** present in the set of elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity', 'system'|
| 'n_aminoacids' | **number of groups aminoacid type** in the system | 'system'|
| 'n_nucleotides' | **number of groups nucleotide type** in the system | 'system'|
| 'n_ions' | **number of molecules ion type** in the system | 'system'|
| 'n_waters' | **number of molecules water type** in the system | 'system'|
| 'n_cosolutes' | **number of molecules cosolute type** in the system | 'system'|
| 'n_small_molecules' | **number of molecules small molecule type** in the system | 'system'|
| 'n_peptides' | **number of molecules peptide type** in the system | 'system'|
| 'n_proteins' | **number of molecules protein type** in the system | 'system'|
| 'n_adns' | **number of molecules adn type** in the system | 'system'|
| 'n_arns' | **number of molecules arn type** in the system | 'system'|
| 'mass' | **mass** of every targeted element defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity', 'system'|
| 'charge' | **formal charge** of every targeted element defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity', 'system'|
| 'coordinates' | **coordinates of atoms** present in the set of elements defined by `indices` or `selection`| 'atom', 'group', 'component', 'chain', 'molecule', 'entity', 'system'|
| 'box' | **box vectors** defining the periodic box of the system -if any-| 'system'|
| 'box_shape' | **box shape** of the periodic box of the system -if any-| 'system'|
| 'box_lengths' | **edge lengths** of the periodic box of the system -if any-| 'system'|
| 'box_angles' | **vertices angles** of the periodic box of the system -if any-| 'system'|
| 'time' | **times** stored in the trajectory| 'system'|
| 'n_frames' | **number of frames** stored in the trajectory| 'system'|
| 'form' | **form** of the molecular system| 'system'|

## Some examples over atoms, groups, components, chains, molecules or entities

Here you can find some additional examples where `msm.get()` works over atoms, groups, components, chains, molecules or entities:

In [13]:
# Name of atoms with index 0, 1 or 2
msm.get(molecular_system, target='atom', indices=[0,1,2], name=True)

array(['N', 'CA', 'C'], dtype=object)

In [14]:
# Name of atoms with index 0, 1 or 2
msm.get(molecular_system, target='atom', selection='atom.index in [0,1,2]', name=True)

array(['N', 'CA', 'C'], dtype=object)

In [15]:
# Group id of atoms with index 0, 1 or 2
msm.get(molecular_system, target='atom', selection='atom.index in [0,1,2]', group_id=True)

array([4, 4, 4])

In [16]:
# Number of groups in atoms with index 0, 1 or 2
msm.get(molecular_system, target='atom', selection='atom.index in [0,1,2]', n_groups=True)

1

In [17]:
# Id of groups with index 0, 1 or 2
msm.get(molecular_system, target='group', indices=[0,1,2], id=True)

array([4, 5, 6])

In [18]:
# Id of groups with index 0, 1 or 2
msm.get(molecular_system, target='group', selection='group.index in [0,1,2]', id=True)

array([4, 5, 6])

In [19]:
# Index of atoms in groups with index 0 or 1
msm.get(molecular_system, target='group', indices=[0,1], atom_index=True)

array([ 0,  1,  2, ..., 13, 14, 15])

In [20]:
# Index of groups in molecule with index 0
msm.get(molecular_system, target='molecule', indices=0, group_index=True)

array([  0,   1,   2, ..., 494, 495, 496])

In [21]:
# Names of groups in molecule of index 0
msm.get(molecular_system, target='molecule', indices=0, group_name=True)

array(['LYS', 'PRO', 'GLN', ..., 'ALA', 'THR', 'LYS'], dtype=object)

In [22]:
# Number of molecules of type protein
msm.get(molecular_system, target='molecule', selection='molecule.type=="protein"', n_molecules=True)

1

In [23]:
# Number of groups in molecules of type protein
msm.get(molecular_system, target='molecule', selection='molecule.type=="protein"', n_groups=True)

497

In [24]:
# Number of groups in molecules of type water
msm.get(molecular_system, target='group', selection='molecule.type=="water"', n_groups=True)

165

In [25]:
# Name of entity with index 0
msm.get(molecular_system, target='entity', indices=0, name=True)

array(['TRIOSEPHOSPHATE ISOMERASE'], dtype=object)

In [26]:
# Type of entity with index 1
msm.get(molecular_system, target='entity', indices=1, type=True)

array(['water'], dtype=object)

In [27]:
# Number of molecules in entity with index 1
msm.get(molecular_system, target='entity', indices=1, n_molecules=True)

165

In [28]:
# Index of molecules in entity with index 1
msm.get(molecular_system, target='entity', indices=1, molecule_index=True)

array([  1,   2,   3, ..., 163, 164, 165])

In [29]:
# Molecule type of groups with index 10, 11 or 12
msm.get(molecular_system, target='group', indices=[10,11,12], molecule_type=True)

array(['protein', 'protein', 'protein'], dtype=object)

In [30]:
# Molecule type of molecules with index 1 to 9
msm.get(molecular_system, target='molecule', indices=range(1,10), molecule_type=True)

array(['water', 'water', 'water', 'water', 'water', 'water', 'water',
       'water', 'water'], dtype=object)

In [31]:
# Number of groups of type aminoacid
msm.get(molecular_system, target='group', selection='group.type=="aminoacid"', n_groups=True)

497

## Some examples over system

The element system has some specific observables accounting for the amount of type of elements such as the number of aminoacid groups, ion molecules or protein molecules:

In [32]:
# Number of aminoacid groups in the system
msm.get(molecular_system, target='system', n_aminoacids=True)

497

In [33]:
# Number of ion molecules in the system
msm.get(molecular_system, target='system', n_ions=True)

0

In [34]:
# Number of protein molecules in the system
msm.get(molecular_system, target='system', n_proteins=True)

1

In [35]:
# Number of rna molecules in the system
msm.get(molecular_system, target='system', n_rnas=True)

0

In [36]:
# Number of atoms in the system
msm.get(molecular_system, target='system', n_atoms=True)

3983

In [37]:
# Number of entities in the system
msm.get(molecular_system, target='system', n_entities=True)

2

## Getting trajectory times and coordinates, and pbc box attributes