In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import molsysmt as msm



# Covalent chains and blocks

##  How to get covalent chains
Lets load first of all a molecular system to work with in this section:

In [3]:
molecular_system = msm.demo_systems.files['1tcd.mmtf']
molecular_system = msm.convert(molecular_system)

In [4]:
msm.info(molecular_system)

form,n_atoms,n_groups,n_components,n_chains,n_molecules,n_entities,n_waters,n_proteins,n_frames
molsysmt.MolSys,3983,662,167,4,166,2,165,1,1


MolSysMT includes a method to get all covalent chains found in the molecular system given by a sequence of atom names. To illustrate how the method `molsysmt.covalent_chains` works lets extract all segments of atoms C, N, CA an C covalently bound in this order (C-N-CA-C):

In [5]:
covalent_chains =msm.covalent_chains(molecular_system, chain=["C", "N", "CA", "C"],
                                     selection="component_index==0")

In [6]:
covalent_chains.shape

(247, 4)

The output is a numpy array 2-ranked where the dimension of the first axe or rank is the number of chains found in the system, and the second rank has dimension 4 (since it chain was chosen to have 4 atoms):

In [7]:
covalent_chains

array([[   2,    9,   10,   11],
       [  11,   16,   17,   18],
       [  18,   25,   26,   27],
       ...,
       [1877, 1884, 1885, 1886],
       [1886, 1889, 1890, 1891],
       [1891, 1896, 1897, 1898]])

Lets check that the name of the atoms in any of the obtained chains is correct:

In [8]:
msm.get(molecular_system, selection=covalent_chains[0], name=True)

array(['C', 'N', 'CA', 'C'], dtype=object)

The atom name specified at each place does not need to be unique, we can introduce variants at any position defining the covalent chain. Lets see for instance how to get all 4 atoms covalent chains where the first three atoms are C-N-CA, in this order, and the fourth atom can either be C or CB:

In [9]:
covalent_chains =msm.covalent_chains(molecular_system, chain=["C", "N", "CA", ["C", "CB"]],
                                                              selection="component_index==0")

The covalent chains defining the $\phi$, $\psi$, $\omega$ and , $\xi_1$ dihedral angles are obtained as follows:

In [10]:
# Covalent chains defining all phi dihedral angles in the molecular system
phi_chains = msm.covalent_chains(molecular_system, chain=["C", "N", "CA", "C"])

In [11]:
# Covalent chains defining all psi dihedral angles in the molecular system
psi_chains = msm.covalent_chains(molecular_system, chain=["N", "CA", "C", "N"])

In [12]:
# Covalent chains defining all omega dihedral angles in the molecular system
omega_chains = msm.covalent_chains(molecular_system, chain=[["CA", "CH3"], "C", "N", ["CA", "CH3"]])

In [13]:
# Covalent chains defining all chi1 dihedral angles in the molecular system
chi1_chains = msm.covalent_chains(molecular_system, chain=["N", "CA", "CB", "CG"])

##  How to get covalent blocks

Lets load first of all a molecular system to work with in this section:

In [14]:
molecular_system = msm.demo_systems.metenkephalin()

In [15]:
msm.info(molecular_system)

form,n_atoms,n_groups,n_components,n_chains,n_molecules,n_entities,n_peptides,n_frames
molsysmt.MolSys,72,5,1,1,1,1,1,1


In [16]:
blocks = msm.covalent_blocks(molecular_system)

In [17]:
print(len(blocks))

1


In [18]:
print(blocks)

[{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71}]


In [19]:
msm.get(molecular_system, target='atom', selection='atom_name==["C", "N"]', inner_bonded_atoms=True)

array([[19, 21],
       [26, 28],
       [33, 35],
       [53, 55]])

In [20]:
blocks = msm.covalent_blocks(molecular_system, remove_bonds=[[19,21],[33,35]])

In [21]:
print(len(blocks))

3


In [22]:
print(blocks)

[{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}, {32, 33, 34, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31}, {35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71}]


In [23]:
blocks = msm.covalent_blocks(molecular_system, remove_bonds=[[19,21],[33,35]], output_form='array')

In [24]:
blocks

array([0, 0, 0, ..., 2, 2, 2])

## How to get the atoms quartets defining the dihedral angles

In [25]:
molecular_system = msm.demo_systems.files['1tcd.mmtf']
molecular_system = msm.convert(molecular_system)

In [51]:
phi_chains = msm.covalent_dihedral_quartets(molecular_system, dihedral_angles='phi')

In [52]:
chi5_chains = msm.covalent_dihedral_quartets(molecular_system, dihedral_angles='chi5')

In [54]:
print(chi5_chains.shape[0])

26


In [57]:
msm.get(molecular_system, target='group', selection='group_name=="ARG"', n_groups=True)

26

#### Phi

| Residue | Atoms | Zero value | Range (degrees)|
| :---: | :---: | :---: | :---: |
| all but PRO | C-N-CA-C | C cis to C | [-180, 180) |
| PRO | C-N-CA-C | C cis to C | ~-90 |

#### Psi

| Residue | Atoms | Zero value | Range (degrees)|
| :---: | :---: | :---: | :---: |
| all | N-CA-C-N | N cis to N | [-180, 180) |

#### Omega

| Residue | Atoms | Zero value | Range (degrees)|
| :---: | :---: | :---: | :---: |
| all | CA-C-N-CA | CA cis to CA | ~180 |
| all | CH3-C-N-CA | CA cis to CA | ~180 |
| all | CA-C-N-CH3 | CA cis to CA | ~180 |

#### Chi1

| Residue | Atoms | Zero value | Range (degrees)|
| :---: | :---: | :---: | :---: |
| ARG | N-CA-CB-CG | CG cis to N | [-180, 180) |
| ASN | N-CA-CB-CG | CG cis to N | [-180, 180) |
| ASP | N-CA-CB-CG | CG cis to N | [-180, 180) |
| CYS | N-CA-CB-SG | SG cis to N | [-180, 180) |
| GLN | N-CA-CB-CG | CG cis to N | [-180, 180) |
| GLU | N-CA-CB-CG | CG cis to N | [-180, 180) |
| HIS | N-CA-CB-CG | CG cis to N | [-180, 180) |
| ILE | N-CA-CB-CG1 | CG1 cis to N | [-180°, 180) |
| LEU | N-CA-CB-CG | CG cis to N | [-180, 180) |
| LYS | N-CA-CB-CG | CG cis to N | [-180, 180) |
| MET | N-CA-CB-CG | CG cis to N | [-180, 180) |
| PHE | N-CA-CB-CG | CG cis to N | [-180, 180) |
| PRO | N-CA-CB-CG | CG cis to N | CA-CB is part of ring |
| SER | N-CA-CB-OG | OG cis to N | [-180, 180) |
| THR | N-CA-CB-OG1 | OG1 cis to N | [-180, 180) |
| TRP | N-CA-CB-CG | CG cis to N | [-180, 180) |
| TYR | N-CA-CB-CG | CG cis to N | [-180, 180) |
| VAL | N-CA-CB-CG1 | CG1 cis to N | [-180, 180) |


### Chi2

| Residue | Atoms | Zero value | Range (degrees)|
| :---: | :---: | :---: | :---: |
| ARG | CA-CB-CG-CD  | CD cis to CA     | [-180, 180) |
| ASN | CA-CB-CG-OD1 | OD1 cis to CA    | [-180, 180) |
| ASP | CA-CB-CG-OD  | OD1 cis to CA    | [-180, 180) |
| GLN | CA-CB-CG-CD  | CD cis to CA     | [-180, 180) |
| GLU | CA-CB-CG-CD  | CD cis to CA     | [-180, 180) |
| HIS | CA-CB-CG-ND1 | ND1 cis to CA    | [-180, 180) |
| ILE | CA-CB-CG1-CD | CD cis to CA     | [-180, 180) |
| LEU | CA-CB-CG-CD1 | CD1 cis to CA    | [-180, 180) |
| LYS | CA-CB-CG-CD  | CD cis to CA     | [-180, 180) |
| MET | CA-CB-CG-SD  | SD cis to CA     | [-180, 180) |
| PHE | CA-CB-CG-CD  | CD1 cis to CA    | [-180, 180) |
| PRO | CA-CB-CG-CD  | CD cis to CA     | CB-CG is part of ring |
| TRP | CA-CB-CG-CD1 | CD1 cis to CA    | [-180, 180) |
| TYR | CA-CB-CG-CD1 | CD1 cis to CA    | [-180, 180) |

#### Chi3

| Residue | Atoms | Zero value | Range (degrees)|
| :---: | :---: | :---: | :---: |
| ARG | CB-CG-CD-NE  | NE cis to CB     | [-180, 180) |
| GLN | CB-CG-CD-OE1 | OE1 cis to CB    | [-180, 180) |
| GLU | CB-CG-CD-OE1 | OE1 cis to CB    | [-180, 180) |
| LYS | CB-CG-CD-CE  | CE cis to CB     | [-180, 180) |
| MET | CB-CG-SD-CE  | CE cis to CB     | [-180, 180) |

#### Chi4

| Residue | Atoms | Zero value | Range (degrees)|
| :---: | :---: | :---: | :---: |
| ARG | CG-CD-NE-CZ | CZ cis to CG      | [-180, 180) |
| LYS | CG-CD-CE-NZ | NZ cis to CG      | [-180, 180) |

#### Chi5

| Residue | Atoms | Zero value | Range (degrees)|
| :---: | :---: | :---: | :---: |
| ARG | CD-NE-CZ-NH1 | NH1 cis to CD    | [-180, 180) |

In [68]:
psi_chains, psi_blocks = msm.covalent_dihedral_quartets(molecular_system, dihedral_angles='psi', with_blocks=True)