In [None]:
import MDAnalysis as mda
import numpy as np
import yaml

In [None]:
def filter_atom_names(ag1,ag2):
    return list(set(ag1.names) - set(ag2.names))

# What and why?

In the following we introduce the functionality of the pycomplexes toolbox to be able to convert structure files with non-canonical amber residue names. With this functionality we are implicitly able to prepare coarse-grained protein structures at different pH. We prepared hen-egg white Lysozyme (6LYZ) at different pH with an external tool and demonstrate how to convert these structures to the cplx-structure format.

# Topologies at different pH 

We compare amber topologies which have been prepared at different pH:

In [None]:
# all-atom coordinate file at pH 5.0
pqr5 = mda.Universe("0.15_80_10_pH5.0_6LYZ.result.pqr")
# all-atom coordinate file at pH 7.0
pqr7 = mda.Universe("0.15_80_10_pH7.0_6LYZ.result.pqr")

In [None]:
print("Total charge at pH 5.0: {:.1f}".format(pqr5.atoms.charges.sum()))
print("Total charge at ph 7.0: {:.1f}".format(pqr7.atoms.charges.sum()))

We observe a change in the total charge.

## Which residue charges changed?

We loop through all residues and check, whether some residue names have changed between pH 5 and pH 7:

In [None]:
for rid, r5, r7 in zip(pqr5.residues.resids, pqr5.residues.resnames, pqr7.residues.resnames):
    if r5 != r7:
        print("resid {}: {} -> {}".format(rid, r5, r7))

We find, that residue 15 and 35 have changed. These name changes suggest a change in protonation state of these two residues. The change of protonation state is reflected in the changed residue name.

We follow the Residue naming convention of amber. This convention is documented in the Amber Manual:

http://ambermd.org/Manuals.php

In section >> Residue naming conventions <<

# Comparing residues at different protonation state

In [None]:
res15ph5 = pqr5.atoms.select_atoms("resid 15")
res15ph7 = pqr7.atoms.select_atoms("resid 15")

res35ph5 = pqr5.atoms.select_atoms("resid 35")
res35ph7 = pqr7.atoms.select_atoms("resid 35")

We compare the number of atoms of the residues, which have been changed in the course of preparation at different pH.

In [None]:
print("Number of atoms in residue 15 at pH=5:\t{}\n"
      "Number of atoms in residue 15 at pH=7:\t{}".format(res15ph5.n_atoms, res15ph7.n_atoms))
print("Number of atoms in residue 35 at pH=5:\t{}\n"
      "Number of atoms in residue 35 at pH=7:\t{}".format(res35ph5.n_atoms, res35ph7.n_atoms))

Due to deprotonation the number of atoms has been decreased.

### Which atoms have been added in the course of protonation?

In [None]:
print("{}".format(*filter_atom_names(res15ph5,res15ph7)))
print("{}".format(*filter_atom_names(res35ph5,res35ph7)))

Two hydrogen atoms have been added.

In [None]:
for rid, c5, c7 in zip(pqr5.residues.resids, pqr5.residues.charges, pqr7.residues.charges):
    if abs(c5-c7) > 1e-2:
        print("Residue number {}".format(rid))
        print("  Charge at pH 5.0: {:.2f}".format(c5))
        print("  Charge at pH 7.0: {:.2f}".format(c7))

We can see that the charge of these two residues changed due to deprotonation.

# Making use of pycomplexes to convert amber-topologies

## Preparing input for convert

We use the prepared pqr files as input for the convert tool of the pycomplexes toolbox.

In [None]:
%%writefile 6LYZ_pH5.0.top
box: [100, 100, 100]
topology:
    A:
        coordinate-file: 0.15_80_10_pH5.0_6LYZ.result.pqr
        domains:
            lysozyme:
                type: rigid
                selection: 'protein and name CA'

In [None]:
%%writefile 6LYZ_pH7.0.top
box: [100, 100, 100]
topology:
    A:
        coordinate-file: 0.15_80_10_pH7.0_6LYZ.result.pqr
        domains:
            lysozyme:
                type: rigid
                selection: 'protein and name CA'

## Applying convert to these files

In [None]:
from pycomplexes import convert

In [None]:
top_fname = "6LYZ_pH5.0.top"
with open(top_fname) as f:
    top = yaml.safe_load(f)

cplx5 = convert.convert(top)

In [None]:
top_fname = "6LYZ_pH7.0.top"
with open(top_fname) as f:
    top = yaml.safe_load(f)

cplx7 = convert.convert(top)

# Comparing complexes topologies at different pH

In [None]:
for (charge5, charge7, resname5, resname7) in zip(cplx5["topologies"][0]["domains"][0]["charges"],
                                                  cplx7["topologies"][0]["domains"][0]["charges"],
                                                  cplx5["topologies"][0]["domains"][0]["beads"],
                                                  cplx7["topologies"][0]["domains"][0]["beads"]):
    if charge5 != charge7:
        print("residue names: {}, {} charges: {:>4}, {:>4}".format(resname5, resname7, charge5, charge7))

We see that the charges of these residues has been changed at different pH-values.
The Lennard-Jones interaction parameters do not change, since the the residues have been renamed to agree with the canonical naming convention.

# Preparing cplx-structure files

We can also use the convert-tool from the command line to achieve a cplx-structure file:

In [None]:
! pycomplexes convert 6LYZ_pH5.0.top 6LYZ_pH5.0.cplx

In [None]:
! pycomplexes convert 6LYZ_pH7.0.top 6LYZ_pH7.0.cplx

In [None]:
!ls *.cplx

These files we prepared previously can be used with complexes-pp. We assigned charges according to the pH used preparing the structures.