# Protein prepatation
by Mauro Álvarez, Albert Meseguer, Adrià Pérez and Cristina Prat

Currently with the function prepareProtein of htmd we use propka and pdb2pqr to guess the protonation states and hydrogens. The idea of this project is to write a different algorithm based more or less on this strategy.

We need the atomic information of the different amino acids in order to know if they have to be protonated.

|Amino acids           |Hydrogen donor atoms|Hydrogen acceptor atoms|
|----------------------|--------------------|-----------------------|
|Arginine (Arg, R)     |NE, NH1 (2), NH2 (2)|                       |
|Asparagine (Asn, N)   |ND2 (2)             |OD1 (2)                |
|Aspartic acid (Asp, D)|                    |OD1 (2), OD2 (2)       |
|Glutamine (Gln, Q)    |NE2 (2)             |OE1 (2)                |
|Glutamic acid (Glu, E)|                    |OE1 (2), OE2 (2)       |
|Histidine (His, H)    |ND1, NE2            |ND1, NE2               |
|Lysine (Lys, K)       |NZ (3)              |                       |
|Serine (Ser, S)       |OG                  |OG (2)                 |
|Threonine (Thr, T)	   |OG1                 |OG1 (2)                |
|Tryptophan (Trp, W)   |NE1                 |                       |
|Tyrosine (Tyr, Y)     |OH                  |OH                     |


The molecule produced by the preparation step has residue names modified according to their protonation. Later system-building functions assume these residue names. Note that support for alternative charge states varies between the forcefields.

Charge +1    |  Neutral   | Charge -1
-------------|------------|----------
 -           |  ASH       | ASP
 -           |  CYS       | CYM
 -           |  GLH       | GLU
HIP          |  HID/HIE   |  -
LYS          |  LYN       |  -
 -           |  TYR       | TYM
ARG          |  AR0       |  -


In our approach, we do not consider the protonation of TYR and CYS amino acids:
- CYS reduction is related with disulphide bonds which are included in the PDB molecule using the tag SSBOND. Therefore, we do not need to take into account it. 
- TYM is derived from TYR. We see that this amino acid variant does not have COOH group so we are not interested in it. We do not find a TYR patch amino acid with O- atom, therefore, we do not look for its protonation.

First we import the modules we are going to need for the project

In [1]:
from htmd import *

Please cite -- HTMD: High-Throughput Molecular Dynamics for Molecular Discovery
J. Chem. Theory Comput., 2016, 12 (4), pp 1845-1852. 
http://pubs.acs.org/doi/abs/10.1021/acs.jctc.6b00049


You are on the latest HTMD version (1.3.1).


For this report is used the Trypsin molecule as an example

In [2]:
mol = Molecule('3PTB')

2016-06-30 20:40:14,332 - htmd.molecule.molecule - INFO - Attempting PDB query for 3PTB


We need to remove the water and others ligands that we have in our molecule, for this reason, we end up with just the protein.

In [3]:
mol.filter("protein")

2016-06-30 20:40:15,259 - htmd.molecule.molecule - INFO - Removed 72 atoms. 1629 atoms remaining in the molecule.


After that, we make a copy of that molecule because we just want to take into account five amino acids: ARG, LYS, ASP, GLU, HIS, CYS and TYR. They are interesting because they are able to behave as hidrogen donor or acceptor.

In [4]:
copied = mol.copy()
copied.filter('resname ARG LYS ASP GLU HIS CYS TYR')

2016-06-30 20:40:15,329 - htmd.molecule.molecule - INFO - Removed 1175 atoms. 454 atoms remaining in the molecule.


We create two new variables (*rid* and *rn*) that contain resids and resname in order to easily iterate afterwards.

In [5]:
rid = np.unique(copied.get('resid'))
rn = copied.get('resname', "name CA")

## GLU, ASP, ARG and LYS analysis

In [6]:
# Firstly, a pos_list (basic aa interactions) and neg_list (acid aa interactions) are created
pos_list = []
neg_list = []
for resname, resid in zip(rn, rid):
    # Let's start with GLU
    if resname == "GLU":
        # Select resids that can interact with basic or acid resids of the ARG LYS ASP GLU HIS
        # CYS TYR amino acids by h-bonds
        a = mol.get("resid", sel="name O N OH OG1 OG OE1 NE2 ND2 OE2 SG NE NE1 ND1 NH1 OD1 OD2 NZ NH2 and within 4 of name OE1 OE2 and resid "+str(resid))
        # Remove redudant outcomes
        un = np.unique(a)
        for residueID in un:
            try:
                l = len(np.unique(mol.get("resname", sel="resid "+str(residueID))))
                # There are some positions in the sequence that have two residues instead of 
                # one. In order to avoid that, we calculate the distance, if it is bigger 
                # than 1 means that there are two residues at the same position. At this point,
                # under the possible positions and if there are an acid, we get it and we add 
                # it at neg_list
                if l > 1:
                    for i in range(0, l):
                        if np.unique(mol.get("resname", sel="resid "+str(residueID)))[i] in ["GLU", "ASP"] and residueID != resid:
                            res = np.unique(mol.get("resname", sel="resid "+str(residueID)))
                            # Both lists have the same format: resname and resid of our 
                            # residue, and after that, resname and resid of the residue 
                            # that interacts our residue
                            tup = (resname, resid, res, residueID)
                            neg_list.append(tup)
                            print(np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID)
                else:
                    # If there is no multiple amino acid positions, we add normally the
                    # residues in the list
                    if np.unique(mol.get("resname", sel="resid "+str(residueID))) in ["GLU", "ASP"] and residueID != resid:
                        res = np.unique(mol.get("resname", sel="resid "+str(residueID)))
                        tup = (resname, resid, res, residueID)
                        neg_list.append(tup)
                        print(np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID)
            except:
                # In the case that are some mistakes, it will show a print which indicate the 
                # position
                print("Error at " + np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID)
                continue
                
    elif resname == "ASP":
        a = mol.get("resid", sel="name O N OH OG1 OG OE1 NE2 ND2 OE2 SG NE NE1 ND1 NH1 OD1 OD2 NZ NH2 and within 4 of name OD1 OD2 and resid "+str(resid))
        #print(np.unique(a), "resid", resid)
        un = np.unique(a)
        for residueID in un:
            try:
                l = len(np.unique(mol.get("resname", sel="resid "+str(residueID))))
                if l > 1:
                    for i in range(0, l):
                        if np.unique(mol.get("resname", sel="resid "+str(residueID)))[i] in ["GLU", "ASP"] and residueID != resid:
                            res = np.unique(mol.get("resname", sel="resid "+str(residueID)))
                            tup = (resname, resid, res, residueID)
                            neg_list.append(tup)
                            print(np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID)
                else:
                    
                    if np.unique(mol.get("resname", sel="resid "+str(residueID))) in ["GLU", "ASP"] and residueID != resid:
                        res = np.unique(mol.get("resname", sel="resid "+str(residueID)))
                        tup = (resname, resid, res, residueID)
                        neg_list.append(tup)
                        print(np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID)
            except:
                print("Error at " + np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID)
                continue
                
    elif resname == "ARG":
        a = mol.get("resid", sel="name O N OH OG1 OG OE1 NE2 ND2 OE2 SG NE NE1 ND1 NH1 OD1 OD2 NZ NH2 and within 4 of name NH1 NH2 NE and resid "+str(resid))
        #print(np.unique(a), "resid", resid)
        un = np.unique(a)
        for residueID in un:
            try:
                l = len(np.unique(mol.get("resname", sel="resid "+str(residueID))))
                if l > 1:
                    for i in range(0, l):
                        if np.unique(mol.get("resname", sel="resid "+str(residueID)))[i] in ["LYS", "ARG"] and residueID != resid:
                            res = np.unique(mol.get("resname", sel="resid "+str(residueID)))
                            tup = (resname, resid, res, residueID)
                            pos_list.append(tup)
                            print(np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID)
                else:
                    
                    if np.unique(mol.get("resname", sel="resid "+str(residueID))) in ["LYS", "ARG"] and residueID != resid:
                        res = np.unique(mol.get("resname", sel="resid "+str(residueID)))
                        tup = (resname, resid, res, residueID)
                        pos_list.append(tup)
                        print(np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID)
            except:
                print("Error at " + np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID)
                continue
                
    elif resname == "LYS":
        a = mol.get("resid", sel="name O N OH OG1 OG OE1 NE2 ND2 OE2 SG NE NE1 ND1 NH1 OD1 OD2 NZ NH2 and within 4 of name NZ and resid "+str(resid))
        #print(np.unique(a), "resid", resid)
        un = np.unique(a)
        for residueID in un:
            try:
                l = len(np.unique(mol.get("resname", sel="resid "+str(residueID))))
                if l > 1:
                    for i in range(0, l):
                        if np.unique(mol.get("resname", sel="resid "+str(residueID)))[i] in ["LYS", "ARG"] and residueID != resid:
                            res = np.unique(mol.get("resname", sel="resid "+str(residueID)))
                            tup = (resname, resid, res, residueID)
                            pos_list.append(tup)
                            print(np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID)
                else:
                    
                    if np.unique(mol.get("resname", sel="resid "+str(residueID))) in ["LYS", "ARG"] and residueID != resid:
                        res = np.unique(mol.get("resname", sel="resid "+str(residueID)))
                        tup = (resname, resid, res, residueID)
                        pos_list.append(tup)
                        print(np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID)
            except:
                print("Error at " + np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID)
                continue

['ASP'] 71
['GLU'] 80
['GLU'] 70
['GLU'] 77


In the following function *redundancy_eliminator*, the aim is to remove the redundance in its list. We use three list: 
- *departures* have the positions of the residues that we are
- *arrivals* have the positions of the residues which interactue 
- *delete* have the index of the element in the list that are redundant

In [7]:
def redundancy_eliminator(lista):
    departures = []
    arrivals = []
    delete = []
    if len(lista) > 1: 
        j = 0
        for i in range(0, len(lista)):
            if lista[i][1] in arrivals and lista[i][3] in departures: # If we have redudancies
                # Keep the index of the list elements which are redundantsdelete.append(i)
                next
            else:
                # Those elements that are not redundat, we append it in the list to allow their
                # comparation in future iterations
                departures.append(lista[i][1])
                arrivals.append(lista[i][3])
    for element in delete:
        lista.remove(lista[element])
    return lista           

In [8]:
neg_list = redundancy_eliminator(neg_list)
pos_list = redundancy_eliminator(pos_list)

In the function *atom_density*, we calculate the atom density around the chosen atoms within a 6A radius spherical space.

In [9]:
def atom_density(resname, resid):
    # We will obtain scores for the two residues which are the same
    aa_score_acid = {"OH":1, "OG1":2, "OG":2, "NE2": 1, "ND1":1, "O": 2, "OD1": 2, "OE1": 2, "OE2": 2, "OD2": 2,"SG": -1, "N": -1, "NE": -1, "NE1": -1, "ND2": -2, "NH1": -2, "NH2": -2, "NZ": -3 }
    aa_score_base = {"OH":-1, "OG1":-1, "OG":-1, "NE2":-1, "ND1":-1, "O": 2, "OD1": 2, "OE1": 2, "OE2": 2, "OD2": 2,"SG": -1, "N": -1, "NE": -1, "NE1": -1, "ND2": -2, "NH1": -2, "NH2": -2, "NZ": -3 }
    score = 0
    a = []
    if resname == "ASP":
        a = mol.get("name", sel="name O N OH OG1 OG OE1 NE2 ND2 OE2 SG NE NE1 ND1 NH1 OD1 OD2 NZ NH2 and within 6 of name OD1 OD2 and resid "+str(resid))
        for element in a:
            try:
                score += aa_score_acid[element]
            except:
                continue

    elif resname == "GLU":
        a = mol.get("name", sel="name O N OH OG1 OG OE1 NE2 ND2 OE2 SG NE NE1 ND1 NH1 OD1 OD2 NZ NH2 and within 6 of name OE1 OE2 and resid "+str(resid))
        for element in a:
            try:
                score += aa_score_acid[element]
            except:
                continue

    elif resname == "LYS":
        a = mol.get("name", sel="name O N OH OG1 OG OE1 NE2 ND2 OE2 SG NE NE1 ND1 NH1 OD1 OD2 NZ NH2 and within 6 of name NZ and resid "+str(resid))
        for element in a:
            try:
                score += aa_score_base[element]
            except:
                continue

    elif resname == "ARG":
        a = mol.get("name", sel="name O N OH OG1 OG OE1 NE2 ND2 OE2 SG NE NE1 ND1 NH1 OD1 OD2 NZ NH2 and within 6 of name NH1 NH2 NE and resid "+str(resid))
        for element in a:
            try:
                score += aa_score_base[element]
            except:
                continue

    return score

Now, we have to change correctly the name in the original structure (mol). If we have any same_charge<->same_charge interaction we use the electrodensity to decide its protonation.

In [10]:
# First, negatives amino acids
for element in neg_list:
    if element[0] == "GLU" and element[2] == "ASP":
        mol.set("resname", "GLH", "resid "+str(element[1]))
    elif element[0] == "ASP" and element[2] == "GLU":
        mol.set("resname", "GLH", "resid "+str(element[3]))
    elif element[0] == element[2]:
        if atom_density(element[0], element[1]) > atom_density(element[2], element[3]): 
            # Amino acid with higher electrodensity is protonated
            if element[0] == "GLU":
                name = "GLH"
            else:
                name = "ASH"
            mol.set("resname", name, "resid " + str(element[1]))
            
        elif atom_density(element[0], element[1]) < atom_density(element[2], element[3]):
            if element[2] == "GLU":
                name = "GLH"
            else:
                name = "ASH"
            mol.set("resname", name, "resid " + str(element[3]))
            
        else:
            if element[2] == "GLU":
                name = "GLH"
            else:
                name = "ASH"
            mol.set("resname", name, "resid " + str(element[1]))

In [11]:
# Secondly, positive amino acids
for element in pos_list:
    if element[0] == "ARG" and element[2] == "LYS":
        mol.set("resname", "LYN", "resid "+str(element[3]))
    elif element[0] == "LYS" and element[2] == "ARG":
        mol.set("resname", "LYN", "resid "+str(element[1]))
    elif element[0] == element[2]:
        if atom_density(element[0], element[1]) < atom_density(element[2], element[3]): 
             # Amino acid with lower electrodensity is desprotonated
            if element[0] == "LYS":
                name = "LYN"
            else:
                name = "AR0"
            mol.set("resname", name, "resid " + str(element[1]))
            
        elif atom_density(element[0], element[1]) > atom_density(element[2], element[3]):
            if element[2] == "LYS":
                name = "LYN"
            else:
                name = "AR0"
            mol.set("resname", name, "resid " + str(element[3]))
            
        else:
            if element[2] == "LYS":
                name = "LYN"
            else:
                name = "AR0"
            mol.set("resname", name, "resid " + str(element[1]))

## HIS analysis

In [12]:
rid = np.unique(copied.get('resid'))
rn = copied.get('resname', "name CA")

nd1_list = []
ne2_list = []
for resname, resid in zip(rn, rid):
    if resname == "HIS":
        print(resname, resid)
        a = mol.get("resid", sel="name O N OH OG1 OG OE1 NE2 ND2 OE2 SG NE NE1 ND1 NH1 OD1 OD2 NZ NH2 and within 4 of name ND1 and resid "+str(resid))
        un = np.unique(a)
        print ("ND1")
        for residueID in un:
            try:
                l = len(np.unique(mol.get("resname", sel="resid "+str(residueID))))
                if l > 1:
                    for i in range(0, l):
                        if residueID != resid:
                            res = np.unique(mol.get("resname", sel="resid "+str(residueID)))
                            if res == "PRO":
                                continue
                            atoms = mol.get("name" , sel="name O N OH OG1 OG OE1 NE2 ND2 OE2 SG NE NE1 ND1 NH1 OD1 OD2 NZ NH2 and resid "+str(residueID)+" and within 4 of name ND1 and resid "+str(resid))
                            tup = (res, residueID, atoms)
                            nd1_list.append(tup)
                           
                            print(np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID, atoms)
                else:
                    if residueID != resid:
                        res = np.unique(mol.get("resname", sel="resid "+str(residueID)))
                        if res == "PRO":
                                continue
                        atoms = mol.get("name" , sel="name O N OH OG1 OG OE1 NE2 ND2 OE2 SG NE NE1 ND1 NH1 OD1 OD2 NZ NH2 and resid "+str(residueID)+" and within 4 of name ND1 and resid "+str(resid))
                        tup = (res, residueID, atoms)
                        nd1_list.append(tup)
                        print(np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID, atoms)
            except:
                print("Error at " + np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID)
                continue
                
        a = mol.get("resid", sel="name O N OH OG1 OG OE1 NE2 ND2 OE2 SG NE NE1 ND1 NH1 OD1 OD2 NZ NH2 and within 4 of name NE2 and resid "+str(resid))
        un = np.unique(a)
        print ("NE2")
        for residueID in un:
            try:
                l = len(np.unique(mol.get("resname", sel="resid "+str(residueID))))
                if l > 1:
                    for i in range(0, l):
                        if residueID != resid:
                            res = np.unique(mol.get("resname", sel="resid "+str(residueID)))
                            if res == "PRO":
                                continue
                            atoms = mol.get("name" , sel="name O N OH OG1 OG OE1 NE2 ND2 OE2 SG NE NE1 ND1 NH1 OD1 OD2 NZ NH2 and resid "+str(residueID)+" and within 4 of name NE2 and resid "+str(resid))
                            tup = (res, residueID, atoms)
                            ne2_list.append(tup)
                            print(np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID, atoms)
                else:
                    
                    if residueID != resid:
                        res = np.unique(mol.get("resname", sel="resid "+str(residueID)))
                        if res == "PRO":
                                continue
                        atoms = mol.get("name" , sel="name O N OH OG1 OG OE1 NE2 ND2 OE2 SG NE NE1 ND1 NH1 OD1 OD2 NZ NH2 and resid "+str(residueID)+" and within 4 of name NE2 and resid "+str(resid))
                        tup = (res, residueID, atoms)
                        ne2_list.append(tup)
                        
                        print(np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID, atoms)
            except:
                print("Error at " + np.unique(mol.get("resname", sel="resid "+str(residueID))), residueID)
                continue
                
        # I'll try to discriminate which nitrogen has to be protonated with something similar
        # as a  "scoring method", based on h-bond capacity 
        # (source: http://www.imgt.org/IMGTeducation/Aide-memoire/_UK/aminoacids/charge/)
        # If it can be both donor and acceptor, first the environment is evaluated and then 
        # it scores. Also, if both nitrogens have an acidic aa near, it protonates that 
        # histidine. 
        nd1_negativech = False
        ne2_negativech = False
        nd1_lvl = 0
        ne2_lvl = 0
        aa_score = {"O": 2, "OD1": 2, "OE1": 2, "OE2": 2, "OD2": 2,"SG": -1, "N": -1, "NE": -1, "NE1": -1, "ND2": -2, "NH1": -2, "NH2": -2, "NZ": -3 }
        # Over-score the negative charges (positive can be identified)"
        # First, we start by iterating for every aa found around HIS
        for aa in nd1_list:
            # Check if a negative aminoacid is near
            if aa[0] in ["ASP", "GLU"]:
                nd1_negativech = True
                nd1_lvl += 2
            # Then, we iterate for every atom found for an specific aminoacid 
            # (usually only 1 atom)
            for atom in aa[2]:
                if atom in aa_score:
                    nd1_lvl += aa_score[atom]
                # There are atoms that can be donors or acceptors, so we check if the index is
                # positive or negative to apply a mitigation effect
                elif atom == "OG": #Serine
                    if nd1_lvl <0:
                        nd1_lvl += 2
                    elif nd1_lvl >= 0:
                        nd1_lvl -=1
                elif atom == "OG1": #Threonine
                    if nd1_lvl <0:
                        nd1_lvl += 2
                    elif nd1_lvl >= 0:
                        nd1_lvl -=1
                elif atom == "OH": #Tyrosine
                    if nd1_lvl <0:
                        nd1_lvl += 1
                    elif nd1_lvl >= 0:
                        nd1_lvl -=1
                elif atom == "NE2": # Other HIS or GLN
                    if aa[0] == "GLN":
                        nd1_lvl -= 2
                    elif aa[0] == "HID":
                        nd1_lvl += 1
                    elif aa[0] == "HIE":
                        nd1_lvl -= 1
                    elif aa[0] == "HIP":
                        nd1_lvl -= 1
                    elif aa[0] == "HIS":
                        if nd1_lvl <0:
                            nd1_lvl += 1
                        elif nd1_lvl >= 0:
                            nd1_lvl -=1    
                elif atom == "ND1": # Other HIS or GLN
                    if aa[0] == "HID":
                        nd1_lvl -= 1
                    elif aa[0] == "HIE":
                        nd1_lvl += 1
                    elif aa[0] == "HIP":
                        nd1_lvl -= 1
                    elif aa[0] == "HIS":
                        if nd1_lvl <0:
                            nd1_lvl += 1
                        elif nd1_lvl >= 0:
                            nd1_lvl -=1 
        for aa in ne2_list:
            # Check if a negative aminoacid is near
            if aa[0] in ["ASP", "GLU", "TYM"]:
                ne2_negativech = True
                ne2_lvl += 2
            # Then, we iterate for every atom found for an specific aminoacid 
            # (usually only 1 atom)
            for atom in aa[2]:
                if atom in aa_score:
                    ne2_lvl += aa_score[atom]
                # There are atoms that can be donors or acceptors, so we check if the index is
                # positive or negative to apply a mitigation effect
                elif atom == "OG": #Serine
                    if ne2_lvl <0:
                        ne2_lvl += 2
                    elif ne2_lvl >= 0:
                        ne2_lvl -=1
                elif atom == "OG1": #Threonine
                    if ne2_lvl <0:
                        ne2_lvl += 2
                    elif ne2_lvl >= 0:
                        ne2_lvl -=1
                elif atom == "OH": #Tyrosine
                    if ne2_lvl <0:
                        ne2_lvl += 1
                    elif ne2_lvl >= 0:
                        ne2_lvl -=1
                elif atom == "NE2": # Other HIS or GLN
                    if aa[0] == "GLN":
                        ne2_lvl -= 2
                    elif aa[0] == "HID":
                        ne2_lvl += 1
                    elif aa[0] == "HIE":
                        ne2_lvl -= 1
                    elif aa[0] == "HIP":
                        ne2_lvl -= 1
                    elif aa[0] == "HIS":
                        if nd1_lvl <0:
                            ne2_lvl += 1
                        elif nd1_lvl >= 0:
                            ne2_lvl -=1    
                elif atom == "ND1": # Other HIS or GLN
                    if aa[0] == "HID":
                        ne2_lvl -= 1
                    elif aa[0] == "HIE":
                        ne2_lvl += 1
                    elif aa[0] == "HIP":
                        ne2_lvl -= 1
                    elif aa[0] == "HIS":
                        if ne2_lvl <0:
                            ne2_lvl += 1
                        elif ne2_lvl >= 0:
                            ne2_lvl -=1
        # Okay, scoring made, let's decide the protonation!
        if nd1_negativech and ne2_negativech:
            mol.set("resname", "HIP", "resname "+str(resname)+" and resid "+str(resid))   
        if nd1_lvl >= ne2_lvl:
            mol.set("resname", "HID", "resname "+str(resname)+" and resid "+str(resid))
        elif nd1_lvl < ne2_lvl:
            mol.set("resname", "HIE", "resname "+str(resname)+" and resid "+str(resid))

HIS 40
ND1
['SER'] 32 ['OG']
['PHE'] 41 ['N']
['CYS'] 42 ['O']
NE2
['CYS'] 42 ['N' 'O']
['GLY'] 193 ['O']
HIS 57
ND1
['ASP'] 102 ['OD1' 'OD2']
['SER'] 214 ['O']
NE2
['SER'] 195 ['OG']
['SER'] 214 ['O']
HIS 91
ND1
['SER'] 93 ['N']
['TYR'] 94 ['N']
NE2


Here, we can see the different amino acids that we have now in our molecule:

In [13]:
np.unique(mol.get('resname', sel='protein'))

array(['ALA', 'ARG', 'ASN', 'ASP', 'CYS', 'GLH', 'GLN', 'GLU', 'GLY',
       'HID', 'HIE', 'ILE', 'LEU', 'LYS', 'MET', 'PHE', 'PRO', 'SER',
       'THR', 'TRP', 'TYR', 'VAL'], dtype=object)