# Binding-Pocket Interactions of Four EGFR Inhibitors

For this notebook, we use mdciao to visualize the binding-pocket interactions of four **Epidermal Growth Factor Receptor (EGFR) inhibitors**. EGFR is an important drug target with implications in cancer and inflammation ([Wikipedia](https://en.wikipedia.org/wiki/Epidermal_growth_factor_receptor)). It is a transmembrane protein with an extracellular receptor domain and an intracellular kinase domain.

The molecular dynamics (MD) data used here was generated by slightly modifying the notebook 

* [T019 · Molecular dynamics simulation](https://projects.volkamerlab.org/teachopencadd/talktorials/T019_md_simulation.html)

which is part of the impressive [TeachOpenCADD](https://projects.volkamerlab.org/teachopencadd/index.html) collection, made available as teaching platform for computer-aided drug design by the [Volkamer Lab at Saarland University, Saarbrücken](https://volkamerlab.org/index.html). 


The four inhibitors and structures are chosen from the following RCSB entries:

* [*The crystal structure of EGFR T790M/C797S with the inhibitor HCD2892 (PDB ID 7VRE)*](https://www.rcsb.org/structure/7VRE)

* [*EGFR kinase domain complexed with compound 20a (PDB ID 3W32)*](https://www.rcsb.org/structure/3W32)

* [*EGFR Kinase domain complexed with tak-285 (PDB ID 3POZ)*](https://www.rcsb.org/structure/3POZ)

* [*Crystal Structure of EGFR(L858R/T790M/C797S) in complex with CH7233163 (PDB ID 6LUB)*](https://www.rcsb.org/structure/6LUB)  

Please see the references at the bottom of the notebook for more information.

In [None]:
import mdciao
import os
import matplotlib
import nglview
from glob import glob

# Consensus labeler object for KLIFS nomenclature
Since it will be used more than once, it is better to have it instantiated only once and reused many times. The only thing we need is the [UniProt Accession Code](https://www.uniprot.org/uniprot/P00533) of the EGFR, `P00533`.

In [None]:
KLIFS = mdciao.nomenclature.LabelerKLIFS("UniProtAC:P00533")

# Download example data

In [None]:
if not os.path.exists("example_kinases"):
    mdciao.examples.fetch_example_data("EGFR");

# Guess molecular fragments 

In [None]:
for pdb in sorted(glob("example_kinases/*.pdb")):
    print(pdb)
    mdciao.fragments.get_fragments(pdb)
    print()

All three setups share the equivalent topology of kinase (fragment 0) and ligand (fragment 1):
 
 * from PDB ID `3POZ` ligand `03P1`  
 
 * from PDB ID `3W32` ligand `W321` 
 
 * from PDB ID `6LUB` ligand `EUX1` 
 
 * from PDB ID `7VRE` ligand `7VH1` 


For labelling purposes, create a mapping between PDB IDs and ligand names:

In [None]:
pdb2lig = {"3POZ" : "03P1",
           "3W32" : "W321",
           "6LUB" : "EUX1", 
           "7VRE" : "7VH1"
          }

# Compute the ligand-kinase interactions for the four inhibitors  

In [None]:
binding_pocket = {}
for pdb in sorted(glob("example_kinases/*.pdb")):
    key = os.path.basename(pdb).split(".")[1]
    key="%s@%s"%(pdb2lig[key], key)
    xtc = pdb.replace(".pdb",".xtc").replace("topology","trajectory")
    binding_pocket[key]=mdciao.cli.interface(xtc, 
                                             pdb, 
                                             fragment_names=["EGFR", "ligand"],
                                             KLIFS_string=KLIFS, 
                                             ctc_control=1.0, 
                                             interface_selection_1=[0],
                                             interface_selection_2=[1],
                                             accept_guess=True, figures=False, no_disk=True)
    

# Compare interactions across the four compounds in a violinplot
Additionally, we will display *representative* geometries directly on the violinplots via their residue-residue distance-values. Subsequently, we will view these geometries in 3D

In [None]:
colors = mdciao.plots.color_dict_guesser("tab10", binding_pocket.keys())
myfig, myax, keys, representatives = mdciao.plots.compare_violins(binding_pocket,
                                                                  colors=colors,           
                                                                  anchor="ligand", 
                                                                  ctc_cutoff_Ang=4.5,
                                                                  mutations_dict={
                                                                      "EUX1": "ligand",
                                                                      "7VH1": "ligand",
                                                                      "W321": "ligand",
                                                                      "03P1": "ligand"
                                                                  },
                                                                  defrag=None,
                                                                  sort_by="residue",
                                                                  legend_rows=2,   
                                                                  representatives=True,   
                                                                  figsize=(20,5)                                                                  
                                                         )
myax.set_title("binding pocket interactions"
               "\nfor 4 different EGFR inhibitors")
myfig.tight_layout()
#myfig.savefig("EGFR.png", bbox_inches="tight")

# Show the representative geometries
The object `representatives` is a dictionary containing the geometries behind the small dots inside the violins of the previous figure, using the [repframes](https://proteinformatics.uni-leipzig.de/mdciao/api/generated/generated/mdciao.contacts.ContactGroup.html#mdciao.contacts.ContactGroup.repframes) method. In the next cells we will first align them and then overlap them using the KLIFS nomenclature.

# Superpose structures using the KLIFs alignment labels
This way, the alignment will be particularly good in the binding pocket

In [None]:
KLIFS_alignment = mdciao.nomenclature.AlignerConsensus({key : KLIFS for key in binding_pocket.keys()},
                                                       tops={key : bp.top for key, bp in binding_pocket.items()})
                                                       
KLIFS_alignment.AAresSeq

In [None]:
# We can directly get CA indices to map atoms
KLIFS_alignment.CAidxs

In [None]:
ref_key = "W321@3W32" # We take this one but could be any one
ref_geom = representatives[ref_key]
for key, geom in representatives.items():
    if key!=ref_key:
        ref_CAs, key_CAs = KLIFS_alignment.CAidxs[[ref_key, key]].values.T.astype(int)
        geom.superpose(ref_geom, atom_indices=key_CAs, ref_atom_indices=ref_CAs)
        

# Visualize residues with different behaviors in each compound
For example, residues 

* `775@b.l.36` 

* `841@c.l.74` 

* `855@xDFG.81`

* `997@EGFR` (doesn't have a KLIFS label)

In [None]:
colors = {key: matplotlib.colors.to_hex(col) for key, col in colors.items()}
iwd = nglview.NGLWidget()
for ii, (key, rep) in enumerate(representatives.items()):
    iwd.add_trajectory(rep)
    iwd.clear_representations(component=ii)
    iwd.add_cartoon(color="white", component=ii)
    iwd.add_licorice(color=colors[key], component=ii, selection="(775 841 855 997) and not Hydrogen", radius=.1)
    iwd.add_ball_and_stick(color=colors[key], component=ii, 
                          selection="not protein and not Hydrogen",
                           radius=.1,
                          )
iwd

# References

* [The crystal structure of EGFR T790M/C797S with the inhibitor HCD2892 (PDB ID 7VRE)](https://www.rcsb.org/structure/7VRE)
  * Chen, H., Lai, M., Zhang, T., Chen, Y., Tong, L., Zhu, S., … Ding, K. (2022).   
    Conformational Constrained 4-(1-Sulfonyl-3-indol)yl-2-phenylaminopyrimidine Derivatives as New Fourth-Generation Epidermal Growth Factor Receptor Inhibitors Targeting T790M/C797S Mutations.   
    Journal of Medicinal Chemistry, 65(9), 6840–6858.   
    https://doi.org/10.1021/acs.jmedchem.2c00168
* [EGFR kinase domain complexed with compound 20a (PDB ID 3W32)](https://www.rcsb.org/structure/3W32)
  * Kawakita, Y., Seto, M., Ohashi, T., Tamura, T., Yusa, T., Miki, H., … Ishikawa, T. (2013).   
    Design and synthesis of novel pyrimido[4,5- b ]azepine derivatives as HER2/EGFR dual inhibitors.   
    Bioorganic & Medicinal Chemistry, 21(8), 2250–2261.   
    https://doi.org/10.1016/j.bmc.2013.02.014
* [EGFR Kinase domain complexed with tak-285 (PDB ID 3POZ)](https://www.rcsb.org/structure/3POZ)
  * Aertgeerts, K., Skene, R., Yano, J., Sang, B. C., Zou, H., Snell, G., … Sogabe, S. (2011).   
    Structural analysis of the mechanism of inhibition and allosteric activation of the kinase domain of HER2 protein.  
    Journal of Biological Chemistry, 286(21), 18756–18765.  
    https://doi.org/10.1074/jbc.M110.206193
* [Crystal Structure of EGFR(L858R/T790M/C797S) in complex with CH7233163 (PDB ID 6LUB)](https://www.rcsb.org/structure/6LUB)  
  * Kashima, K., Kawauchi, H., Tanimura, H., Tachibana, Y., Chiba, T., Torizawa, T., & Sakamoto, H. (2020).  
    CH7233163 Overcomes Osimertinib-Resistant EGFR-Del19/T790M/C797S Mutation.   
    Molecular Cancer Therapeutics, 19(11), 2288–2297.  
    https://doi.org/10.1158/1535-7163.MCT-20-0229