# Kinase modeling

This notebook explains how KinoML can be used to prepare a structure of interest. The long-term goal is to automatically model different Dunbrack conformations.

## Content/Progress

1. Prepare a structure from PDB
  - [x] Retrieve structure  
  - [x] Preparation with OESpruce  
    - [x] loop modeling  
    - [x] protonation  
    - [x] no capping (capping could interfer with MODELLER)  
    - [x] workaroud for missing OXT atoms, which are not added in OESpruce 1.1.0
    - [x] select design unit based on Iridium  
  - [ ] Preparation with MODELLER and MDAnalysis  
    - [ ] retrieve Kinase domain sequence from Uniprot with information about termini   
    - [ ] cut unnecessary residues  
    - [ ] build missing residues  
    - [ ] renumbering residues  
  - [x] cap termini with OESpruce if not real termini  
  - [x] protonation with OESpruce
  - [x] write prepared protein and ligand  
2. Generate all Dunbrack conformations  
  - [ ] Generate/Update dataframe of available templates with associated Dunbrack conformation  
  - [ ] Pick template  
  - [ ] Model conformation with MODELLER  

## 1. Prepare a structure from PDB or file

### Fix as much as possible with OESpruce

In [1]:
from appdirs import user_cache_dir
from kinoml.modeling.OpenEyePreparation import has_ligand, read_molecules, read_electron_density, prepare_complex, prepare_protein, write_molecules
from kinoml.utils import download_file

In [2]:
# specify pdb and download structure
pdb_id = "4yne"
download_file(f"https://files.rcsb.org/download/{pdb_id}.pdb", f"{user_cache_dir()}/{pdb_id}.pdb")
structure = read_molecules(f"{user_cache_dir()}/{pdb_id}.pdb")[0]

If a ligand is present, download the electron density map and supply it to the complex preparation function. Model as much missing residues as possible. Do not cap termini, since MODELLER might need to model additional parts.

In [3]:
if has_ligand(structure):
    download_file(f"https://edmaps.rcsb.org/coefficients/{pdb_id}.mtz", f"{user_cache_dir()}/{pdb_id}.mtz")
    electron_density = read_electron_density(f"{user_cache_dir()}/{pdb_id}.mtz")
    protein, ligand = prepare_complex(structure,
                                      electron_density=electron_density,
                                      loop_db="~/.OpenEye/rcsb_spruce.loop_db",
                                      cap_termini=False)
else:
    protein = prepare_protein(structure, 
                              loop_db="~/.OpenEye/rcsb_spruce.loop_db",
                              cap_termini=False)

In [4]:
# write the prepared protein to do further modeling steps with MODELLER
write_molecules([protein], f"{user_cache_dir()}/{pdb_id}_prep.pdb")

### Preparation with MODELLER and MDAnalysis

Build residues that were not added by OESpruce. Renumber residues. Cut Residues that are not part of the kinase domain.

TBD

### Protonate and cap modeled structure

The prepared structure will not have any caps yet. Also, MODELLER does not take care of protonation. So we can run another round of preparation with OESpruce. But this time termini will be capped unless they are biologically relevant.

***Would be good to pass the ligand here as well, since protonation can be affected!***

In [5]:
# use the prepared structure from the above OESpruce preparation step
capped_protein = prepare_protein(protein, real_termini=[1, 793])

In [6]:
# write the prepared protein to do further modeling steps with MODELLER
write_molecules([capped_protein], f"{user_cache_dir()}/{pdb_id}_prep_capped.pdb")

## 2. Generate all Dunbrack conformations

TBD