# Introduction to Docking

## What is docking?

    1. Predict the orientation and conformation of a small molecule (ligand) in the binding site of the target protein 
    
    2. Estimate its binding affinity. 

## Application scenario of different docking methods

<img src='figure/model_comp.png' width=80%></img>

    1. In small-scale (i.e., dozens of) docking experiments, it is always a good idea to use sophisticated docking methods, such as Flexible CDOCKER
    
    2. However, in large-scale docking experiments, such as virtual screening for lead compound identification, direct application of such methods might not be appropriate or realistic because of its relative large computational cost. 
    
        1. This leads to the development of rigid receptor -- rigid ligand docking method (FFT docking).
        
        2. This also motivates us to development hierarchical docking methods. 

<img src="figure/hierarchical_model.png" width="65%"/></img>

# Introduction to CDOCKER

## CDOCKER is a CHARMM based docking algorithm

    1. Grid based docking method
    
    2. Used physics based scoring function

<img src="figure/cdocker.png" width="80%"/>

## Motivation for pyCHARMM CDOCKER

    1. Docking is a powerful computational tool and has been applied for various structure-based structure- function explorations. However, the scripting languages of these docking methods are relatively complicated for ones with little knowledge of docking. 
    
    2. On the other hand, to perform a successful docking experiment, one typically would need various cheminformatics tools either in preparation or post analyzing.

# General Set Up for pyCHARMM CDOCKER

Step 1. import essential python libraries

In [1]:
import pycharmm
import pycharmm.lib as lib
import pycharmm.read as read
import pycharmm.lingo as lingo
import pycharmm.settings as settings

Step 2. read in topology and parameter files

    1. In the following example, only protein and Cgenff topology and paramter files are read.
    2. If you want to dock against non-protein target, you should read in the corresponding topology and parameter files
    3. Alternatively, you could read in all topology and parameter files. Then you don't need to set bomblev. 

In [2]:
topdir = "~/Desktop/inp_test/ocl-test/toppar/"

settings.set_bomb_level(-1)
read.rtf(topdir + "top_all36_prot.rtf")
read.rtf(topdir + "top_all36_cgenff.rtf", append = True)
read.prm(topdir + "par_all36m_prot.prm", flex = True)
read.prm(topdir + "par_all36_cgenff.prm", append = True, flex = True)
settings.set_bomb_level(0)
lingo.charmm_script('stream "../rigid/ligandrtf"')

  
 CHARMM>     read rtf card -
 CHARMM>     name ~/Desktop/inp_test/ocl-test/toppar/top_all36_prot.rtf
 VOPEN> Attempting to open::/Users/yujin/DESKTOP/INP_TEST/OCL-TEST/TOPPAR/TOP_ALL36_PROT.RTF::
 MAINIO> Residue topology file being read from unit  91.
 TITLE> *>>>>>>>>CHARMM36 ALL-HYDROGEN TOPOLOGY FILE FOR PROTEINS <<<<<<
 TITLE> *>>>>> INCLUDES PHI, PSI CROSS TERM MAP (CMAP) CORRECTION <<<<<<<
 TITLE> *>>>>>>>>>>>>>>>>>>>>>>>>>> MAY 2011 <<<<<<<<<<<<<<<<<<<<<<<<<<<<
 TITLE> * ALL COMMENTS TO THE CHARMM WEB SITE: WWW.CHARMM.ORG
 TITLE> *             PARAMETER SET DISCUSSION FORUM
 TITLE> *
 VCLOSE: Closing unit   91 with status "KEEP"
  
 CHARMM>     
  
  
 CHARMM>     read rtf card -
 CHARMM>     name ~/Desktop/inp_test/ocl-test/toppar/top_all36_cgenff.rtf -
 CHARMM>     append
 VOPEN> Attempting to open::/Users/yujin/DESKTOP/INP_TEST/OCL-TEST/TOPPAR/TOP_ALL36_CGENFF.RTF::
 MAINIO> Residue topology file being read from unit  91.
 TITLE> *  ---------------------------------------

1

Step 3. define grid box information

    1. grid box center (xyz-coordinates): xcen, ycen, zcen
    2. size of grid box: maxlen

Step 4. general parameters for the pyCHARMM CDOCKER

# pyCHARMM Rigid CDOCKER

## Helper function

In [3]:
from pycharmm.cdocker import Rigid_CDOCKER
help(Rigid_CDOCKER)

Help on function Rigid_CDOCKER in module pycharmm.cdocker:

Rigid_CDOCKER(xcen=0, ycen=0, zcen=0, maxlen=10, dielec=3, rcta=0, rctb=0, hmax=0, flag_grid=False, flag_rdie=True, flag_form=False, flag_delete_grid=True, probeFile='"../Toppar/fftdock_c36prot_cgenff_probes.txt"', softGridFile='grid-emax-0.6-mine--0.4-maxe-0.4.bin', hardGridFile='grid-emax-3-mine--30-maxe-30.bin', nativeGridFile='grid-emax-100-mine--100-maxe-100.bin', receptorPDB='./protein.pdb', receptorPSF='./protein.psf', ligPDB='./ligand.pdb', ligSeg='LIGA', confDir='./conformer/', placementDir='./placement/', exhaustiveness='high', numPlace=100, numCopy=1000, flag_delete_conformer=True, flag_delete_placement=True, flag_save_all=True, flag_save_cluster=True, flag_save_top=True, flag_suppress_print=True, flag_center_ligand=True, flag_fast_grid=False, flag_use_hbond=False, flag_fast_placement=True, threshold=2500, sort_energy='total_energy', saveDir='./dockresult/')
    Rigid CDOCKER standard docking method 
    
    Parame

## pyCHARMM Rigid CDOCKER Docking and Analysis

### Running Rigid CDOCKER with Default Input

1. ligand file: <font color='red'>ligand.pdb</font>
2. ligand segment ID: <font color='red'>LIGA</font>
3. receptor file : <font color='red'>protein.pdb</font> and <font color='red'>protein.psf</font>
4. For docking with Rigid CDOCKER, the user need to provide a pre-computed ligand conformer library. Default folder name: <font color='red'>conformer/</font>
5. Default docking trials: <font color='red'>100</font> docking trials per conformer

<img src="figure/SB2012.png" width="65%"/>

With default input, the user only need to provide <font color='red'>grid box information</font> (see example below).

In [None]:
Rigid_CDOCKER(xcen = 1, ycen = 1, zcen = 1, maxlen = 10)

### pyCHARMM Rigid CDOCKER Example

We have shown some basic requirments for Rigid CDOCKER. Below is a practical example of perform pyCHARMM Rigid CDOCKER docking experiments.
    
***Note***
1. In this example, we specified the <font color='red'>file location</font> of the ligand and receptor (i.e., they are not at their default location).
2. We also change the number of docking trials per conformer to <font color='red'>20</font> to reduce the computational cost. 

In [1]:
## Import module
import pycharmm
import pycharmm.lib as lib
import pycharmm.read as read
import pycharmm.lingo as lingo
import pycharmm.settings as settings
from pycharmm.cdocker import Rigid_CDOCKER

## File name and pathway
ligPDB = "../rigid/ligand.pdb"
ligandrtf = "../rigid/ligandrtf"
confDir = "../rigid/conformer/"
receptorPDB = "../rigid/protein.pdb"
receptorPSF = "../rigid/protein.psf"
topdir = "~/Desktop/inp_test/ocl-test/toppar/"

## Topology and parameter files
settings.set_bomb_level(-1)
read.rtf(topdir + "top_all36_prot.rtf")
read.rtf(topdir + "top_all36_cgenff.rtf", append = True)
read.prm(topdir + "par_all36m_prot.prm", flex = True)
read.prm(topdir + "par_all36_cgenff.prm", append = True, flex = True)
settings.set_bomb_level(0)
lingo.charmm_script('stream ' + ligandrtf)

## Rigid CDOCKER standard docking protocol
clusterResult, dockResult = Rigid_CDOCKER(xcen = 12.33, ycen = 33.48, zcen = 19.70,
                                        maxlen = 25.762, ligPDB = ligPDB, receptorPDB = receptorPDB,
                                        receptorPSF = receptorPSF, confDir = confDir, 
                                        flag_delete_conformer = False, numPlace = 20)

print(clusterResult)
print(dockResult)
exit()


  
 CHARMM>     read rtf card -
 CHARMM>     name ~/Desktop/inp_test/ocl-test/toppar/top_all36_prot.rtf
 VOPEN> Attempting to open::/Users/yujin/DESKTOP/INP_TEST/OCL-TEST/TOPPAR/TOP_ALL36_PROT.RTF::
 MAINIO> Residue topology file being read from unit  91.
 TITLE> *>>>>>>>>CHARMM36 ALL-HYDROGEN TOPOLOGY FILE FOR PROTEINS <<<<<<
 TITLE> *>>>>> INCLUDES PHI, PSI CROSS TERM MAP (CMAP) CORRECTION <<<<<<<
 TITLE> *>>>>>>>>>>>>>>>>>>>>>>>>>> MAY 2011 <<<<<<<<<<<<<<<<<<<<<<<<<<<<
 TITLE> * ALL COMMENTS TO THE CHARMM WEB SITE: WWW.CHARMM.ORG
 TITLE> *             PARAMETER SET DISCUSSION FORUM
 TITLE> *
 VCLOSE: Closing unit   91 with status "KEEP"
  
 CHARMM>     
  
  
 CHARMM>     read rtf card -
 CHARMM>     name ~/Desktop/inp_test/ocl-test/toppar/top_all36_cgenff.rtf -
 CHARMM>     append
 VOPEN> Attempting to open::/Users/yujin/DESKTOP/INP_TEST/OCL-TEST/TOPPAR/TOP_ALL36_CGENFF.RTF::
 MAINIO> Residue topology file being read from unit  91.
 TITLE> *  ---------------------------------------

/Applications/MMTSB/bin/kclust -pdb -centroid -cdist -radius 0.5 -iterate -maxerr 0.01 -mode rmsd -heavy
/Applications/MMTSB/bin/kclust -pdb -centroid -cdist -radius 0.6 -iterate -maxerr 0.01 -mode rmsd -heavy
/Applications/MMTSB/bin/kclust -pdb -centroid -cdist -radius 0.7 -iterate -maxerr 0.01 -mode rmsd -heavy
/Applications/MMTSB/bin/kclust -pdb -centroid -cdist -radius 0.8 -iterate -maxerr 0.01 -mode rmsd -heavy
/Applications/MMTSB/bin/kclust -pdb -centroid -cdist -radius 0.9 -iterate -maxerr 0.01 -mode rmsd -heavy
/Applications/MMTSB/bin/kclust -pdb -centroid -cdist -radius 1.0 -iterate -maxerr 0.01 -mode rmsd -heavy
/Applications/MMTSB/bin/kclust -pdb -centroid -cdist -radius 1.1 -iterate -maxerr 0.01 -mode rmsd -heavy
/Applications/MMTSB/bin/kclust -pdb -centroid -cdist -radius 1.2 -iterate -maxerr 0.01 -mode rmsd -heavy
/Applications/MMTSB/bin/kclust -pdb -centroid -cdist -radius 1.3 -iterate -maxerr 0.01 -mode rmsd -heavy
/Applications/MMTSB/bin/kclust -pdb -centroid -cdist -r


 Message from MAPIC: Atom numbers are changed.

 Message from MAPIC:          1 residues deleted.

 Message from MAPIC:          1 segments deleted.
 DELTIC:        48 bonds deleted
 DELTIC:        85 angles deleted
 DELTIC:       120 dihedrals deleted
 DELTIC:         3 improper dihedrals deleted
 DELTIC:         3 acceptors deleted
  
 CHARMM>     read psf card -
 CHARMM>     name ../rigid/protein.psf -
 CHARMM>     append
 VOPEN> Attempting to open::../RIGID/PROTEIN.PSF::
 MAINIO> Protein structure file being appended from unit  91.
 psf_read_formatted: Reading PSF in the expanded format.
 TITLE>  * BUILD PROTEIN PSF AND PDB FILE
 TITLE>  *  DATE:    11/15/21     13:21: 4      CREATED BY USER: yujin
 TITLE>  *
 PSFSUM> PSF modified: NONBOND lists and IMAGE atoms cleared.
 PSFSUM> Summary of the structure file counters :
         Number of segments      =        1   Number of residues   =      166
         Number of atoms         =     2619   Number of groups     =      788
        

### pyCHARMM Rigid CDOCKER Default Output

By default, pyCHARMM Rigid CDOCKER will create a folder (<font color='red'>dockresult</font>) and save all the outputs to this folder.

TSV files
1. <font color='red'>clusterResult.tsv</font> contains docking result for the clustered poses.
2. <font color='red'>explicitCluster.tsv</font> contains docking result for the clustered poses after explicit all atom minimization. 
3. <font color='red'>dockResult.tsv</font> contains docking result for all docked poses. 
4. <font color='red'>explicitTop10.tsv</font> contains docking result for the top 10 poses after explicit all atom minimization. 

Folders:
1. Folder <font color='red'>allPose/</font> contains all docked poses.
2. Folder <font color='red'>cluster/</font> contains all clustered poses.
3. Folder <font color='red'>top_ener/</font> contains the top 10 poses.


In [1]:
!ls dockresult

[33mallPose[m[m             clusterResult.tsv   explicitCluster.tsv [33mtop_ener[m[m
[33mcluster[m[m             dockResult.tsv      explicitTop10.tsv


In [2]:
!ls dockresult/allPose dockresult/cluster dockresult/top_ener

dockresult/allPose:
10_1.pdb  11_17.pdb 1_6.pdb   3_13.pdb  4_20.pdb  6_1.pdb   7_17.pdb  8_6.pdb
10_10.pdb 11_18.pdb 1_7.pdb   3_14.pdb  4_3.pdb   6_10.pdb  7_18.pdb  8_7.pdb
10_11.pdb 11_19.pdb 1_8.pdb   3_15.pdb  4_4.pdb   6_11.pdb  7_19.pdb  8_8.pdb
10_12.pdb 11_2.pdb  1_9.pdb   3_16.pdb  4_5.pdb   6_12.pdb  7_2.pdb   8_9.pdb
10_13.pdb 11_20.pdb 2_1.pdb   3_17.pdb  4_6.pdb   6_13.pdb  7_20.pdb  9_1.pdb
10_14.pdb 11_3.pdb  2_10.pdb  3_18.pdb  4_7.pdb   6_14.pdb  7_3.pdb   9_10.pdb
10_15.pdb 11_4.pdb  2_11.pdb  3_19.pdb  4_8.pdb   6_15.pdb  7_4.pdb   9_11.pdb
10_16.pdb 11_5.pdb  2_12.pdb  3_2.pdb   4_9.pdb   6_16.pdb  7_5.pdb   9_12.pdb
10_17.pdb 11_6.pdb  2_13.pdb  3_20.pdb  5_1.pdb   6_17.pdb  7_6.pdb   9_13.pdb
10_18.pdb 11_7.pdb  2_14.pdb  3_3.pdb   5_10.pdb  6_18.pdb  7_7.pdb   9_14.pdb
10_19.pdb 11_8.pdb  2_15.pdb  3_4.pdb   5_11.pdb  6_19.pdb  7_8.pdb   9_15.pdb
10_2.pdb  11_9.pdb  2_16.pdb  3_5.pdb   5_12.pdb  6_2.pdb   7_9.pdb   9_16.pdb
10_20.pdb 1_1.pdb   2_17.pdb  3_6.pdb

1. In the folder <font color='red'>allPose/</font>, the nomenclature of the pdb files is \{conformer_id\}_\{placement_id\}
2. In the folder <font color='red'>cluster/</font> and <font color='red'>top_ener</font>, the docked poses are sorted (i.e., top_1.pdb to top_n.pdb). The sorting method will be discussed later.

#### Lets look at the docked pose first

In [9]:
# if you can run pymol in your setup issue the command above in the terminal and set the usepymol logical
# here:
usepymol = False
if usepymol:
    import xmlrpc.client as xmlrpclib
    cmd = xmlrpclib.ServerProxy('http://localhost:9123')
    cmd.reinitialize()
# Otherwise set usepymol to False and proceed.

ssh -N -f -R 9123:localhost:9123 satyr

In [14]:
# This import enables pymol command interpreter to be used
if usepymol:
    # We can use this command interface to alter the view/representation and selection
    cmd.delete('all')
    cmd.load('../rigid/ligand.pdb')
    cmd.load('../rigid/protein.pdb')
    cmd.load('dockresult/cluster/top_1.pdb')
    cmd.load('dockresult/top_ener/top_1.pdb')
    cmd.orient('top_1')

#### Now lets look at these tsv files. 

In [7]:
import pandas as pd

1. dockResult.tsv will always be created. It contains the docking results for all docked poses and is sorted by <font color='red'>total energy</font>. 

In [8]:
print(pd.read_csv("dockresult/dockResult.tsv", sep ='\t').drop("Unnamed: 0", axis = 1))

     total_energy   grid_total     grid_vdw   grid_elec  grid_hbond  \
0     -115.863534  -101.433575   -53.143357  -48.290217         0.0   
1     -108.371461   -98.652752   -49.739433  -48.913319         0.0   
2     -106.754894  -100.066885   -43.354744  -56.712141         0.0   
3     -104.828069   -91.176979   -51.190354  -39.986624         0.0   
4     -103.968985   -93.650264    22.638548 -116.288812         0.0   
..            ...          ...          ...         ...         ...   
215   2703.741876  2455.799400  2903.058299 -447.258899         0.0   
216   2723.292325  2377.949196  2891.888755 -513.939559         0.0   
217   2830.202703  2634.554491  3140.883507 -506.329016         0.0   
218   2868.545346  2702.435078  2941.580570 -239.145492         0.0   
219   3013.887501  2708.072956  3173.696929 -465.623973         0.0   

     conformer_id  placement_id   PDB_name  
0              11            17  11_17.pdb  
1              11            14  11_14.pdb  
2           

2. clusterResult.tsv will always be created. It contains the docking results for the clustered poses and is sorted by <font color='red'>total energy</font>

In [9]:
print(pd.read_csv("dockresult/clusterResult.tsv", sep = '\t').drop("Unnamed: 0", axis = 1))

    total_energy   grid_total     grid_vdw   grid_elec  grid_hbond  \
0    -115.863534  -101.433575   -53.143357  -48.290217         0.0   
1     -94.670380   -77.905906   -47.624531  -30.281376         0.0   
2     -84.191419   -65.466209   -38.585129  -26.881080         0.0   
3     245.489912   173.792093   218.188307  -44.396214         0.0   
4     724.007166   593.822843   617.299013  -23.476170         0.0   
5     804.769287   647.782522   787.771624 -139.989102         0.0   
6     814.989792   744.183211   985.980824 -241.797614         0.0   
7     837.155884   683.019957   985.568896 -302.548940         0.0   
8    1364.798292  1222.649704  1422.439466 -199.789763         0.0   
9    1720.784960  1522.891737  1811.960166 -289.068429         0.0   
10   2106.749374  2023.269821  2406.503644 -383.233823         0.0   

    conformer_id  placement_id   PDB_name  cluster_size  
0             11            17  11_17.pdb             3  
1              8            15   8_15.pdb  

3. By default, cluster representatives and top 10 poses will be rescored with explicit receptor atoms.
The user can decide how they want to sort these two dataframes with the parameter <font color='red'>sort_energy</font>.
The default sorting is based on total energy. 


In [10]:
explicitTop10 = pd.read_csv("dockresult/explicitTop10.tsv", sep = '\t').drop("Unnamed: 0", axis = 1)
explicitCluster = pd.read_csv("dockresult/explicitCluster.tsv", sep = '\t').drop("Unnamed: 0", axis = 1)
print("Top 10 lowest energy poses with explicit all atom minimization. \n \n", explicitTop10, "\n \n")
print("Cluster poses with explicit all atom minimization. \n \n", explicitCluster)

Top 10 lowest energy poses with explicit all atom minimization. 
 
     total_energy        vdw        elec  grid_hbond  conformer_id  \
0    -149.314768 -54.481437 -151.675355         0.0            10   
1    -138.993401 -47.610587 -150.652257         0.0            11   
2    -136.032576 -44.722724 -153.428512         0.0            11   
3    -134.490406 -44.695279 -151.321857         0.0             8   
4    -129.819058 -42.642853 -150.684926         0.0             7   
5    -125.519305 -34.307546 -146.545128         0.0             8   
6    -124.581184 -40.208548 -145.008423         0.0            11   
7    -122.737989 -41.465627 -139.130881         0.0             8   
8    -119.434421 -39.092169 -133.827068         0.0             6   
9    -116.793126 -36.892171 -131.192949         0.0             1   
10   -111.827987 -43.768770 -123.343606         0.0             8   

    placement_id   PDB_name  
0             10  10_10.pdb  
1             17  11_17.pdb  
2            

## pyCHARMM Rigid CDOCKER Parameter for Controlling Input Information. 

1. Parameters that require user input

| Parameters | Default value | Meaning|
| :-- | :-- | :-- |
| xcen | 0 | x coordinate of the center of the grid box | 
| ycen | 0 | y coordinate of the center of the grid box | 
| zcen | 0 | z coordinate of the center of the grid box | 
| maxlen | 10 | size of the grid box | 

2. Parameters that are miscellaneous. 

| Parameters | Default value | Meaning|
| :-- | :-- | :-- |
| ligPDB | 'ligand.pdb' | ligand PDB file | 
| ligSeg | 'LIGA' | Segment ID in the ligand PDB file | 
| receptorPDB | 'protein.pdb' | receptor PDB file | 
| receptorPSF | 'protein.psf' | receptor PSF file | 
| numPlace | 100 | number of placement per conformer | 
| sort_energy | 'total_energy' | sorting method of the explicit all atom minimization result |
| saveDir | './dockresult/' | folder name of the final result directory |
| confDir | './conformer/' | folder name of the conformer library |
| flag_center_ligand | Ture | whether or not center the ligand in the binding pocket | 
| flag_grid | False | whether or not grid need to be generated before docking |
| flag_delete_grid | True | whether or not delete grid after docking |
| flag_delete_conformer | True | whether or not delete conformer after docking |
| flag_delete_placement | True | whether or not delete ligand initial placement after docking| 
| flag_save_all | Ture | whether or not save all docked pose after docking |
| flag_save_cluster | True | whether or not save clustered results after docking |
| flag_save_top | Ture | whether or not save top 10 ranking poses after docking |

3. Parameters that are not recommended to be changed and the default values are applicable for most use cases.

| Parameters | Default value | Meaning|
| :-- | :-- | :-- |
| dielec | 3.0 | dielectric constant | 
| flag_rdie | True | True for rdie, False for cdie | 
| flag_form | False | whether or not grid form is formatted | 
| probeFile | "../Toppar/fftdock_c36prot_cgenff_probes.txt" | probe file for gpu grid generation| 
| softGridFile | 'grid-emax-0.6-mine--0.4-maxe-0.4.bin' | soft grid file name |
| hardGridFile | 'grid-emax-3-mine--30-maxe-30.bin' | hard grid file name |
| nativeGridFile | 'grid-emax-100-mine--100-maxe-100.bin' | native grid file name |
| placementDir | './placement/' | ligand placement folder name |
| flag_fast_placement | True | True for using fast placement, False for using original Rigid CDOCKER placement method |
| exhaustiveness | 'high' | exhaustiveness for fast placement, high, medium, low |
| threshold | 2500 | cutoff threshold for original Rigid CDOCKER placement method |
| numCopy | 1000 | maximum number of copy for OpenMM parallel simulated annealing |
| flag_suppress_print | True | whether or not suppress warning message during docking |
| flag_fast_grid | False | True for using grid minimization, False for explicit atom minimization |

## pyCHARMM Rigid CDOCKER Covalent Docking

The major difference between the covalent docking and non-covalent docking in the pyCHARMM Rigid CDOCKER is that the user needs to <font color=red>specify the well-depth and cutoffs</font> of the covalent bond grid potential. Below shows the customizable covalent (hydrogen bond) grid potential.

***Note***
1. User need to specifiy the sets of ligand and receptor atoms that are interested. 
2. The corresponding parameter files need to be updated
3. CHARMM selection script : 
       1. acceptor set select *** end
       2. donor set select *** end

<img src="figure/ener.png" width="80%"/>

And the corrsponding parameters in the Rigid_CDOCKER are :

| Parameters | Default value | Meaning|
| :-- | :-- | :-- |
| rcta | 0 | customizable grid left cutoff | 
| rctb | 0 | customizable grid right cutoff |
| hmax | 0 | customizable grid well-depth |
| flag_use_hbond | False | whether or not use hydrogen/covalent bond grid potential in Rigid CDOCKER | 

# pyCHARMM Flexible CDOCKER

## Helper function

In [11]:
from pycharmm.cdocker import Flexible_CDOCKER
help(Flexible_CDOCKER)

Help on function Flexible_CDOCKER in module pycharmm.cdocker:

Flexible_CDOCKER(xcen=0, ycen=0, zcen=0, maxlen=10, num=20, copy=25, generation=2, threshold_init=2500, threshold_mutate=100, flag_grid=False, flag_form=False, flag_delete_grid=True, probeFile='"../Toppar/fftdock_c36prot_cgenff_probes.txt"', softGridFile='grid-emax-0.6-mine--0.4-maxe-0.4.bin', hardGridFile='grid-emax-3-mine--30-maxe-30.bin', nativeGridFile='grid-emax-100-mine--100-maxe-100.bin', dihedralFile='../Toppar/protein_dihedral.csv', receptorPDB='./protein.pdb', receptorPSF='./protein.psf', receptorCard='"./flexchain.crd"', saveLig='./ligand/', saveProt='./protein/', crossoverLig='./crossover_ligand/', crossoverProt='./crossover_protein/', saveLigFinal='./ligand_final/', saveProtFinal='./protein_final/', ligPDB='./ligand.pdb', ligSeg='LIGA', flexchain=None, placementDir='./placement/', flag_save_all=False, flag_save_cluster=True, flag_save_placement=False, flag_save_crossover=False, flag_suppress_print=True, flag_ce

## pyCHARMM Flexible CDOCKER Docking and Analysis

### Running Flexible CDOCKER with Default Input

1. ligand file: <font color='red'>ligand.pdb</font>
2. ligand segment ID: <font color='red'>LIGA</font>
3. receptor file : <font color='red'>protein.pdb</font> and <font color='red'>protein.psf</font>
4. Default docking trials: <font color='red'>20</font> conformers generated by Open Babel and <font color='red'>25</font> copies each conformer. (i.e., Open Babel is essential for Flexible CDOCKER)

With default input, the user only needs to provide <font color='red'>grid box information</font> and <font color='red'>flexible amino acid side chain selection</font>(see example below).

***Note***
1. Currently, pyCHARMM Flexible CDOCKER only works with protein receptor.
2. The flexible side chain selection needs to be a pandas dataframe. 

In [3]:
import pandas as pd
flexchain = pd.read_csv('../flex/flexchain.csv', sep = '\t', index_col = 0)
print(flexchain)

   res_id seg_id
0      84   PROT
1      87   PROT
2      99   PROT
3     111   PROT
4     118   PROT


In [None]:
Flexible_CDOCKER(xcen = 1, ycen = 1, zcen = 1, maxlen = 10, flexchain = flexchain)

### pyCHARMM Flexible CDOCKER Example

We have shown some basic requirments for Flexible CDOCKER. Below is a practical example of perform pyCHARMM Flexible CDOCKER docking experiments.
    
***Note***
1. In this example, we specified the <font color='red'>file location</font> of the ligand, receptor and flexible side chain selection (i.e., they are not at their default location).
2. We also change the number of conformers to <font color='red'>5</font> and the number of docking trials per conformer to <font color='red'>5</font> to reduce the computational cost. 

In [4]:
## Import modeule
import pandas as pd
import pycharmm
import pycharmm.lib as lib
import pycharmm.read as read
import pycharmm.lingo as lingo
import pycharmm.settings as settings
from pycharmm.cdocker import Flexible_CDOCKER

## File name and pathway
ligPDB = "../flex/ligand.pdb"
ligandrtf = "../flex/ligandrtf"
receptorPDB = "../flex/protein.pdb"
receptorPSF = "../flex/protein.psf"
topdir = "~/Desktop/inp_test/ocl-test/toppar/"

## Topology and parameter files
settings.set_bomb_level(-1)
read.rtf(topdir + "top_all36_prot.rtf")
read.rtf(topdir + "top_all36_cgenff.rtf", append = True)
read.prm(topdir + "par_all36m_prot.prm", flex = True)
read.prm(topdir + "par_all36_cgenff.prm", append = True, flex = True)
settings.set_bomb_level(0)
lingo.charmm_script('stream ' + ligandrtf)

## Read in the receptor flexible side chain selection
flexchain = pd.read_csv('../flex/flexchain.csv', sep = '\t', index_col = 0)

## Flexible CDOCKER standard docking protocol
clusterResult, dockResult = Flexible_CDOCKER(xcen = 26.911, ycen = 6.126, zcen = 4.178,
                            maxlen = 14.492, ligPDB = ligPDB, receptorPDB = receptorPDB,
                            receptorPSF = receptorPSF, num = 5, copy = 5,
                            flexchain = flexchain)

print(clusterResult)
print(dockResult)

  
 CHARMM>     read rtf card -
 CHARMM>     name ~/Desktop/inp_test/ocl-test/toppar/top_all36_prot.rtf
 VOPEN> Attempting to open::/Users/yujin/DESKTOP/INP_TEST/OCL-TEST/TOPPAR/TOP_ALL36_PROT.RTF::
 MAINIO> Residue topology file being read from unit  91.
 TITLE> *>>>>>>>>CHARMM36 ALL-HYDROGEN TOPOLOGY FILE FOR PROTEINS <<<<<<
 TITLE> *>>>>> INCLUDES PHI, PSI CROSS TERM MAP (CMAP) CORRECTION <<<<<<<
 TITLE> *>>>>>>>>>>>>>>>>>>>>>>>>>> MAY 2011 <<<<<<<<<<<<<<<<<<<<<<<<<<<<
 TITLE> * ALL COMMENTS TO THE CHARMM WEB SITE: WWW.CHARMM.ORG
 TITLE> *             PARAMETER SET DISCUSSION FORUM
 TITLE> *
 VCLOSE: Closing unit   91 with status "KEEP"
  
 CHARMM>     
  
  
 CHARMM>     read rtf card -
 CHARMM>     name ~/Desktop/inp_test/ocl-test/toppar/top_all36_cgenff.rtf -
 CHARMM>     append
 VOPEN> Attempting to open::/Users/yujin/DESKTOP/INP_TEST/OCL-TEST/TOPPAR/TOP_ALL36_CGENFF.RTF::
 MAINIO> Residue topology file being read from unit  91.
 TITLE> *  ---------------------------------------

  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 1)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 2)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 3)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 4)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 5)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 sh

A selection has been stored as WECMIGJJCBMWIAEHMIOO
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  1
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  2
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  3
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  4
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  5
A selection has been stored as JFBLFCFQIGWCRBCRFQXZ
Grid is applied to specified atoms


  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 1)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 2)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 3)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 4)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 5)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 sh

Grid is applied to specified atoms
Flexible receptor + ligand placement  6
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  7
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  8
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  9
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  10


  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 1)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 2)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 3)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 4)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 5)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 sh

A selection has been stored as KSXLOWFOHKTNGPBYNXUH
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  11
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  12
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  13
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  14
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  15
A selection has been stored as ZYLEXMXGWLMNXGFUAJVM
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  16
Grid is applied to specified atoms


  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 1)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 2)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 3)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 4)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 5)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 sh

Grid is applied to specified atoms
Flexible receptor + ligand placement  17
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  18
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  19
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  20


  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 1)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 2)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 3)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 4)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 should contain the element symbol of an atom.
  but OpenBabel found '  ' (atom 5)
  Problems reading a HETATM or ATOM record.
  According to the PDB specification,
  columns 77-78 sh

A selection has been stored as VFXLORKOQQIJJIBZGIXP
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  21
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  22
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  23
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  24
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  25
Set OMMD ligand copy 1
Set OMMD ligand copy 2
Set OMMD ligand copy 3
Set OMMD ligand copy 4
Set OMMD ligand copy 5
Set OMMD ligand copy 6
Set OMMD ligand copy 7
Set OMMD ligand copy 8
Set OMMD ligand copy 9
Set OMMD ligand copy 10
Set OMMD ligand copy 11
Set OMMD ligand copy 12
Set OMMD ligand copy 13
Set OMMD ligand copy 14
Set OMMD ligand copy 15
Set OMMD ligand copy 16
Set OMMD ligand co

/Applications/MMTSB/bin/kclust -pdb -centroid -cdist -radius 1 -iterate -maxerr 0.01 -mode rmsd -heavy


Flexible docking for generation: 2
Flexible receptor + ligand placement  1
Flexible receptor + ligand placement  2
Flexible receptor + ligand placement  3
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  4
Flexible receptor + ligand placement  5
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  6
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  7
Flexible receptor + ligand placement  8
Flexible receptor + ligand placement  9
Grid is applied to specified atoms
Grid is applied to specified atoms
Flexible receptor + ligand placement  10
Flexible receptor + ligand placement  11
Flexible receptor + ligand placement  12
Flexible receptor + ligand placement  13
Flexible receptor + ligand placement  14
Flexible receptor + ligand placement  15
Flexible receptor + ligand placement  16
Grid is applied to specified atoms
Gri

/Applications/MMTSB/bin/kclust -pdb -centroid -cdist -radius 1 -iterate -maxerr 0.01 -mode rmsd -heavy


Calculate conformational entropy contribution from amino acid  LEU
Calculate conformational entropy contribution from amino acid  VAL
Calculate conformational entropy contribution from amino acid  VAL
Calculate conformational entropy contribution from amino acid  LEU

 Message from MAPIC: Atom numbers are changed.

 Message from MAPIC:          6 residues deleted.

 Message from MAPIC:          2 segments deleted.
 DELTIC:        87 bonds deleted
 DELTIC:       148 angles deleted
 DELTIC:       189 dihedrals deleted
 DELTIC:         5 donors deleted
 DELTIC:         5 acceptors deleted
  
 CHARMM>     read psf card -
 CHARMM>     name ../flex/protein.psf -
 CHARMM>     append
 VOPEN> Attempting to open::../FLEX/PROTEIN.PSF::
 MAINIO> Protein structure file being appended from unit  91.
 TITLE>  * BUILD PROTEIN PSF AND PDB FILE
 TITLE>  *  DATE:    12/ 3/21     13:16: 4      CREATED BY USER: yujin
 TITLE>  *
 PSFSUM> PSF modified: NONBOND lists and IMAGE atoms cleared.
 PSFSUM> Summary 

### pyCHARMM Flexible CDOCKER Default Output

By default, pyCHARMM Rigid CDOCKER will create a folder (<font color='red'>dockresult</font>) and save all the outputs to this folder.

TSV files
1. <font color='red'>clusterResult.tsv</font> contains docking result for the clustered poses.
2. <font color='red'>explicitResult.tsv</font> contains docking result for all docked poses after explicit all atom minimization. 
3. <font color='red'>dockResult.tsv</font> contains docking result for all docked poses. 

Folders:
1. Folder <font color='red'>cluster/</font> contains all clustered poses.

In [5]:
!ls dockresult

[33mcluster[m[m            clusterResult.tsv  explicitResult.tsv
cluster.log        dockResult.tsv


In [6]:
!ls dockresult/cluster/*

dockresult/cluster/ligand:
top_1.pdb top_2.pdb top_3.pdb top_4.pdb top_5.pdb top_6.pdb top_7.pdb

dockresult/cluster/protein:
top_1.pdb top_2.pdb top_3.pdb top_4.pdb top_5.pdb top_6.pdb top_7.pdb


In the folder <font color='red'>cluster/</font>, the ligand and the corresponding protein docked poses are sorted (i.e., top_1.pdb to top_n.pdb). The sorting method will be introduced later.

#### Lets look at the docked pose first

In [8]:
# if you can run pymol in your setup issue the command above in the terminal and set the usepymol logical
# here:
usepymol = False
if usepymol:
    import xmlrpc.client as xmlrpclib
    cmd = xmlrpclib.ServerProxy('http://localhost:9123')
    cmd.reinitialize()
# Otherwise set usepymol to False and proceed.

ssh -N -f -R 9123:localhost:9123 satyr

In [6]:
# This import enables pymol command interpreter to be used
if usepymol:
    # We can use this command interface to alter the view/representation and selection
    cmd.delete('all')
    cmd.load('../flex/ligand.pdb')
    cmd.load('../flex/protein.pdb')
    cmd.load('dockresult/cluster/ligand/top_1.pdb')
    cmd.orient('top_1')

#### Now lets look at these tsv files. 

In [7]:
import pandas as pd

1. dockResult.tsv will always be created. It contains the docking results for all docked poses and is sorted by <font color='red'>total energy</font>. 

In [10]:
print(pd.read_csv("dockresult/dockResult.tsv", sep ='\t').drop("Unnamed: 0", axis = 1))

     enthalpy  PDB_name
0  -26.557646        20
1  -26.517979        24
2  -26.490580        12
3  -26.479465         3
4  -26.461838        13
5  -26.446676        19
6  -26.300365         8
7  -26.145816        23
8  -26.138086         7
9  -25.987131        16
10 -24.780220         4
11 -24.347180        18
12 -24.301400         2
13 -24.140987        15
14 -24.117587        21
15 -24.020850        11
16 -23.928780         6
17 -23.743860        10
18 -23.604960        22
19 -21.558777        17
20 -20.198841         5
21 -20.003035        14
22 -19.966674         9
23 -18.609989         1
24 -18.508390        25


2. clusterResult.tsv will always be created. It contains the docking representatives for the clustered poses and is sorted by <font color='red'>total energy</font>. By default, these poses will be rescored with explicit receptor atoms.The user can decide how they want to sort the dataframe with the parameter <font color='red'>sort_energy</font>. The default sorting is based on total energy. 

In [11]:
print(pd.read_csv("dockresult/clusterResult.tsv", sep = '\t').drop("Unnamed: 0", axis = 1))

   total_energy   enthalpy        vdw       elec    entropy  cluster_size  \
0    -53.410567 -33.276644 -32.276249 -64.354162 -20.133924             4   
1    -53.063334 -32.970723 -32.058904 -64.630001 -20.092611             6   
2    -52.846125 -32.717663 -32.306627 -64.698845 -20.128462             2   
3    -52.835510 -32.699545 -32.288509 -64.691699 -20.135965             2   
4    -52.329302 -32.276985 -32.276985 -64.698259 -20.052317             3   
5    -52.260228 -32.157248 -31.779796 -64.003921 -20.102980             3   
6    -46.883818 -27.518812 -27.185348 -60.976100 -19.365006             4   

   PDB_name  cluster_id  
0        19           5  
1         7           7  
2        24           2  
3        13           1  
4        12           4  
5        14           3  
6        15           6  


3. By default, all docked poses will be rescored with explicit receptor atoms. The result is sorted by <font color='red'>total energy</font> and saved in the file explicitResult.tsv. The user can decide how they want to sort the dataframe with the parameter <font color='red'>sort_energy</font>. The default sorting is based on total energy. 

In [12]:
print(pd.read_csv("dockresult/explicitResult.tsv", sep = '\t').drop("Unnamed: 0", axis = 1))

    total_energy   enthalpy        vdw       elec   entropy  PDB_name  \
0     -33.277836 -32.277441 -64.354852 -20.134152 -1.000394        19   
1     -33.159194 -32.247376 -64.607254 -20.169272 -0.911818         7   
2     -32.818621 -31.906802 -64.392613 -19.992668 -0.911818        16   
3     -32.778768 -31.866950 -64.929446 -20.054530 -0.911818         3   
4     -32.718019 -32.306983 -64.699056 -20.128485 -0.411036        24   
5     -32.700584 -32.289548 -64.692293 -20.136180 -0.411036        13   
6     -32.309350 -32.309350 -64.743637 -20.080690  0.000000        12   
7     -32.242946 -32.242946 -64.649027 -20.020632  0.000000         4   
8     -32.181829 -31.804376 -64.215382 -19.889800 -0.377453        14   
9     -32.132711 -31.755258 -63.775557 -20.334953 -0.377453         8   
10    -28.337421 -27.337027 -61.495547 -19.185280 -1.000394        11   
11    -28.202448 -27.290629 -61.216193 -19.194515 -0.911818         2   
12    -27.817730 -26.905911 -61.100967 -19.073484 -

## pyCHARMM Flexible CDOCKER Parameter for Controlling Input Information. 

1. Parameters that require user input

| Parameters | Default value | Meaning|
| :-- | :-- | :-- |
| xcen | 0 | x coordinate of the center of the grid box | 
| ycen | 0 | y coordinate of the center of the grid box | 
| zcen | 0 | z coordinate of the center of the grid box | 
| maxlen | 10 | size of the grid box | 
| flexchain | None | flexible side chain selection (pandas dataframe)|

2. Parameters that are miscellaneous. 

| Parameters | Default value | Meaning|
| :-- | :-- | :-- |
| ligPDB | 'ligand.pdb' | ligand PDB file | 
| ligSeg | 'LIGA' | Segment ID in the ligand PDB file | 
| receptorPDB | 'protein.pdb' | receptor PDB file | 
| receptorPSF | 'protein.psf' | receptor PSF file | 
| num | 20 | number of conformers | 
| copy | 25 | number of copies per conformer |
| sort_energy | 'total_energy' | sorting method of the explicit all atom minimization result |
| saveDir | './dockresult/' | folder name of the final result directory |
| flag_center_ligand | Ture | whether or not center the ligand in the binding pocket | 
| flag_grid | False | whether or not grid need to be generated before docking |
| flag_delete_grid | True | whether or not delete grid after docking |
| flag_save_placement | True | whether or not save ligand initial placement after docking| 
| flag_save_crossover | False | whether or not save crossover after docking| 
| flag_save_all | False | whether or not save all docked pose after docking |
| flag_save_cluster | True | whether or not save clustered results after docking |
| flag_save_top | Ture | whether or not save top 10 ranking poses after docking |
| top_N_result | 10 | number of top N largest clusters, final generation uses top N + 5 clusters | 

3. Parameters that are not recommended to be changed and the default values are applicable for most use cases.

| Parameters | Default value | Meaning|
| :-- | :-- | :-- |
| flag_form | False | whether or not grid form is formatted | 
| probeFile | "../Toppar/fftdock_c36prot_cgenff_probes.txt" | probe file for gpu grid generation| 
| dihedralFile | '../Toppar/protein_dihedral.csv' | protein amino acid side chain dihedral angle look-up table |
| softGridFile | 'grid-emax-0.6-mine--0.4-maxe-0.4.bin' | soft grid file name |
| hardGridFile | 'grid-emax-3-mine--30-maxe-30.bin' | hard grid file name |
| nativeGridFile | 'grid-emax-100-mine--100-maxe-100.bin' | native grid file name |
| receptorCard | '"./flexchain.crd"' | receptor flexible side chain coordinate card |
| saveLig | './ligand/' | folder name for all ligand docked poses |
| saveProt | './protein/' | folder name for all protein docked poses |
| saveLigFinal | './ligand_final/' | folder name for ligand final cluster representatives |
| saveProtFinal | './protein_final/' | folder name for protein final cluster representatives |
| crossoverLig | './crossover_ligand/' | folder name for ligand crossover |
| crossoverProt | './crossover_protein/' | folder name for protein crossover |
| placementDir | './placement/' | ligand placement folder name |
| threshold_init | 2500 | energy threshold for initial placement |
| threshold_mutate | 100 | energy threshold for mutation |
| flag_fast_grid | False | True for using grid minimization, False for explicit atom minimization |
| flag_fast_placement | False | True for using fast placement, False for using original Flexible CDOCKER placement method |
| exhaustiveness | 'high' | exhaustiveness for fast placement, high, medium, low |
| flag_suppress_print | True | whether or not suppress warning message during docking | 

# PyCHARMM CDOCKER Docking Summary

## Docking Speed

1. Slightly improve Rigid CDOCKER docking speed.
2. Significantly improved Flexible CDOCKER docking speed by at least 2 folds. 

<img src='figure/fig_pycharmm_time.png' width=50%></img>

## Docking Accuracy

Because we improved the searching efficiency, we could perform more exhaustive searching. And hence we observe improved pose prediction performance. 

    1. We compared the Rigid CDOCKER docking performance with the benchmark dataset SB2012.
    2. We compared the Flexible CDOCKER docking preformance with the cross-docking benchmark dataset. 

<img src='figure/fig_pycdocker.png' width=80%></img>

## Summary

    1. Reduce the complexity
    2. Accelerate CDOCKER family
    3. Improve docking accuracy
    4. Easy to integrate with other python packages

## Future work

    1. pyCHARMM Flexible CDOCKER atom selection
    2. FFT docking in pyCHARMM
    3. One liner code for virtual screening