# Rescore docking poses with predicted binding affinity

This notebook analyzes docking pose rescoring. It contains the same analyzes conducted by Chachulski & Windshügel in their [LEADS-FRAG paper](https://doi.org/10.1021/acs.jcim.0c00693), where they benchmark different molecular docking programs with and without rescoring:

**Analysis 1**: To evaluate the rescoring based on the entire set of dockings, we count the number of top poses with an RMSD below several thresholds. We compare these counts before and after rescoring.

**Analysis 2**: To directly compare rescoring with the scoring function used by the docking program, we count dockings where the best pose (RMSD-wise) is also the top ranked pose.

Input are three CSV files:
1. docking poses with rank and RMSD
2. binding affinities predicted for each pose and its receptor
3. binding affinities predicted for each reference ligand and its receptor

Some assumptions are made regarding format:
- columns are separated by ";"
- the column "Receptor" contains some sort of ID to link the values from different files
- the column "PoseRank" contains each pose's rank according to the docking program (top pose ranked first)
- the column "RMSD" contains RMSD between a pose and some reference
- the column "PredictedBindingAffinity" contains the predicted binding affinities

With the predicted affinity, we can re-rank the docking poses. To evaluate, if this brings any improvement for a specific docking, we compare the RMSD of the pose ranked highest by the docking program, and the pose with the highest affinity. If the top pose based on affinity has a lower RMSD, the re-ranking did improve the docking result.

The predicted binding affinity for reference ligands opens another way of evaluating the rescoring: if it works well, the reference ligands should be assigned an affinity higher than all their docking poses.

## Load and prepare RMSD and affinity data

In [1]:
# SELECT DATA TO LOAD

import pandas as pd

# Pose rank and RMSD
docking = pd.read_csv('../example_data/affinity_predictions/docking_poses.csv', sep=';')

# predicted affinity of each pose
predictions = pd.read_csv('../example_data/affinity_predictions/affinity_predictions.csv', sep=';')

# predicted affinity of the reference
refligs = pd.read_csv('../example_data/affinity_predictions/affinity_predictions_refligands.csv', sep=';')

The data frames look as follows:

In [2]:
docking

Unnamed: 0,Receptor,PoseRank,RMSD
0,1Q11,1,10.889195
1,1Q11,2,11.078528
2,1Q11,3,11.170565
3,1Q11,4,10.772341
4,1Q11,5,10.793099
5,1Q11,6,11.039625
6,1Q11,7,11.193941
7,5ILW,1,3.631839
8,5ILW,2,0.587167
9,5ILW,3,3.868636


In [3]:
predictions

Unnamed: 0,Receptor,Ligand,PredictedBindingAffinity
0,1Q11,1,4.336701
1,1Q11,2,4.221958
2,1Q11,3,4.097513
3,1Q11,4,4.077734
4,1Q11,5,4.081782
5,1Q11,6,4.238152
6,1Q11,7,4.107012
7,5ILW,1,4.862881
8,5ILW,2,5.091005
9,5ILW,3,4.811201


To make life easier, make column names match between both frames.

In [4]:
predictions.rename(columns={'Ligand': 'PoseRank'}, inplace=True)

Now the frames can be easily combined into a single frame.

In [5]:
combined = docking.merge(predictions, sort=True)
combined

Unnamed: 0,Receptor,PoseRank,RMSD,PredictedBindingAffinity
0,1Q11,1,10.889195,4.336701
1,1Q11,2,11.078528,4.221958
2,1Q11,3,11.170565,4.097513
3,1Q11,4,10.772341,4.077734
4,1Q11,5,10.793099,4.081782
5,1Q11,6,11.039625,4.238152
6,1Q11,7,11.193941,4.107012
7,5F25,1,2.527784,5.033033
8,5F25,2,2.251039,5.020662
9,5F25,3,0.694955,5.051973


## Rescoring ##

Rescoring means assigning a new ranking order to the poses, in this case based on the predicted binding affinity. The higher the affinity, the better. We use RMSD to break ties in the affinity rank (lower RMSD means better rank).

In [6]:
affinity_rank = combined.sort_values(by='RMSD').groupby(by='Receptor').rank(ascending=False, method='first').rename(columns={'PredictedBindingAffinity': 'AffinityRank'}).AffinityRank
combined = combined.merge(affinity_rank, left_index=True, right_index=True, sort=True)
combined

Unnamed: 0,Receptor,PoseRank,RMSD,PredictedBindingAffinity,AffinityRank
0,1Q11,1,10.889195,4.336701,1.0
1,1Q11,2,11.078528,4.221958,3.0
2,1Q11,3,11.170565,4.097513,5.0
3,1Q11,4,10.772341,4.077734,7.0
4,1Q11,5,10.793099,4.081782,6.0
5,1Q11,6,11.039625,4.238152,2.0
6,1Q11,7,11.193941,4.107012,4.0
7,5F25,1,2.527784,5.033033,3.0
8,5F25,2,2.251039,5.020662,4.0
9,5F25,3,0.694955,5.051973,1.0


## Did rescoring improve individual dockings?
To compare the old and new rankings for individual dockings, look at the RMSD difference between old and new top pose. When substracting the new from the old top pose RMSD, a difference > 0 shows an improvement after rescoring. For this analysis, we only use dockings with potential for improvement, i.e. where top pose RMSD is not the lowest RMSD.

In [7]:
# Rank poses according to RMSD
combined['RMSDRank'] = combined[['Receptor', 'RMSD']].groupby(by='Receptor').rank(ascending=True)

# Construct new data frame with minimum RMSD, top pose RMSD and top affinity RMSD for each receptor.
stats = combined.groupby(by='Receptor').min().rename(columns={'RMSD': 'MinRMSD'})
stats = stats.reset_index()[['Receptor', 'MinRMSD']]
stats = stats.merge(combined[combined.PoseRank == 1].rename(columns={'RMSD': 'TopPoseRMSD'})[['Receptor', 'TopPoseRMSD']])
stats = stats.merge(combined[combined.AffinityRank == 1][['Receptor', 'RMSD']].rename(columns={'RMSD': 'TopAffinityRMSD'}))

# All dockings with top pose RMSD > minimum RMSD can be improved through rescoring
improvable_dockings = stats[stats.TopPoseRMSD != stats.MinRMSD][['Receptor', 'TopPoseRMSD', 'TopAffinityRMSD', 'MinRMSD']]
print(f'There are {len(improvable_dockings)} dockings with potential for improvement:')
improvable_dockings[['Receptor', 'TopPoseRMSD', 'MinRMSD']]

There are 3 dockings with potential for improvement:


Unnamed: 0,Receptor,TopPoseRMSD,MinRMSD
0,1Q11,10.889195,10.772341
1,5F25,2.527784,0.694955
2,5ILW,3.631839,0.587167


In [8]:
# Now we can use the difference of top pose RMSD and top affinity RMSD to see the effect of rescoring.
improvable_dockings['DeltaRMSD'] = improvable_dockings.TopPoseRMSD - improvable_dockings.TopAffinityRMSD
improved_dockings = improvable_dockings[improvable_dockings.DeltaRMSD > 0]
worse_dockings = improvable_dockings[improvable_dockings.DeltaRMSD < 0]
print(round(len(improved_dockings)/len(improvable_dockings) * 100, 1), '% of which improved.')
print(round(len(worse_dockings)/len(improvable_dockings) * 100, 1), '% of which degraded.')
print('Improved dockings:')
improved_dockings.drop(columns='MinRMSD').sort_values(by='DeltaRMSD', ascending=False, ignore_index=True)

66.7 % of which improved.
0.0 % of which degraded.
Improved dockings:


Unnamed: 0,Receptor,TopPoseRMSD,TopAffinityRMSD,DeltaRMSD
0,5ILW,3.631839,0.587167,3.044672
1,5F25,2.527784,0.694955,1.832829


## The big picture: Did rescoring improve overall docking results (Analysis 1)?
To further compare the rankings, count top poses with an RMSD in several intervals. This is based on the premise, that the docking program is able to produce good poses, but the scoring function fails to rank them first. If rescoring would do better than the prior used scoring function, the amount of poses with low RMSD would increase after rescoring.

In [9]:
top_docking_poses = docking[docking.PoseRank == 1]
n_dockings = len(top_docking_poses)
top_rescoring_poses = combined[combined.AffinityRank == 1]
thresholds = [0.5, 1.0, 1.5, 2.0, 2.5]
counts_docking = []
counts_rescoring = []
for threshold in thresholds:
    counts_docking.append(top_docking_poses[top_docking_poses.RMSD >= threshold - 0.5].loc[top_docking_poses.RMSD < threshold].count()[0])
    counts_rescoring.append(top_rescoring_poses[top_rescoring_poses.RMSD >= threshold - 0.5].loc[top_rescoring_poses.RMSD < threshold].count()[0])

pd.DataFrame({'RMSDThreshold': thresholds, 'TopPoseCountDocking': counts_docking, 'TopPoseCountRescoring': counts_rescoring})

Unnamed: 0,RMSDThreshold,TopPoseCountDocking,TopPoseCountRescoring
0,0.5,0,0
1,1.0,0,2
2,1.5,0,0
3,2.0,0,0
4,2.5,0,0


In [10]:
print(f'Before rescoring, {round(sum(counts_docking)/n_dockings * 100, 1)} % of dockings have a top pose with RMSD<2.5 A.')
print(f'After rescoring, {round(sum(counts_rescoring)/n_dockings * 100, 1)} % of dockings have a top pose with RMSD<2.5 A.')

Before rescoring, 0.0 % of dockings have a top pose with RMSD<2.5 A.
After rescoring, 66.7 % of dockings have a top pose with RMSD<2.5 A.


## A closer look: Ranking of pose with lowest RMSD  (Analysis 2)
This is an analysis of the entire docking pipeline. Here we count what could be called correctly reproduced binding modes: the best pose with an RMSD<2.5 A ranked first.

In [11]:
top_docking_poses = combined[combined.PoseRank == 1].loc[combined.RMSDRank == 1]
top_rescoring_poses = combined[combined.AffinityRank == 1].loc[combined.RMSDRank == 1]

thresholds = [0.5, 1.0, 1.5, 2.0, 2.5]
counts_docking = []
counts_rescoring = []
for threshold in thresholds:
    counts_docking.append(top_docking_poses[top_docking_poses.RMSD >= threshold - 0.5].loc[top_docking_poses.RMSD < threshold].count()[0])
    counts_rescoring.append(top_rescoring_poses[top_rescoring_poses.RMSD >= threshold - 0.5].loc[top_rescoring_poses.RMSD < threshold].count()[0])

pd.DataFrame({'RMSDThreshold': thresholds, 'LowestRMSDTopPoseCountDocking': counts_docking, 'LowestRMSDTopPoseCountRescoring': counts_rescoring})

Unnamed: 0,RMSDThreshold,LowestRMSDTopPoseCountDocking,LowestRMSDTopPoseCountRescoring
0,0.5,0,0
1,1.0,0,2
2,1.5,0,0
3,2.0,0,0
4,2.5,0,0


In [12]:
print(f'Before rescoring, {round(sum(counts_docking)/n_dockings * 100, 1)} % of dockings correctly reproduced the binding mode.')
print(f'After rescoring, {round(sum(counts_rescoring)/n_dockings * 100, 1)} % of dockings correctly reproduced the binding mode.')

Before rescoring, 0.0 % of dockings correctly reproduced the binding mode.
After rescoring, 66.7 % of dockings correctly reproduced the binding mode.


## Validate prediction with reference ligands
By including the reference ligand into the affinity ranking, we can validate the affinity prediction. If it works well, the reference ligand should rank first. To make the validation a little less strict, exclude those dockings, where the top pose (affinity-wise) already has an RMSD<1 A. Below all reference ligands with a higher predicted affinity than the top affinity pose are shown.

In [13]:
refligs.rename(columns={'PredictedBindingAffinity': 'RefligBindingAffinity'}, inplace=True)
refligs = refligs.merge(stats[stats.TopAffinityRMSD >= 1])

# Compare affinity of top rescored pose with ref. ligand binding affinity
refligs = refligs.merge(combined[['Receptor', 'PredictedBindingAffinity']][combined.AffinityRank == 1].rename(columns={'PredictedBindingAffinity': 'MaxPoseAffinity'}))
top_ranked_refligs = refligs.loc[refligs.RefligBindingAffinity > refligs.MaxPoseAffinity].reset_index(drop=True)

n_refligs_should_rank_first = len(refligs)
print(n_refligs_should_rank_first, 'dockings have a top affinity pose with an RMSD>=1 A.')
print(f'{round(len(top_ranked_refligs)/n_refligs_should_rank_first * 100, 1)} % of their reference ligands have a higher predicted binding affinity than all corresponding docking poses:')
top_ranked_refligs[['Receptor', 'RefligBindingAffinity', 'MaxPoseAffinity']]

1 dockings have a top affinity pose with an RMSD>=1 A.
100.0 % of their reference ligands have a higher predicted binding affinity than all corresponding docking poses:


Unnamed: 0,Receptor,RefligBindingAffinity,MaxPoseAffinity
0,1Q11,4.582866,4.336701
