# Binder comparison

In [48]:
import os
import pandas as pd

pd.set_option('display.max_columns', 500)

## Initial experiment setup

I want to test the original RFDiffusion model with LabDAO's fork. My initial experimental setup is informed by the experimental setup used by Baker et al. (2023) to compare the Rosetta pipeline with the RFDiffusion pipeline. Key elements of this setup are as follows:
- Each model will be used to design targets for 5 different proteins
    - Proteins selected from Baker et al. 2023 paper IL-7Ra, TrkA, PD-L1, InsR, IH
- 10 different proteins designed for each target
    - Limited to this as it takes approx. 7 mins to generate each binder
- Same hotpot residues chosen as detailed in Baker et al. 2023
- Target chain selected as the one containing the hotspot residue
    - Target start- and end-residues chosen as 1 and len(target chain)
- Comparison metrics are pLDDT, iPAE, and prodigy score
- Thresholds for filtering potential binders
    - iPAE < 10
    - RMSD < 1 Angstrom
    - pLDDT > 80
    
I will initially test this setup by designing binders against IL-7Ra. There are 3 different hotspots detailed in the paper for which I will design 3 binders each.

In [30]:
outcome_cols = ["plddt", "i_pae", "rmsd", "affinity"]

In [31]:
# Define the folder path
folder_path = '../data/output-binders/il-7ra'

# Get a list of all files in the folder
file_names = os.listdir(folder_path)

# List to save csvs to
binder_performance_dfs = []

# Loop through each file and save contents
for file_name in file_names:
    file_path = os.path.join(folder_path, file_name)
    df = pd.read_csv(file_path)
    df = df[outcome_cols + ["params.advanced_settings.hotspot"]]
    binder_performance_dfs.append(df)

In [41]:
combined_df = pd.concat(binder_performance_dfs)

combined_df.rename(columns={"params.advanced_settings.hotspot": "hotspot"}, inplace=True)

I will now plot some summary statistics for each of these hotspots.

In [56]:
combined_df.groupby(by=["hotspot"]).describe()

Unnamed: 0_level_0,index,index,index,index,index,index,index,index,plddt,plddt,plddt,plddt,plddt,plddt,plddt,plddt,i_pae,i_pae,i_pae,i_pae,i_pae,i_pae,i_pae,i_pae,rmsd,rmsd,rmsd,rmsd,rmsd,rmsd,rmsd,rmsd,affinity,affinity,affinity,affinity,affinity,affinity,affinity,affinity
Unnamed: 0_level_1,count,mean,std,min,25%,50%,75%,max,count,mean,std,min,25%,50%,75%,max,count,mean,std,min,25%,50%,75%,max,count,mean,std,min,25%,50%,75%,max,count,mean,std,min,25%,50%,75%,max
hotspot,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2,Unnamed: 23_level_2,Unnamed: 24_level_2,Unnamed: 25_level_2,Unnamed: 26_level_2,Unnamed: 27_level_2,Unnamed: 28_level_2,Unnamed: 29_level_2,Unnamed: 30_level_2,Unnamed: 31_level_2,Unnamed: 32_level_2,Unnamed: 33_level_2,Unnamed: 34_level_2,Unnamed: 35_level_2,Unnamed: 36_level_2,Unnamed: 37_level_2,Unnamed: 38_level_2,Unnamed: 39_level_2,Unnamed: 40_level_2
B139,24.0,11.5,7.071068,0.0,5.75,11.5,17.25,23.0,24.0,0.78005,0.089788,0.542272,0.733608,0.804045,0.829328,0.919441,24.0,27.516729,0.413493,26.254953,27.263924,27.550189,27.901356,28.1395,24.0,28.31398,8.895684,13.264915,21.858055,28.014628,36.518788,41.740845,14.0,-4.6955,0.297464,-5.393,-4.8445,-4.6615,-4.59125,-4.207
B58,24.0,11.5,7.071068,0.0,5.75,11.5,17.25,23.0,24.0,0.850054,0.02941,0.7631,0.84025,0.853562,0.86353,0.903999,24.0,27.70557,0.151555,27.517756,27.62445,27.681326,27.752899,28.258421,24.0,37.107875,11.695424,21.173025,28.337973,32.586342,51.705568,54.745564,14.0,-4.639214,0.144346,-4.973,-4.7295,-4.6405,-4.53075,-4.42
B80,24.0,11.5,7.071068,0.0,5.75,11.5,17.25,23.0,24.0,0.837174,0.036078,0.765431,0.809909,0.840699,0.868821,0.895849,24.0,27.643934,0.22745,27.171406,27.460127,27.642069,27.831105,28.032725,24.0,33.395739,10.324228,12.516999,27.252305,31.854891,38.989036,50.450775,17.0,-4.704882,0.138408,-4.957,-4.801,-4.702,-4.563,-4.529
