# Molecular Generator Evaluation using TUPOR, SESY and ASER Metrics

🔹 **Objective**  
This notebook evaluates molecular generators by computing four key metrics:  
   - **TUPOR**: scaffold recall metrics  
   - **SESY**: scaffold hopping potencial  
   - **ASER**: chemical space exploration

🔹 **Workflow**  
1️⃣ **Compute Metrics**: The script calculates TUPOR, SESY and ASER for different molecular generators.  
2️⃣ **Merge Data**: Results from multiple generators are combined into a single Pandas DataFrame.  
3️⃣ **Normalize Values**: The computed metrics are normalized using Min-Max scaling for comparison.  
4️⃣ **Save Outputs**: Processed data is stored in CSV files for further analysis.  

🔹 **Data Structure**  
- The calculations are performed for different **scaffold types** (`csk`, `murcko`) and **cluster types** (`dis`, `sim`).  
- Results are computed for multiple **generators** (`Molpher`, `REINVENT`, `DrugEx`, `GB_GA`, etc.).  
- The analysis is conducted for a specific **biological target receptor**, such as the **Glucocorticoid receptor**.

This notebook allows us to compare the performance of various molecular generators in terms of structural diversity, similarity to known bioactive compounds, and synthetic feasibility.

# Loading required libraries

In [5]:
from src import metrics # Importing custom metric functions
import importlib as imp
imp.reload(metrics)

<module 'src.metrics' from '/home/filv/phd_projects/iga_2023/git_reccal/new/recall_metrics/src/metrics.py'>

# Function to calculate metrics

In [6]:
def calculate_metrics(type_cluster, type_scaffold,generator, receptor, ncpus = 1):
    """
    Function to calculate molecular generation metrics.
    
    Parameters:
    - scaffold_type: Type of scaffold (e.g., 'csk' or 'murcko')
    - type_cluster: Cluster type  (e.g., 'dis' or 'sim') dis = Dissimilarity split; sim = Similarity split
    - generator: Name of the molecular generator
    - receptor: Target receptor for drug design

    Returns:
    - Computed metrics
    """
    mt = metrics.Metrics(type_cluster, type_scaffold, generator, receptor, ncpus)     
    result = mt.calculate_metrics()
    display(result)
    return result

# Define parameters for metric calculations

In [3]:
type_cluster = 'sim' #options: 'dis'|'sim' 
type_scaffold = 'csk' #options: 'csk'|'murcko'
generator = 'Molpher' #options: 'Molpher'|'DrugEx'|'REINVENT'|'addcarbon'
receptor = 'Leukocyte_elastase' #options: 'Glucocorticoid_receptor'|'Leukocyte_elastase'

calculate_metrics(type_cluster,type_scaffold,generator,receptor, ncpus = 10)

NUMBER:  0


[10:03:34] Explicit valence for atom # 12 C, 5, is greater than permitted
[10:03:49] Explicit valence for atom # 28 C, 5, is greater than permitted
[10:03:49] Explicit valence for atom # 28 C, 5, is greater than permitted
[10:03:49] Explicit valence for atom # 27 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 7 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 7 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 7 C, 5, 

NUMBER:  1


[10:08:55] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:08:55] Explicit valence for atom # 23 C, 5, is greater than permitted
[10:08:55] Explicit valence for atom # 23 C, 5, is greater than permitted
[10:08:55] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:08:55] Explicit valence for atom # 23 C, 5, is greater than permitted
[10:09:48] Explicit valence for atom # 11 C, 5, is greater than permitted
[10:09:48] Explicit valence for atom # 11 C, 5, is greater than permitted
[10:09:48] Explicit valence for atom # 11 C, 5, is greater than permitted
[10:10:01] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:10:01] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:10:01] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:10:07] Explicit valence for atom # 7 C, 5, is greater than permitted
[10:10:48] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:12:09] Explicit valence for atom # 1

NUMBER:  2


[10:15:35] Explicit valence for atom # 10 C, 5, is greater than permitted
[10:16:46] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:16:46] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:17:34] Explicit valence for atom # 6 C, 5, is greater than permitted
[10:17:34] Explicit valence for atom # 6 C, 5, is greater than permitted
[10:17:34] Explicit valence for atom # 25 C, 5, is greater than permitted
[10:17:34] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:17:34] Explicit valence for atom # 6 C, 5, is greater than permitted
[10:17:34] Explicit valence for atom # 6 C, 5, is greater than permitted
[10:17:53] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:17:58] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:17:58] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:17:58] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:18:11] Explicit valence for atom # 7 C, 5

NUMBER:  3


[10:21:36] Explicit valence for atom # 10 C, 5, is greater than permitted
[10:21:54] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:21:56] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:21:56] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:23:58] Explicit valence for atom # 26 C, 5, is greater than permitted
[10:23:58] Explicit valence for atom # 25 C, 5, is greater than permitted
[10:23:58] Explicit valence for atom # 25 C, 5, is greater than permitted
[10:24:35] Explicit valence for atom # 31 C, 5, is greater than permitted
[10:24:35] Explicit valence for atom # 26 C, 5, is greater than permitted
[10:24:35] Explicit valence for atom # 27 C, 5, is greater than permitted
[10:24:42] Explicit valence for atom # 29 C, 5, is greater than permitted
[10:25:23] Explicit valence for atom # 15 C, 5, is greater than permitted


NUMBER:  4


[10:27:38] Explicit valence for atom # 25 C, 5, is greater than permitted
[10:27:38] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:27:38] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 16 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 14 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:28:21] Explicit valence for atom # 21 C, 5, is greater than permitted
[10:28:21] Explicit valence for atom # 21 C, 5, is greater than permitted
[10:28:21] Explicit valence for atom #

Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,Molpher_0,sim,csk,1624700.0,43/46,0.934783,0.138021,0.03143
1,Molpher_1,sim,csk,1662823.0,38/44,0.863636,0.130204,0.009258
2,Molpher_2,sim,csk,1546835.0,36/42,0.857143,0.12128,0.007056
3,Molpher_3,sim,csk,1749302.0,32/42,0.761905,0.123391,0.011643
4,Molpher_4,sim,csk,1722174.0,32/42,0.761905,0.123599,0.003529
5,Molpher_mean,sim,csk,1661166.8,-,0.835874,0.127299,0.012583


Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,Molpher_0,sim,csk,1624700.0,43/46,0.934783,0.138021,0.03143
1,Molpher_1,sim,csk,1662823.0,38/44,0.863636,0.130204,0.009258
2,Molpher_2,sim,csk,1546835.0,36/42,0.857143,0.12128,0.007056
3,Molpher_3,sim,csk,1749302.0,32/42,0.761905,0.123391,0.011643
4,Molpher_4,sim,csk,1722174.0,32/42,0.761905,0.123599,0.003529
5,Molpher_mean,sim,csk,1661166.8,-,0.835874,0.127299,0.012583


# Execute metric calculation function

In [None]:
for receptor in ['Leukocyte_elastase']:
    for type_scaffold in ['csk','murcko']:
        for type_cluster in ['dis','sim']:
            for subset in ['','_500k', '_250k', '_125k', '_62.5k']:
                ncpus = 10
                
                # Define generator names with different epsilon values
                generators_name_list = [
                    f"Molpher{subset}",
                    f"REINVENT{subset}",
                    f"DrugEx_GT_epsilon_0.1{subset}",
                    f"DrugEx_GT_epsilon_0.6{subset}",
                    f"DrugEx_RNN_epsilon_0.1{subset}",
                    f"DrugEx_RNN_epsilon_0.6{subset}",
                    f"GB_GA_mut_r_0.01{subset}",
                    f"GB_GA_mut_r_0.5{subset}",
                    f"addcarbon{subset}"
                ]
                for generator in generators_name_list:
                    calculate_metrics(type_cluster,type_scaffold,generator,receptor,ncpus = ncpus)

NUMBER:  0


[15:37:03] Explicit valence for atom # 24 C, 5, is greater than permitted
[15:37:03] Explicit valence for atom # 24 C, 5, is greater than permitted
[15:37:03] Explicit valence for atom # 25 C, 5, is greater than permitted
[15:38:08] Explicit valence for atom # 19 C, 5, is greater than permitted
[15:38:08] Explicit valence for atom # 20 C, 5, is greater than permitted
[15:38:08] Explicit valence for atom # 20 C, 5, is greater than permitted


## Combining and Normalizing Metrics

The following cell runs functions that:

- merge the mean values of all metrics into a single `pandas.DataFrame` (using `connect_mean_value`)
- apply Min-Max normalization to scale the values (using `connect_mean_value_normalized`)


In [26]:
from src import metrics_connection # Importing custom metric functions
imp.reload(metrics_connection)

<module 'src.metrics_connection' from '/home/filv/phd_projects/iga_2023/git_reccal/new/recall_metrics/src/metrics_connection.py'>

In [28]:
for receptor in ['Glucocorticoid_receptor', 'Leukocyte_elastase']:
    for type_scaffold in ['csk', 'murcko']:
        for type_cluster in ['dis', 'sim']:  # Different cluster types
            for subset in ['_62.5k']:
            
                # Define generator names with different epsilon values
                generators_name_list = [
                    f"Molpher{subset}",
                    f"REINVENT{subset}",
                    f"DrugEx_GT_epsilon_0.1{subset}",
                    f"DrugEx_GT_epsilon_0.6{subset}",
                    f"DrugEx_RNN_epsilon_0.1{subset}",
                    f"DrugEx_RNN_epsilon_0.6{subset}",
                    f"GB_GA_mut_r_0.01{subset}",
                    f"GB_GA_mut_r_0.5{subset}",
                    f"addcarbon{subset}"
                ]
    
                # Connect and process mean values
                df = metrics_connection.connect_mean_value(type_cluster, type_scaffold, generators_name_list, receptor, subset)
                df1 = metrics_connection.connect_mean_value_normalized(type_cluster, type_scaffold, generators_name_list, receptor, subset)
                
                display(df)
                display(df1)

Unnamed: 0,name,type_cluster,scaffold,USo,SSo,TUPOR_,TUPOR,SESY,ASER,CwASo
0,Molpher_62.5k_mean,dis,csk,11998.6,64267.4,-,0.275797,0.186985,0.005181,322.0
1,REINVENT_62.5k_mean,dis,csk,15357.6,61685.2,-,0.210779,0.249357,0.00444,268.2
2,DrugEx_GT_epsilon_0.1_62.5k_mean,dis,csk,19708.4,62449.4,-,0.345967,0.315596,0.004801,297.4
3,DrugEx_GT_epsilon_0.6_62.5k_mean,dis,csk,28496.8,62459.4,-,0.532874,0.456241,0.007952,490.6
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,dis,csk,12740.4,62468.0,-,0.23048,0.203956,0.005124,315.2
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,dis,csk,12328.4,61886.6,-,0.480158,0.199214,0.018424,1085.2
6,GB_GA_mut_r_0.01_62.5k_mean,dis,csk,23549.6,62496.0,-,0.326206,0.376818,0.003068,190.6
7,GB_GA_mut_r_0.5_62.5k_mean,dis,csk,17976.2,62498.0,-,0.315395,0.287629,0.004304,267.0
8,addcarbon_62.5k_mean,dis,csk,,62500.0,-,0.033403,0.013565,0.001968,


Unnamed: 0,name,type_cluster,scaffold,USo,SSo,TUPOR_,TUPOR,SESY,ASER,CwASo
0,Molpher_62.5k_mean,dis,csk,0.0,1.0,-,0.485302,0.391754,0.195269,0.146881
1,REINVENT_62.5k_mean,dis,csk,0.203598,0.0,-,0.355127,0.532651,0.150208,0.086743
2,DrugEx_GT_epsilon_0.1_62.5k_mean,dis,csk,0.467312,0.295949,-,0.62579,0.682286,0.172158,0.119383
3,DrugEx_GT_epsilon_0.6_62.5k_mean,dis,csk,1.0,0.299822,-,1.0,1.0,0.363616,0.335345
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,dis,csk,0.044962,0.303152,-,0.394571,0.430092,0.191762,0.13928
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,dis,csk,0.01999,0.077996,-,0.894458,0.419379,1.0,1.0
6,GB_GA_mut_r_0.01_62.5k_mean,dis,csk,0.700137,0.313996,-,0.586226,0.820584,0.066858,0.0
7,GB_GA_mut_r_0.5_62.5k_mean,dis,csk,0.362318,0.31477,-,0.564581,0.619107,0.141932,0.085401
8,addcarbon_62.5k_mean,dis,csk,,0.315545,-,0.0,0.0,0.0,


Unnamed: 0,name,type_cluster,scaffold,USo,SSo,TUPOR_,TUPOR,SESY,ASER,CwASo
0,Molpher_62.5k_mean,sim,csk,11481.0,62631.6,-,0.466779,0.183303,0.014613,900.0
1,REINVENT_62.5k_mean,sim,csk,17606.4,61914.6,-,0.303971,0.284446,0.008965,548.6
2,DrugEx_GT_epsilon_0.1_62.5k_mean,sim,csk,19859.4,62444.0,-,0.375883,0.318032,0.013917,842.6
3,DrugEx_GT_epsilon_0.6_62.5k_mean,sim,csk,28863.8,62467.2,-,0.720447,0.462054,0.017285,1058.0
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,sim,csk,13718.4,62445.8,-,0.278013,0.219666,0.020624,1255.6
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,sim,csk,12482.8,61978.6,-,0.654225,0.201373,0.026403,1583.2
6,GB_GA_mut_r_0.01_62.5k_mean,sim,csk,23878.4,62497.6,-,0.421264,0.382069,0.006345,393.8
7,GB_GA_mut_r_0.5_62.5k_mean,sim,csk,18052.4,62497.4,-,0.431339,0.28885,0.009972,616.0
8,addcarbon_62.5k_mean,sim,csk,855.2,62500.0,-,0.201548,0.013683,0.014452,884.4


Unnamed: 0,name,type_cluster,scaffold,USo,SSo,TUPOR_,TUPOR,SESY,ASER,CwASo
0,Molpher_62.5k_mean,sim,csk,0.379376,1.0,-,0.511141,0.378303,0.412193,0.425593
1,REINVENT_62.5k_mean,sim,csk,0.598073,0.0,-,0.197384,0.603883,0.130586,0.13015
2,DrugEx_GT_epsilon_0.1_62.5k_mean,sim,csk,0.678513,0.738354,-,0.33597,0.67879,0.377468,0.377333
3,DrugEx_GT_epsilon_0.6_62.5k_mean,sim,csk,1.0,0.770711,-,1.0,1.0,0.545401,0.558433
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,sim,csk,0.459259,0.740865,-,0.147358,0.459403,0.711857,0.724567
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,sim,csk,0.415144,0.089261,-,0.872379,0.418604,1.0,1.0
6,GB_GA_mut_r_0.01_62.5k_mean,sim,csk,0.822005,0.81311,-,0.423427,0.82161,0.0,0.0
7,GB_GA_mut_r_0.5_62.5k_mean,sim,csk,0.613997,0.812831,-,0.442843,0.613704,0.180795,0.186817
8,addcarbon_62.5k_mean,sim,csk,0.0,0.816457,-,0.0,0.0,0.404141,0.412477


Unnamed: 0,name,type_cluster,scaffold,USo,SSo,TUPOR_,TUPOR,SESY,ASER,CwASo
0,Molpher_62.5k_mean,dis,murcko,20039.0,64267.4,-,0.057369,0.31228,0.001007,63.0
1,REINVENT_62.5k_mean,dis,murcko,48606.4,61693.0,-,0.038883,0.787626,0.000608,36.8
2,DrugEx_GT_epsilon_0.1_62.5k_mean,dis,murcko,41494.2,62451.4,-,0.075826,0.664439,0.001079,67.2
3,DrugEx_GT_epsilon_0.6_62.5k_mean,dis,murcko,47137.4,62464.8,-,0.151764,0.754621,0.001773,110.4
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,dis,murcko,37399.6,62488.6,-,0.025324,0.598504,0.000615,38.4
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,dis,murcko,37169.6,61917.2,-,0.155704,0.600296,0.001597,98.6
6,GB_GA_mut_r_0.01_62.5k_mean,dis,murcko,40663.6,62496.0,-,0.038619,0.650659,0.000311,19.4
7,GB_GA_mut_r_0.5_62.5k_mean,dis,murcko,34164.4,62498.6,-,0.044642,0.546643,0.000336,21.0
8,addcarbon_62.5k_mean,dis,murcko,2873.6,62500.0,-,0.015227,0.045978,0.000805,50.2


Unnamed: 0,name,type_cluster,scaffold,USo,SSo,TUPOR_,TUPOR,SESY,ASER,CwASo
0,Molpher_62.5k_mean,dis,murcko,0.375341,1.0,-,0.299992,0.359068,0.476135,0.479121
1,REINVENT_62.5k_mean,dis,murcko,1.0,0.0,-,0.168396,1.0,0.203228,0.191209
2,DrugEx_GT_epsilon_0.1_62.5k_mean,dis,murcko,0.844484,0.294593,-,0.431379,0.8339,0.525711,0.525275
3,DrugEx_GT_epsilon_0.6_62.5k_mean,dis,murcko,0.967879,0.299798,-,0.971957,0.955498,1.0,1.0
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,dis,murcko,0.75495,0.309043,-,0.071877,0.744998,0.208356,0.208791
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,dis,murcko,0.749921,0.087088,-,1.0,0.747414,0.879718,0.87033
6,GB_GA_mut_r_0.01_62.5k_mean,dis,murcko,0.826322,0.311917,-,0.16652,0.815321,0.0,0.0
7,GB_GA_mut_r_0.5_62.5k_mean,dis,murcko,0.684209,0.312927,-,0.209394,0.675071,0.017574,0.017582
8,addcarbon_62.5k_mean,dis,murcko,0.0,0.313471,-,0.0,0.0,0.33828,0.338462


Unnamed: 0,name,type_cluster,scaffold,USo,SSo,TUPOR_,TUPOR,SESY,ASER,CwASo
0,Molpher_62.5k_mean,sim,murcko,19264.2,62631.6,-,0.174795,0.307573,0.005126,318.8
1,REINVENT_62.5k_mean,sim,murcko,51453.2,61917.6,-,0.048288,0.831004,0.00221,135.8
2,DrugEx_GT_epsilon_0.1_62.5k_mean,sim,murcko,43392.6,62445.2,-,0.085756,0.694892,0.008243,499.6
3,DrugEx_GT_epsilon_0.6_62.5k_mean,sim,murcko,47159.2,62472.8,-,0.299036,0.754865,0.004636,287.2
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,sim,murcko,38699.2,62465.6,-,0.034065,0.619518,0.002389,148.2
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,sim,murcko,37476.0,62008.2,-,0.279831,0.604306,0.003604,222.4
6,GB_GA_mut_r_0.01_62.5k_mean,sim,murcko,40918.6,62497.6,-,0.083339,0.654722,0.002072,129.2
7,GB_GA_mut_r_0.5_62.5k_mean,sim,murcko,34047.8,62497.6,-,0.084787,0.544786,0.003299,205.4
8,addcarbon_62.5k_mean,sim,murcko,2870.8,62500.0,-,0.046849,0.045933,0.003206,199.4


Unnamed: 0,name,type_cluster,scaffold,USo,SSo,TUPOR_,TUPOR,SESY,ASER,CwASo
0,Molpher_62.5k_mean,sim,murcko,0.337435,1.0,-,0.531115,0.333269,0.494871,0.511879
1,REINVENT_62.5k_mean,sim,murcko,1.0,0.0,-,0.053677,1.0,0.022267,0.017819
2,DrugEx_GT_epsilon_0.1_62.5k_mean,sim,murcko,0.834084,0.738936,-,0.19508,0.826625,1.0,1.0
3,DrugEx_GT_epsilon_0.6_62.5k_mean,sim,murcko,0.911614,0.777591,-,1.0,0.903016,0.415477,0.426566
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,sim,murcko,0.737477,0.767507,-,0.0,0.730615,0.05126,0.051296
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,sim,murcko,0.712299,0.126891,-,0.927519,0.711239,0.248165,0.25162
6,GB_GA_mut_r_0.01_62.5k_mean,sim,murcko,0.78316,0.812325,-,0.185958,0.775457,0.0,0.0
7,GB_GA_mut_r_0.5_62.5k_mean,sim,murcko,0.641734,0.812325,-,0.191424,0.635423,0.198801,0.205724
8,addcarbon_62.5k_mean,sim,murcko,0.0,0.815686,-,0.048244,0.0,0.183648,0.189525


Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,Molpher_62.5k_mean,dis,csk,62587.2,-,0.441405,0.200562,0.004505
1,REINVENT_62.5k_mean,dis,csk,61699.8,-,0.382392,0.263057,0.005054
2,DrugEx_GT_epsilon_0.1_62.5k_mean,dis,csk,62413.4,-,0.470729,0.349069,0.020162
3,DrugEx_GT_epsilon_0.6_62.5k_mean,dis,csk,62213.2,-,0.59051,0.410697,0.019505
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,dis,csk,61608.4,-,0.432523,0.154487,0.049374
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,dis,csk,61424.0,-,0.598948,0.198376,0.035476
6,GB_GA_mut_r_0.01_62.5k_mean,dis,csk,55988.2,-,0.222678,0.156785,0.014699
7,GB_GA_mut_r_0.5_62.5k_mean,dis,csk,50724.8,-,0.233564,0.133874,0.015586
8,addcarbon_62.5k_mean,dis,csk,62500.0,-,0.137921,0.014621,0.004058


Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,Molpher_62.5k_mean,dis,csk,1.0,-,0.65828,0.46946,0.009869
1,REINVENT_62.5k_mean,dis,csk,0.925192,-,0.530275,0.627244,0.021973
2,DrugEx_GT_epsilon_0.1_62.5k_mean,dis,csk,0.985349,-,0.721885,0.844404,0.355382
3,DrugEx_GT_epsilon_0.6_62.5k_mean,dis,csk,0.968472,-,0.981698,1.0,0.340866
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,dis,csk,0.917487,-,0.639014,0.353131,1.0
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,dis,csk,0.901942,-,1.0,0.463938,0.693307
6,GB_GA_mut_r_0.01_62.5k_mean,dis,csk,0.443704,-,0.183845,0.358931,0.234813
7,GB_GA_mut_r_0.5_62.5k_mean,dis,csk,0.0,-,0.207457,0.301087,0.254402
8,addcarbon_62.5k_mean,dis,csk,0.992649,-,0.0,0.0,0.0


Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,Molpher_62.5k_mean,sim,csk,62641.0,-,0.621673,0.203287,0.012317
1,REINVENT_62.5k_mean,sim,csk,61840.2,-,0.41039,0.244203,0.010805
2,DrugEx_GT_epsilon_0.1_62.5k_mean,sim,csk,62436.8,-,0.597647,0.345729,0.021248
3,DrugEx_GT_epsilon_0.6_62.5k_mean,sim,csk,62205.8,-,0.747723,0.420602,0.027119
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,sim,csk,61029.8,-,0.450348,0.154924,0.065827
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,sim,csk,61507.0,-,0.746678,0.217028,0.048005
6,GB_GA_mut_r_0.01_62.5k_mean,sim,csk,57020.8,-,0.270271,0.18673,0.017218
7,GB_GA_mut_r_0.5_62.5k_mean,sim,csk,51823.4,-,0.338566,0.151576,0.027658
8,addcarbon_62.5k_mean,sim,csk,62500.0,-,0.349746,0.014797,0.013027


Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,Molpher_62.5k_mean,sim,csk,1.0,-,0.735996,0.464484,0.027487
1,REINVENT_62.5k_mean,sim,csk,0.925972,-,0.293472,0.565312,0.0
2,DrugEx_GT_epsilon_0.1_62.5k_mean,sim,csk,0.981123,-,0.685674,0.815493,0.189792
3,DrugEx_GT_epsilon_0.6_62.5k_mean,sim,csk,0.959769,-,1.0,1.0,0.296499
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,sim,csk,0.851058,-,0.377163,0.345306,1.0
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,sim,csk,0.895171,-,0.997812,0.498345,0.676083
6,GB_GA_mut_r_0.01_62.5k_mean,sim,csk,0.480458,-,0.0,0.423683,0.116545
7,GB_GA_mut_r_0.5_62.5k_mean,sim,csk,0.0,-,0.14304,0.337055,0.306286
8,addcarbon_62.5k_mean,sim,csk,0.986966,-,0.166456,0.0,0.040378


Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,Molpher_62.5k_mean,dis,murcko,62588.2,-,0.096729,0.359111,0.00184
1,REINVENT_62.5k_mean,dis,murcko,61703.0,-,0.122112,0.786161,0.000612
2,DrugEx_GT_epsilon_0.1_62.5k_mean,dis,murcko,62423.8,-,0.122074,0.748953,0.007241
3,DrugEx_GT_epsilon_0.6_62.5k_mean,dis,murcko,62219.2,-,0.288677,0.731255,0.006152
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,dis,murcko,61682.6,-,0.059928,0.577273,0.015026
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,dis,murcko,61508.8,-,0.18187,0.635373,0.009665
6,GB_GA_mut_r_0.01_62.5k_mean,dis,murcko,55988.2,-,0.04518,0.373767,0.007213
7,GB_GA_mut_r_0.5_62.5k_mean,dis,murcko,50724.8,-,0.037591,0.364277,0.006057
8,addcarbon_62.5k_mean,dis,murcko,62500.0,-,0.029286,0.0756,0.000339


Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,Molpher_62.5k_mean,dis,murcko,1.0,-,0.260004,0.398996,0.102201
1,REINVENT_62.5k_mean,dis,murcko,0.925384,-,0.35786,1.0,0.018575
2,DrugEx_GT_epsilon_0.1_62.5k_mean,dis,murcko,0.986142,-,0.357715,0.947636,0.469949
3,DrugEx_GT_epsilon_0.6_62.5k_mean,dis,murcko,0.968896,-,1.0,0.922729,0.395793
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,dis,murcko,0.923664,-,0.118133,0.706024,1.0
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,dis,murcko,0.909014,-,0.58824,0.78779,0.634983
6,GB_GA_mut_r_0.01_62.5k_mean,dis,murcko,0.443667,-,0.061276,0.419621,0.468002
7,GB_GA_mut_r_0.5_62.5k_mean,dis,murcko,0.0,-,0.032019,0.406266,0.389277
8,addcarbon_62.5k_mean,dis,murcko,0.992565,-,0.0,0.0,0.0


Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,Molpher_62.5k_mean,sim,murcko,62641.2,-,0.196671,0.358739,0.004471
1,REINVENT_62.5k_mean,sim,murcko,61846.0,-,0.101651,0.814765,0.001225
2,DrugEx_GT_epsilon_0.1_62.5k_mean,sim,murcko,62440.2,-,0.23603,0.740815,0.005546
3,DrugEx_GT_epsilon_0.6_62.5k_mean,sim,murcko,62215.4,-,0.409863,0.755204,0.00847
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,sim,murcko,61114.2,-,0.062564,0.582847,0.019449
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,sim,murcko,61618.6,-,0.303905,0.66186,0.012136
6,GB_GA_mut_r_0.01_62.5k_mean,sim,murcko,57020.8,-,0.042138,0.40883,0.003466
7,GB_GA_mut_r_0.5_62.5k_mean,sim,murcko,51823.4,-,0.04328,0.388567,0.005038
8,addcarbon_62.5k_mean,sim,murcko,62500.0,-,0.114764,0.07552,0.00178


Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,Molpher_62.5k_mean,sim,murcko,1.0,-,0.420241,0.38312,0.178091
1,REINVENT_62.5k_mean,sim,murcko,0.926492,-,0.161841,1.0,0.0
2,DrugEx_GT_epsilon_0.1_62.5k_mean,sim,murcko,0.98142,-,0.527273,0.899966,0.2371
3,DrugEx_GT_epsilon_0.6_62.5k_mean,sim,murcko,0.960639,-,1.0,0.919431,0.397533
4,DrugEx_RNN_epsilon_0.1_62.5k_mean,sim,murcko,0.858844,-,0.055549,0.686277,1.0
5,DrugEx_RNN_epsilon_0.6_62.5k_mean,sim,murcko,0.905471,-,0.711855,0.79316,0.598719
6,GB_GA_mut_r_0.01_62.5k_mean,sim,murcko,0.480449,-,0.0,0.450879,0.122956
7,GB_GA_mut_r_0.5_62.5k_mean,sim,murcko,0.0,-,0.003107,0.423469,0.20923
8,addcarbon_62.5k_mean,sim,murcko,0.986947,-,0.1975,0.0,0.030459
