# Molecular Generator Evaluation using TUPOR, SESY and ASER Metrics

🔹 **Objective**  
This notebook evaluates molecular generators by computing four key metrics:  
   - **TUPOR**: scaffold recall metrics  
   - **SESY**: scaffold hopping potencial  
   - **ASER**: chemical space exploration

🔹 **Workflow**  
1️⃣ **Compute Metrics**: The script calculates TUPOR, SESY and ASER for different molecular generators.  
2️⃣ **Merge Data**: Results from multiple generators are combined into a single Pandas DataFrame.  
3️⃣ **Normalize Values**: The computed metrics are normalized using Min-Max scaling for comparison.  
4️⃣ **Save Outputs**: Processed data is stored in CSV files for further analysis.  

🔹 **Data Structure**  
- The calculations are performed for different **scaffold types** (`csk`, `murcko`) and **cluster types** (`dis`, `sim`).  
- Results are computed for multiple **generators** (`Molpher`, `REINVENT`, `DrugEx`, `GB_GA`, etc.).  
- The analysis is conducted for a specific **biological target receptor**, such as the **Glucocorticoid receptor**.

This notebook allows us to compare the performance of various molecular generators in terms of structural diversity, similarity to known bioactive compounds, and synthetic feasibility.

# Loading required libraries

In [30]:
from src import metrics # Importing custom metric functions
import importlib as imp
imp.reload(metrics)

<module 'src.metrics' from '/home/filv/phd_projects/iga_2023/git_reccal/new/recall_metrics/src/metrics.py'>

# Function to calculate metrics

In [27]:
def calculate_metrics(type_cluster, type_scaffold,generator, receptor, ncpus = 1):
    """
    Function to calculate molecular generation metrics.
    
    Parameters:
    - scaffold_type: Type of scaffold (e.g., 'csk' or 'murcko')
    - type_cluster: Cluster type  (e.g., 'dis' or 'sim') dis = Dissimilarity split; sim = Similarity split
    - generator: Name of the molecular generator
    - receptor: Target receptor for drug design

    Returns:
    - Computed metrics
    """
    mt = metrics.Metrics(type_cluster, type_scaffold, generator, receptor, ncpus)     
    result = mt.calculate_metrics()
    display(result)
    return result

# Define parameters for metric calculations

In [3]:
type_cluster = 'sim' #options: 'dis'|'sim' 
type_scaffold = 'csk' #options: 'csk'|'murcko'
generator = 'Molpher' #options: 'Molpher'|'DrugEx'|'REINVENT'|'addcarbon'
receptor = 'Leukocyte_elastase' #options: 'Glucocorticoid_receptor'|'Leukocyte_elastase'

calculate_metrics(type_cluster,type_scaffold,generator,receptor, ncpus = 10)

NUMBER:  0


[10:03:34] Explicit valence for atom # 12 C, 5, is greater than permitted
[10:03:49] Explicit valence for atom # 28 C, 5, is greater than permitted
[10:03:49] Explicit valence for atom # 28 C, 5, is greater than permitted
[10:03:49] Explicit valence for atom # 27 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 7 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 7 C, 5, is greater than permitted
[10:04:05] Explicit valence for atom # 7 C, 5, 

NUMBER:  1


[10:08:55] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:08:55] Explicit valence for atom # 23 C, 5, is greater than permitted
[10:08:55] Explicit valence for atom # 23 C, 5, is greater than permitted
[10:08:55] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:08:55] Explicit valence for atom # 23 C, 5, is greater than permitted
[10:09:48] Explicit valence for atom # 11 C, 5, is greater than permitted
[10:09:48] Explicit valence for atom # 11 C, 5, is greater than permitted
[10:09:48] Explicit valence for atom # 11 C, 5, is greater than permitted
[10:10:01] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:10:01] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:10:01] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:10:07] Explicit valence for atom # 7 C, 5, is greater than permitted
[10:10:48] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:12:09] Explicit valence for atom # 1

NUMBER:  2


[10:15:35] Explicit valence for atom # 10 C, 5, is greater than permitted
[10:16:46] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:16:46] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:17:34] Explicit valence for atom # 6 C, 5, is greater than permitted
[10:17:34] Explicit valence for atom # 6 C, 5, is greater than permitted
[10:17:34] Explicit valence for atom # 25 C, 5, is greater than permitted
[10:17:34] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:17:34] Explicit valence for atom # 6 C, 5, is greater than permitted
[10:17:34] Explicit valence for atom # 6 C, 5, is greater than permitted
[10:17:53] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:17:58] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:17:58] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:17:58] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:18:11] Explicit valence for atom # 7 C, 5

NUMBER:  3


[10:21:36] Explicit valence for atom # 10 C, 5, is greater than permitted
[10:21:54] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:21:56] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:21:56] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:23:58] Explicit valence for atom # 26 C, 5, is greater than permitted
[10:23:58] Explicit valence for atom # 25 C, 5, is greater than permitted
[10:23:58] Explicit valence for atom # 25 C, 5, is greater than permitted
[10:24:35] Explicit valence for atom # 31 C, 5, is greater than permitted
[10:24:35] Explicit valence for atom # 26 C, 5, is greater than permitted
[10:24:35] Explicit valence for atom # 27 C, 5, is greater than permitted
[10:24:42] Explicit valence for atom # 29 C, 5, is greater than permitted
[10:25:23] Explicit valence for atom # 15 C, 5, is greater than permitted


NUMBER:  4


[10:27:38] Explicit valence for atom # 25 C, 5, is greater than permitted
[10:27:38] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:27:38] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 16 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 14 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:28:17] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:28:21] Explicit valence for atom # 21 C, 5, is greater than permitted
[10:28:21] Explicit valence for atom # 21 C, 5, is greater than permitted
[10:28:21] Explicit valence for atom #

Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,Molpher_0,sim,csk,1624700.0,43/46,0.934783,0.138021,0.03143
1,Molpher_1,sim,csk,1662823.0,38/44,0.863636,0.130204,0.009258
2,Molpher_2,sim,csk,1546835.0,36/42,0.857143,0.12128,0.007056
3,Molpher_3,sim,csk,1749302.0,32/42,0.761905,0.123391,0.011643
4,Molpher_4,sim,csk,1722174.0,32/42,0.761905,0.123599,0.003529
5,Molpher_mean,sim,csk,1661166.8,-,0.835874,0.127299,0.012583


Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,Molpher_0,sim,csk,1624700.0,43/46,0.934783,0.138021,0.03143
1,Molpher_1,sim,csk,1662823.0,38/44,0.863636,0.130204,0.009258
2,Molpher_2,sim,csk,1546835.0,36/42,0.857143,0.12128,0.007056
3,Molpher_3,sim,csk,1749302.0,32/42,0.761905,0.123391,0.011643
4,Molpher_4,sim,csk,1722174.0,32/42,0.761905,0.123599,0.003529
5,Molpher_mean,sim,csk,1661166.8,-,0.835874,0.127299,0.012583


# Execute metric calculation function

In [28]:
for receptor in ['Leukocyte_elastase']:
    for type_scaffold in ['csk','murcko']:
        for type_cluster in ['dis','sim']:
            #for subset in ['','_500k', '_250k', '_125k', '_62.5k']:
            for subset in ['']:
                ncpus = 10
                
                # Define generator names with different epsilon values
                generators_name_list = [
                    #f"Molpher{subset}",
                    #f"REINVENT{subset}",
                    #f"DrugEx_GT_epsilon_0.1{subset}",
                    #f"DrugEx_GT_epsilon_0.6{subset}",
                    #f"DrugEx_RNN_epsilon_0.1{subset}",
                    #f"DrugEx_RNN_epsilon_0.6{subset}",
                    f"GB_GA_new_mut_r_0.01{subset}",
                    #f"GB_GA_mut_r_0.5{subset}",
                    #f"addcarbon{subset}"
                ]
                for generator in generators_name_list:
                    print(generator)
                    calculate_metrics(type_cluster,type_scaffold,generator,receptor,ncpus = ncpus)

GB_GA_new_mut_r_0.01
NUMBER:  0
NUMBER:  1
NUMBER:  2
NUMBER:  3
NUMBER:  4


Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,GB_GA_new_mut_r_0.01_0,dis,csk,965034.0,35/42,0.833333,0.051346,0.046633
1,GB_GA_new_mut_r_0.01_1,dis,csk,954314.0,27/41,0.658537,0.048422,0.01095
2,GB_GA_new_mut_r_0.01_2,dis,csk,950738.0,39/45,0.866667,0.04043,0.004819
3,GB_GA_new_mut_r_0.01_3,dis,csk,968998.0,37/47,0.787234,0.04947,0.148856
4,GB_GA_new_mut_r_0.01_4,dis,csk,950214.0,29/41,0.707317,0.04126,0.01648
5,GB_GA_new_mut_r_0.01_mean,dis,csk,957859.6,-,0.770617,0.046186,0.045548


GB_GA_new_mut_r_0.01
NUMBER:  0
NUMBER:  1
NUMBER:  2
NUMBER:  3
NUMBER:  4


Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,GB_GA_new_mut_r_0.01_0,sim,csk,970116.0,44/46,0.956522,0.082954,0.131986
1,GB_GA_new_mut_r_0.01_1,sim,csk,958878.0,38/44,0.863636,0.042822,0.019254
2,GB_GA_new_mut_r_0.01_2,sim,csk,954807.0,36/42,0.857143,0.042849,0.038149
3,GB_GA_new_mut_r_0.01_3,sim,csk,955239.0,32/42,0.761905,0.043637,0.060815
4,GB_GA_new_mut_r_0.01_4,sim,csk,956949.0,31/42,0.738095,0.045547,0.006907
5,GB_GA_new_mut_r_0.01_mean,sim,csk,959197.8,-,0.83546,0.051562,0.051422


GB_GA_new_mut_r_0.01
NUMBER:  0
NUMBER:  1
NUMBER:  2
NUMBER:  3
NUMBER:  4


Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,GB_GA_new_mut_r_0.01_0,dis,murcko,965034.0,45/106,0.424528,0.293862,0.024132
1,GB_GA_new_mut_r_0.01_1,dis,murcko,954314.0,15/96,0.15625,0.240445,0.001467
2,GB_GA_new_mut_r_0.01_2,dis,murcko,950738.0,38/102,0.372549,0.211128,0.001844
3,GB_GA_new_mut_r_0.01_3,dis,murcko,968998.0,18/62,0.290323,0.25297,0.05874
4,GB_GA_new_mut_r_0.01_4,dis,murcko,950214.0,29/71,0.408451,0.22621,0.004345
5,GB_GA_new_mut_r_0.01_mean,dis,murcko,957859.6,-,0.33042,0.244923,0.018106


GB_GA_new_mut_r_0.01
NUMBER:  0
NUMBER:  1
NUMBER:  2
NUMBER:  3
NUMBER:  4


Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,GB_GA_new_mut_r_0.01_0,sim,murcko,970116.0,91/172,0.52907,0.363387,0.06146
1,GB_GA_new_mut_r_0.01_1,sim,murcko,958878.0,55/93,0.591398,0.230296,0.007119
2,GB_GA_new_mut_r_0.01_2,sim,murcko,954807.0,34/65,0.523077,0.231124,0.01163
3,GB_GA_new_mut_r_0.01_3,sim,murcko,955239.0,25/57,0.438597,0.23468,0.015453
4,GB_GA_new_mut_r_0.01_4,sim,murcko,956949.0,15/50,0.3,0.238915,0.000497
5,GB_GA_new_mut_r_0.01_mean,sim,murcko,959197.8,-,0.476428,0.259681,0.019232


In [None]:
for receptor in ['Leukocyte_elastase']:
    for type_scaffold in ['csk','murcko']:
        for type_cluster in ['dis','sim']:
            #for subset in ['','_500k', '_250k', '_125k', '_62.5k']:
            for subset in ['']:
                ncpus = 10
                
                # Define generator names with different epsilon values
                generators_name_list = [
                    #f"Molpher{subset}",
                    #f"REINVENT{subset}",
                    #f"DrugEx_GT_epsilon_0.1{subset}",
                    #f"DrugEx_GT_epsilon_0.6{subset}",
                    #f"DrugEx_RNN_epsilon_0.1{subset}",
                    #f"DrugEx_RNN_epsilon_0.6{subset}",
                    f"GB_GA_new_mut_r_0.5{subset}",
                    #f"GB_GA_mut_r_0.5{subset}",
                    #f"addcarbon{subset}"
                ]
                for generator in generators_name_list:
                    print(generator)
                    calculate_metrics(type_cluster,type_scaffold,generator,receptor,ncpus = ncpus)

GB_GA_new_mut_r_0.5
NUMBER:  0


[10:02:06] Explicit valence for atom # 16 C, 5, is greater than permitted
[10:02:12] Explicit valence for atom # 10 C, 5, is greater than permitted
[10:02:14] Explicit valence for atom # 18 C, 5, is greater than permitted
[10:02:16] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:02:16] Explicit valence for atom # 13 C, 5, is greater than permitted
[10:02:16] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:02:17] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:02:18] Explicit valence for atom # 11 C, 5, is greater than permitted
[10:02:22] Explicit valence for atom # 13 C, 5, is greater than permitted
[10:02:24] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:02:25] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:02:26] Explicit valence for atom # 5 C, 5, is greater than permitted
[10:02:27] Explicit valence for atom # 5 C, 5, is greater than permitted
[10:02:28] Explicit valence for atom # 1 C, 5

NUMBER:  1


[10:03:58] Explicit valence for atom # 25 C, 5, is greater than permitted
[10:04:00] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:04:11] Explicit valence for atom # 7 C, 5, is greater than permitted
[10:04:20] Explicit valence for atom # 20 C, 5, is greater than permitted
[10:04:24] Explicit valence for atom # 4 C, 5, is greater than permitted
[10:04:25] Explicit valence for atom # 13 C, 5, is greater than permitted
[10:04:27] Explicit valence for atom # 7 C, 5, is greater than permitted
[10:04:28] Explicit valence for atom # 7 C, 5, is greater than permitted
[10:04:30] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:04:39] Explicit valence for atom # 8 C, 5, is greater than permitted
[10:04:40] Explicit valence for atom # 20 C, 5, is greater than permitted
[10:04:42] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:05:04] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:05:05] Explicit valence for atom # 11 C, 5,

NUMBER:  2


[10:06:47] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:06:51] Explicit valence for atom # 5 C, 5, is greater than permitted
[10:06:56] Explicit valence for atom # 6 C, 5, is greater than permitted
[10:07:01] Explicit valence for atom # 18 C, 5, is greater than permitted
[10:07:12] Explicit valence for atom # 11 C, 5, is greater than permitted
[10:07:13] Explicit valence for atom # 8 C, 5, is greater than permitted
[10:07:13] Explicit valence for atom # 5 C, 5, is greater than permitted
[10:07:20] Explicit valence for atom # 8 C, 5, is greater than permitted
[10:07:20] Explicit valence for atom # 4 C, 5, is greater than permitted
[10:07:20] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:07:21] Explicit valence for atom # 11 C, 5, is greater than permitted
[10:07:23] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:07:26] Explicit valence for atom # 12 C, 5, is greater than permitted
[10:07:27] Explicit valence for atom # 2 C, 5, 

NUMBER:  3


[10:09:44] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:09:51] Explicit valence for atom # 8 C, 5, is greater than permitted
[10:09:52] Explicit valence for atom # 26 C, 5, is greater than permitted
[10:09:53] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:09:58] Explicit valence for atom # 14 C, 5, is greater than permitted
[10:10:00] Explicit valence for atom # 32 C, 5, is greater than permitted
[10:10:05] Explicit valence for atom # 18 C, 5, is greater than permitted
[10:10:10] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:10:14] Explicit valence for atom # 6 C, 5, is greater than permitted
[10:10:15] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:10:23] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:10:27] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:10:30] Explicit valence for atom # 6 C, 5, is greater than permitted
[10:10:34] Explicit valence for atom # 1 C, 5, 

NUMBER:  4


[10:12:49] Explicit valence for atom # 11 C, 5, is greater than permitted
[10:12:50] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:12:50] Explicit valence for atom # 14 C, 5, is greater than permitted
[10:13:01] Explicit valence for atom # 28 C, 5, is greater than permitted
[10:13:02] Explicit valence for atom # 23 C, 5, is greater than permitted
[10:13:03] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:13:05] Explicit valence for atom # 14 C, 5, is greater than permitted
[10:13:06] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:13:07] Explicit valence for atom # 7 C, 5, is greater than permitted
[10:13:08] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:13:13] Explicit valence for atom # 4 C, 5, is greater than permitted
[10:13:19] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:13:26] Explicit valence for atom # 8 C, 5, is greater than permitted
[10:13:33] Explicit valence for atom # 13 C, 

Unnamed: 0,name,type_cluster,scaffold,SSo,TUPOR_,TUPOR,SESY,ASER
0,GB_GA_new_mut_r_0.5_0,dis,csk,928554.0,34/42,0.809524,0.035634,0.074778
1,GB_GA_new_mut_r_0.5_1,dis,csk,911965.0,26/41,0.634146,0.036109,0.011834
2,GB_GA_new_mut_r_0.5_2,dis,csk,913025.0,39/45,0.866667,0.033119,0.004017
3,GB_GA_new_mut_r_0.5_3,dis,csk,930248.0,33/47,0.702128,0.037688,0.210645
4,GB_GA_new_mut_r_0.5_4,dis,csk,905490.0,27/41,0.658537,0.031765,0.013035
5,GB_GA_new_mut_r_0.5_mean,dis,csk,917856.4,-,0.7342,0.034863,0.062862


GB_GA_new_mut_r_0.5
NUMBER:  0


[10:16:44] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:16:45] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:16:48] Explicit valence for atom # 20 C, 5, is greater than permitted
[10:16:57] Explicit valence for atom # 22 C, 5, is greater than permitted
[10:17:11] Explicit valence for atom # 12 C, 5, is greater than permitted
[10:17:16] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:17:18] Explicit valence for atom # 31 C, 5, is greater than permitted
[10:17:23] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:17:24] Explicit valence for atom # 5 C, 5, is greater than permitted
[10:17:27] Explicit valence for atom # 11 C, 5, is greater than permitted
[10:17:32] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:17:38] Explicit valence for atom # 19 C, 5, is greater than permitted
[10:17:38] Explicit valence for atom # 7 C, 5, is greater than permitted
[10:17:43] Explicit valence for atom # 8 C, 5

NUMBER:  1


[10:19:47] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:19:48] Explicit valence for atom # 14 C, 5, is greater than permitted
[10:19:48] Explicit valence for atom # 6 C, 5, is greater than permitted
[10:19:49] Explicit valence for atom # 14 C, 5, is greater than permitted
[10:19:50] Explicit valence for atom # 8 C, 5, is greater than permitted
[10:19:52] Explicit valence for atom # 5 C, 5, is greater than permitted
[10:19:53] Explicit valence for atom # 32 C, 5, is greater than permitted
[10:19:54] Explicit valence for atom # 18 C, 5, is greater than permitted
[10:19:54] Explicit valence for atom # 12 C, 5, is greater than permitted
[10:19:55] Explicit valence for atom # 7 C, 5, is greater than permitted
[10:20:01] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:20:04] Explicit valence for atom # 15 C, 5, is greater than permitted
[10:20:08] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:20:09] Explicit valence for atom # 14 C,

NUMBER:  2


[10:22:45] Explicit valence for atom # 4 C, 5, is greater than permitted
[10:22:49] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:22:49] Explicit valence for atom # 10 C, 5, is greater than permitted
[10:22:49] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:22:51] Explicit valence for atom # 22 C, 5, is greater than permitted
[10:22:52] Explicit valence for atom # 9 C, 5, is greater than permitted
[10:22:53] Explicit valence for atom # 12 C, 5, is greater than permitted
[10:22:55] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:22:56] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:22:56] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:22:57] Explicit valence for atom # 4 C, 5, is greater than permitted
[10:22:58] Explicit valence for atom # 23 C, 5, is greater than permitted
[10:22:58] Explicit valence for atom # 5 C, 5, is greater than permitted
[10:23:02] Explicit valence for atom # 14 C, 5,

NUMBER:  3


[10:25:52] Explicit valence for atom # 12 C, 5, is greater than permitted
[10:25:56] Explicit valence for atom # 23 C, 5, is greater than permitted
[10:25:58] Explicit valence for atom # 5 C, 5, is greater than permitted
[10:25:59] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:25:59] Explicit valence for atom # 4 C, 5, is greater than permitted
[10:25:59] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:26:02] Explicit valence for atom # 25 C, 5, is greater than permitted
[10:26:09] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:26:09] Explicit valence for atom # 19 C, 5, is greater than permitted
[10:26:11] Explicit valence for atom # 24 C, 5, is greater than permitted
[10:26:12] Explicit valence for atom # 8 C, 5, is greater than permitted
[10:26:12] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:26:13] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:26:13] Explicit valence for atom # 2 C, 5,

## Combining and Normalizing Metrics

The following cell runs functions that:

- merge the mean values of all metrics into a single `pandas.DataFrame` (using `connect_mean_value`)
- apply Min-Max normalization to scale the values (using `connect_mean_value_normalized`)


In [17]:
from src import metrics_connection # Importing custom metric functions
imp.reload(metrics_connection)

<module 'src.metrics_connection' from '/home/filv/phd_projects/iga_2023/git_reccal/new/recall_metrics/src/metrics_connection.py'>

In [37]:
for receptor in ['Leukocyte_elastase']:
    for type_scaffold in ['csk', 'murcko']:
        for type_cluster in ['dis', 'sim']:  # Different cluster types
            for subset in ['']:
            
                # Define generator names with different epsilon values
                generators_name_list = [
                    #f"Molpher{subset}",
                    #f"REINVENT{subset}",
                    #f"DrugEx_GT_epsilon_0.1{subset}",
                    #f"DrugEx_GT_epsilon_0.6{subset}",
                    #f"DrugEx_RNN_epsilon_0.1{subset}",
                    #f"DrugEx_RNN_epsilon_0.6{subset}",
                    f"GB_GA_mut_r_0.5{subset}",
                    f"GB_GA_new_mut_r_0.5{subset}",
                    #f"GB_GA_mut_r_0.5{subset}",
                    #f"addcarbon{subset}"
                ]
    
                # Connect and process mean values
                df = metrics_connection.connect_mean_value(type_cluster, type_scaffold, generators_name_list, receptor, subset)
                df1 = metrics_connection.connect_mean_value_normalized(type_cluster, type_scaffold, generators_name_list, receptor, subset)
                print(receptor)
                display(df[['name','type_cluster','scaffold','TUPOR','SESY','ASER']])
                display(df1[['name','type_cluster','scaffold','TUPOR','SESY','ASER']])

Leukocyte_elastase


Unnamed: 0,name,type_cluster,scaffold,TUPOR,SESY,ASER
0,GB_GA_mut_r_0.5_mean,dis,csk,0.435405,0.05339,0.014556
1,GB_GA_new_mut_r_0.5_mean,dis,csk,0.7342,0.034863,0.062862


Unnamed: 0,name,type_cluster,scaffold,TUPOR,SESY,ASER
0,GB_GA_mut_r_0.5_mean,dis,csk,0.0,1.0,0.0
1,GB_GA_new_mut_r_0.5_mean,dis,csk,1.0,0.0,1.0


Leukocyte_elastase


Unnamed: 0,name,type_cluster,scaffold,TUPOR,SESY,ASER
0,GB_GA_mut_r_0.5_mean,sim,csk,0.514474,0.062103,0.027059
1,GB_GA_new_mut_r_0.5_mean,sim,csk,0.831112,0.034867,0.064306


Unnamed: 0,name,type_cluster,scaffold,TUPOR,SESY,ASER
0,GB_GA_mut_r_0.5_mean,sim,csk,0.0,1.0,0.0
1,GB_GA_new_mut_r_0.5_mean,sim,csk,1.0,0.0,1.0


Leukocyte_elastase


Unnamed: 0,name,type_cluster,scaffold,TUPOR,SESY,ASER
0,GB_GA_mut_r_0.5_mean,dis,murcko,0.057478,0.237449,0.005819
1,GB_GA_new_mut_r_0.5_mean,dis,murcko,0.328627,0.236072,0.022809


Unnamed: 0,name,type_cluster,scaffold,TUPOR,SESY,ASER
0,GB_GA_mut_r_0.5_mean,dis,murcko,0.0,1.0,0.0
1,GB_GA_new_mut_r_0.5_mean,dis,murcko,1.0,0.0,1.0


Leukocyte_elastase


Unnamed: 0,name,type_cluster,scaffold,TUPOR,SESY,ASER
0,GB_GA_mut_r_0.5_mean,sim,murcko,0.071259,0.257604,0.005037
1,GB_GA_new_mut_r_0.5_mean,sim,murcko,0.559051,0.228702,0.021894


Unnamed: 0,name,type_cluster,scaffold,TUPOR,SESY,ASER
0,GB_GA_mut_r_0.5_mean,sim,murcko,0.0,1.0,0.0
1,GB_GA_new_mut_r_0.5_mean,sim,murcko,1.0,0.0,1.0
