Step 1: Import Python Libraries into the Jupyter Notebook.

Note: Please download the relevant Python packages to execute this notebook.

In [1]:
import pandas as pd
from keras.models import load_model
import numpy as np
import time

Using TensorFlow backend.


Step 2: Load the precalculated featurizations for each of the 718 drugs in the DILIrank database (SSP, c-ADMET and c-MolDes).

In [2]:
SSP=pd.read_csv("3_2_1_1_General_DILI_MLP_Features\General_DILI_MLP_SSP.csv")
c_ADMET=pd.read_csv("3_2_1_1_General_DILI_MLP_Features\General_DILI_MLP_Standardized_c-ADMET.csv")
c_MolDes=pd.read_csv("3_2_1_1_General_DILI_MLP_Features\General_DILI_MLP_Standardized_c-MolDes.csv")

Step 3: Define a function, "get_drug_query", that takes the generic name of a drug as stated in the DILIrank database (718 drugs), which will then create the DILI-c MLP input by looking up the featurization values for both drugs in Step 2/

Define a function, "execute_DILI_c_MLP",  that takes the k-th DILI-c MLP model (k=1 to 10), which will then output the output vector of the expected DILI concern level for the k-th DILI-c MLP model (output index 0 means low DILI concern and output index 1 means high DILI concern).

Define another function, "get_ensemble_result", that takes the mean output vector of the 10 iterations of execute_DILI_c_MLP using 10 versions of DILI-c_MLP (stratified K=10) and then decide the final predicted DILI concern outcome, output index 0 means low DILI concern and output index 1 means high DILI concern. 

In [3]:
def get_drug_query(Drug):
    Drug_query=[]

    SSP_query=list(SSP.loc[SSP['DILIrank_ID'] == Drug].values[0])[1:]
    c_ADMET_query=list(c_ADMET.loc[c_ADMET['DILIrank_ID'] == Drug].values[0])[1:]
    c_MolDes_query=list(c_MolDes.loc[c_MolDes['DILIrank_ID'] == Drug].values[0])[1:]
    Drug_query=SSP_query+c_MolDes_query+c_ADMET_query 
    return Drug_query

def execute_DILI_c_MLP(Drug_query, current_model):
    input_to_DILI_c_MLP=pd.DataFrame(Drug_query).T
    #Get DILI-c MLP Prediction:
    output_vector = current_model.predict(input_to_DILI_c_MLP)
    return output_vector

def get_ensemble_result(k_th_output_vector):
    df=pd.DataFrame(k_th_output_vector)
    mean_output_vector=np.asarray([np.mean(df[0].values), np.mean(df[1].values)])
    predicted_DILI_c = np.argmax(mean_output_vector)

    status=""
    if predicted_DILI_c==0:
        status="Low-DILI-Concern"
    elif predicted_DILI_c==1:
        status="High-DILI-Concern"
    
    print("Model Name: DILI-c MLP, [1,128] network architecture, triple features")
    print("\n--- Query Details ---")
    print("Drug :", Drug)
    print("\n--- Start of Query Results ---")
    print("DILI-c MLP predicts that", Drug, "has", status, ".")
    print("--- End of Query Results ---")
    return predicted_DILI_c

Step 4: Load the 10 DILI-c MLP models into the notebook and perform DILI concern predictions. The final results is the ensemble of all the models.


Note: You can safely ignore any tensorflow warnings that pop up when executing this cell.

In [4]:
Drug="amineptine"
k_th_output_vector=[]
print("Starting Calculation:\n")
start_time = time.time()
model_load_time=start_time-start_time
Drug_query=get_drug_query(Drug)
query_time=time.time() - start_time

for current_k_iteration in range(10):
    start_time = time.time()
    saved_model = load_model("3_2_1_2_Misc Input Files for DILI-c MLP\DILI-c_MLP_K_"+str(current_k_iteration)+".h5")
    model_load_time=model_load_time+(time.time() - start_time)
    
    start_time = time.time()
    k_th_output_vector.append(list(execute_DILI_c_MLP(Drug_query, saved_model)[0]))
    query_time=query_time+(time.time() - start_time)
    print("Finished", current_k_iteration+1, "out of 10 Models.")

print("Finished Calculation.\n")

get_ensemble_result(k_th_output_vector)

print("\nTotal Model Load time: %.5f seconds" % (model_load_time))
print("Model Load time per cycle: %.5f seconds" % (model_load_time/10))
print("\nTotal Query Run time: %.5f seconds" % (query_time))
print("Query Run time per cycle: %.5f seconds" % (query_time/10))

Starting Calculation:

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.
Finished 1 out of 10 Models.
Finished 2 out of 10 Models.
Finished 3 out of 10 Models.
Finished 4 out of 10 Models.
Finished 5 out of 10 Models.
Finished 6 out of 10 Models.
Finished 7 out of 10 Models.
Finished 8 out of 10 Models.
Finished 9 out of 10 Models.
Finished 10 out of 10 Models.
Finished Calculation.

Model Name: DILI-c MLP, [1,128] network architecture, triple features

--- Query Details ---
Drug : amineptine

--- Start of Query Results ---
DILI-c MLP predicts that amineptine has High-DILI-Concern .
--- End of Query Results ---

Total Model Load time: 39.38807 seconds
Model Load time per cycle: 3.93881 seconds

Total Query Run time: 4.42268 seconds
Query Run time per cycle: 0.44227 seconds
