# Inference times
This notebook performs a comparison of DENVIS and DeepDTA models with regards to their screening (inference) times.

For DENVIS we investigate the following parameters:
* Atom-level vs. surface-level
* Efficient implementation vs. naive implementation. With efficient implementation protein pockets embeddings are pre-computed and stored into memory, whereas with naive implementation the embeddings are computed afresh for each protein pocket-ligand pair.

We also show results for single protein-ligand pair inference vs. total time when using a model ensemble with 5 atom-level and 5 surface-level instances.

For all models, inference has been run for 20 times and timings are averaged (i.e. using mean).

In [3]:
import os 
import json

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()

import inference_times

# 1. Configuration

In [4]:
NUM_BASE_MODELS = 5  # Number of base models in the ensembles

TOT_RUNS = 20  # Number of runs that inference times have been measured (to estimate error bars)

models = ['atom', 'surface']
screening_types = ['efficient', 'naive']

PATH_RESULTS = {
    'DENVIS': '../data/times/denvis/',
    'DeepDTA': '../data/times/deepdta/'}

# 2. Parse ressults

In [10]:
columns = ['Model', 'Type', 'Run', 'Time']
times_df = pd.DataFrame(columns=columns)
for model in models:
    for screening_type in screening_types:
        for run in range(TOT_RUNS):
            fpath = os.path.join(
                PATH_RESULTS['DENVIS'], model + '_level', screening_type, 'run_' + str(run), 'dude.json')
            time = inference_times.read_denvis_times(fpath)
            times_df = pd.concat(
                (times_df, pd.DataFrame(
                    {'Model': [model],
                     'Type': screening_type,
                     'Run': run,
                     'Time': time})), axis='index')
                
# Create extra entries for ensemble models (sum the respective times)
sum_df = times_df.groupby(by=['Type', 'Run'], group_keys=False).sum().reset_index(drop=False)
sum_df['Model'] = 'ensemble'
times_df = pd.concat((times_df, sum_df), axis='index')

# If using a version ensemble, multiple times by number of ensembles
times_df['Total time'] = times_df['Time'] * NUM_BASE_MODELS

# DeepDTA
columns = ['Model', 'Time', 'Type']
times_df_deepdta = pd.DataFrame(columns=columns)
for run in range(TOT_RUNS):
    fpath = os.path.join(PATH_RESULTS['DeepDTA'], 'run_' + str(run),  'dude_times.csv')
    time = inference_times.read_deepdta_times(fpath)
    times_df_deepdta = pd.concat(
        (times_df_deepdta, pd.DataFrame({
            "Model": ['deepDTA'],
            'Type': ['naive'],
            'Time': [time],
            'Total time': [time]})), axis='index')

# Combine all
times_df_all = pd.concat((times_df, times_df_deepdta), axis=0)

# 3. Display average inference times (Table 5)

In [13]:
for model in ['ensemble', 'deepDTA']:
    if model == 'deepDTA':
        pred_mean = times_df_all[(times_df_all['Model']==model) & (times_df_all['Type']=='naive')]['Time'].mean()
        pred_std = times_df_all[(times_df_all['Model']==model) & (times_df_all['Type']=='naive')]['Time'].std()
        print(f"Average prediction time for {model} model with {implementation} implementation: {1000 * pred_mean:.3f} ± {1000 * pred_std:.3f} (mean ± std).")
    else:
        for implementation in ['efficient', 'naive']:
            single_pred_mean = times_df_all[(times_df_all['Model']==model) & (times_df_all['Type']==implementation)]['Time'].mean()
            single_pred_std = times_df_all[(times_df_all['Model']==model) & (times_df_all['Type']==implementation)]['Time'].std()
            ensemble_pred_mean = times_df_all[(times_df_all['Model']==model) & (times_df_all['Type']==implementation)]['Total time'].mean()
            ensemble_pred_std = times_df_all[(times_df_all['Model']==model) & (times_df_all['Type']==implementation)]['Total time'].std()
            print(f"Average prediction time for {model} model with {implementation} implementation: single prediction {1000 * single_pred_mean:.3f} ± {1000 * single_pred_std:.3f} , ensemble prediction {1000 * ensemble_pred_mean:.3f} ± {1000 * ensemble_pred_std:.3f} (mean ± std, ms).")


Average prediction time for ensemble model with efficient implementation: single prediction 0.314 ± 0.008 , ensemble prediction 1.570 ± 0.039 (mean ± std, ms).
Average prediction time for ensemble model with naive implementation: single prediction 1.908 ± 0.003 , ensemble prediction 9.540 ± 0.014 (mean ± std, ms).
Average prediction time for deepDTA model with naive implementation: 0.582 ± 0.006 (mean ± std).
