# Calculate performance of signature

Gregory Way, 2021

I previously identified a series of morphology features that were significantly different between sensitive and resistant clones.
I also applied this signature to all profiles from training, testing, validation, and holdout sets.
Here, I evaluate the performance of this signature.

## Evaluation

* Accuracy
  - The resistant and sensitive clones were balanced, so accuracy is an appropriate measure
* Average precision
  - How well are we able to classify the resistant samples (number correctly identified as resistant / total resistant)
  
## Shuffled results

I also randomly permute the signature score 100 times and perform the full evaluation.
I record performance in this shuffled set as a negative control.

## Metadata stratification

Lastly, I calculate performance in a variety of different metadata subsets. I calculate performance separately for:

1. Across model splits (training, test, validation, holdout)
2. Across model splits and plates (to identify plate-specific performance)
3. Across model splits and clone ID (to identify if certain clones are consistently predicted differentially)

In [1]:
import sys
import pathlib
import numpy as np
import pandas as pd
from sklearn.metrics import accuracy_score, average_precision_score

import plotnine as gg

from utils.metrics import get_metrics, get_metric_pipeline

In [2]:
np.random.seed(5678)

In [3]:
# Set constants
dataset = "bortezomib"

sig_dir = pathlib.Path("results", "singscore")
results_file = pathlib.Path(sig_dir, f"singscore_results{dataset}.tsv.gz")

output_dir = pathlib.Path("results", "performance")

num_permutations = 100
threshold = 0

metric_comparisons = {
    "total": ["Metadata_model_split"],
    "plate": ["Metadata_model_split", "Metadata_Plate"],
    "sample": ["Metadata_model_split", "Metadata_clone_number"]
}

In [4]:
# Load data
results_df = pd.read_csv(results_file, sep="\t")

print(results_df.shape)
results_df.head()

(525, 28)


Unnamed: 0,Metadata_Plate,Metadata_Well,Metadata_batch,Metadata_cell_count,Metadata_cell_density,Metadata_celltype_shorthand_from_plate_graph,Metadata_clone_number,Metadata_date,Metadata_plate_map_name,Metadata_time_to_adhere,...,TotalScore,TotalDispersion,UpScore,UpDispersion,DownScore,DownDispersion,Metadata_permuted_p_value,dataset,min_permuted_value,max_permuted_value
0,219907,B02,2021_03_03_Batch12,6139,2.5x10^3 cells/well,1.0,WT_parental,20210205.0,219814,48 hr,...,-0.098036,406.9737,-0.121904,189.7728,0.023868,217.2009,0.849,bortezomib,-0.149556,0.151539
1,219907,B03,2021_03_03_Batch12,4567,2.5x10^3 cells/well,2.0,CloneA,20210205.0,219814,48 hr,...,0.020157,573.7662,-0.056714,383.9934,0.076872,189.7728,0.387,bortezomib,-0.149556,0.151539
2,219907,B04,2021_03_03_Batch12,5624,2.5x10^3 cells/well,3.0,CloneE,20210205.0,219814,48 hr,...,0.069194,531.5121,-0.065754,352.8588,0.134948,178.6533,0.218,bortezomib,-0.149556,0.151539
3,219907,B05,2021_03_03_Batch12,5894,2.5x10^3 cells/well,4.0,WT clone 01,20210205.0,219814,48 hr,...,-0.010165,283.9179,-0.053064,120.0906,0.042899,163.8273,0.535,bortezomib,-0.149556,0.151539
4,219907,B06,2021_03_03_Batch12,1277,2.5x10^3 cells/well,5.0,WT clone 02,20210205.0,219814,48 hr,...,-0.212776,389.1825,-0.033159,170.499,-0.179616,218.6835,0.994,bortezomib,-0.149556,0.151539


In [5]:
# Using real predictions
real_metric_results = get_metric_pipeline(
    results_df,
    metric_comparisons,
    [dataset],
    shuffle=False,
    signature=False,
    threshold=threshold
)

  recall = tps / tps[-1]


In [6]:
# Using shuffled predictions
all_shuffle_results = {compare: [] for compare in metric_comparisons}
for i in range(0, num_permutations):
    np.random.seed(i)
    shuffle_metric_results = get_metric_pipeline(
        results_df,
        metric_comparisons,
        datasets=[dataset],
        shuffle=True,
        signature=False,
        threshold=threshold
    )
    for compare in metric_comparisons:
        metric_df = shuffle_metric_results[compare].assign(permutation=i)
        all_shuffle_results[compare].append(metric_df)

In [7]:
# Output performance results
for compare in metric_comparisons:
    full_results_df = real_metric_results[compare]
    shuffle_results_df = pd.concat(all_shuffle_results[compare]).reset_index(drop=True)
    
    output_file = pathlib.Path(f"{output_dir}/{compare}_{dataset}_metric_performance.tsv")
    full_results_df.to_csv(output_file, sep="\t", index=False)
    
    output_file = pathlib.Path(f"{output_dir}/{compare}_{dataset}_shuffle_metric_performance.tsv")
    shuffle_results_df.to_csv(output_file, sep="\t", index=False)