# Analyze Accuracy of the Autocuration Pipeline

This notebook analyzes the performance of the Influenza Autocuration pipeline on sequences that were already flagged in IRD.  Sequences from species A-D were downloaded from legacy IRD and the pipeline was applied to each species' set of sequences. Results from each set of sequences were saved by generating a table that listed column for 'Accession', 'Actual Flag', 'My Flag', and 'Profile'.  The 'Actual Flag' and 'My Flag' columns could be compared to assess performance.  This said, there are five possible results of performance that need to be kept in mind.  The five possibilities are the following:

(1) actual = Pass | mine = Pass

(2) actual = Pass | mine = Flag_A

(3) actual = Flag_A | mine = Pass

(4) actual = Flag_A | mine = Flag_A

(5) actual = Flag_A | mine = Flag_B

To account for these five possiblities, this performance analysis measures (1) the precision of determining a 'pass' sequence (pass precision), (2) the precision of determing a 'flag' sequence (flag precision), (3) the rate at which actual pass sequences are labeled with pass (pass recall), (4) the rate at which actual flag sequences are labeled with a flag (flag recall), (5) the accuracy of assigning the correct anotation(s) to a flagged sequence (flag accuracy), and finally (6) the overall accuracy in determining the correct annotation.

This performance gets analyzed in two stages, one in which the MAFFT alignment gap penalty is utilized and one in which it is not. Thus, the analysis is broken into two sections.

In [1]:
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings("ignore")

In [2]:
def pass_flag_confusion(theirs, mine):
    
    mat = pd.DataFrame([[0,0],[0,0]], index=['Pass_IRD', 'Flag_IRD'], columns=['Pass_Mine', 'Flag_Mine'])
    
    for i in range(len(theirs)):
        if theirs[i] == 'Pass' and mine[i] == 'Pass':
            mat.at['Pass_IRD', 'Pass_Mine'] += 1
        elif theirs[i] == 'Pass' and mine[i] != 'Pass':
            mat.at['Pass_IRD', 'Flag_Mine'] += 1
        elif theirs[i] != 'Pass' and mine[i] == 'Pass':
            mat.at['Flag_IRD', 'Pass_Mine'] += 1
        elif theirs[i] != 'Pass' and mine[i] != 'Pass':
            mat.at['Flag_IRD', 'Flag_Mine'] += 1
    
    return(mat)


def corr_incorr_confusion(theirs, mine):
    
    mat = pd.DataFrame([[0,0],[0,0]], index=['Pass_IRD', 'Flag_IRD'], columns=['Match', 'No Match'])
    
    for i in range(len(theirs)):
        if theirs[i] == 'Pass' and mine[i] == 'Pass':
            mat.at['Pass_IRD', 'Match'] += 1
        elif theirs[i] == 'Pass' and mine[i] != 'Pass':
            mat.at['Pass_IRD', 'No Match'] += 1
        elif theirs[i] != 'Pass' and theirs[i] == mine[i]:
            mat.at['Flag_IRD', 'Match'] += 1
        elif theirs[i] != 'Pass' and theirs[i] != mine[i]:
            mat.at['Flag_IRD', 'No Match'] += 1
    
    return(mat)    

In [3]:
# True Pass / True Pass + False Pass
def pass_precision(mat):
    
    return((mat['Pass_Mine']['Pass_IRD'])/(mat['Pass_Mine']['Pass_IRD'] + mat['Pass_Mine']['Flag_IRD']))

# True Pass / True Pass + False Flag (Also 'True Pass Rate' and 'Pass Accuracy')
def pass_recall(mat):
    
    return((mat['Pass_Mine']['Pass_IRD'])/(mat['Pass_Mine']['Pass_IRD'] + mat['Flag_Mine']['Pass_IRD']))

# True Flag / True Flag + False Flag
def flag_precision(mat):
    
    return((mat['Flag_Mine']['Flag_IRD'])/(mat['Flag_Mine']['Flag_IRD'] + mat['Flag_Mine']['Pass_IRD']))

# True Flag / True Flag + False Pass (Also 'True Flag Rate')
def flag_recall(mat):
    
    return((mat['Flag_Mine']['Flag_IRD'])/(mat['Flag_Mine']['Flag_IRD'] + mat['Pass_Mine']['Flag_IRD'])) 

# How well flagged sequences actually got designated with the correct flag
def flag_accuracy(mat):
    
    return((mat['Match']['Flag_IRD'])/(mat['Match']['Flag_IRD'] + mat['No Match']['Flag_IRD']))

# How well 'My Flag' matches 'Actual Flag'
def overall_accuracy(mat):
    
    return((mat['Match']['Pass_IRD'] + mat['Match']['Flag_IRD'])/(mat['Match']['Pass_IRD'] + mat['Match']['Flag_IRD'] + mat['No Match']['Pass_IRD'] + mat['No Match']['Flag_IRD']))

# (1) Performance With Gap Penalty 

## Influenza A Results

In [4]:
fluA_results = pd.read_csv("InfluenzaA_result.csv")
fluA_results = fluA_results[(fluA_results['Profile'] != 'Unknown')]
actual = list(fluA_results['Actual Flag'])
pred = list(fluA_results['My Flag'])

In [5]:
confusion1 = pass_flag_confusion(actual, pred)
confusion1

Unnamed: 0,Pass_Mine,Flag_Mine
Pass_IRD,989,21
Flag_IRD,63,714


In [6]:
confusion2 = corr_incorr_confusion(actual, pred)
confusion2

Unnamed: 0,Match,No Match
Pass_IRD,989,21
Flag_IRD,655,122


In [7]:
print("Pass precision:", round(pass_precision(confusion1), 3))
print("Pass recall (Pass accuracy):", round(pass_recall(confusion1), 3))
print('\n')
print("Flag precision:", round(flag_precision(confusion1), 3))
print("Flag recall:", round(flag_recall(confusion1), 3))
print('\n')
print("Flag accuracy:", round(flag_accuracy(confusion2), 3))
print("Overall accuracy:", round(overall_accuracy(confusion2), 3))

Pass precision: 0.94
Pass recall (Pass accuracy): 0.979


Flag precision: 0.971
Flag recall: 0.919


Flag accuracy: 0.843
Overall accuracy: 0.92


## Influenza B Results

In [8]:
fluB_results = pd.read_csv("InfluenzaB_result.csv")
fluB_results = fluB_results[(fluB_results['Profile'] != 'Unknown')]
actual = list(fluB_results['Actual Flag'])
pred = list(fluB_results['My Flag'])

In [9]:
confusion1 = pass_flag_confusion(actual, pred)
confusion1

Unnamed: 0,Pass_Mine,Flag_Mine
Pass_IRD,401,99
Flag_IRD,19,197


In [10]:
confusion2 = corr_incorr_confusion(actual, pred)
confusion2

Unnamed: 0,Match,No Match
Pass_IRD,401,99
Flag_IRD,189,27


In [11]:
print("Pass precision:", round(pass_precision(confusion1), 3))
print("Pass recall (Pass accuracy):", round(pass_recall(confusion1), 3))
print('\n')
print("Flag precision:", round(flag_precision(confusion1), 3))
print("Flag recall:", round(flag_recall(confusion1), 3))
print('\n')
print("Flag accuracy:", round(flag_accuracy(confusion2), 3))
print("Overall accuracy:", round(overall_accuracy(confusion2), 3))

Pass precision: 0.955
Pass recall (Pass accuracy): 0.802


Flag precision: 0.666
Flag recall: 0.912


Flag accuracy: 0.875
Overall accuracy: 0.824


## Influenza C Results

In [14]:
fluC_results = pd.read_csv("InfluenzaC_result.csv")
fluC_results = fluC_results[(fluC_results['Profile'] != 'Unknown')]
actual = list(fluC_results['Actual Flag'])
pred = list(fluC_results['My Flag'])

In [15]:
confusion1 = pass_flag_confusion(actual, pred)
confusion1

Unnamed: 0,Pass_Mine,Flag_Mine
Pass_IRD,69,1
Flag_IRD,0,0


In [16]:
confusion2 = corr_incorr_confusion(actual, pred)
confusion2

Unnamed: 0,Match,No Match
Pass_IRD,69,1
Flag_IRD,0,0


In [17]:
print("Pass precision:", round(pass_precision(confusion1), 3))
print("Pass recall (Pass accuracy):", round(pass_recall(confusion1), 3))
print('\n')
print("Flag precision:", round(flag_precision(confusion1), 3))
print("Flag recall:", round(flag_recall(confusion1), 3))
print('\n')
print("Flag accuracy:", round(flag_accuracy(confusion2), 3))
print("Overall accuracy:", round(overall_accuracy(confusion2), 3))

Pass precision: 1.0
Pass recall (Pass accuracy): 0.986


Flag precision: 0.0
Flag recall: nan


Flag accuracy: nan
Overall accuracy: 0.986


## Influenza D Results

In [18]:
fluD_results = pd.read_csv("InfluenzaD_result.csv")
fluD_results = fluD_results[(fluD_results['Profile'] != 'Unknown')]
actual = list(fluD_results['Actual Flag'])
pred = list(fluD_results['My Flag'])

In [19]:
confusion1 = pass_flag_confusion(actual, pred)
confusion1

Unnamed: 0,Pass_Mine,Flag_Mine
Pass_IRD,253,0
Flag_IRD,0,23


In [20]:
confusion2 = corr_incorr_confusion(actual, pred)
confusion2

Unnamed: 0,Match,No Match
Pass_IRD,253,0
Flag_IRD,13,10


In [21]:
print("Pass precision:", round(pass_precision(confusion1), 3))
print("Pass recall (Pass accuracy):", round(pass_recall(confusion1), 3))
print('\n')
print("Flag precision:", round(flag_precision(confusion1), 3))
print("Flag recall:", round(flag_recall(confusion1), 3))
print('\n')
print("Flag accuracy:", round(flag_accuracy(confusion2), 3))
print("Overall accuracy:", round(overall_accuracy(confusion2), 3))

Pass precision: 1.0
Pass recall (Pass accuracy): 1.0


Flag precision: 1.0
Flag recall: 1.0


Flag accuracy: 0.565
Overall accuracy: 0.964


## Overall Results

In [22]:
all_flu_results = pd.concat([fluA_results, fluB_results, fluC_results, fluD_results], axis = 0)
actual = list(all_flu_results['Actual Flag'])
pred = list(all_flu_results['My Flag'])

In [23]:
confusion1 = pass_flag_confusion(actual, pred)
confusion1

Unnamed: 0,Pass_Mine,Flag_Mine
Pass_IRD,1712,121
Flag_IRD,82,934


In [24]:
confusion2 = corr_incorr_confusion(actual, pred)
confusion2

Unnamed: 0,Match,No Match
Pass_IRD,1712,121
Flag_IRD,857,159


In [25]:
print("Pass precision:", round(pass_precision(confusion1), 3))
print("Pass recall (Pass accuracy):", round(pass_recall(confusion1), 3))
print('\n')
print("Flag precision:", round(flag_precision(confusion1), 3))
print("Flag recall:", round(flag_recall(confusion1), 3))
print('\n')
print("Flag accuracy:", round(flag_accuracy(confusion2), 3))
print("Overall accuracy:", round(overall_accuracy(confusion2), 3))

Pass precision: 0.954
Pass recall (Pass accuracy): 0.934


Flag precision: 0.885
Flag recall: 0.919


Flag accuracy: 0.844
Overall accuracy: 0.902


# (2) Performance Without Gap Penalty

## Influenza A Results

In [5]:
fluA_results = pd.read_csv("InfluenzaA_result_2.csv")
actual = list(fluA_results['Actual Flag'])
pred = list(fluA_results['My Flag'])

In [6]:
confusion1 = pass_flag_confusion(actual, pred)
confusion1

Unnamed: 0,Pass_Mine,Flag_Mine
Pass_IRD,945,65
Flag_IRD,24,777


In [7]:
confusion2 = corr_incorr_confusion(actual, pred)
confusion2

Unnamed: 0,Match,No Match
Pass_IRD,945,65
Flag_IRD,693,108


In [8]:
print("Pass precision:", round(pass_precision(confusion1), 3))
print("Pass recall (Pass accuracy):", round(pass_recall(confusion1), 3))
print('\n')
print("Flag precision:", round(flag_precision(confusion1), 3))
print("Flag recall:", round(flag_recall(confusion1), 3))
print('\n')
print("Flag accuracy:", round(flag_accuracy(confusion2), 3))
print("Overall accuracy:", round(overall_accuracy(confusion2), 3))

Pass precision: 0.975
Pass recall (Pass accuracy): 0.936


Flag precision: 0.923
Flag recall: 0.97


Flag accuracy: 0.865
Overall accuracy: 0.904


## Influenza B Results

In [27]:
fluB_results = pd.read_csv("InfluenzaB_result_2.csv")
actual = list(fluB_results['Actual Flag'])
pred = list(fluB_results['My Flag'])

In [28]:
confusion1 = pass_flag_confusion(actual, pred)
confusion1

Unnamed: 0,Pass_Mine,Flag_Mine
Pass_IRD,385,115
Flag_IRD,1,215


In [29]:
confusion2 = corr_incorr_confusion(actual, pred)
confusion2

Unnamed: 0,Match,No Match
Pass_IRD,385,115
Flag_IRD,202,14


In [30]:
print("Pass precision:", round(pass_precision(confusion1), 3))
print("Pass recall (Pass accuracy):", round(pass_recall(confusion1), 3))
print('\n')
print("Flag precision:", round(flag_precision(confusion1), 3))
print("Flag recall:", round(flag_recall(confusion1), 3))
print('\n')
print("Flag accuracy:", round(flag_accuracy(confusion2), 3))
print("Overall accuracy:", round(overall_accuracy(confusion2), 3))

Pass precision: 0.997
Pass recall (Pass accuracy): 0.77


Flag precision: 0.652
Flag recall: 0.995


Flag accuracy: 0.935
Overall accuracy: 0.82


## Influenza C Results

In [45]:
fluC_results = pd.read_csv("InfluenzaC_result_2.csv")
actual = list(fluC_results['Actual Flag'])
pred = list(fluC_results['My Flag'])

In [46]:
confusion1 = pass_flag_confusion(actual, pred)
confusion1

Unnamed: 0,Pass_Mine,Flag_Mine
Pass_IRD,68,2
Flag_IRD,0,0


In [47]:
confusion2 = corr_incorr_confusion(actual, pred)
confusion2

Unnamed: 0,Match,No Match
Pass_IRD,68,2
Flag_IRD,0,0


In [48]:
print("Pass precision:", round(pass_precision(confusion1), 3))
print("Pass recall (Pass accuracy):", round(pass_recall(confusion1), 3))
print('\n')
print("Flag precision:", round(flag_precision(confusion1), 3))
print("Flag recall:", round(flag_recall(confusion1), 3))
print('\n')
print("Flag accuracy:", round(flag_accuracy(confusion2), 3))
print("Overall accuracy:", round(overall_accuracy(confusion2), 3))

Pass precision: 1.0
Pass recall (Pass accuracy): 0.971


Flag precision: 0.0
Flag recall: nan


Flag accuracy: nan
Overall accuracy: 0.971


## Influenza D Results

In [49]:
fluD_results = pd.read_csv("InfluenzaD_result_2.csv")
actual = list(fluD_results['Actual Flag'])
pred = list(fluD_results['My Flag'])

In [50]:
confusion1 = pass_flag_confusion(actual, pred)
confusion1

Unnamed: 0,Pass_Mine,Flag_Mine
Pass_IRD,252,1
Flag_IRD,0,23


In [51]:
confusion2 = corr_incorr_confusion(actual, pred)
confusion2

Unnamed: 0,Match,No Match
Pass_IRD,252,1
Flag_IRD,13,10


In [52]:
print("Pass precision:", round(pass_precision(confusion1), 3))
print("Pass recall (Pass accuracy):", round(pass_recall(confusion1), 3))
print('\n')
print("Flag precision:", round(flag_precision(confusion1), 3))
print("Flag recall:", round(flag_recall(confusion1), 3))
print('\n')
print("Flag accuracy:", round(flag_accuracy(confusion2), 3))
print("Overall accuracy:", round(overall_accuracy(confusion2), 3))

Pass precision: 1.0
Pass recall (Pass accuracy): 0.996


Flag precision: 0.958
Flag recall: 1.0


Flag accuracy: 0.565
Overall accuracy: 0.96


## Overall Results

In [53]:
all_flu_results = pd.concat([fluA_results, fluB_results, fluC_results, fluD_results], axis = 0)
actual = list(all_flu_results['Actual Flag'])
pred = list(all_flu_results['My Flag'])

In [54]:
confusion1 = pass_flag_confusion(actual, pred)
confusion1

Unnamed: 0,Pass_Mine,Flag_Mine
Pass_IRD,1693,140
Flag_IRD,64,976


In [55]:
confusion2 = corr_incorr_confusion(actual, pred)
confusion2

Unnamed: 0,Match,No Match
Pass_IRD,1693,140
Flag_IRD,863,177


In [56]:
print("Pass precision:", round(pass_precision(confusion1), 3))
print("Pass recall (Pass accuracy):", round(pass_recall(confusion1), 3))
print('\n')
print("Flag precision:", round(flag_precision(confusion1), 3))
print("Flag recall:", round(flag_recall(confusion1), 3))
print('\n')
print("Flag accuracy:", round(flag_accuracy(confusion2), 3))
print("Overall accuracy:", round(overall_accuracy(confusion2), 3))

Pass precision: 0.964
Pass recall (Pass accuracy): 0.924


Flag precision: 0.875
Flag recall: 0.938


Flag accuracy: 0.83
Overall accuracy: 0.89


# Visualize All of Performance