# Plasmids MILP experiment 1

## Description

The following describes the results of the experiment with the first version of the Plasmids assembly MILP. There are two parts to the experiment. In the first part, I tried to determine the set of $\alpha$s in the objective function by running the MILP for the same sample. In the second part, I used the same $\alpha$ values for different samples. In both parts, I have recorded the average and maximum precision, recall and F1 score obtained.

In [1]:
from IPython.display import HTML

HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')

In [8]:
import os
from collections import defaultdict
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
plt.switch_backend('agg')
%matplotlib inline

In [9]:
output_dir = '../../output'
ratio_tests = os.path.join(output_dir,'ratio_test')

In [10]:
def update_dict(line, stat_dict, file, folder_loc):
    stat = line.split(" ")[-1]
    stat_dict[file.split('/')[folder_loc]].append(float(stat))
    return stat_dict

In [11]:
def compute_mean(mean, k, precs, recs, f1s):
    mean[k] = {}
    mean[k]['precision'] = sum(precs[k])/len(precs[k])
    mean[k]['recall'] = sum(recs[k])/len(recs[k])
    mean[k]['f1_score'] = sum(f1s[k])/len(f1s[k])
    return mean

In [12]:
def compute_max(best, k, precs, recs, f1s):
    best[k] = {}
    best[k]['precision'] = max(precs[k])
    best[k]['recall'] = max(recs[k])
    best[k]['f1_score'] = max(f1s[k])
    return best

## Part I: Statistics for various coefficient combinations ($\alpha_1, \alpha_2, \alpha_3$)

The aim of this part is to find which set of $\alpha$ values would be a suitable choice for the MILP. For this part I used a single sample (id=103) and obtained the results for various $\alpha$ values. The sample itself contains exactly 1 plasmid. However, I ran the experiments for number of plasmids (nplasmids) $\in \{1,2,3,4\}$. The reason behind this is that in many cases, the output when nplasmids $= 1$ yielded 0 precision and recall. In other words, the best plasmid according to the MILP did not match the reference plasmid for the sample.

In [13]:
files = [os.path.join(dp, f) for dp, dn, filenames in os.walk(ratio_tests) for f in filenames if "eval.csv" in f]
precs, recs, f1s = defaultdict(list), defaultdict(list), defaultdict(list)

for file in files:
    with open(file, 'r') as f:
        for line in f:
            if "precision" in line:
                precs = update_dict(line, precs, file, 11)
            if "recall" in line:
                recs = update_dict(line, recs, file, 11)
            if "f1" in line:
                f1s = update_dict(line, f1s, file, 11)

IndexError: list index out of range

In [None]:
mean, best = {}, {} 
mean_scores = []
best_scores = []

for ratio in precs:
    mean = compute_mean(mean, ratio, precs, recs, f1s)
    mean_scores.append([ratio, mean[ratio]['precision'], mean[ratio]['recall'], mean[ratio]['f1_score']])
    best = compute_max(best, ratio, precs, recs, f1s)
    best_scores.append([ratio, best[ratio]['precision'], best[ratio]['recall'], best[ratio]['f1_score']])
    
mean_scores = pd.DataFrame(mean_scores)
mean_scores.rename(columns = {0: 'Ratio', 1: 'Precision', 2: 'Recall', 3: 'F1 score'}, inplace = True)

best_scores = pd.DataFrame(best_scores)
best_scores.rename(columns = {0: 'Ratio', 1: 'Precision', 2: 'Recall', 3: 'F1 score'}, inplace = True)

with open(os.path.join(output_dir,'exp1_scores.csv'), 'w') as f:
    mean_scores.to_csv(f, sep = '\t', encoding='utf-8', index=False)
with open(os.path.join(output_dir,'exp1_scores.csv'), 'a') as f:
    best_scores.to_csv(f, sep = '\t', encoding='utf-8', index=False)

Table 1 lists the results. The first column lists the ratio ($\alpha_1: \alpha_2: \alpha_3$) used in the objective function. From the table, we can see that the choice of $\alpha_1$ does not impact the results too much. For instance, the cases 0.1.0, 1.1.0 and 2.1.0 have very similar precision, recall and F1 score. This indicates that the MILP assigns read depths rd[p][c] such that the deviation from the average read depth mean_rd[p] is negligible.

In [None]:
mean_scores

In [None]:
best_scores

## Part 2: Statistics for different samples

In this part, I chose the $\alpha$ ratios which performed better than others. If two sets of ratios gave the same or similar output, (such as 1.1.0 and 2.1.0 above), only one of the two was chosen. Since variation of $\alpha_1$ does not significantly impact the answers, it is assigned the value 1. The ratios chosen for this part are 1.1.0, 1.1.1, 1.2.1 and 1.5.1. For each ratio, the MILP was run with nplasmids $\in \{1,2,3\}$. In many cases, the MILP did not converge to the optimal for higher number of plasmids. 

A set of 10 ids was chosen to test the general performance of the MILP. These were the same ids on which Robert had previously carried out his preliminary experiments with the HyAsP greedy algorithm.

In [None]:
files = [os.path.join(dp, f) for dp, dn, filenames in os.walk(output_dir) for f in filenames if "eval.csv" in f]
precs, recs, f1s = defaultdict(list), defaultdict(list), defaultdict(list)

for file in files:
    if 'sample' in file.split('/')[10]:
        sample_id = file.split('/')[10].split('_')[1]
        with open(file, 'r') as f:
            for line in f:
                if "precision" in line:
                    precs = update_dict(line, precs, file, 10)
                if "recall" in line:
                    recs = update_dict(line, recs, file, 10)
                if "f1" in line:
                    f1s = update_dict(line, f1s, file, 10)

#print(precs)                    
#print(len(precs))

In [None]:
mean, best = {}, {}
mean_scores, best_scores = [], [] 
for sample_id in precs:
    #print(sample_id)
    #print(precs[sample_id])
    #print(sum(precs[sample_id]))
    #print(len(precs[sample_id]))
    mean[sample_id] = {}
    mean[sample_id]['precision'] = sum(precs[sample_id])/len(precs[sample_id])
    mean[sample_id]['recall'] = sum(recs[sample_id])/len(recs[sample_id])
    mean[sample_id]['f1_score'] = sum(f1s[sample_id])/len(f1s[sample_id])
    #mean = compute_mean(mean, sample_id, precs, recs, f1s)
    mean_scores.append([sample_id, mean[sample_id]['precision'], mean[sample_id]['recall'], mean[sample_id]['f1_score']])

    best[sample_id] = {}
    best[sample_id]['precision'] = max(precs[sample_id])
    best[sample_id]['recall'] = max(recs[sample_id])
    best[sample_id]['f1_score'] = max(f1s[sample_id])   
    #best = compute_max(best, sample_id, precs, recs, f1s)
    best_scores.append([sample_id, best[sample_id]['precision'], best[sample_id]['recall'], best[sample_id]['f1_score']])

mean_scores = pd.DataFrame(mean_scores)
mean_scores.rename(columns = {0: 'Sample', 1: 'Precision', 2: 'Recall', 3: 'F1 score'}, inplace = True)

best_scores = pd.DataFrame(best_scores)
best_scores.rename(columns = {0: 'Sample', 1: 'Precision', 2: 'Recall', 3: 'F1 score'}, inplace = True)

In [None]:
mean_scores

In [None]:
best_scores

In [None]:
N = 10
ind = np.arange(N)
width = 0.27

As seen in the following figure, the average recall rate is lower than 0.6 for 6 of the 10
samples chosen. However, the precision is higher than the recall for most of the
samples. One explanation for this is that the MILP predicts a large plasmid
than covers a significant portion of the reference plasmid. As one of the terms
in the objective function increases the gene density, the MILP tends to choose
a higher length plasmid. Hopefully, this issue might be resolved in the next
experiment, in which path constraints will be introduced.

In [None]:
fig = plt.figure()
ax = fig.add_subplot(111)

pvals = mean_scores['Precision'].values.tolist()
rects1 = ax.bar(ind, pvals, width, color='r')
rvals = mean_scores['Recall'].values.tolist()
rects2 = ax.bar(ind+width, rvals, width, color='g')
fvals = mean_scores['F1 score'].values.tolist()
rects3 = ax.bar(ind+width*2, fvals, width, color='b')
ids = mean_scores['Sample'].values.tolist()
ids = [k.split('_')[1] for k in ids]
ax.set_ylabel('Scores')
ax.set_xticks(ind+width)
ax.set_xticklabels( (ids[0], ids[1], ids[2], ids[3], ids[4], ids[5], ids[6], ids[7], ids[8], ids[9]) )
ax.legend( (rects1[0], rects2[0], rects3[0]), ('Prec', 'Rec', 'F1') )
plt.show()
plt.savefig(os.path.join(output_dir,'mean_scores_MILP_exp1.pdf'), format = 'pdf', dpi = 1200, bbox_inches = 'tight')

In [None]:
fig = plt.figure()
ax = fig.add_subplot(111)

pvals = best_scores['Precision'].values.tolist()
rects1 = ax.bar(ind, pvals, width, color='r')
rvals = best_scores['Recall'].values.tolist()
rects2 = ax.bar(ind+width, rvals, width, color='g')
fvals = best_scores['F1 score'].values.tolist()
rects3 = ax.bar(ind+width*2, fvals, width, color='b')
ids = best_scores['Sample'].values.tolist()
ids = [k.split('_')[1] for k in ids]
ax.set_ylabel('Scores')
ax.set_xticks(ind+width)
ax.set_xticklabels( (ids[0], ids[1], ids[2], ids[3], ids[4], ids[5], ids[6], ids[7], ids[8], ids[9]) )
ax.legend( (rects1[0], rects2[0], rects3[0]), ('Prec', 'Rec', 'F1') )
plt.show()
plt.savefig(os.path.join(output_dir,'best_scores_MILP_exp1.pdf'), format = 'pdf', dpi = 1200, bbox_inches = 'tight')