# Experiment: Varying N in top-N DDA fragmentation

We demonstrate that the simulator can be used for scan-level closed-loop DDA experiments. 
- Take an existing data. Find out which MS1 peaks are linked to which MS2 peaks.
- Run all MS1 peaks through the simulator’s Top-N protocol. 
- For the top-100 most intense MS1 peaks, how many got fragmented in the simulator as we change N?
- If N is greater than the real data, do we see the same MS1 peaks from (1) being fragmented again, plus additional fragment peaks?
- Verification on actual machine.
- Talk to stefan about machine time.

In [None]:
%matplotlib inline

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
import numpy as np
import pandas as pd
import sys
import scipy.stats
import pylab as plt
from IPython import display
import pylab as plt
from random import random, shuffle
from joblib import Parallel, delayed
import multiprocessing

In [None]:
sys.path.append('../codes')

In [None]:
from VMSfunctions.Chemicals import *
from VMSfunctions.Chromatograms import *
from VMSfunctions.MassSpec import *
from VMSfunctions.Controller import *
from VMSfunctions.Common import *
from VMSfunctions.DataGenerator import *
from VMSfunctions.Noise import *

### Experiment by varying N and rt_tol

Load experiment results for analysis

In [None]:
controllers = load_obj('../models/noisy_top_N_controllers.p')

#### Plot total number of fragmented MS2 peaks

In [None]:
def get_total_peaks(controller, ms_level):
    num_peaks = [scan.num_peaks for scan in controller.scans[ms_level]]
    return sum(num_peaks)

In [None]:
total_peaks = [get_total_peaks(controller, 2) for controller in controllers]
plt.plot(total_peaks)
plt.xlabel('N')
plt.ylabel('Total MS2 Peaks')
plt.title('Top-N vs Total MS2 Peaks')
plt.xticks(range(len(inputs)), inputs, rotation='vertical')
plt.grid()

#### Plot how many large MS1 peaks got fragmented

In [None]:
sorted_dataset = sorted(dataset, key=lambda x: x.max_intensity, reverse=True)
most_intense = sorted_dataset[0:100]