# Varying N in Top-N Simulations

This is a notebook showing how to take an mzml file and rerun multiple Top-N analyses to find the optimal choise of N and Dynamic Exclusion Window (DEW). The user provides an mzml file and options for N and DEW. Eventually it will be possible for the user to specify other settings.

To use this file the user will need to have
- ViMMS - github.com/glasgowcompbio/vimms
- pymzm - github.com/glasgowcompbio/pymzm
- MZMine2 (this will need to be in the same parent directory as the experiment and data)

# 1. Load Packages

This loads the packages needed to run the example and add locations for the github repositories. You will need to add your own locations in order to be able to use it

In [1]:
%matplotlib inline
%load_ext autoreload
%autoreload 2

In [2]:
import sys
sys.path.append('C:\\Users\\joewa\\Work\\git\\vimms')
sys.path.append('C:\\Users\\Vinny\\work\\vimms')

In [7]:
from pathlib import Path
from pyDOE import *
from vimms.Environment import *

In [9]:
from vimms.Chemicals import ChemicalCreator, GET_MS2_BY_PEAKS, GET_MS2_BY_SPECTRA
from vimms.MassSpec import IndependentMassSpectrometer
from vimms.Controller import *
from vimms.Common import *
from vimms.PlotsForPaper import *
from vimms.Roi import make_roi
from vimms.Chemicals import RoiToChemicalCreator
from vimms.FeatureExtraction import extract_roi
from vimms.SequenceManager import *

In [10]:
base_dir = os.path.abspath('../Trained Models')
data_dir = os.path.abspath('./QCsamples')

In [11]:
set_log_level_info()

# 2. Settings

This sections specifies settings that are initially fixed. Settings are provided for the Mass Spec, the Top-N Controller and MZMine2 peak picking. Folders are also given to save the files, which will need to be update to use the example

In [12]:
# this is needed to create the datasets
ps = load_obj('C:\\Users\\Vinny\\OneDrive - University of Glasgow\\CLDS Metabolomics Project\\Trained Models\\peak_sampler_mz_rt_int_beerqcb_fragmentation.p')

In [13]:
# This is where I save my results
experiment_dir = 'C:\\Users\\Vinny\\work\\mzmine_files'
output_dir = os.path.join(experiment_dir, 'SimpleExperiments\\GridSearch\\experiment_2_2_3')

In [14]:
# these are the parameters for the virutal mass spec
mass_spec_params = {'ionisation_mode': POSITIVE,
                    'peak_sampler': ps,
                    'add_noise': False,
                    'isolation_transition_window': 'rectangular',
                    'isolation_transition_window_params': None}

In [15]:
# these are the default for the parameters for the controller. Some of these will get replaced for different runs in section 4
controller_params = {"ionisation_mode": POSITIVE,
                       "N": 10,
                       "mz_tol": 10,
                       "rt_tol":30,
                       "min_ms1_intensity": 1.75E5,
                       "rt_range": [(200, 400)],
                       "isolation_width": 1}

In [17]:
# this is where mzmine is and is used to pick the peaks
mzmine_command = 'C:\\Users\\Vinny\\work\\MZmine-2.40.1\\MZmine-2.40.1\\startMZmine_Windows.bat'

# 3. User Inputs

These are the inputs that the user will need to supply
- the mzml file that is going to be re-analysed
- which settings to test
- what method to evaluate the method using (this is the only option currently, but it would be good to make the app generic enough to take other options in the future

In [18]:
mzml_file = os.path.join(experiment_dir, 'QCB\\fullscan_mzmls\\QCB_22May19_1.mzML')

In [19]:
topn_variable_params_dict = {'N': [10,20], 'rt_tol': [15,30]}

In [20]:
evaluation_methods = ['mzmine_peak']

# 4. Run Methods

This takes all the information previously given and runs a grid search over all the possible values given

In [22]:
vsm = VimmsSequenceManager(None, evaluation_methods, output_dir, progress_bar=True, ms1_picked_peaks_file=None, mzmine_command=mzmine_command)
gs = GridSearchExperiment(vsm, 'TopNController', mass_spec_params, None, topn_variable_params_dict, controller_params, mzml_file, ps=ps, parallel=False)

2020-09-22 10:50:03.223 | INFO     | vimms.Common:save_obj:165 - Saving <class 'list'> to C:\Users\Vinny\work\mzmine_files\SimpleExperiments\GridSearch\experiment_2_2_3\QCB_22May19_1.p
2020-09-22 10:50:16.596 | INFO     | vimms.PythonMzmine:pick_peaks:23 - Creating xml batch file for QCB_22May19_1.mzML
2020-09-22 10:50:16.606 | INFO     | vimms.PythonMzmine:pick_peaks:53 - Running mzMine for QCB_22May19_1.mzML


KeyboardInterrupt: 

# 5. Analyse Results

In [None]:
gs.results

In [None]:
Heatmap_GridSearch(gs, 'mzmine_peak', 'rt_tol', 'N')