# 2. Generating a Sample using MS1 Controller

In this notebook, we demonstrate how ViMMS can be used to generate a full-scan mzML file from a single sample. This corresponds to Section 3.1 of the paper.

In [1]:
%matplotlib inline

In [2]:
%load_ext autoreload
%autoreload 2

In [3]:
import sys
sys.path.append('..')

In [4]:
from pathlib import Path

In [5]:
from vimms.Chemicals import ChemicalCreator
from vimms.MassSpec import IndependentMassSpectrometer
from vimms.Controller import SimpleMs1Controller
from vimms.Common import *

Load previously trained spectral feature database and the list of extracted metabolites, created in **01. Download Data.ipynb**.

In [6]:
base_dir = os.path.abspath('example_data')
ps = load_obj(Path(base_dir, 'peak_sampler_mz_rt_int_19_beers_fullscan.p'))
hmdb = load_obj(Path(base_dir, 'hmdb_compounds.p'))

Set ViMMS logging level

In [7]:
set_log_level_debug()

## Create Chemicals

Define an output folder containing our results

In [8]:
out_dir = Path(base_dir, 'results', 'MS1_single')

Here we generate the chemical objects that will be used in the sample. The chemical objects are generated by sampling from metabolites in the HMDB database.

In [9]:
# the list of ROI sources created in the previous notebook '01. Download Data.ipynb'
ROI_Sources = [str(Path(base_dir,'DsDA', 'DsDA_Beer', 'beer_t10_simulator_files'))]

# minimum MS1 intensity of chemicals
min_ms1_intensity = 1.75E5

# m/z and RT range of chemicals
rt_range = [(0, 1440)]
mz_range = [(0, 1050)]

# the number of chemicals in the sample
n_chems = 6500

# maximum MS level (we do not generate fragmentation peaks when this value is 1)
ms_level = 1

In [10]:
chems = ChemicalCreator(ps, ROI_Sources, hmdb)
dataset = chems.sample(mz_range, rt_range, min_ms1_intensity, n_chems, ms_level)
save_obj(dataset, Path(out_dir, 'dataset.p'))

DEBUG  : ChemicalCreator                : Sorting database compounds by masses
DEBUG  : ChemicalCreator                : 6500 chemicals to be created.
DEBUG  : ChemicalCreator                : Sampling formula 0/6500
DEBUG  : ChemicalCreator                : Sampling formula 500/6500
DEBUG  : ChemicalCreator                : Sampling formula 1000/6500
DEBUG  : ChemicalCreator                : Sampling formula 1500/6500
DEBUG  : ChemicalCreator                : Sampling formula 2000/6500
DEBUG  : ChemicalCreator                : Sampling formula 2500/6500
DEBUG  : ChemicalCreator                : Sampling formula 3000/6500
DEBUG  : ChemicalCreator                : Sampling formula 3500/6500
DEBUG  : ChemicalCreator                : Sampling formula 4000/6500
DEBUG  : ChemicalCreator                : Sampling formula 4500/6500
DEBUG  : ChemicalCreator                : Sampling formula 5000/6500
DEBUG  : ChemicalCreator                : Sampling formula 5500/6500
DEBUG  : ChemicalCreator 

Saving <class 'list'> to C:\Users\joewa\Work\git\vimms\examples\example_data\results\MS1_single\dataset.p


In [11]:
for chem in dataset[0:10]:
    print(chem)

KnownChemical - 'C13H21NO10' rt=1186.77 max_intensity=367970.85
KnownChemical - 'C22H28ClFO4' rt=669.53 max_intensity=643271.48
KnownChemical - 'C21H21FO8' rt=284.30 max_intensity=651949.58
KnownChemical - 'C32H48O6' rt=631.90 max_intensity=7151240.05
KnownChemical - 'C29H46O' rt=443.85 max_intensity=245411.10
KnownChemical - 'H2O3S' rt=592.13 max_intensity=926532.07
KnownChemical - 'C16H24' rt=85.26 max_intensity=17106772.82
KnownChemical - 'C12H20N2O3' rt=441.17 max_intensity=1044845.15
KnownChemical - 'C11H21NO11S3' rt=658.04 max_intensity=2695579.25
KnownChemical - 'C17H24N6O3' rt=181.41 max_intensity=4455507.29


## Run MS1 controller on the samples and generate .mzML files

In [12]:
set_log_level_warning()

In [13]:
min_rt = rt_range[0][0]
max_rt = rt_range[0][1]

In [14]:
mass_spec = IndependentMassSpectrometer(POSITIVE, dataset, ps)
controller = SimpleMs1Controller(mass_spec)
controller.run(min_rt, max_rt)

mzml_filename = Path(out_dir, 'ms1_controller.mzML')
controller.write_mzML('my_analysis', mzml_filename)

1440.8787000000023it [00:56, 25.62it/s]                                                                              


Simulated results have been saved to the following .mzML file and can be viewed in tools like [ToppView](https://pubs.acs.org/doi/abs/10.1021/pr900171m) or using other mzML file viewers.

In [15]:
mzml_filename

WindowsPath('C:/Users/joewa/Work/git/vimms/examples/example_data/results/MS1_single/ms1_controller.mzML')