# 2. Generating a Sample using MS1 Controller

In this notebook, we demonstrate how ViMMS can be used to generate a full-scan mzML file from a single sample. This corresponds to Section 3.1 of the paper.

In [1]:
%matplotlib inline

In [2]:
%load_ext autoreload
%autoreload 2

In [3]:
import sys
sys.path.append('..')

In [4]:
from pathlib import Path

In [5]:
from vimms.Chemicals import ChemicalCreator
from vimms.MassSpec import IndependentMassSpectrometer
from vimms.Controller import SimpleMs1Controller
from vimms.Common import *

Load previously trained KDEs in `PeakSampler` and the list of extracted metabolites, created in **01. Download Data.ipynb**.

In [6]:
base_dir = os.path.abspath('example_data')
ps = load_obj(Path(base_dir, 'peak_sampler_mz_rt_int_19_beers_fullscan.p'))
hmdb = load_obj(Path(base_dir, 'hmdb_compounds.p'))

Set ViMMS logging level

In [7]:
set_log_level_warning()
# set_log_level_info()
# set_log_level_debug()

## Create Chemicals

Define an output folder containing our results

In [8]:
out_dir = Path(base_dir, 'results', 'MS1_single')

Here we generate the chemical objects that will be used in the sample. The chemical objects are generated by sampling from metabolites in the HMDB database.

In [9]:
# the list of ROI sources created in the previous notebook '01. Download Data.ipynb'
ROI_Sources = [str(Path(base_dir,'DsDA', 'DsDA_Beer', 'beer_t10_simulator_files'))]

# minimum MS1 intensity of chemicals
min_ms1_intensity = 1.75E5

# m/z and RT range of chemicals
rt_range = [(0, 1440)]
mz_range = [(0, 1050)]

# the number of chemicals in the sample
n_chems = 6500

# maximum MS level (we do not generate fragmentation peaks when this value is 1)
ms_level = 1

In [10]:
chems = ChemicalCreator(ps, ROI_Sources, hmdb)
dataset = chems.sample(mz_range, rt_range, min_ms1_intensity, n_chems, ms_level)
save_obj(dataset, Path(out_dir, 'dataset.p'))

Saving <class 'list'> to /home/joewandy/git/vimms/examples/example_data/results/MS1_single/dataset.p


In [11]:
for chem in dataset[0:10]:
    print(chem)

KnownChemical - 'C5H11N' rt=784.62 max_intensity=784864.69
KnownChemical - 'C11H21NO4' rt=293.06 max_intensity=12970362.95
KnownChemical - 'C23H32O5' rt=266.41 max_intensity=246157.36
KnownChemical - 'C21H21O12' rt=258.79 max_intensity=389302.70
KnownChemical - 'C4H4FN3O' rt=1311.29 max_intensity=597049.46
KnownChemical - 'C7H15N3O3' rt=791.35 max_intensity=347040.99
KnownChemical - 'C20H22ClN3O' rt=360.54 max_intensity=4698737.47
KnownChemical - 'C6H6O4S' rt=1366.00 max_intensity=5509996.27
KnownChemical - 'C17H26ClN' rt=264.42 max_intensity=263541.38
KnownChemical - 'C3H6O4' rt=756.64 max_intensity=678372.09


## Run MS1 controller on the samples and generate .mzML files

In [12]:
min_rt = rt_range[0][0]
max_rt = rt_range[0][1]

In [13]:
mass_spec = IndependentMassSpectrometer(POSITIVE, dataset, density=ps.density_estimator)
controller = SimpleMs1Controller(mass_spec)
controller.run(min_rt, max_rt)

mzml_filename = Path(out_dir, 'ms1_controller.mzML')
controller.write_mzML('my_analysis', mzml_filename)

1440.4559390000024it [00:59, 24.20it/s]                           


Simulated results have been saved to the following .mzML file and can be viewed in tools like [ToppView](https://pubs.acs.org/doi/abs/10.1021/pr900171m) or using other mzML file viewers.

In [14]:
mzml_filename

PosixPath('/home/joewandy/git/vimms/examples/example_data/results/MS1_single/ms1_controller.mzML')