# 2. Generating a Sample using MS1 Controller

In this notebook, we demonstrate how ViMMS can be used to generate a full-scan mzML file from a single sample. This corresponds to Section 3.1 of the paper.

In [1]:
%matplotlib inline

In [2]:
%load_ext autoreload
%autoreload 2

In [3]:
import sys
sys.path.append('..')

In [4]:
from vimms.Chemicals import ChemicalCreator
from vimms.MassSpec import IndependentMassSpectrometer
from vimms.Controller import SimpleMs1Controller
from vimms.Common import *

Load previously trained KDEs in `PeakSampler` and the list of extracted metabolites, created in **01. Download Data.ipynb**.

In [5]:
base_dir = os.path.abspath('example_data')
ps = load_obj(os.path.join(base_dir, 'peak_sampler_mz_rt_int_19_beers_fullscan.p'))
hmdb = load_obj(os.path.join(base_dir, 'hmdb_compounds.p'))

Set ViMMS logging level

In [6]:
set_log_level_warning()
# set_log_level_info()
# set_log_level_debug()

## Create Chemicals

Define an output folder containing our results

In [7]:
out_dir = os.path.join(base_dir, 'results', 'MS1_single')

Here we generate the chemical objects that will be used in the sample. The chemical objects are generated by sampling from metabolites in the HMDB database.

In [8]:
# the list of ROI sources created in the previous notebook '01. Download Data.ipynb'
ROI_Sources = [os.path.join(base_dir,'DsDA\\DsDA_Beer\\beer_t10_simulator_files\\')]

# minimum MS1 intensity of chemicals
min_ms1_intensity = 1.75E5

# m/z and RT range of chemicals
rt_range = [(0, 1440)]
mz_range = [(0, 1050)]

# the number of chemicals in the sample
n_chems = 6500

# maximum MS level (we do not generate fragmentation peaks when this value is 1)
ms_level = 1

In [9]:
chems = ChemicalCreator(ps, ROI_Sources, hmdb)
dataset = chems.sample(mz_range, rt_range, min_ms1_intensity, n_chems, ms_level)
save_obj(dataset, os.path.join(out_dir, 'dataset.p'))

Saving <class 'list'> to C:\Users\joewa\Work\git\vimms\examples\example_data\results\MS1_single\dataset.p


In [10]:
for chem in dataset[0:10]:
    print(chem)

KnownChemical - 'C19H30O2' rt=686.36 max_intensity=5575606.15
KnownChemical - 'C9H12' rt=515.42 max_intensity=1048342.02
KnownChemical - 'C20H33NO' rt=643.63 max_intensity=1110634.49
KnownChemical - 'C5H13N' rt=632.68 max_intensity=421479.01
KnownChemical - 'C20H18O11S' rt=227.94 max_intensity=250322.90
KnownChemical - 'C20H15Cl3N2OS' rt=227.59 max_intensity=962067.36
KnownChemical - 'C12H19N4O7P2S' rt=274.19 max_intensity=1462289.41
KnownChemical - 'C28H30O9' rt=434.79 max_intensity=513845.73
KnownChemical - 'C4H7BrO2S' rt=584.89 max_intensity=356565.93
KnownChemical - 'C15H10O6' rt=592.57 max_intensity=446904.41


## Run MS1 controller on the samples and generate .mzML files

In [11]:
min_rt = rt_range[0][0]
max_rt = rt_range[0][1]

In [12]:
mass_spec = IndependentMassSpectrometer(POSITIVE, dataset, density=ps.density_estimator)
controller = SimpleMs1Controller(mass_spec)
controller.run(min_rt, max_rt)

mzml_filename = os.path.join(out_dir, 'ms1_controller.mzML')
controller.write_mzML('my_analysis', mzml_filename)

1440.4464600000001it [00:56, 25.55it/s]                                                                                           


Simulated results have been saved to the following .mzML file and can be viewed in tools like [ToppView](https://pubs.acs.org/doi/abs/10.1021/pr900171m) or using other mzML file viewers.

In [13]:
mzml_filename

'C:\\Users\\joewa\\Work\\git\\vimms\\examples\\example_data\\results\\MS1_single\\ms1_controller.mzML'