# Load Mass Spectrum and calculate metrics

In [None]:
import nomhsms.draw as draw
from nomhsms.spectrum import Spectrum

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## Load and save

Load mass spectrum from csv file. When loading, if the headings of m/z and intensity are not default ("mass", "intensity"), you must specify them by mapper. You also need to specify a separator (default is ","). In order not to load unnecessary data, we will also set the take_only_mz flag True. 

In [None]:
spec = Spectrum.read_csv(filename = "data/sample2.txt",
                            mapper = {'m/z':'mass', "I":'intensity'},
                            take_only_mz = True,
                            sep = ',',
                            )

Now we can plot spectrum

In [None]:
draw.spectrum(spec)

The next step is assignement of brutto formulas for the masses. By default, this process is performed on the following range of elements: 'C' from 4 to 50, 'H' from 4 to 100, 'O' from 0 to 25, 'N' from 0 to 3, 'S' from 0 to 2. The following rules are also followed by default: 0.25<H/C<2.2, O/C < 1, nitogen parity, DBE-O <= 10.

We can specify elements by brutto_dict parameters as bellow. If you want use isotopes use "_" and number, for example "C_13"

rel_error is allowable relative error. By default it is 0.5

In [None]:
spec = spec.assign(brutto_dict={'C':(4,51), 'C_13':(0,3), 'H':(4,101), 'O':(0, 26), 'N':(0,3)}, rel_error=0.5)

Now you can see masses with brutto formulas. But first we can drop unassigned formulas

In [None]:
spec = spec.drop_unassigned()
spec.table

After assignment we can calculate different metrics: H/C, O/C, CRAM, NOSC, AI, DBE and other. We can do it separate by such methods as calc_ai, calc_dbe ... or do all by one command calc_all_metrics

In [None]:
spec = spec.calc_all_metrics()

Now we can see all metrics

In [None]:
spec.table.columns

You can save the data with spectrum and calculated metrics to a csv file at any time by to_csv method

In [None]:
spec.to_csv('temp.csv')

## Draw

Data can be visualized by different methods

Simple Van-Krevelen diagramm. By default CHO formulas is blue, CHON is orange, CHOS is green, CHONS is red. There is no S, so it is only two colors here

In [None]:
draw.vk(spec)

We can plot it with density axis

In [None]:
draw.vk(spec, draw.scatter_density)

Or do it with any of metrics in spectrum. For example, NOSC vs DBE-OC

In [None]:
draw.scatter(spec, x='NOSC', y='DBE-OC')
draw.scatter_density(spec, x='NOSC', y='DBE-OC')


We can plot separate density

In [None]:
draw.density(spec, 'AI')

Or plot 2D kernel density scatter

In [None]:
draw.density_2D(spec, x='NOSC', y='DBE-OC')

We can plot Kendric diagramm by command

In [None]:
draw.scatter(spec, x='Ke', y='KMD')

## Molecular class

We can get average density of molecular classes of brutto formulas in spectrum

In [None]:
spec.get_mol_class()

## Metrics

We can get any metrics that avarage by weight of intensity.

In [None]:
spec.get_mol_metrics()

In [None]:
spec.get_mol_metrics(metrics=['AI', 'DBE', 'NOSC', 'H/C', 'O/C'])

We can avarage the same by mean or other function(max, min, std, median)

In [None]:
spec.get_mol_metrics(metrics=['AI', 'DBE', 'NOSC', 'H/C', 'O/C'], func='mean')

Also we can split VanKrevelen diagramm to squares and calculate density in each squares

In [None]:
spec.get_squares_vk(draw=True)

It may also be useful to calculate the dependence od DBE vs nO. By fit the slope we can determinen state of sample

In [None]:
spec.get_dbe_vs_o(draw=True, olim=(5, 18))

Using the obtained metrics, it is possible to classify samples by origin or property, train different models.