# Quantifying with chemtbd

After reading in GCMS data, concentrations can be determined using gcquant. gcquant has the following hierarchy:

- no one
- no two
- no three

The inputs to calculate concentrations are listed below:

1. stfud
dfdss


## Import Data

The first thing to do is get your GC data into a dataframe. Below is an example of how this is done with the Agilent class. For a full description of the class look [here](https://github.com/blakeboswell/chemtbd/blob/master/example.ipynb).

In [1]:
from chemtbd.io import Agilent
agi = Agilent.from_root('data/test3')

Now import the GCQuant class:

In [2]:
from chemtbd.io import GCQuant

## Align Peaks on Retention time

The next task is to matchup the identified species `.results_lib` to the integrated areas `.results_tic` and/or `.results_fid`. 

In [3]:
agi.results_lib.head()

Unnamed: 0_level_0,header=,pk,rt,pct_area,library_id,ref,cas,qual
key,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
FA03.D,1=,1.0,5.7877,2.0335,Methyl octanoate,17.0,000000-00-0,96.0
FA03.D,2=,2.0,7.3441,3.4015,Methyl decanoate,1.0,000000-00-0,98.0
FA03.D,3=,3.0,8.0364,1.7448,Methyl undecanoate,2.0,000000-00-0,98.0
FA03.D,4=,4.0,8.6715,3.9674,Methyl dodecanoate,3.0,000000-00-0,98.0
FA03.D,5=,5.0,9.2781,1.9607,Methyl tridecanoate,4.0,000000-00-0,99.0


Calling the class with the library and area dataframes will make available `qnt.matched` which is where the matched dataframe is stored.

In [4]:
qnt = GCQuant(agi.results_lib,agi.results_tic)

In [5]:
aligned = qnt.align
aligned.head()

Unnamed: 0_level_0,header=,pk,rt,pct_area,library_id,ref,cas,qual,area
key,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
FA03.D,1=,1.0,5.7877,2.0335,Methyl octanoate,17.0,000000-00-0,96.0,1489466.0
FA03.D,2=,2.0,7.3441,3.4015,Methyl decanoate,1.0,000000-00-0,98.0,2491449.0
FA03.D,3=,3.0,8.0364,1.7448,Methyl undecanoate,2.0,000000-00-0,98.0,1277982.0
FA03.D,4=,4.0,8.6715,3.9674,Methyl dodecanoate,3.0,000000-00-0,98.0,2905961.0
FA03.D,5=,5.0,9.2781,1.9607,Methyl tridecanoate,4.0,000000-00-0,99.0,1436154.0


## Create Calibration Curves

Great! Now that the species have a corresponding area, the next step is to create calibration curves. To create calibration curves the following 3 things will be needed, and this information will need to be captured in a dataframe.

- identify files within the subfolder that contain calibration curve data
- provide the known concentrations of each species in each file

We recommend either putting the data into a csv file and importing it using pandas or creating a pandas dataframe. Below is how this can be performed. 

#### IMPORTANT: 
1. The dataframe headers must have a library_id column and the remaining columns should be the file names for each standard. 
2. Be sure the file names are exactly the same as the files in the subfolder including any extention in the name (e.g. ".D").
3. UNITS UNITS UNITS. The concentrations should be entered in molar (mol/L).

In [6]:
import pandas as pd
standards = pd.read_csv('standards.csv')
standards.head()

Unnamed: 0,library_id,FA03.D,FA04.D,FA05.D
0,Methyl palmitate,0.25,0.5,1
1,Methyl heptadecanoate,0.25,0.5,1
2,Methyl docosanoate,0.25,0.5,1
3,Methyl undecanoate,0.25,0.5,1
4,"Methyl cis-8,11,14-eicosatrienoate",0.25,0.5,1


Now you can match the known concentrations with the areas in your dataset. This allows you calculate the response factors (i.e. slopes).

In [7]:
rfactors = qnt.stdcurves(aligned,standards)
rfactors.head()

Unnamed: 0,library_id,responsefactor,intercept,rvalue,pvalue,stderr,max,min
0,"All cis-4,7,10,13,16,19-docosahexaenoate methy...",1.367304e-07,0.103774,0.999932,0.007421,1.593827e-09,6566581.0,1094384.0
1,Methyl arachidate,7.146232e-08,-0.063537,0.999076,0.027367,3.073928e-09,14785000.0,4222622.0
2,Methyl arachidonate,1.344684e-07,0.067813,0.999319,0.000681,3.51112e-09,7056294.0,1390417.0
3,"Methyl cis-11,14,17-eicosatrienoate",1.306034e-07,0.102132,1.0,0.0,0.0,6874772.0,3046387.0
4,Methyl cis-11-eicosenoate,6.336257e-08,0.008515,1.0,0.00011,1.092154e-11,15647413.0,3810380.0


## Quantify Concentrations
With both the response factors calculated and the aligned areas, the concentrations of each species can now be calculated by calling `qnt.concentations(aligned,rfactors)`.

In [9]:
concentrations = qnt.concentrations(aligned,rfactors)
concentrations.head()

Unnamed: 0_level_0,header=,pk,rt,pct_area,library_id,ref,cas,qual,area,responsefactor,intercept,max,min,conc,conc%,area%
key,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
FA03.D,1=,1.0,5.7877,2.0335,Methyl octanoate,17.0,000000-00-0,96.0,1489466.0,1.11362e-07,0.075876,8256723.0,1489466.0,0.241746,0.03515,0.020335
FA03.D,2=,2.0,7.3441,3.4015,Methyl decanoate,1.0,000000-00-0,98.0,2491449.0,1.026929e-07,-0.003581,9783363.0,2491449.0,0.252273,0.03668,0.034015
FA03.D,3=,3.0,8.0364,1.7448,Methyl undecanoate,2.0,000000-00-0,98.0,1277982.0,1.833527e-07,0.018361,5360876.0,1277982.0,0.252682,0.03674,0.017448
FA03.D,4=,4.0,8.6715,3.9674,Methyl dodecanoate,3.0,000000-00-0,98.0,2905961.0,9.68541e-08,-0.037806,10679280.0,2905961.0,0.243649,0.035426,0.039674
FA03.D,5=,5.0,9.2781,1.9607,Methyl tridecanoate,4.0,000000-00-0,99.0,1436154.0,1.640559e-07,0.013704,6009837.0,1436154.0,0.249314,0.03625,0.019607


## Report Results
### Pivot Table 
GCQuant provides a couple 'out-of-the-box' methods to show the results of your data. 

First you can now display the calculated areas or concentrations in a convient pivot table. The data can be reported as absolute values or as percentages.

### Bar Plots