# Reading Agilent GCMS Directories with `chemtbd`

> __NOTE__: We need a name.  See [issue 3](https://github.com/blakeboswell/chemtbd/issues/3).

In [1]:
from chemtbd.io import Agilent

The `Agilent` object contains an object for reading Agilent directories named `Agilent`.  Use it as follows:

In [2]:
agi = Agilent.from_root('data/test3')

The `Agilent` object loads data lazily, so nothing has actually happened yet.  When we ask it for data, it will read the data from disk, structure it as a pandas DataFrame, store it in a cache and finally return it.  The next time we ask for the same data, the DataFrame is loaded from the cache.

Currently there are two access properties:
- `results` - returns `tic`, `fid`, and `lib` tables from `RESULTS.csv`
- `raw` - returns `tic` and `tme` series from `DATA.MS`


## Acessing Agilent GCMS `RESULTS.CSV`

Let's read in the `results` data:

> __NOTE__: The below will __not__ work on the test data containing dummy `fid` tables. See [issue 1](https://github.com/blakeboswell/chemtbd/issues/1) for details.

The `results` data can contain `tic`, `lib` and `fid` tables although one or more tables may be missing from the source data and therefore not available from the `Agilent` object.  

When present in the source data, these tables are accessible via properties of the same name.  For example

In [3]:
agi.keys()

dict_keys(['FA01.D', 'FA02.D', 'FA03.D', 'FA04.D', 'FA05.D', 'FA06.D', 'FA07.D', 'FA08.D', 'FA09.D', 'FA10.D', 'FA11.D', 'FA12.D', 'FA13.D', 'FA14.D', 'FA15.D'])

In [4]:
agi.results('tic').head()

Unnamed: 0,header=,peak,rt,first,max,last,pk_ty,height,area,pct_max,pct_total,key
0,1=,1.0,12.288,1600.0,1609.0,1647.0,rBV3,71023.0,478771.0,39.71,6.909,FA01.D
1,2=,2.0,13.598,1830.0,1838.0,1864.0,rBV2,247725.0,825285.0,68.46,11.91,FA01.D
2,3=,3.0,14.428,1977.0,1983.0,2004.0,rBV,481706.0,1098175.0,91.09,15.848,FA01.D
3,4=,4.0,15.08,2091.0,2097.0,2109.0,rBV,806692.0,1205528.0,100.0,17.397,FA01.D
4,5=,5.0,15.692,2198.0,2204.0,2215.0,rBV,731146.0,1085862.0,90.07,15.67,FA01.D


In [5]:
agi.results('tic').shape

(129, 12)

In [6]:
agi.results('lib').head()

Unnamed: 0,header=,pk,rt,pct_area,library_id,ref,cas,qual,key
0,1=,1.0,5.7877,2.0335,Methyl octanoate,17.0,000000-00-0,96.0,FA03.D
1,2=,2.0,7.3441,3.4015,Methyl decanoate,1.0,000000-00-0,98.0,FA03.D
2,3=,3.0,8.0364,1.7448,Methyl undecanoate,2.0,000000-00-0,98.0,FA03.D
3,4=,4.0,8.6715,3.9674,Methyl dodecanoate,3.0,000000-00-0,98.0,FA03.D
4,5=,5.0,9.2781,1.9607,Methyl tridecanoate,4.0,000000-00-0,99.0,FA03.D


In [7]:
agi.results('fid').head()

Unnamed: 0,header=,peak,rt,first,end,pk_ty,height,area,pct_max,pct_total,key
0,1=,1,6.250716,5.93818,6.563252,M,2578080,14894660,1,1.962,FA03.D
1,2=,2,7.858187,7.465278,8.251096,M,9647430,24914490,1,3.282,FA03.D
2,3=,3,8.357856,7.939963,8.775749,M,6084180,12779820,1,1.683,FA03.D
3,4=,4,9.798795,9.308855,10.288735,M,19290490,29059610,1,3.828,FA03.D
4,5=,5,10.669815,10.136324,11.203306,M,8825210,14361540,1,1.892,FA03.D


# Accessing Agilent GCMS `DATA.MS`

In [8]:
agi.data('tic').head()

Unnamed: 0,tic,key
0,3576027.0,FA01.D
1,2654533.0,FA01.D
2,2052596.0,FA01.D
3,1665840.0,FA01.D
4,1409909.0,FA01.D


In [9]:
agi.data('tme').head()

Unnamed: 0,tme,key
0,3.086817,FA01.D
1,3.092533,FA01.D
2,3.09825,FA01.D
3,3.103983,FA01.D
4,3.1097,FA01.D


In [10]:
agi.data('tme').shape

(44355, 2)

In [11]:
agi.data('tic').shape

(44355, 2)