# Processing the IR dataset

In [1]:
from spectrochempy.api import *
options.log_level=ERROR


        SpectroChemPy's API
        Version   : 0.1a3.dev
        Copyright : 2014-2017 - LCS (Laboratory for Catalysis and Spectrochempy)
            


We read the .scp saved previously

In [2]:
samples = {'P350':{'label':'$\mathrm{M_P}\,(623\,K)$'},
           'A350':{'label':'$\mathrm{M_A}\,(623\,K)$'}, 
           'B350':{'label':'$\mathrm{M_B}\,(623\,K)$'}}

for key, sample in samples.items():
    # our data are in our test `scpdata` directory. 
    basename = os.path.join(scpdata,'agirdata/{}/FTIR/FTIR'.format(key))
    filename = basename + '.scp'
    sample['IR'] = NDDataset.read( filename)

We will resize the data in the interesting region of wavenumbers

In [3]:
sources = [sample['IR'] for sample in samples.values()]
labels = ["sample "+sample['label'] for sample in samples.values()]

_ = multiplot_stack(sources=sources, labels=labels, nrow=1, ncol=3, figsize=(9,4), sharex=False,
                sharey=True, style='sans')

<IPython.core.display.Javascript object>

## Masking bad data

Clearly some of the spectra above displayed have problem with noise, or some experiment artifacts. Let's mask them. 

First, we will make a copy of the data, to be sure not to modified them before finishing alll the processing.

Then, we will select the region of interest.

#####  Some notes about Slicing

Slicing can be done by index, coordinates or labels (when they are present).

* `P350[:, 10]` for colomn slicing (here we get the 10th column (with index starting at 0!))
* or `P350[10]` for row slicing

As said above, we can also slice using the real coordinates. For example,

* `P350[:, 3000.0:3100.0]` will select all columns from wavenumbers 3000 to 3100. 

**IMPORTANT** : when doing such slicing, the wavenumbers must be expressed as floating numbers (with the decimal separator present) or it will fail!.

Here we want to leave only a selection of wavenumbers (between 4000. and 1290. cm$^{-1}$). Let's try to mask them. It is just necessary to affect the value **`masked`** to the data we want to mask:



In [4]:
for key, sample in samples.items():
    sample['IR'][:,400.:1290.] = masked 
    

Here we have masked the undesired data using coordinates slicing. However, note that when using coordinates, both limits needs to be set, as SpectroChemPy cannot infer wich direction will be masked)

Let's display the results

In [5]:
sources = [sample['IR'] for sample in samples.values()]
labels = ["sample "+sample['label'] for sample in samples.values()]

_ = multiplot_stack(sources=sources, labels=labels, nrow=1, ncol=3, figsize=(9,4), sharex=False,
                sharey=True, style='sans')

<IPython.core.display.Javascript object>

To remove a mask, the only way is to remove **all** masks. It cannot be done selectively!

In [6]:
for key, sample in samples.items():
    sample['IR'].remove_masks()

_ = multiplot_stack(sources=sources, labels=labels, nrow=1, ncol=3, figsize=(9,4), sharex=False,
                sharey=True, style='sans')

<IPython.core.display.Javascript object>

Actually, because we want to select a region of interest, it might be simple to just keep this region (no need to use masks for this). WE will slice this and stor this for further use.

In [7]:
for key, sample in samples.items():
    s = sample['IR'][:,1290.:4000.]  # such slicing is not done inplace. Original data are preserved.
    sample['IR'] = s                 # we thus need to force the change to keep this modification in the
                                     # original data.
        
    # save the data in a `scp` file
    basename = os.path.join(scpdata,'agirdata/{}/FTIR/FTIR'.format(key))
    sample['IR'].save(basename + '_corrected.scp')

### Masking bad data for sample P350

May be, besides just slicing as above explained, it is useful to use the an interactive window to mask some values. Also because, it appears that mainly the data to remove correspond to some row it may be interesting to work on transposed data (we use the operator `.T`):

In [8]:
P350T = samples['P350']['IR'].T
_ = P350T.interactive_masks(figsize=(9,5))

<IPython.core.display.Javascript object>

In [9]:
P350T.T.plot_stack()
# keep this 
P350 = P350T.T

<IPython.core.display.Javascript object>

Put back the masked data into the original

In [10]:
samples['P350']['IR'] = P350 

### Masking bad data for sample A350

Again we will work on transposed data

In [13]:
A350T= samples['A350']['IR'].T
A350T.interactive_masks(figsize=(9,5))

<IPython.core.display.Javascript object>

<matplotlib.widgets.SpanSelector at 0x134d8df60>

In [12]:
samples

{'A350': {'IR': NDDataset: [[   1.600,    1.552, ...,    1.467,    1.466],
              [   1.462,    1.417, ...,    1.332,    1.333],
              ..., 
              [   1.536,    1.486, ...,    1.572,    1.570],
              [   1.539,    1.487, ...,    1.573,    1.572]] a.u.,
  'label': '$\\mathrm{M_A}\\,(623\\,K)$'},
 'B350': {'IR': NDDataset: [[   1.316,    1.274, ...,    0.744,    0.745],
              [   1.256,    1.215, ...,    0.813,    0.814],
              ..., 
              [   1.261,    1.218, ...,    1.252,    1.253],
              [   1.253,    1.211, ...,    1.251,    1.253]] a.u.,
  'label': '$\\mathrm{M_B}\\,(623\\,K)$'},
 'P350': {'IR': NDDataset: [[   1.424,    1.376, ...,    1.345,    1.345],
              [   1.319,    1.276, ...,    1.305,    1.305],
              ..., 
              [  --,   --, ...,   --,   --],
              [  --,   --, ...,   --,   --]] a.u.,
  'label': '$\\mathrm{M_P}\\,(623\\,K)$'}}

In [16]:
samples['P350']['IR']

0,1
Name/Id,dde43606
,
Author,christian@MacBook-Pro-de-Christian.local
,
Created,2017-11-20 23:26:34.900677
,
Last Modified,2017-11-20 23:26:34.900927
,
Description,"Stack of 130 datasets : ( LOS2221, LOS2222, LOS2223, LOS2224, LOS2225, LOS2226, LOS2227, LOS2228, LOS2229, LOS2230, LOS2231, LOS2232, LOS2233, LOS2234, LOS2235, LOS2236, LOS2237, LOS2238, LOS2239, LOS2240, LOS2241, LOS2242, LOS2243, LOS2244, LOS2245, LOS2246, LOS2247, LOS2248, LOS2249, LOS2250, LOS2251, LOS2252, LOS2253, LOS2254, LOS2255, LOS2256, LOS2257, LOS2258, LOS2259, LOS2260, LOS2261, LOS2262, LOS2263, LOS2264, LOS2265, LOS2266, LOS2267, LOS2268, LOS2269, LOS2270, LOS2271, LOS2272, LOS2273, LOS2274, LOS2275, LOS2276, LOS2277, LOS2278, LOS2279, LOS2280, LOS2281, LOS2282, LOS2283, LOS2284, LOS2285, LOS2286, LOS2287, LOS2288, LOS2289, LOS2290, LOS2291, LOS2292, LOS2293, LOS2294, LOS2295, LOS2296, LOS2297, LOS2298, LOS2299, LOS2300, LOS2301, LOS2302, LOS2303, LOS2304, LOS2305, LOS2306, LOS2307, LOS2308, LOS2309, LOS2310, LOS2311, LOS2312, LOS2313, LOS2314, LOS2315, LOS2316, LOS2317, LOS2318, LOS2319, LOS2320, LOS2321, LOS2322, LOS2323, LOS2324, LOS2325, LOS2326, LOS2327, LOS2328, LOS2329, LOS2330, LOS2331, LOS2332, LOS2333, LOS2334, LOS2335, LOS2336, LOS2337, LOS2338, LOS2339, LOS2340, LOS2341, LOS2342, LOS2343, LOS2344, LOS2345, LOS2346, LOS2347, LOS2348, LOS2349, LOS2350 )"
,

0,1
Title,Absorbance
,
Shape,130 x 2811
,
Values,"[[ 1.424 1.376 ..., 1.345 1.345]  [ 1.319 1.276 ..., 1.305 1.305]  ..., [ -- -- ..., -- --]  [ -- -- ..., -- --]] a.u."
,

0,1
Title,Tos
,
Data,"[ 0.000 0.167 ..., 21.402 21.569] hr"
,
Labels,"[[2013-08-30 09:35:07 2013-08-30 09:45:08 ..., 2013-08-31 06:59:14 2013-08-31 07:09:16]  [LOS2221 LOS2222 ..., LOS2349 LOS2350]]"
,

0,1
Title,Wavenumbers
,
Data,"[1290.165 1291.129 ..., 3998.740 3999.704] cm-1"
,


In [17]:
samples['P350']['IR'].plot()

<IPython.core.display.Javascript object>

<matplotlib.axes._subplots.AxesSubplot at 0x13d643cf8>