# Raman Spectroscopy Decomposition

## Introduction

Once components in a mixture Raman spectra have been identified and assigned, and psudo-Voigt curve fiting has been completed the next step is to compare pure component calibration (or non-decomposing, non-reacting) area under peaks to experimental data. From this comparision one will be able to deterimine:
1. is decomposition occuring?

and if it is then: 
2. calculate the amount of molar decomposition 

This calculation can be completed by comparing the area value of the experimental mixture Raman spectra to the pure component calibration Raman spectra area.

## Pre-step: Import Modules

In [3]:
#initial imports
import os
import h5py
import matplotlib.pyplot as plt
from ramandecompy import dataprep

## Step 1: Import Experimental Data Sets
The first thing is to put experimental data into a hdf5 file (this file will end up being used to identify peaks)

With multiple files in a directory/ many data sets it is usefull to loop over all files in the directory to add versus adding one by one. The code to loop came from a stackoverflow comment: `https://stackoverflow.com/questions/10377998/how-can-i-iterate-over-files-in-a-given-directory`

Note: A good resource for HDF5 file types in general is: `http://docs.h5py.org/en/stable/`


In [2]:
#dataprep.new_hdf5('dataprep_experiment') #comment this line out once made for the first time so an error isn't given saying that the file already exists
#directory = '/Users/elizabeth/Desktop/raman-spectra-decomp-analysis/ramandecompy/tests/test_files/' #defining directory for data
#dataprep.view_hdf5('dataprep_experimental.hdf5')

#base_dir = '../ramandecompy/tests/test_files/'

#for filename in os.listdir(directory):
#    if filename.startswith('FA_') and filename.endswith('.csv'):
#        locationandfile = directory + filename
#        dataprep.add_experiment('dataprep_experimental.hdf5', locationandfile)
#        continue
#    else:
#        continue
#return

#FOR CALIBRATION DATA MASS ADD

#dataprep.new_hdf5('dataprep_experiment') #comment this line out once made for the first time so an error isn't given saying that the file already exists
#directory = '/Users/elizabeth/Desktop/raman-spectra-decomp-analysis/ramandecompy/tests/test_files/' #defining directory for data
#


In [13]:
type(filename) #checking the type (making sure is a string) for file name

str

In [35]:
dataprep.view_hdf5('dataprep_experimental.hdf5') #making sure the loop did its job and all data is correctly imported 
#comment out this to not see the long list

**** dataprep_experimental.hdf5 ****
[1m300C[0m
|    [1m25s[0m
|    |    Peak_01
|    |    Peak_02
|    |    Peak_03
|    |    Peak_04
|    |    Peak_05
|    |    Peak_06
|    |    Peak_07
|    |    Peak_08
|    |    Peak_09
|    |    Peak_10
|    |    Peak_11
|    |    Peak_12
|    |    Peak_13
|    |    Peak_14
|    |    Peak_15
|    |    Peak_16
|    |    counts
|    |    wavenumber
|    [1m35s[0m
|    |    Peak_01
|    |    Peak_02
|    |    Peak_03
|    |    Peak_04
|    |    Peak_05
|    |    Peak_06
|    |    Peak_07
|    |    Peak_08
|    |    Peak_09
|    |    Peak_10
|    |    Peak_11
|    |    Peak_12
|    |    Peak_13
|    |    Peak_14
|    |    Peak_15
|    |    Peak_16
|    |    counts
|    |    wavenumber
|    [1m45s[0m
|    |    Peak_01
|    |    Peak_02
|    |    Peak_03
|    |    Peak_04
|    |    Peak_05
|    |    Peak_06
|    |    Peak_07
|    |    Peak_08
|    |    Peak_09
|    |    Peak_10
|    |    Peak_11
|    |    Peak_12
|    |    Peak_13
|    |    Pea

|    |    Peak_04
|    |    Peak_05
|    |    Peak_06
|    |    Peak_07
|    |    Peak_08
|    |    Peak_09
|    |    Peak_10
|    |    Peak_11
|    |    Peak_12
|    |    Peak_13
|    |    Peak_14
|    |    counts
|    |    wavenumber
|    [1m125s[0m
|    |    Peak_01
|    |    Peak_02
|    |    Peak_03
|    |    Peak_04
|    |    Peak_05
|    |    Peak_06
|    |    Peak_07
|    |    Peak_08
|    |    Peak_09
|    |    Peak_10
|    |    Peak_11
|    |    Peak_12
|    |    Peak_13
|    |    Peak_14
|    |    counts
|    |    wavenumber
|    [1m15s[0m
|    |    Peak_01
|    |    Peak_02
|    |    Peak_03
|    |    Peak_04
|    |    Peak_05
|    |    Peak_06
|    |    Peak_07
|    |    Peak_08
|    |    Peak_09
|    |    Peak_10
|    |    Peak_11
|    |    Peak_12
|    |    Peak_13
|    |    Peak_14
|    |    counts
|    |    wavenumber
|    [1m5s[0m
|    |    Peak_01
|    |    Peak_02
|    |    Peak_03
|    |    Peak_04
|    |    Peak_05
|    |    Peak_06
|    |    Peak_07
|    | 

## Step 2: Define substance of interest
The second step is to determine if the desired speciecies in the spectra is present, and if it is then if it has decomposed (decreased/changed) from the defined calibration area. 

At this point this will be done by the user knowing where the approximate location of the peak for the substance that is of interest. 

Given the user center peak wavelength location input the code will go through the calibration data and for a peak with a center at the defined location (ith some tolerance of +/- 10 cm^-1) will take the area of that curve and store it as a variable.

In [4]:
dataprep.view_hdf5('cal_example.hdf5') #viewing the hdf5 file to see if compound of interest is included

**** cal_example.hdf5 ****
[1mCarbon Monoxide[0m
|    Peak_1
|    x
|    y
[1mHydrogen[0m
|    Peak_1
|    Peak_2
|    Peak_3
|    Peak_4
|    x
|    y
[1mMethane[0m
|    Peak_1
|    x
|    y


In [12]:
#defining the location of the file
directory = '/Users/elizabeth/Desktop/raman-spectra-decomp-analysis/ramandecompy/tests/test_files/' #defining directory for data
#defining the file name

for filename in os.listdir(directory):
    if filename.startswith('Formic') and filename.endswith('.csv'):
        locationandfile = directory + filename
        dataprep.add_experiment('cal_example.hdf5', locationandfile)
        continue
    else:
        continue
#return

AttributeError: 'list' object has no attribute 'split'

In [23]:
pwd

'/Users/elizabeth/Desktop/raman-spectra-decomp-analysis/examples'

In [24]:
dataprep.add_experiment('cal_example.hdf5', 'FormicAcid.csv')


AttributeError: 'list' object has no attribute 'split'

## Step 3: Define presence of substance in experimental data

Then for that same center peak wavelength location input it will identify the presence of the peak (if it is there) in the experimental data and area for that peak and store it as a second variable.


In [19]:
dataprep.view_hdf5('cal_example.hdf5')

**** cal_example.hdf5 ****
[1mCarbon Monoxide[0m
|    Peak_1
|    x
|    y
[1mHydrogen[0m
|    Peak_1
|    Peak_2
|    Peak_3
|    Peak_4
|    x
|    y
[1mMethane[0m
|    Peak_1
|    x
|    y


In [33]:
data = h5py.File('cal_example.hdf5', 'r+')
# then specify the peak
peak_01 = list(data['300C/25s/Peak_01'])
# you put list because otherwise it just saves it as a h5py.dataset or something and lists are more familiar. Then peak_01 will be a list containing the 7 elements of the Peak_01 dataset
print(peak_01)




KeyError: 'Unable to open object (component not found)'

In [34]:
data1 = h5py.File('dataprep_experimental.hdf5', 'r+')
# then specify the peak
peak_01 = list(data['300C/25s/Peak_01'])
# you put list because otherwise it just saves it as a h5py.dataset or something and lists are more familiar. Then peak_01 will be a list containing the 7 elements of the Peak_01 dataset
print(peak_01)


KeyError: 'Unable to open object (component not found)'

## Step 5: Calculate Molar Decomposition

To define the molar decomposition the area of the experimental data will be divided by the calibration data's area. This value will be the molar amount of the substance at the given experimental temperature and resonance time.

In [None]:
import os

for filename in os.listdir(directory):
    if filename.endswith(".hdf5"):
        molar_area = "exp_temp_time_Co2.hdf5 area" - "filename.hdf5 FA area"
        molar_percent = "filename.hdf5 FA area" / "FA_calibration.hdf5 area"
         # print(os.path.join(directory, filename))
        continue
    else:
        continue
return molar_area, molar

In [None]:
#tests



In [17]:
data = h5py.File(dataprep_experimental.hdf5, ‘r’)
# then specify the peak
peak_01 = list(data[’300C/25s/Peak_01])
# you put list because otherwise it just saves it as a h5py.dataset or something and lists are more familiar. Then peak_01 will be a list containing the 7 elements of the Peak_01 dataset



SyntaxError: invalid character in identifier (<ipython-input-17-9253a45d0f35>, line 1)

## Step 6: Plot Molar Decomposition

In [None]:
norm_mol = "area "

## Step 7: Compare Molar Decomposition with Reported Literature Values

# Conclusion