# Normalizing and Transforming Data

Before performing model calculations on your data, it may be desired to normalize the input composition to a total of 100 wt%. VESIcal has multiple methods for normalizing sample data using various routines. Normalization can be done automatically when retrieving a single sample from an Excel file, as detailed above. Each of the normalization routines can be accessed by the user at any time to normalize either a signle sample or all samples in an ExcelFile object. 

All three normalization functions can take in either a single composition as a dictionary or multiple compositions either as an ExcelFile object or a pandas DataFrame object (e.g., `yourexcelfile` or `yourexcelfile.data`). The standard normalize functino returns the composition normalized to 100%, including any volatiles. The FixedVolatiles function normalizes the oxides to 100%, but volatiles remain fixed while other major element oxides are reduced proporitonally so that the total is 100 wt%. The AdditionalVolatiles function normalizes oxides to 100% assuming the sample is volatile-free. If H$_2$O or CO$_2$ concentrations are passed to the function, their un-normalized values will be retained in addition to the normalized non-volatile oxides, summing to >100%.

In [1]:
import sys
sys.path.insert(0, '../')

import VESIcal as v

  from scipy.ndimage.filters import convolve1d
  _TINY = np.finfo(float).machar.tiny


## Normalizing an entire dataset
### Import an Excel file

In [2]:
myfile = v.BatchFile('../../manuscript/example_data.xlsx')

### Standard Normalization
Returns the composition normalized to 100%, including any volatiles.

In [3]:
standard = myfile.get_data(normalization='standard')
standard

myfile.save_csv("../tables/NormStandard.csv", standard)

Saved ../tables/NormStandard.csv


### FixedVolatiles Normalization
Normalizes the oxides to 100%, but volatiles remain fixed while other major element oxides are reduced proporitonally so that the total is 100 wt%.

In [4]:
fixed_vols = myfile.get_data(normalization='fixedvolatiles')
fixed_vols

myfile.save_csv("../tables/NormFixedVolatiles.csv", fixed_vols)

Saved ../tables/NormFixedVolatiles.csv


### AdditionalVolatiles Normalization

Normalizes oxides to 100% assuming the sample is volatile-free. If H$_2$O or CO$_2$ concentrations are passed to the function, their un-normalized values will be retained in addition to the normalized non-volatile oxides, summing to >100%.

In [5]:
additional = myfile.get_data(normalization='additionalvolatiles')
additional

myfile.save_csv("../tables/NormAdditionalVolatiles.csv", additional)

Saved ../tables/NormAdditionalVolatiles.csv


## Normalize a single sample composition

### Extract a single sample from your dataset

In [6]:
SampleName = 'BT-ex'
extracted_bulk_comp = myfile.get_sample_composition(SampleName, asSampleClass=True)

The normalization type can be passed to get_sample_composition directly:

```python
extracted_bulk_comp = myfile.get_sample_composition(SampleName, normalization=<normalization-type>, asSampleClass=True)
```

Or, normalization can be done to any Sample object, as shown below.

### Standard Normalization

In [7]:
single_standard = extracted_bulk_comp.get_composition(normalization="standard")
single_standard

SiO2     73.369308
TiO2      0.075736
Al2O3    11.833759
Fe2O3     0.195967
Cr2O3     0.000000
FeO       0.447789
MnO       0.000000
MgO       0.028401
NiO       0.000000
CoO       0.000000
CaO       0.407081
Na2O      3.767869
K2O       4.619900
P2O5      0.000000
H2O       5.206854
CO2       0.047335
dtype: float64

### FixedVolatiles Normalization

In [8]:
single_fixed = extracted_bulk_comp.get_composition(normalization="fixedvolatiles")
single_fixed

SiO2     73.140238
TiO2      0.075500
Al2O3    11.796813
Fe2O3     0.195355
Cr2O3     0.000000
FeO       0.446391
MnO       0.000000
MgO       0.028312
NiO       0.000000
CoO       0.000000
CaO       0.405810
Na2O      3.756105
K2O       4.605476
P2O5      0.000000
CO2       0.050000
H2O       5.500000
dtype: float64

### AdditionalVolatiles Normalization

In [9]:
single_additional = extracted_bulk_comp.get_composition(normalization="additionalvolatiles")
single_additional

SiO2     77.438050
TiO2      0.079936
Al2O3    12.490008
Fe2O3     0.206835
Cr2O3     0.000000
FeO       0.472622
MnO       0.000000
MgO       0.029976
NiO       0.000000
CoO       0.000000
CaO       0.429656
Na2O      3.976819
K2O       4.876099
P2O5      0.000000
H2O       5.500000
CO2       0.050000
dtype: float64

In [10]:
myfile.save_excel(filename='ex_normalize_tables.xlsx', calculations=[standard, fixed_vols, additional])

Saved ex_normalize_tables.xlsx
