# Normalizing and Transforming Data

Before performing model calculations on your data, it may be desired to normalize the input composition to a total of 100 wt%. VESIcal has multiple methods for normalizing sample data using various routines. Normalization can be done automatically when retrieving a single sample from an Excel file, as detailed above. Each of the normalization routines can be accessed by the user at any time to normalize either a signle sample or all samples in an ExcelFile object. 

All three normalization functions can take in either a single composition as a dictionary or multiple compositions either as an ExcelFile object or a pandas DataFrame object (e.g., `yourexcelfile` or `yourexcelfile.data`). The standard normalize functino returns the composition normalized to 100%, including any volatiles. The FixedVolatiles function normalizes the oxides to 100%, but volatiles remain fixed while other major element oxides are reduced proporitonally so that the total is 100 wt%. The AdditionalVolatiles function normalizes oxides to 100% assuming the sample is volatile-free. If H$_2$O or CO$_2$ concentrations are passed to the function, their un-normalized values will be retained in addition to the normalized non-volatile oxides, summing to >100%.

In [1]:
import sys
sys.path.insert(0, '../')

import VESIcal as v

## Normalizing an entire dataset
### Import an Excel file

In [2]:
myfile = v.ExcelFile('../manuscript/example_data.xlsx')

### Standard Normalization
Returns the composition normalized to 100%, including any volatiles.

In [3]:
standard = v.normalize(myfile)
standard

Unnamed: 0_level_0,SiO2,TiO2,Al2O3,Fe2O3,Cr2O3,FeO,MnO,MgO,NiO,CoO,CaO,Na2O,K2O,P2O5,H2O,CO2,Press,Temp
Label,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
BT-ex,73.369308,0.075736,11.833759,0.195967,0.0,0.447789,0.0,0.028401,0.0,0.0,0.407081,3.767869,4.6199,0.0,5.206854,0.047335,500,900
TVZMa-ex,75.315939,0.124934,11.474701,0.0,0.0,0.95142,0.038441,0.048052,0.0,0.0,0.509346,3.651915,3.978665,0.0,3.901783,0.004805,600,800
TVZOh-ex,74.513367,0.076522,11.62179,0.0,0.0,0.9087,0.047826,0.057392,0.0,0.0,0.526089,3.87393,3.940887,0.0,4.428715,0.004783,50,900
Oh48-FTIR1-MI1-a,75.097594,0.028592,11.53281,0.0,0.0,0.942966,0.032238,0.049413,0.0,0.0,0.457858,3.885847,3.924226,0.0,4.044075,0.004381,250,950
Oh48-FTIR1-MI1-b,75.248644,0.02865,11.556007,0.0,0.0,0.944862,0.032303,0.049512,0.0,0.0,0.458779,3.893663,3.932119,0.0,3.851184,0.004276,500,1025
Oh48-FTIR1-MI1-IRc,75.335528,0.028683,11.56935,0.0,0.0,0.945953,0.03234,0.049569,0.0,0.0,0.459309,3.898159,3.936659,0.0,3.73997,0.00448,5000,925
Oh50-4.1,74.449862,0.09403,11.533947,0.0,0.0,1.008719,0.024559,0.095463,0.0,0.0,0.498435,3.860576,3.994358,0.0,4.435689,0.004363,1000,862
Oh50-4.2,74.620875,0.094246,11.560441,0.0,0.0,1.011036,0.024615,0.095683,0.0,0.0,0.49958,3.869443,4.003533,0.0,4.216289,0.00426,100,770
Oh49-4.1,74.709866,0.009492,11.611094,0.0,0.0,0.96072,0.064432,0.052351,0.0,0.0,0.512576,3.844797,4.122849,0.0,4.107446,0.004378,1000,855
Oh49-4.2,74.748223,0.009497,11.617056,0.0,0.0,0.961213,0.064465,0.052377,0.0,0.0,0.512839,3.846771,4.124966,0.0,4.058326,0.004267,500,1000


### FixedVolatiles Normalization
Normalizes the oxides to 100%, but volatiles remain fixed while other major element oxides are reduced proporitonally so that the total is 100 wt%.

In [4]:
fixed = v.normalize_FixedVolatiles(myfile)
fixed

Unnamed: 0_level_0,SiO2,TiO2,Al2O3,Fe2O3,Cr2O3,FeO,MnO,MgO,NiO,CoO,CaO,Na2O,K2O,P2O5,H2O,CO2,Press,Temp
Label,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
BT-ex,73.140238,0.0755,11.796813,0.195355,0.0,0.446391,0.0,0.028312,0.0,0.0,0.40581,3.756105,4.605476,0.0,5.5,0.05,500,900
TVZMa-ex,75.191779,0.124728,11.455785,0.0,0.0,0.949851,0.038378,0.047972,0.0,0.0,0.508506,3.645895,3.972106,0.0,4.06,0.005,600,800
TVZOh-ex,74.356256,0.076361,11.597285,0.0,0.0,0.906784,0.047725,0.057271,0.0,0.0,0.52498,3.865762,3.932577,0.0,4.63,0.005,50,900
Oh48-FTIR1-MI1-a,74.963741,0.028541,11.512255,0.0,0.0,0.941285,0.032181,0.049325,0.0,0.0,0.457042,3.878921,3.917231,0.0,4.214912,0.004566,250,950
Oh48-FTIR1-MI1-b,75.127485,0.028604,11.537401,0.0,0.0,0.943341,0.032251,0.049432,0.0,0.0,0.458041,3.887394,3.925788,0.0,4.005816,0.004448,500,1025
Oh48-FTIR1-MI1-IRc,75.221373,0.028639,11.551819,0.0,0.0,0.94452,0.032291,0.049494,0.0,0.0,0.458613,3.892252,3.930694,0.0,3.885649,0.004654,5000,925
Oh50-4.1,74.289091,0.093827,11.50904,0.0,0.0,1.00654,0.024506,0.095257,0.0,0.0,0.497358,3.852239,3.985732,0.0,4.641843,0.004566,1000,862
Oh50-4.2,74.475939,0.094063,11.537987,0.0,0.0,1.009072,0.024567,0.095497,0.0,0.0,0.498609,3.861928,3.995757,0.0,4.402133,0.004448,100,770
Oh49-4.1,74.572211,0.009475,11.589701,0.0,0.0,0.95895,0.064313,0.052254,0.0,0.0,0.511631,3.837713,4.115253,0.0,4.283934,0.004566,1000,855
Oh49-4.2,74.61391,0.00948,11.596181,0.0,0.0,0.959486,0.064349,0.052283,0.0,0.0,0.511917,3.839859,4.117554,0.0,4.230533,0.004448,500,1000


### AdditionalVolatiles Normalization

Normalizes oxides to 100% assuming the sample is volatile-free. If H$_2$O or CO$_2$ concentrations are passed to the function, their un-normalized values will be retained in addition to the normalized non-volatile oxides, summing to >100%.

In [5]:
additional = v.normalize_AdditionalVolatiles(myfile)
additional

Unnamed: 0_level_0,SiO2,TiO2,Al2O3,Fe2O3,Cr2O3,FeO,MnO,MgO,NiO,CoO,CaO,Na2O,K2O,P2O5,H2O,CO2,Press,Temp
Label,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
BT-ex,77.43805,0.079936,12.490008,0.206835,0.0,0.472622,0.0,0.029976,0.0,0.0,0.429656,3.976819,4.876099,0.0,5.5,0.05,500,900
TVZMa-ex,78.377838,0.130013,11.941194,0.0,0.0,0.990099,0.040004,0.050005,0.0,0.0,0.530053,3.80038,4.140414,0.0,4.06,0.005,600,800
TVZOh-ex,77.970173,0.080072,12.160945,0.0,0.0,0.950856,0.050045,0.060054,0.0,0.0,0.550495,4.053648,4.123711,0.0,4.63,0.005,50,900
Oh48-FTIR1-MI1-a,78.266165,0.029799,12.019411,0.0,0.0,0.982752,0.033598,0.051497,0.0,0.0,0.477177,4.049802,4.0898,0.0,4.214912,0.004566,250,950
Oh48-FTIR1-MI1-b,78.266165,0.029799,12.019411,0.0,0.0,0.982752,0.033598,0.051497,0.0,0.0,0.477177,4.049802,4.0898,0.0,4.005816,0.004448,500,1025
Oh48-FTIR1-MI1-IRc,78.266165,0.029799,12.019411,0.0,0.0,0.982752,0.033598,0.051497,0.0,0.0,0.477177,4.049802,4.0898,0.0,3.885649,0.004654,5000,925
Oh50-4.1,77.909065,0.098399,12.069855,0.0,0.0,1.055587,0.0257,0.099899,0.0,0.0,0.521594,4.039952,4.17995,0.0,4.641843,0.004566,1000,862
Oh50-4.2,77.909065,0.098399,12.069855,0.0,0.0,1.055587,0.0257,0.099899,0.0,0.0,0.521594,4.039952,4.17995,0.0,4.402133,0.004448,100,770
Oh49-4.1,77.913533,0.009899,12.108995,0.0,0.0,1.001917,0.067194,0.054595,0.0,0.0,0.534556,4.009667,4.299643,0.0,4.283934,0.004566,1000,855
Oh49-4.2,77.913533,0.009899,12.108995,0.0,0.0,1.001917,0.067194,0.054595,0.0,0.0,0.534556,4.009667,4.299643,0.0,4.230533,0.004448,500,1000


## Normalize a single sample composition

### Extract a single sample from your dataset

In [6]:
SampleName = 'BT-ex'
extracted_bulk_comp = myfile.get_sample_oxide_comp(SampleName)

### Standard Normalization

In [7]:
single_standard = v.normalize(extracted_bulk_comp)
single_standard

{'SiO2': 73.3693079617533,
 'TiO2': 0.07573605983148728,
 'Al2O3': 11.833759348669886,
 'Fe2O3': 0.1959670548139733,
 'Cr2O3': 0.0,
 'FeO': 0.44778945375366846,
 'MnO': 0.0,
 'MgO': 0.028401022436807727,
 'NiO': 0.0,
 'CoO': 0.0,
 'CaO': 0.4070813215942441,
 'Na2O': 3.7678689766164917,
 'K2O': 4.619899649720724,
 'P2O5': 0.0,
 'H2O': 5.2068541134147495,
 'CO2': 0.04733503739467954}

### FixedVolatiles Normalization

In [8]:
single_fixed = v.normalize_FixedVolatiles(extracted_bulk_comp)
single_fixed

{'SiO2': 73.1402378097522,
 'TiO2': 0.07549960031974419,
 'Al2O3': 11.79681254996003,
 'Fe2O3': 0.19535521582733809,
 'Cr2O3': 0.0,
 'FeO': 0.4463913868904875,
 'MnO': 0.0,
 'MgO': 0.02831235011990407,
 'NiO': 0.0,
 'CoO': 0.0,
 'CaO': 0.405810351718625,
 'Na2O': 3.756105115907274,
 'K2O': 4.6054756195043955,
 'P2O5': 0.0,
 'CO2': 0.05,
 'H2O': 5.5}

### AdditionalVolatiles Normalization

In [9]:
single_additional = v.normalize_AdditionalVolatiles(extracted_bulk_comp)
single_additional

  normalized = pd.Series({})


{'SiO2': 77.4380495603517,
 'TiO2': 0.07993605115907274,
 'Al2O3': 12.490007993605113,
 'Fe2O3': 0.20683453237410068,
 'Cr2O3': 0.0,
 'FeO': 0.4726219024780175,
 'MnO': 0.0,
 'MgO': 0.029976019184652272,
 'NiO': 0.0,
 'CoO': 0.0,
 'CaO': 0.4296562749800159,
 'Na2O': 3.9768185451638685,
 'K2O': 4.8760991207034365,
 'P2O5': 0.0,
 'H2O': 5.5,
 'CO2': 0.05}

In [10]:
myfile.save_excelfile(filename='ex_normalize_tables.xlsx', calculations=[standard, fixed, additional])

Saved ex_normalize_tables.xlsx
