| [**Overview**](./00_overview.ipynb) | [Getting Started](./01_jupyter_python.ipynb) | **Examples:** | [Access](./02_accessing_indexing.ipynb) | [Transform](./03_transform.ipynb) | [Plotting](./04_simple_vis.ipynb) | [Norm-Spiders](./05_norm_spiders.ipynb) | [Minerals](./06_minerals.ipynb) | [lambdas](./07_lambdas.ipynb) |
| ----------------------------------- | -------------------------------------------- | ------------- | --------------------------------------- | --------------------------------- | --------------------------------- | --------------------------------------- | ------------------------------- | ----------------------------- |

# Transforming Geochemical Data


In [1]:
import pyrolite.geochem
import pandas as pd
import numpy as np
from pyrolite.util.synthetic import normal_frame

pd.set_option("precision", 3)  # smaller graphical outputs
 
df = normal_frame(columns=['CaO', 'MgO', 'SiO2', 'FeO','Na2O', 'Ni', 'Ti', 'La', 'Lu', 'Te']) * 100
df[['Ni', 'Ti', 'La', 'Lu', 'Te']] *= 10
df['Sr87/Sr86'] = 0.0700  / 0.0986 + np.random.randn(df.index.size) * 0.0001
df

Unnamed: 0,CaO,MgO,SiO2,FeO,Na2O,Ni,Ti,La,Lu,Te,Sr87/Sr86
0,7.363,14.097,15.794,7.463,7.039,32.75,164.56,76.737,43.809,164.577,0.71
1,7.493,14.707,15.92,6.828,6.671,34.572,158.108,77.831,47.156,166.151,0.71
2,7.37,14.602,15.867,6.799,6.792,34.768,163.208,79.196,45.55,162.987,0.71
3,7.44,13.762,15.073,7.021,7.091,33.215,178.057,76.028,43.147,165.68,0.71
4,7.504,13.433,15.239,7.125,7.326,35.81,180.651,77.172,47.91,152.186,0.71
5,7.618,14.374,15.454,7.36,6.902,33.729,163.031,77.882,45.755,162.519,0.71
6,7.263,14.868,16.324,6.899,6.697,34.293,154.209,76.046,43.798,171.145,0.71
7,7.264,13.7,15.71,7.712,7.084,30.635,167.176,75.267,47.722,164.503,0.71
8,7.079,13.788,15.717,7.058,7.254,31.553,170.261,78.28,45.911,165.037,0.71
9,7.836,13.898,15.206,7.256,7.219,34.807,165.493,77.051,47.255,161.259,0.71


----
### Using Indexers, Scaling

You can also use these indexers for assignment, where the dimensionality of the dataset doesn't change. While you can transform elements and oxide abundnace units easily when you remember the relative scales, `pyrolite` provides some functions such that you don't have to rely on your memory. Here we create a copy of the dataframe and within it revert the change we made above - so these should be the orignal ppm values. This method provides an easy way to explicitly declare your intention when changing units - and makes sure the relative scales are correct!

In [2]:
from pyrolite.util.units import scale

els = df.pyrochem.elements.copy() # get a copy of just the elements from the dataframe, we'll then edit this version
els.pyrochem.elements *=  scale('ppm', 'wt%')

In [3]:
df.pyrochem.elements, els.pyrochem.elements

(       Ni       Ti      La      Lu       Te
 0  32.750  164.560  76.737  43.809  164.577
 1  34.572  158.108  77.831  47.156  166.151
 2  34.768  163.208  79.196  45.550  162.987
 3  33.215  178.057  76.028  43.147  165.680
 4  35.810  180.651  77.172  47.910  152.186
 5  33.729  163.031  77.882  45.755  162.519
 6  34.293  154.209  76.046  43.798  171.145
 7  30.635  167.176  75.267  47.722  164.503
 8  31.553  170.261  78.280  45.911  165.037
 9  34.807  165.493  77.051  47.255  161.259,
       Ni     Ti     La     Lu     Te
 0  0.003  0.016  0.008  0.004  0.016
 1  0.003  0.016  0.008  0.005  0.017
 2  0.003  0.016  0.008  0.005  0.016
 3  0.003  0.018  0.008  0.004  0.017
 4  0.004  0.018  0.008  0.005  0.015
 5  0.003  0.016  0.008  0.005  0.016
 6  0.003  0.015  0.008  0.004  0.017
 7  0.003  0.017  0.008  0.005  0.016
 8  0.003  0.017  0.008  0.005  0.017
 9  0.003  0.017  0.008  0.005  0.016)

---
### Converting Chemical Components 

`pyrolite` provides some straightfoward methods to calcuate element-oxide conversions (e.g. to transform Ti abundance to TiO2 abudnance), assuming that the system is open to oxygen (i.e. in this case the extra oxygen will be added to the composition). This interface also allows the user to quickly add ratios and specify redox pairs at the same time. For example, we can transform a copy of our dataframe to include extra ratios and change some of our oxide components to elements:

In [4]:
df.pyrochem.convert_chemistry(
    to=["MgO", "SiO2", "FeO", "Ca", "Te", "Na", "Na/Te", "MgO/SiO2"]
)

Unnamed: 0,Sr87/Sr86,MgO,SiO2,FeO,Ca,Te,Na,Na/Te,MgO/SiO2
0,0.71,14.097,15.794,7.463,5.263,164.577,5.222,0.032,0.893
1,0.71,14.707,15.92,6.828,5.355,166.151,4.949,0.03,0.924
2,0.71,14.602,15.867,6.799,5.267,162.987,5.038,0.031,0.92
3,0.71,13.762,15.073,7.021,5.317,165.68,5.26,0.032,0.913
4,0.71,13.433,15.239,7.125,5.363,152.186,5.435,0.036,0.882
5,0.71,14.374,15.454,7.36,5.445,162.519,5.12,0.032,0.93
6,0.71,14.868,16.324,6.899,5.191,171.145,4.969,0.029,0.911
7,0.71,13.7,15.71,7.712,5.191,164.503,5.256,0.032,0.872
8,0.71,13.788,15.717,7.058,5.06,165.037,5.382,0.033,0.877
9,0.71,13.898,15.206,7.256,5.6,161.259,5.355,0.033,0.914


In a similar way, we can also specify the molar speciation for redox species (so far just iron; others could be incorporated if they'll be useful). Here we adjust the total iron within our compositions (currently specified as FeO) to have a $Fe^{2+}/Fe^{3+}$ ratio of 9:1 (roughly what you might expect from a normal mantle-derived magma):

In [5]:
df.pyrochem.convert_chemistry(to=[{"FeO": 0.9, "Fe2O3": 0.1}])

Unnamed: 0,Sr87/Sr86,FeO,Fe2O3
0,0.71,6.716,0.829
1,0.71,6.145,0.759
2,0.71,6.119,0.756
3,0.71,6.319,0.78
4,0.71,6.412,0.792
5,0.71,6.624,0.818
6,0.71,6.209,0.767
7,0.71,6.94,0.857
8,0.71,6.352,0.784
9,0.71,6.53,0.806


----
<div class='alert alert-warning'> <font size="+1" color="black"><b> Checkpoint & Time Check</b><br>How are things going?</font></div>

----


### Compositional Data - Logratio Transformation (Optional)

pyrolite includes a few functions for dealing with compositional data, at the heart of which are i) closure (i.e. everything sums to 100%) and ii) log-transforms to deal with
the compositional space. We don't quite have room in this workshop to get into compositional data anlaysis, but there's a bit more information in [the pyrolite documentation](https://pyrolite.readthedocs.io/en/master/examples/index.html#compositional-data-examples). 

The commonly used log-transformations include the Additive Log-Ratio (`ALR`), Centred Log-Ratio (`CLR`), and Isometric Log-Ratio (`ILR`). Let's have a look at one of the log-transforms, which can be accessed directly from your dataframes (via the `df.pyrocomp` API). A key thing to note here is that everything should start in the same units and sum to one if you want it to be able to be back-transformed! Note we're using `df.pyrochem.compositional` to extract the elements and oxides by leave other columns alone:

In [6]:
scaled_df = df.copy()
scaled_df.pyrochem.elements *= scale('ppm', 'wt%')
scaled_df.pyrochem.compositional = scaled_df.pyrochem.compositional.pyrocomp.renormalise(scale=1)

In [7]:
scaled_df.head()

Unnamed: 0,CaO,MgO,SiO2,FeO,Na2O,Ni,Ti,La,Lu,Te,Sr87/Sr86
0,0.142,0.272,0.305,0.144,0.136,6.322e-05,0.0003177,0.0001481,8.457e-05,0.0003177,0.71
1,0.145,0.285,0.308,0.132,0.129,6.691e-05,0.000306,0.0001506,9.127e-05,0.0003216,0.71
2,0.143,0.284,0.308,0.132,0.132,6.754e-05,0.000317,0.0001538,8.848e-05,0.0003166,0.71
3,0.148,0.273,0.299,0.139,0.141,6.586e-05,0.000353,0.0001507,8.555e-05,0.0003285,0.71
4,0.148,0.265,0.301,0.141,0.145,7.066e-05,0.0003565,0.0001523,9.454e-05,0.0003003,0.71


In [8]:
lr_df = scaled_df.pyrochem.compositional.pyrocomp.CLR()
lr_df.head()

Unnamed: 0,CLR(CaO/G),CLR(MgO/G),CLR(SiO2/G),CLR(FeO/G),CLR(Na2O/G),CLR(Ni/G),CLR(Ti/G),CLR(La/G),CLR(Lu/G),CLR(Te/G)
0,3.284,3.933,4.047,3.297,3.239,-4.434,-2.82,-3.583,-4.143,-2.82
1,3.298,3.972,4.051,3.205,3.181,-4.384,-2.863,-3.572,-4.073,-2.814
2,3.282,3.966,4.049,3.202,3.201,-4.377,-2.83,-3.554,-4.107,-2.832
3,3.298,3.913,4.004,3.24,3.25,-4.416,-2.737,-3.588,-4.155,-2.809
4,3.29,3.872,3.998,3.238,3.266,-4.358,-2.739,-3.59,-4.067,-2.911


In [9]:
back_transformed = lr_df.pyrocomp.inverse_CLR() 
back_transformed.head()

Unnamed: 0,CaO,MgO,SiO2,FeO,Na2O,Ni,Ti,La,Lu,Te
0,0.142,0.272,0.305,0.144,0.136,6.322e-05,0.0003177,0.0001481,8.457e-05,0.0003177
1,0.145,0.285,0.308,0.132,0.129,6.691e-05,0.000306,0.0001506,9.127e-05,0.0003216
2,0.143,0.284,0.308,0.132,0.132,6.754e-05,0.000317,0.0001538,8.848e-05,0.0003166
3,0.148,0.273,0.299,0.139,0.141,6.586e-05,0.000353,0.0001507,8.555e-05,0.0003285
4,0.148,0.265,0.301,0.141,0.145,7.066e-05,0.0003565,0.0001523,9.454e-05,0.0003003


One of the key areas where these logratio transforms might be useful is in deriving statistical properties from your geochemical data, for example calculating a mean. There's a specific function dedicated to this aspect:

In [10]:
scaled_df.pyrochem.compositional.pyrocomp.logratiomean()

CaO     1.445e-01
MgO     2.748e-01
SiO2    3.042e-01
FeO     1.391e-01
Na2O    1.364e-01
Ni      6.538e-05
Ti      3.238e-04
La      1.502e-04
Lu      8.912e-05
Te      3.184e-04
dtype: float64

| [**Overview**](./00_overview.ipynb) | [Getting Started](./01_jupyter_python.ipynb) | **Examples:** | [Access](./02_accessing_indexing.ipynb) | [Transform](./03_transform.ipynb) | [Plotting](./04_simple_vis.ipynb) | [Norm-Spiders](./05_norm_spiders.ipynb) | [Minerals](./06_minerals.ipynb) | [lambdas](./07_lambdas.ipynb) |
| ----------------------------------- | -------------------------------------------- | ------------- | --------------------------------------- | --------------------------------- | --------------------------------- | --------------------------------------- | ------------------------------- | ----------------------------- |