| [**Overview**](./00_overview.ipynb) | [Getting Started](./01_jupyter_python.ipynb) | **Examples:** | [Access](./02_accessing_indexing.ipynb) | [Transform](./03_transform.ipynb) | [Plotting](./04_simple_vis.ipynb) | [Norm-Spiders](./05_norm_spiders.ipynb) | [Minerals](./06_minerals.ipynb) | [lambdas](./07_lambdas.ipynb) |
| ----------------------------------- | -------------------------------------------- | ------------- | --------------------------------------- | --------------------------------- | --------------------------------- | --------------------------------------- | ------------------------------- | ----------------------------- |

# Acccessing and Indexing Geochemical Data with pyrolite
##### *~6 min*

The `pyrolite.pyrochem` API provides access to indexing and transformation functions. This allows easy subsetting of geochemical datasets which can otherwise be unweildly (expecially as the number of columns increases..). To provide a simple illustration we generate a synthetic dataset to work from, which contains an array of typical geochemical measures - oxide components, element components (here as ppm), element ratios and isotope ratios. While this size dataset is managable, some of the indexing tools pyrolite provides make it straightforward to pull out different parts of the dataset.

A key thing to note here is that `pyrolite` provides a range of built-in functionality around known elements, oxides and isotope ratios. Typically, this means that having column names which are names of elements, oxide components or isotope ratios in the form $^{xx}A/^{xy}B$. This means you'll need to keep track of units etc yourself - which is where comments can be useful. This aspect may change in the future, with some ability to deal with units, non-standard column names and potentially even molecular ions (e.g. for hydrogeochemistry).

First let's create some data to play with:

In [1]:
import pyrolite.geochem
import pandas as pd
import numpy as np
from pyrolite.util.synthetic import normal_frame

pd.set_option("precision", 3)  # smaller graphical outputs
 
df = normal_frame(columns=['CaO', 'MgO', 'SiO2', 'FeO','Na2O', 'Ni', 'Ti', 'La', 'Lu', 'Te']) * 100
df[['Ni', 'Ti', 'La', 'Lu', 'Te']] *= 10
df['Sr87/Sr86'] = 0.0700  / 0.0986 + np.random.randn(df.index.size) * 0.0001
df

Unnamed: 0,CaO,MgO,SiO2,FeO,Na2O,Ni,Ti,La,Lu,Te,Sr87/Sr86
0,26.799,2.489,5.336,1.992,10.286,174.312,49.889,75.802,163.274,67.706,0.71
1,25.39,2.804,5.797,2.08,10.005,202.835,47.814,80.021,141.49,67.088,0.71
2,25.687,2.663,5.431,2.035,10.275,188.772,48.529,78.162,156.912,66.723,0.71
3,27.209,2.447,5.256,2.058,10.576,187.208,50.801,78.834,141.636,66.058,0.71
4,28.377,2.428,5.508,1.97,9.678,181.408,49.806,78.983,145.503,64.687,0.71
5,25.889,2.754,5.649,1.968,10.376,189.78,47.289,78.516,151.666,66.394,0.71
6,27.587,2.355,4.963,2.085,10.287,188.452,50.007,76.491,146.636,65.644,0.71
7,24.56,2.744,5.245,2.099,10.623,194.78,47.718,77.234,160.336,67.215,0.71
8,27.816,2.428,5.259,2.048,10.187,183.689,49.761,78.738,145.15,65.281,0.71
9,26.73,2.423,5.129,2.016,10.2,187.894,49.426,76.077,155.73,65.88,0.71


To save a bit of space, below we'll use `df.head(2)` to restrict the output to just the top two rows:

In [2]:
df.head(2)

Unnamed: 0,CaO,MgO,SiO2,FeO,Na2O,Ni,Ti,La,Lu,Te,Sr87/Sr86
0,26.799,2.489,5.336,1.992,10.286,174.312,49.889,75.802,163.274,67.706,0.71
1,25.39,2.804,5.797,2.08,10.005,202.835,47.814,80.021,141.49,67.088,0.71


### Selecting Data

To select a subset of columns from the dataframe, the `pyrolite.pyrochem` API has a few built-in selectors:

In [3]:
df.head(2).pyrochem.oxides

Unnamed: 0,CaO,MgO,SiO2,FeO,Na2O
0,26.799,2.489,5.336,1.992,10.286
1,25.39,2.804,5.797,2.08,10.005


In [4]:
df.head(2).pyrochem.elements

Unnamed: 0,Ni,Ti,La,Lu,Te
0,174.312,49.889,75.802,163.274,67.706
1,202.835,47.814,80.021,141.49,67.088


In [5]:
df.head(2).pyrochem.REE

Unnamed: 0,La,Lu
0,75.802,163.274
1,80.021,141.49


In [6]:
df.head(2).pyrochem.REY

Unnamed: 0,La,Lu
0,75.802,163.274
1,80.021,141.49


In [7]:
df.head(2).pyrochem.compositional

Unnamed: 0,CaO,MgO,SiO2,FeO,Na2O,Ni,Ti,La,Lu,Te
0,26.799,2.489,5.336,1.992,10.286,174.312,49.889,75.802,163.274,67.706
1,25.39,2.804,5.797,2.08,10.005,202.835,47.814,80.021,141.49,67.088


In [8]:
df.head(2).pyrochem.isotope_ratios

Unnamed: 0,Sr87/Sr86
0,0.71
1,0.71


---
These selectors can also be used to re-assign modified subsets back to the dataframe - useful if you wanted to just process/scale/transform/standardise a subset of the data:

In [9]:
df.pyrochem.isotope_ratios /= 0.7016

In [10]:
df.head(2).pyrochem.isotope_ratios

Unnamed: 0,Sr87/Sr86
0,1.012
1,1.012


------
### Listing Columns
If you just want the column names, it's easy enough to get those too (these are actually used in the indexing above):

In [11]:
df.pyrochem.list_oxides

['CaO', 'MgO', 'SiO2', 'FeO', 'Na2O']

In [12]:
df.pyrochem.list_elements

['Ni', 'Ti', 'La', 'Lu', 'Te']

In [13]:
df.pyrochem.list_REE

['La', 'Lu']

In [14]:
df.pyrochem.list_compositional

['CaO', 'MgO', 'SiO2', 'FeO', 'Na2O', 'Ni', 'Ti', 'La', 'Lu', 'Te']

In [15]:
df.pyrochem.list_isotope_ratios

['Sr87/Sr86']

----
<div class='alert alert-warning'> <b> <font size="+1">Checkpoint & Time Check</font></b><br>How are things going?</div>

----

| [**Overview**](./00_overview.ipynb) | [Getting Started](./01_jupyter_python.ipynb) | **Examples:** | [Access](./02_accessing_indexing.ipynb) | [Transform](./03_transform.ipynb) | [Plotting](./04_simple_vis.ipynb) | [Norm-Spiders](./05_norm_spiders.ipynb) | [Minerals](./06_minerals.ipynb) | [lambdas](./07_lambdas.ipynb) |
| ----------------------------------- | -------------------------------------------- | ------------- | --------------------------------------- | --------------------------------- | --------------------------------- | --------------------------------------- | ------------------------------- | ----------------------------- |