| [**Overview**](./00_overview.ipynb) | [Getting Started](./01_jupyter_python.ipynb) | **Examples:** | [Access](./02_accessing_indexing.ipynb) | [Transform](./03_transform.ipynb) | [Plotting](./04_simple_vis.ipynb) | [Norm-Spiders](./05_norm_spiders.ipynb) | [Minerals](./06_minerals.ipynb) | **Workflows:** | [lambdas](./07_lambdas.ipynb) | [CIPW](./08_CIPW_Norm.ipynb)  | [ML](./11_geochem_ML.ipynb) | [Spatial Data](./12_spatial_geochem.ipynb) |
| -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |

# Acccessing and Indexing Geochemical Data with pyrolite


The `pyrolite.pyrochem` API provides access to indexing and transformation functions. This allows easy subsetting of geochemical datasets which can otherwise be unweildly (expecially as the number of columns increases..). To provide a simple illustration we generate a synthetic dataset to work from, which contains an array of typical geochemical measures - oxide components, element components (here as ppm), element ratios and isotope ratios. While this size dataset is managable, some of the indexing tools pyrolite provides make it straightforward to pull out different parts of the dataset.

A key thing to note here is that `pyrolite` provides a range of built-in functionality around known elements, oxides and isotope ratios. Typically, this means that having column names which are names of elements, oxide components or isotope ratios in the form $^{xx}A/^{xy}B$. This means you'll need to keep track of units etc yourself - which is where comments can be useful. This aspect may change in the future, with some ability to deal with units, non-standard column names and potentially even molecular ions (e.g. for hydrogeochemistry).

First let's create some data to play with:

In [None]:
import numpy as np
import pandas as pd
import pyrolite.geochem
from synthdata import get_synthetic_data

pd.options.display.precision = 3  # smaller graphical outputs

# create a dataframe which contains some oxides and elements
df = get_synthetic_data(
    columns=["CaO", "MgO", "SiO2", "FeO", "Na2O", "Ni", "Ti", "La", "Lu", "Te", "Y"],
    isotopes=True,
)
df

Unnamed: 0,CaO,MgO,SiO2,FeO,Na2O,Ni,Ti,La,Lu,Te,Y,Sr87/Sr86
0,4.664,0.813,4.929,4.02,8.933,122.687,234.151,33.175,39.962,291.972,44.472,0.71
1,4.45,0.68,4.885,3.851,9.114,122.604,226.745,32.907,37.888,304.841,45.203,0.71
2,5.529,1.032,5.656,4.351,9.128,116.602,237.472,31.198,42.361,266.793,48.618,0.71
3,5.952,0.818,5.268,3.922,9.307,125.765,214.353,30.167,39.707,289.203,48.133,0.71
4,4.108,0.672,4.348,3.731,8.918,122.056,230.911,32.128,36.899,315.416,44.817,0.71
5,4.909,0.685,5.419,4.04,9.596,119.492,214.651,32.919,38.397,299.354,48.7,0.71
6,5.374,0.959,5.696,4.429,8.927,114.807,242.492,32.403,42.454,265.043,48.956,0.71
7,4.544,0.746,4.878,3.941,9.138,123.788,224.32,31.953,38.827,298.667,49.989,0.71
8,5.518,0.831,5.964,4.222,9.518,123.117,218.928,33.22,39.767,275.591,48.855,0.71
9,4.968,0.719,5.023,3.825,9.299,124.131,219.47,31.087,38.0,301.39,47.585,0.71


To save a bit of space, below we'll use `df.head(2)` to restrict the output to just the top two rows:

In [3]:
df.head(2)

Unnamed: 0,CaO,MgO,SiO2,FeO,Na2O,Ni,Ti,La,Lu,Te,Y,Sr87/Sr86
0,4.664,0.813,4.929,4.02,8.933,122.687,234.151,33.175,39.962,291.972,44.472,0.71
1,4.45,0.68,4.885,3.851,9.114,122.604,226.745,32.907,37.888,304.841,45.203,0.71


### Selecting Data

To select a subset of columns from the dataframe, the `pyrolite.pyrochem` API has a few built-in selectors - *note that these are returning DataFrames*:

In [None]:
df.head(2).pyrochem.oxides

In [None]:
df.head(2).pyrochem.elements

In [None]:
df.head(2).pyrochem.REE

In [None]:
df.head(2).pyrochem.REY

In [None]:
df.head(2).pyrochem.compositional

In [None]:
df.head(2).pyrochem.isotope_ratios

---
These selectors can also be used to re-assign modified subsets back to the dataframe - useful if you wanted to just process/scale/transform/standardise a subset of the data - e.g. normalising the strontium isotope ratios:

In [None]:
df.pyrochem.isotope_ratios /= 0.7016

In [None]:
df.head(2).pyrochem.isotope_ratios

------
### Listing Columns
If you *just want the column names* (to see what you have, or use them somewhere), it's easy enough to get those too (these are actually used in the indexing above):

In [None]:
df.pyrochem.list_oxides

In [None]:
df.pyrochem.list_elements

In [None]:
df.pyrochem.list_REE

In [None]:
df.pyrochem.list_compositional

In [None]:
df.pyrochem.list_isotope_ratios

----
<div class='alert alert-warning'> <font size="+1" color="black"><b> Checkpoint & Time Check</b><br>How are things going?</font></div>

----

| [**Overview**](./00_overview.ipynb) | [Getting Started](./01_jupyter_python.ipynb) | **Examples:** | [Access](./02_accessing_indexing.ipynb) | [Transform](./03_transform.ipynb) | [Plotting](./04_simple_vis.ipynb) | [Norm-Spiders](./05_norm_spiders.ipynb) | [Minerals](./06_minerals.ipynb) | **Workflows:** | [lambdas](./07_lambdas.ipynb) | [CIPW](./08_CIPW_Norm.ipynb)  | [ML](./11_geochem_ML.ipynb) | [Spatial Data](./12_spatial_geochem.ipynb) |
| -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- |