# <u>Recalculations & Fe Handling</u>

This notebook provides worked examples of the [`recalc()`](#recalculations), [`recalc_Fe()`](#recalc_fe), and [`iron_ratios()`](#iron_ratios) functions from the `lydwhitt-tools` package.

In [20]:
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import pandas as pd
import lydwhitt_tools as lwt
import numpy as np

In [21]:
df = pd.read_excel('example_data/Geoscore_filter_ExampleData.xlsx', sheet_name="Sheet1")
df2 = pd.read_excel('example_data/Geoscore_filter_ExampleData2.xlsx', sheet_name="Sheet1")

## Recalculations

#### <i>What are the recalculations and when would I use them?</i>

The recalc() function is a phase-specific recalculation tool that I use to standardise major element geochemical data before any interpretation or further processing. In reality, geochemical datasets are reported in lots of slightly different ways, with different normalisation choices, iron reporting conventions, and phase-specific calculations applied inconsistently between studies. This function exists to deal with that mess once, in a consistent way, so the data are actually comparable.

By recalculating compositions into phase-appropriate formats (for example cation fractions for liquids or atoms per formula unit for minerals), the function makes it straightforward to calculate commonly used petrological indices such as Mg# or An, and ensures those values are always derived in the same way. This avoids having to repeatedly redo the same manual recalculations when compiling new datasets or revisiting older ones, which is both time-consuming and a common source of error.

> <span style="color:#2c5aa0"><b>Research note:</b> I use this function as the starting point for almost all of my geochemical workflows. Having the data standardised at this stage makes later filtering, plotting, and interpretation much more robust, and reduces the risk of small but important mistakes creeping in when working with large literature compilations.</span>

#### <i>How the `recalc()` function works</i>

The `recalc()` function is a phase-specific geochemical recalculation tool designed to standardise major element data prior to interpretation, plotting, or further processing. In practice, geochemical datasets are often reported using a range of normalisation schemes, iron conventions, and phase-specific calculations that make direct comparison between studies difficult. This function provides a consistent way to bring all data into a common, analysis-ready format.

Rather than introducing new methodology, `recalc()` applies standard petrological recalculations that are widely used across the community, but packages them into a single, reproducible workflow. This avoids the need to repeatedly apply the same manual recalculations when compiling new datasets or revisiting older ones, and reduces the risk of small but important inconsistencies being introduced.

The function handles:
1) standardising iron reporting to a single total-iron column (<b>FeO<sub>T</sub></b>),  
2) applying phase-appropriate normalisation (cation fractions for liquids, atoms per formula unit for minerals),  
3) calculating commonly used petrological indices for each phase.

- **Input:** A <span style="color:#4a78a8"><b>DataFrame</b></span> containing major element oxide data for a single phase, and a phase label <span style="color:#2f7d32"><b>phase</b></span> (<b>"Liq"</b>, <b>"Plg"</b>, <b>"Cpx"</b>, or <b>"Ol"</b>). Optional inputs control whether liquid compositions are normalised anhydrous and whether intermediate molar values are retained.

#### Phase-specific behaviour

The behaviour of `recalc()` depends on the phase being analysed. Each phase is recalculated using conventions appropriate to how that phase is typically interpreted in petrology.

| Phase | Output format | Indices calculated |
|------|---------------|-------------------|
| Liquid (Liq) | Cation fractions | Mg# |
| Olivine (Ol) | Atoms per formula unit (apfu) | Fo, Fa, Mg# |
| Clinopyroxene (Cpx) | Atoms per formula unit (apfu) | Wo, En, Fs, Mg# |
| Plagioclase (Plg) | Atoms per formula unit (apfu) | An, Ab, Or |

These phase-specific recalculations are intentionally explicit rather than fully automated. The aim is to make the assumptions behind each recalculation clear, reproducible, and easy to audit, rather than hiding them inside opaque processing steps.

- **Output:** A <span style="color:#4a78a8"><b>DataFrame</b></span> containing the recalculated chemistry, including:
  - standardised oxide columns with phase suffixes preserved,
  - phase-appropriate compositional variables,
  - commonly used petrological indices for that phase.

> <span style="color:#2c5aa0"><b>Research note:</b> I use this function as the first step in almost all of my geochemical workflows. Standardising the data at this stage makes later filtering, plotting, and interpretation much more robust, and significantly reduces the potential for human error when working with large literature compilations.</span>

**Example usage:**
- <span style="color:#4a78a8"><b>df</b></span> → the input DataFrame  
- <span style="color:#2f7d32"><b>"Liq"</b></span> → the phase label  

In [22]:
lwt.recalc(df, 'Liq', anhydrous=True, mol_values=True)

Unnamed: 0,Sample_ID,SiO2_Liq,TiO2_Liq,Al2O3_Liq,FeOt_Liq,CaO_Liq,MgO_Liq,MnO_Liq,K2O_Liq,Na2O_Liq,...,Si,Ti,Al,Fe,Mg,Ca,Na,K,Mn,Mg_num
0,ID00376,48.589087,1.974754,17.732282,10.518306,11.737130,6.288833,0.162031,0.324062,2.673514,...,0.452180,0.013826,0.194488,0.081863,0.087248,0.117033,0.048239,0.003847,0.001277,51.592082
1,ID00377,49.549051,3.650003,13.772816,14.974573,8.954399,4.787398,0.227479,0.651417,3.432864,...,0.469832,0.026038,0.153916,0.118749,0.067673,0.090974,0.063111,0.007880,0.001827,36.301077
2,ID00378,70.855184,0.412066,13.568014,4.521383,1.839220,0.261310,0.130655,2.904560,5.507609,...,0.657332,0.002876,0.148349,0.035079,0.003614,0.018282,0.099065,0.034376,0.001027,9.339918
3,ID00529,48.945380,3.263707,13.730079,12.698386,11.827103,6.353487,0.102311,0.388780,2.690768,...,0.461115,0.023132,0.152449,0.100049,0.089232,0.119385,0.049149,0.004673,0.000816,47.142499
4,ID00560,70.720272,0.698002,13.272610,4.532392,1.163336,0.634547,0.095182,3.257342,5.626317,...,0.654722,0.004862,0.144819,0.035092,0.008758,0.011540,0.100991,0.038471,0.000746,19.971968
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
114,ID05364,58.142040,1.993266,13.574654,12.030520,5.642476,2.034154,0.316878,1.451507,4.814504,...,0.547300,0.014116,0.150598,0.094708,0.028545,0.056909,0.087868,0.017431,0.002526,23.159611
115,ID00223,71.843266,0.297466,13.796287,4.390194,1.795056,0.307724,,2.882347,4.687660,...,0.670515,0.002089,0.151754,0.034267,0.004281,0.017950,0.084825,0.034319,,11.106787
116,ID02066,71.251429,0.413783,13.452996,4.244538,1.534025,0.282584,0.161476,2.876297,5.782871,...,0.659424,0.002881,0.146739,0.032853,0.003899,0.015212,0.103767,0.033960,0.001266,10.608523
117,ID02067,70.991305,0.412272,13.242996,4.815340,1.739589,0.180998,0.180998,2.674743,5.761759,...,0.658505,0.002877,0.144775,0.037355,0.002503,0.017289,0.103622,0.031652,0.001422,6.279471


## Recalc_Fe

#### <i>What are the  iron recalculations and when would I use them?</i>

The `recalc_Fe()` function is a small but important standardisation step for major element datasets. Different studies report iron in different ways (for example as FeO, Fe2O3, Fe2O3t, or already as total iron), which makes it difficult to compare compositions directly or run consistent downstream calculations. This function exists to deal with that inconsistency once, in a predictable way, so iron is always handled the same across a compiled dataset.

In practice, `recalc_Fe()` ensures your dataframe contains a single total-iron column (<b>FeO<sub>T</sub></b>, stored as <b>FeOt</b>) by converting any available iron reporting format into a common FeO-equivalent value. This is useful because many petrological indices and plots assume iron is represented consistently, and small differences in iron handling can propagate into later interpretation.

> <span style="color:#2c5aa0"><b>Research note:</b> I use this function any time I compile literature datasets, because iron reporting is one of the most common sources of inconsistency between papers. Standardising iron early means later recalculations, filtering, and trend analysis behave much more consistently, and it reduces the risk of subtle errors creeping in when combining data from multiple sources.</span>

The `recalc_Fe()` function is a standardisation utility designed to ensure iron is represented consistently within major element geochemical datasets. In practice, iron is reported in a variety of formats across the literature (for example FeO, Fe<sub>2</sub>O<sub>3</sub>, Fe<sub>2</sub>O<sub>3t</sub>, or already as total iron), which makes direct comparison between datasets difficult and can introduce subtle inconsistencies into downstream calculations.

Rather than introducing new assumptions, this function applies standard stoichiometric conversions to express all iron as a single total-iron value (<b>FeO<sub>T</sub></b>, stored as <b>FeOt</b>). This provides a consistent basis for recalculation, plotting, filtering, and the calculation of petrological indices that depend on iron content.

- **Input:** A <span style="color:#4a78a8"><b>DataFrame</b></span> containing major element oxide data, with iron reported as any combination of  
  <span style="color:#2f7d32"><b>FeO</b></span>,  
  <span style="color:#2f7d32"><b>Fe<sub>2</sub>O<sub>3</sub></b></span>, or  
  <span style="color:#2f7d32"><b>Fe<sub>2</sub>O<sub>3t</sub></b></span>.

- **Output:** A <span style="color:#4a78a8"><b>DataFrame</b></span> with a single total-iron column  
  <span style="color:#2f7d32"><b>FeOt</b></span> (FeO-equivalent), suitable for use in recalculation, filtering, and plotting workflows. Raw iron oxide columns are removed to avoid double counting.


**Example usage:**
- <span style="color:#4a78a8"><b>df</b></span> → the input DataFrame  

In [23]:
lwt.recalc_Fe(df2)

Unnamed: 0,Sample_ID,SiO2,TiO2,Al2O3,FeOt,CaO,MgO,MnO,K2O,Na2O,P2O5,Total
0,ID00376,47.98,1.95,17.51,9.150966,11.59,6.21,0.16,0.32,2.64,0.22,99.02
1,ID00377,47.92,3.53,13.32,6.604532,8.66,4.63,0.22,0.63,3.32,0.47,97.09
2,ID00378,70.50,0.41,13.50,6.937458,1.83,0.26,0.13,2.89,5.48,0.08,99.64
3,ID00529,47.84,3.19,13.42,5.686736,11.56,6.21,0.10,0.38,2.63,0.21,98.44
4,ID00560,66.87,0.66,12.55,4.786936,1.10,0.60,0.09,3.08,5.32,0.06,94.74
...,...,...,...,...,...,...,...,...,...,...,...,...
114,ID05364,56.88,1.95,13.28,11.769384,5.52,1.99,0.31,1.42,4.71,0.86,100.00
115,ID00223,70.04,0.29,13.45,4.280000,1.75,0.30,,2.81,4.57,0.09,93.21
116,ID02066,70.60,0.41,13.33,4.205732,1.52,0.28,0.16,2.85,5.73,0.04,99.22
117,ID02067,70.60,0.41,13.17,4.788798,1.73,0.18,0.18,2.66,5.73,0.03,99.55


## Iron_ratios

#### <i>What are the iron ratio recalculations and when would I use them?</i>

The `iron_ratios()` function is a conversion tool for when you need to split total iron (<b>FeO<sub>T</sub></b>) into separate <b>FeO</b> and <b>Fe<sub>2</sub>O<sub>3</sub></b> components. Many datasets only report total iron, but some workflows (for example redox-sensitive calculations or tools that explicitly require FeO and Fe2O3) need iron to be partitioned into ferrous and ferric components. This function exists to make that split explicit and reproducible, rather than doing it manually or inconsistently between scripts.

In practice, you provide an Fe<sup>3+</sup>/Fe<sub>T</sub> ratio and the function uses that ratio to calculate the corresponding FeO and Fe2O3 weight percent values in the liquid data. This allows you to keep your dataset internally consistent while still being able to run workflows that require a specific iron speciation assumption.

> <span style="color:#2c5aa0"><b>Research note:</b> I use this function when I need to run tools that explicitly depend on iron speciation, or when comparing datasets where some studies report FeO and Fe2O3 separately and others only report total iron. The important point is that the ratio you choose is an assumption, so I always record the value used and treat the resulting split as a controlled approximation rather than a measured quantity.</span>

The `iron_ratios()` function is a conversion tool for splitting total iron (<b>FeO<sub>T</sub></b>) into separate ferrous and ferric components. Many datasets report only total iron, but some workflows require explicit <b>FeO</b> and <b>Fe<sub>2</sub>O<sub>3</sub></b> values, for example when applying redox-sensitive models or tools that explicitly distinguish between Fe<sup>2+</sup> and Fe<sup>3+</sup>.

This function takes a user-defined Fe<sup>3+</sup>/Fe<sub>T</sub> ratio and applies it consistently to the liquid iron content, producing a reproducible split between FeO and Fe<sub>2</sub>O<sub>3</sub>. The key aim is to make the redox assumption explicit and traceable, rather than embedding it implicitly within later calculations.

- **Input:**  
  - A <span style="color:#4a78a8"><b>DataFrame</b></span> containing  
    <span style="color:#2f7d32"><b>FeOt_Liq</b></span>  
  - A scalar <span style="color:#b03a2e"><b>ratio</b></span> representing Fe<sup>3+</sup>/Fe<sub>T</sub>

- **Output:** A <span style="color:#4a78a8"><b>DataFrame</b></span> with liquid iron split into:
  - <span style="color:#2f7d32"><b>FeO_Liq</b></span>
  - <span style="color:#2f7d32"><b>Fe<sub>2</sub>O<sub>3</sub>_Liq</b></span>  
  along with intermediate diagnostic columns used in the conversion.

> <span style="color:#2c5aa0"><b>Research note:</b> I use this function when a workflow explicitly requires iron speciation rather than total iron. The Fe<sup>3+</sup>/Fe<sub>T</sub> ratio is always an assumption, so I treat the resulting split as a controlled approximation and record the ratio used alongside any interpretation.</span>

**Example usage:**
- <span style="color:#4a78a8"><b>df</b></span> → the input DataFrame  
- <span style="color:#b03a2e"><b>0.15</b></span> → example Fe<sup>3+</sup>/Fe<sub>T</sub> ratio  

In [24]:
lwt.iron_ratios(df, 0.15)

Unnamed: 0,Sample_ID,SiO2_Liq,TiO2_Liq,Al2O3_Liq,FeOt_Liq,CaO_Liq,MgO_Liq,MnO,K2O_Liq,Na2O_Liq,P2O5,Total,Fe_wt,Fe3_wt,Fe2_wt,Fe2O3_Liq,FeO_Liq
0,ID00376,47.98,1.95,17.51,10.386454,11.59,6.21,0.16,0.32,2.64,0.22,99.02,8.073486,1.211023,6.862463,1.731441,8.828486
1,ID00377,47.92,3.53,13.32,14.482246,8.66,4.63,0.22,0.63,3.32,0.47,97.09,11.257183,1.688577,9.568605,2.414217,12.309909
2,ID00378,70.50,0.41,13.50,4.498718,1.83,0.26,0.13,2.89,5.48,0.08,99.64,3.496895,0.524534,2.972361,0.749944,3.823910
3,ID00529,47.84,3.19,13.42,12.411606,11.56,6.21,0.10,0.38,2.63,0.21,98.44,9.647655,1.447148,8.200507,2.069037,10.549865
4,ID00560,66.87,0.66,12.55,4.285632,1.10,0.60,0.09,3.08,5.32,0.06,94.74,3.331261,0.499689,2.831572,0.714423,3.642787
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
114,ID05364,56.88,1.95,13.28,11.769384,5.52,1.99,0.31,1.42,4.71,0.86,100.00,9.148450,1.372268,7.776183,1.961978,10.003976
115,ID00223,70.04,0.29,13.45,4.280000,1.75,0.30,,2.81,4.57,0.09,93.21,3.326883,0.499032,2.827851,0.713484,3.638000
116,ID02066,70.60,0.41,13.33,4.205732,1.52,0.28,0.16,2.85,5.73,0.04,99.22,3.269154,0.490373,2.778781,0.701103,3.574872
117,ID02067,70.60,0.41,13.17,4.788798,1.73,0.18,0.18,2.66,5.73,0.03,99.55,3.722377,0.558356,3.164020,0.798301,4.070478
