# grainanalyser
> workflow to read grain-size distributions from the lab, with treating grain sizes as compositional data and therefore processing them using the Aitchison's log-ratio approach

## Prerequisites

* only works for one directory at a time
* only works for csv files
* only works for files in laserscannerformat

In [None]:
%load_ext lab_black

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import composition_stats as comp
from grainalyzer import grainalyzer

In [None]:
pd.set_option("display.max_rows", 15)  # none to view all rows

### Data Wrangling

In [None]:
filepath = "Data/Test_GS-020.csv"
grainalyzer.extract_depth(filepath)

In [None]:
filepath = "Data/Test*.csv"
grainsizes = grainalyzer.read_gs_to_df(filepath)

grainsizes_prep = grainalyzer.cut_off_zeros(grainsizes)

### Convert Grainsize to Krumbein Phi Scale

$$\phi = -\log_2 D/D_0\text{,} $$

> relevant for later classification

In [None]:
grainsizes_prep["gs_phi"] = grainalyzer.diameter_2_krumbein_phi(
    channelwidth=grainsizes_prep["Kanaldurchmesser_unten_um"], unit="um"
)

***

### clr on Vol_% column

> hier wird die clr mithilfe von composition.stats berechnet (alle aliquoten messungen

> Zeros sind an den rändern abgeschnitten, sodasss nur noch die nullstellen innerhalb probleme machen

> diese werden mit `multiplicative_replacement` ersetzt

> closure: summe alle werte = 1



In [None]:
grainsizes_clr = grainalyzer.gs_simplex_2_rplus(dataframe=grainsizes_prep, depth_colum="depth")

### Summarize the subsamples into one mean curve!

In [None]:
grainsizes_summarize = grainalyzer.mean_curves_clr(dataframe=grainsizes_clr, depth_colum="depth")
grainsizes_summarize

### Save to csv

In [None]:
filepath = "Data/grainsizes_summarize.csv"
grainsizes_summarize.to_csv(filepath, index=False)

**********
## Plotting

> plotting all avg. curves in one plot (viridis)


In [None]:
from matplotlib.pyplot import cm

In [None]:
n = len(pd.unique(grainsizes_summarize["depth"]))  ## number of curves here
color = cm.viridis(np.linspace(0, 1, n))
fig, ax = plt.subplots(1, 1, figsize=(15, 10))

for depth, c in zip(
    pd.unique(grainsizes_summarize["depth"]), color
):  # iterate over all depths and colors --> same lengths!
    plot_curve = grainsizes_summarize.loc[(grainsizes_summarize["depth"] == depth)]
    # depth = plot_curve["depth"].iloc[0]
    plt.plot(
        plot_curve["gs_phi"],
        comp.clr_inv(plot_curve["Vol_clr_mean"]) * 100,
        label=f"{depth}cm",
        color=c,
    )  # interim_ali["Vol_perc_clr"]
    # Add confidence bands
    plt.fill_between(
        plot_curve["gs_phi"],
        comp.clr_inv((plot_curve["Vol_clr_mean"] - plot_curve["Vol_clr_std"])) * 100,
        comp.clr_inv((plot_curve["Vol_clr_mean"] + plot_curve["Vol_clr_std"])) * 100,
        color=c,
        alpha=0.1,
    )
    plt.legend(loc="lower center", ncol=2, bbox_to_anchor=(1, 0.2))

plt.title("mean grain size and confidence bands")
plt.xlabel("Grainsizes $\phi$")
plt.ylabel("Volume [%]")
plt.xlim(0, 16)
plt.show()