# Developing Baseline Correction


2023-08-29 16:34:00

An investigation into different baseline correction methods via [PyBaselines](https://pybaselines.readthedocs.io/en/latest/index.html).


In [None]:
# setup

%load_ext autoreload
%autoreload 2

import pandas as pd
from wine_analysis_hplc_uv import definitions
from wine_analysis_hplc_uv.old_signal_processing.signal_processor import (
    SignalProcessor,
)
from pybaselines import Baseline
import matplotlib.pyplot as plt

scipro = SignalProcessor()
df = pd.read_parquet(definitions.XPRO_YPRO_DOWNSAMPLED_PARQ_PATH)
df

### iasls


The current routine is to apply `.iasls` to calculate the baseline:


In [None]:
def iaslsblinefunc(df: pd.DataFrame) -> pd.DataFrame:
    df = df.assign(
        bline=Baseline(
            x_data=df.index.get_level_values("mins").total_seconds(), assume_sorted=True
        ).iasls(df["value"])[0]
    ).assign(blinesub=lambda df: df.eval("value - bline"))

    return df


(
    df.stack(["samplecode", "wine"])
    .groupby(["samplecode"], group_keys=False)
    .apply(lambda df: iaslsblinefunc(df))
    .unstack(["samplecode", "wine"])
    .pipe(lambda df: df if df.pipe(scipro.vars_subplots) else df)
)
plt.suptitle("iasls")
plt.show()

But the fit is not great. What about another one?


## asls


In [None]:
(
    df.pipe(scipro.baseline_correction).pipe(
        lambda df: df if df.pipe(scipro.vars_subplots) else df
    )
)
plt.suptitle("asls")
plt.show()

Much better. Why? It appears that the default settings for `.asls` allow for a a rougher fit.