# Scipy

[Introduction — SciPy v1.10.1 Manual](https://docs.scipy.org/doc/scipy/tutorial/general.html)

Should focus on [Signal processing (scipy.signal) — SciPy v1.10.1 Manual](https://docs.scipy.org/doc/scipy/reference/signal.html#module-scipy.signal).

A lot of the modules are outside of my scope. [scipy.signal.find_peaks](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks.html#scipy.signal.find_peaks) may be useful. It requires a 1-D array, and returns all local maxima.

Load some sample data:

In [None]:
%load_ext autoreload
%autoreload 2

import os
import sys

sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))

from pathlib import Path

import pandas as pd

from scripts.data_interface import retrieve_uv_data

import rainbow as rb

In [None]:
p = Path("/Users/jonathan/0_jono_data/2023-02-22_2021-DEBORTOLI-CABERNET-MERLOT_HALO.D")

print(p)

uv_data = retrieve_uv_data(rb.read(str(p)))

In [None]:
# plot_3d_line(uv_data, plot_title="whateve")

In [None]:
from scripts.data_manipulators import df_windower

uv_data_2 = df_windower(uv_data, "nm", 220, 250)

uv_data_2.head()

In [None]:
# plot_3d_line(uv_data_2, plot_title = "whatevs")

In [None]:
uv_data_2["mins"] = uv_data_2["mins"].round(5)

uv_data_2 = uv_data_2.set_index("mins")

In [None]:
uv_data_2.head()

So we can see that 222nm has the highest absorabance. at 345.40 mAU, as expected. And the idx of that value is..

In [None]:
time_max_222 = uv_data_2[uv_data_2.idxmax().index[0]].idxmax()

print(f"The time for the max value of 222nm is {time_max_222}")

So lets try and to peak identification on 222nm.

In [None]:
data_222 = uv_data_2[222]

data_222.head()

In [None]:
data_222.plot()

In [None]:
type(p)

In [None]:
import plotly.graph_objs as go

from scipy.signal import find_peaks


def peak_plot(data=pd.DataFrame, nm=int, plot_title=Path):
    peak_idx, peak_heights = find_peaks(data[nm], height=50, distance=50)

    cx = data[nm].index.values

    cy = data[nm].values

    px = data.index.values[peak_idx]

    py = data[nm].values[peak_idx]

    fig = go.Figure()

    fig.update_layout(title=f"{p.name}, {nm}")

    peak_trace = go.Scatter(x=px, y=py, mode="markers", name="peaks")

    chrom_trace = go.Scatter(x=cx, y=cy, mode="lines", name="chromatogram")

    fig.add_trace(chrom_trace)

    fig.add_trace(peak_trace)

    fig.show()


peak_plot(uv_data_2, 248, p.name)

A great start, however it quickly became apparent that without a method of distinguishing total peak height from peak height relative to baseline, I had very little analytical functionality. A quick google turned up [scipy.signal.peak_prominences](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.peak_prominences.html)..

In [None]:
from scipy.signal import peak_prominences

prominences = peak_prominences(uv_data_2[nm], peak_idx)
prominences

It has returned: the prominences as an array, each peaks left and right 'bases', where the higher base of each pair is the peaks lowest contour line. By this measurement, we could easily determine optimal signal nm based on the highest sum of a nm signal's prominance array.

That's enough for this notebook.