# Waveform Snippets

<div style="text-align: center;">
    <img src="./assets/2024-02-20_waveform.png" alt="waveform preview" width="450"/>
</div>

## Summary

Plotting hundreds or thousands of overlapping time-series snippets relative to a set of event-times is important for pattern discovery in many domains like finance, sensor monitoring, and neuroscience. In neuroscience, this visualization approach is often used to reveal distinct patterns across action potential waveforms from different neurons. This waveform snippets workflow will demonstrate how to display many overlapping action potential waveform snippets from multiple neurons in a single plot.

## Imports and config

<div class="admonition alert alert-info">
    <p class="admonition-title" style="font-weight:bold">Requirements</p>
    <p>This workflow notebook requires the <a href="./environment.yml">environment</a> specified in this workflow directory.</p>
</div>


In [None]:
import numpy as np
import pandas as pd

import hvplot.pandas
import holoviews as hv; hv.extension('bokeh')
from holoviews.operation.datashader import datashade, shade, dynspread, spread, rasterize
import datashader as ds
import panel as pn; pn.extension()


## Data Intake

Let's load a sample dataset of neural waveforms:

In [None]:
waves = pd.read_parquet("https://datasets.holoviz.org/waveform/v1/waveforms.parq")

In [None]:
waves.sample(10)

## Data Exploration

A sampling of this dataset suggests that it contains a lot of neural waveform snippets. To grasp its structure, we could use pandas utilizies to examine the length of each waveform and count of waveforms per neuron:

In [None]:
first_waveform = waves[(waves['Neuron'] == waves['Neuron'].unique()[0]) & (waves['Waveform'] == 0)]
print(f'Number of samples per waveform: {len(first_waveform)}')
waves.groupby('Neuron')['Waveform'].nunique().reset_index().rename(columns={'Waveform': '# Waveforms'})

Alternatively, we could just use hvPlot to visualize the data, applying `groupby` to `Neuron` and `Waveform` to create widgets that allow us to select values for those fields and see a single waveform at a time:

In [None]:
simple_waves_app = waves.hvplot.line(x='Time', y='Amplitude', line_width=2, grid=True, groupby=['Neuron', 'Waveform'])
simple_waves_app

However, what we really want is to overlay all the waveform snippets together in order to see if each individual neuron's waveforms form a consistent pattern, and whether the waveforms across the different neurons are distinguishable.

The first approach we can take is to use the `by` argument in hvPlot to color by Neuron and set the alpha (transparency) so we can see through the waveforms.

In [None]:
overlaid_waves = waves.hvplot.line(x='Time', y='Amplitude', line_width=2, grid=True, by=['Neuron'], alpha=0.5)
overlaid_waves

However, there are a couple of issues with this approach above. First, sending **all** the data to the browser will not be scalable beyond a few thousand lines or a couple hundred thousand points. Second, this approach does not reveal the patterns of density. Alternatively, if we would have tried `by = ['Neuron', 'Waveform']`, this would have taken over a minute to display and would have produced the ugly and uninformative mess below, since it's trying to color by every Neuron and Waveform pairing (below is a just a screenshot to avoid making your browser angry).

<div style="text-align: center;">
    <img src="./assets/waveform_bad.png" alt="waveform bad" width="500"/>
</div>

With a substantial number of waveforms and multple categories (neurons), clear visualization of data densities with a simple API becomes a challenge. Luckily, we can utilize Datashader, via the hvPlot interface, to reveal all the data.

But first, one caveat: For Datashader, each waveform must be distinctly separated in the dataframe with a NaN row to effectively separate one waveform from another and still color by neuron. This ensures each waveform is treated as an individual entity, avoiding misleading connections between the end of one waveform and the start of the next. Below, we can see one of these NaN rows at the end of the first waveform. These have been added to the sample dataset already, so we don't need to worry about that now.

In [None]:
first_waveform.tail(3)

<div class="alert alert-info">

<b>Note:</b> Work is already underway to avoid having to prepare your dataset with NaN-separators. Stay tuned!

</div>

Now, all we need to do is specify that hvPlot should color by neuron and apply datashader's `rasterize`. We'll also wrap this in a Panel `Column` so we can optionally and selectively display this figure outside of a Jupyter notebook by running `panel serve workflow_waveform-snippets.ipynb --show`

In [None]:
pn.Row(waves.hvplot.line('Time', 'Amplitude', by='Neuron', rasterize=True,
                  xlabel='Time (ms)', ylabel='Amplitude (µV)', title="Waveform Snippets", 
                  cmap='viridis', colorbar=False, hover=True, line_width=1,
                  responsive=True, min_height=400, min_width=800)).servable()

Now, we can quickly visualize the densities of the individual neurons' waveforms and easily compare across neurons. Furthermore, if you are running this notebook locally with a live python process (rather than on a website), then you can zoom in and see how Datashader will adjust the resolution to the new plot ranges, as demonstrated below by setting the `xlim` and `ylim` to mimic zooming interaction:

In [None]:
waves.hvplot.line('Time', 'Amplitude', by='Neuron', rasterize=True,
                  xlabel='Time (ms)', ylabel='Amplitude (µV)', title="Waveform Snippets", 
                  cmap='viridis', colorbar=False, hover=True, line_width=1,
                  responsive=True, min_height=400, min_width=800, xlim=(0.6, 0.9), ylim=(-40, 40))