# Exploring the AIND-ephys outputs using SpikeInterface

### Notebook usage:
- This notebook will allow you to visualize and explore the spike sorting results. 
- This is quite slow for large datasets if you are not using the curation GUI.
- This notebook assumes some comfort with spike sorting and electrophysiology data. In addition, a basic understanding of [SpikeInterface](https://spikeinterface.readthedocs.io/en/latest/index.html) is helpful.

#### Requirements:
- processed AINDS neuropixels data
- installation of spikeinterface - if not installed, please install SpikeInterace using the following command:
```bash
pip install "spikeinterface[full, widgets]"
```

**Note**: This notebook is based on the latest version of SpikeInterface (`spikeinterface==0.101.0`) which is under development. The API may change in the future and *is* different from the version used in the AINDS pipeline (`spikeinterface==0.100.8`). We have adapted the notebook to work with the latest version of SpikeInterface since there is a significant improvement in the API and functionality.

In [None]:
#import packages
import os
import matplotlib.pyplot as plt
import spikeinterface as si
import spikeinterface.extractors as se
import spikeinterface.postprocessing as spost
import spikeinterface.widgets as sw
from spikeinterface.curation import apply_sortingview_curation
from spikeinterface.widgets import plot_sorting_summary

In [None]:
#Fetch data directories

raw_rec = 'path/to/raw/recording'
baseFolder = r"C:\Users\janet\Documents\Tom_AINDS_output" #edit this to the location of your data"
experiment = 'block0_imec0.ap_recording1_group0' #edit this to the name of your experiment folder

preProcessed = baseFolder + '/preprocessed'
postProcessed = baseFolder + '/postprocessed'
spikes = baseFolder + '/spikesorted'
curated = baseFolder + '/curated'
preJSON = os.path.join(preProcessed, experiment + '.json')

In [None]:
#Select the data to explore

data_load = curated
print(f'Set path: {data_load}')

## First, let's load the waveform extractor - we'll explore the postprocessed units which are stored in the `postprocessed` folder. These units have been processed to include the following: 
* removal of duplicate units
* computed amplitudes
* spike/unit locations 
* PCA
* correlograms
* template similarity
* templeate metrics
* QC metrics

## The `curated` folder includes units that *have been* automatically curated by:
* ISI violation ratio
* presence ratio
* amplitude cutoff

### First, load the wave forms and the sorting extractor
*Note: we will use the back compatible version of the waveform extractor which is the `MockWaveformExtractor` that is used in the latest version of SpikeInterface

In [None]:
we =  si.load_waveforms(folder=(os.path.join(postProcessed, experiment)))
sorting_curated = si.load_extractor(os.path.join(data_load, experiment))
we, sorting_curated

### Each object has various extensions and attributes. You can fetch the extensions using `.get_available_extension_names()` or with `dir(object)`.

In [None]:
avail_extensions = we.get_available_extension_names()
avail_extensions

### Create sorting analyzer and fetch quality metrics and unit information

In [None]:
sorting_analyzer = we.sorting_analyzer

#quality metrics
qm=sorting_analyzer.get_extension(extension_name='quality_metrics').get_data()
#fetch decoder labels (e.g. SUA, MUA, noise)
labels = sorting_curated.get_property('decoder_label')
#fetch unit ids and locations
unit_ids = sorting_curated.get_unit_ids()
unit_locations = sorting_analyzer.get_extension("unit_locations").get_data()
unit_locations = unit_locations[:,1]


In [None]:
#create dataframe of all the quality metrics
import pandas as pd
df = pd.DataFrame(qm)
df['unit_ids'] = unit_ids
df['labels'] = labels
df['unit_locations'] = unit_locations
df

In [None]:
print("Total units: ", len(we.unit_ids)), we.unit_ids

### The SpikeInterface objects stores the data for all units. Often, a list of unit_ids is needed to explore the data. Below, we will plot the waveform templates for a list of units.

In [None]:
unit_ids = [0, 3] #list of unit ids to plot

In [None]:
for unit_id in unit_ids:
    fig, ax = plt.subplots()
    template = we.get_template(unit_id=unit_id)
    ax.plot(template)
    ax.set_title(f'{unit_id}')
    
plt.show()

### To fetch spike trains you can use the following logic:

In [None]:
unit_id = 1

spike_extractor = si.load_extractor(os.path.join(spikes, experiment))

#this returns the spike train of a single unit
spike_extractor.get_unit_spike_train(unit_id, return_times=True)

#this returns all the spike trains of all units
spike_extractor.get_all_spike_trains(return_times=True)

## We hope this provides some intuition on how to explore the AINDS data using SpikeInterface. Please refer to the [SpikeInterface](https://spikeinterface.readthedocs.io/en/latest/index.html) documentation for details about API and usage.