# Exploring the AIND-ephys outputs using SpikeInterface

### Notebook usage:
- This notebook will allow you to visualize and explore the spike sorting results. 
- This is quite slow for large datasets if you are not using the curation GUI.
- This notebook assumes some comfort with spike sorting and electrophysiology data. In addition, a basic understanding of [SpikeInterface](https://spikeinterface.readthedocs.io/en/latest/index.html) is helpful.

#### Requirements:
- processed AINDS neuropixels data
- installation of spikeinterface - if not installed, please install SpikeInterace using the following command:
```bash
pip install spikeinterface[full, widgets]
```

**Note**: This notebook is based on the latest version of SpikeInterface (`spikeinterface==0.101.0`) which is under development. The API may change in the future.

In [None]:
#import packages
import os
import matplotlib.pyplot as plt
import spikeinterface as si
import spikeinterface.extractors as se
import spikeinterface.postprocessing as spost
import spikeinterface.widgets as sw
from spikeinterface.curation import apply_sortingview_curation
from spikeinterface.widgets import plot_sorting_summary

In [None]:
#Fetch data directories

raw_rec = 'path/to/raw/recording'
baseFolder = r"C:\Users\janet\Documents\Tom_AINDS_output" #edit this to the location of your data"
experiment = 'block0_imec0.ap_recording1_group0' #edit this to the name of your experiment folder

preProcessed = baseFolder + '/preprocessed'
postProcessed = baseFolder + '/postprocessed'
spikes = baseFolder + '/spikesorted'
curated = baseFolder + '/curated'
preJSON = os.path.join(preProcessed, experiment + '.json')

In [None]:
#Select the data to explore

data_load = curated
print(f'Set path: {data_load}')

## First, let's load the waveform extractor - we'll explore the postprocessed units which are stored in the `postprocessed` folder. These units have been processed to include the following: 
* removal of duplicate units
* computed amplitudes
* spike/unit locations 
* PCA
* correlograms
* template similarity
* templeate metrics
* QC metrics

## The `curated` folder includes units that *have been* automatically curated by:
* ISI violation ratio
* presence ratio
* amplitude cutoff

### First, load the wave forms and the sorting extractor
*Note: we will use the back compatible version of the waveform extractor which is the `MockWaveformExtractor` that is used in the latest version of SpikeInterface

In [None]:
we =  si.load_waveforms(folder=(os.path.join(postProcessed, experiment)))
sorting_curated = si.load_extractor(os.path.join(data_load, experiment))
we, sorting_curated

In [None]:
avail_extensions = we.get_available_extension_names()
avail_extensions

In [None]:
sorting_analyzer = we.sorting_analyzer
qm=sorting_analyzer.get_extension(extension_name='quality_metrics').get_data()
labels = sorting_curated.get_property('decoder_label')
unit_ids = sorting_curated.get_unit_ids()
unit_locations = sorting_analyzer.get_extension("unit_locations").get_data()
unit_locations = unit_locations[:,1]


In [None]:
#create dataframe of all the quality metrics
import pandas as pd
df = pd.DataFrame(qm)
df['unit_ids'] = unit_ids
df['labels'] = labels
df['unit_locations'] = unit_locations
df

### Subselect units to visualize. Within this for loop, a waveform plot and an autocorrelogram plot are generated for each unit. 
*Please note, the more units you visualize, the longer this will take to run. This is memory intensive and should only be used for quick exploration of units-of-interest.*

In [None]:
print("Total units: ", len(we.unit_ids)), we.unit_ids

In [None]:
unit_ids = [0, 3] #list of unit ids to plot

In [None]:
for unit_id in unit_ids:
    fig, ax = plt.subplots()
    template = we.get_template(unit_id=unit_id)
    ax.plot(template)
    ax.set_title(f'{unit_id}')
    #sw.plot_autocorrelograms(sorting_curated, window_ms=150.0, bin_ms=5.0, unit_ids=[unit_id])
    
plt.show()

## Launching the spikeinterface GUI for manual curation

* We can use the QT-based GUI [SpikeInterface-GUI](https://github.com/SpikeInterface/spikeinterface-gui) to visualize and curate the sorting output. You will need the raw recording as long as the sorting object. You will then need to create the `sorting_analyzer` object and run the GUI. 

* In addition to the GUI, there are some automatic curation tools that can be leverged such as `get_potential_auto_merge()` and `remove_duplicated_spikes()`

* If a GUI is not desired, you can also curate within the notebook by manually relabeling the labeling definitions. More details for automatic curation tools and manual relabeling can be found [here](https://spikeinterface.readthedocs.io/en/latest/modules/curation.html#curation)

In [None]:
sorting_analyzer = si.create_sorting_analyzer(sorting=sorting_curated, recording=raw_rec)

# some extensions are required
sorting_analyzer.compute([
    "random_spikes",
    "noise_levels",
    "templates",
    "template_similarity",
    "unit_locations",
    "spike_amplitudes",
    "principal_components",
    "correlograms"
    ]
)
sorting_analyzer.compute("quality_metrics", metric_names=["snr"])

# this will open the GUI in a different window
plot_sorting_summary(sorting_analyzer=sorting_analyzer, curation=True, backend='spikeinterface_gui')