We now have a sorter and recording. Great! Now we can get on to the fun stuff: what shape are our unit templates? Which units are correlated with each other? Where on the probe are the units? This require computing extra information. In `SpikeInterface` we do this by creating an object called a `SortingAnalyzer`. 

A `SortingAnalyzer` combines a recording with a sorting in a unified way, no matter which sorter or recording format you used. Once you have an analyzer, you can compute postprocessing _extensions_ (like spike locations, waveforms, template metrics, ...) in exactly the same for every sorter. They can also be used to keep track of curation, merging and splitting, and more.

This unified framework has several benefits. The main one is that the analyzer defines a sorter-agnostic format for post-sorting analysis. Hence:
- You can compare sorters on a level playing field (i.e. all the extensions are computed in the same way, for all sorters)
- Your postprocessing pipeline can be identical, whether using mountainsort to sort tetrode data or kilosort to sort NeuroPixel data, creating a unified pipeline in your lab.
- External tools have a simple starting point to work from. This should make tooling in the community easier, and there are already several examples of this:
  - NeuronConv contains a `SortingAnalyzerToNWB` function
  - spikeinterface-gui, sortingview and UnitMatch can take an analyzer as their initial input.
  - UnitRefine (LINKS)

Hopefully you're now convinced that creating a `SortingAnalyzer` will make your life easier, and smooth the path to using new tools in your analysis pipeline. So, let's make one.

In [None]:
import spikeinterface.full as si
from pathlib import Path

si.set_global_job_kwargs(n_jobs=4)
base_folder = Path("/home/nolanlab/Work/Projects/Milan/")

recording, sorting = si.generate_ground_truth_recording()

When you make the analyzer, you can either make in _in memory_ or _in folder_. 

In [None]:
analyzer_in_memory = si.create_sorting_analyzer(
    sorting=sorting,
    recording=recording, 
)

analyzer_in_folder = si.create_sorting_analyzer(
    sorting=sorting, 
    recording=recording, 
    folder="my_analyzer",
    format="binary_folder",
)

When in memory, the analyzer is stored in RAM. This makes computation faster, but will use more RAM. You can save your `analyzer_in_memory` to a folder at any point using `analyzer_in_memory.save_as`. (more info: https://spikeinterface.readthedocs.io/en/stable/modules/postprocessing.html) For this demo, we'll use the folder analyzer. Go take a look in the folder. You'll see that it contains recording information, sorting information and more!

# Extensions

Each thing-you'd-like-to-compute is stored as an Extention of the analyzer. Let's compute the templates: the averaged waveforms from all (or a large random sample of) individual spikes.

In [None]:
analyzer_in_folder.compute("templates")

Oh no - an error! This is due to the fact that extensions depend on each other. For example, you can't template similarity (how similar unit templates are to one another) without computing templates. The full dependency graph can be seen here:

![image](images/parent_child.svg)

So, when we compute extensions we need to know which _other_ extensions we need to compute beforehand... Let's compute a few. You can either compute one at a time, or give the `analyzer` a big dictionary of extensions (recommended! It will re-sort based on dependencies, and be able to do a few time saving tricks):

In [None]:
# just one
analyzer_in_folder.compute("random_spikes", max_spikes_per_unit=1000)

# or lots: here we also specify some kwargs
analyzer_in_folder.compute({
    "templates": {},
    "noise_levels": {},
    "correlograms": {},
    "noise_levels": {'method': 'std'},
    "spike_amplitudes": {},
    "spike_locations": {},
    "template_metrics": {'include_multi_channel_metrics': True},
    "unit_locations": {},
    "template_similarity": {'method': 'l1'},
    "quality_metrics": {},
})


> **Note**: to see which extensions are availbale to compute, use `analyzer_in_folder.get_computable_extensions()`. A good way to see which arguments an extension accepts, you can use e.g. `analyzer_in_folder.get_default_extension_params('template_metrics')`

Now take another look in your analyzer folder. You'll find lots of new folders containing your extensions! You can load this data directly, but `SpikeInterface` contains a lot of handy loader functions. The notation is always `analyzer.get_extension("extension_name").get_data()`. Let's look at the quality metrics. These are measures of how _good_ a unit is (more details: https://spikeinterface.readthedocs.io/en/latest/modules/qualitymetrics.html) Note that which quality metrics are computed depends on which other extensions you've computed.

In [None]:
quality_metrics = analyzer_in_folder.get_extension("quality_metrics").get_data()
quality_metrics

This is a `pandas` dataframe with information about each unit. Nice.

We can get the "raw" extension data from any other extensions too

In [None]:
template_similarity_data = analyzer_in_folder.get_extension("template_similarity").get_data()
template_similarity_data

`SpikeInterface` also supports lots of plotting functions that are related to extensions (see more: https://spikeinterface.readthedocs.io/en/latest/modules/widgets.html#available-plotting-functions). Let's plot the spike locations, then a summary plot.

In [None]:
si.plot_template_similarity(analyzer_in_folder)

In [None]:
%matplotlib widget
si.plot_amplitudes(analyzer_in_folder, backend="ipywidgets")

In [None]:
si.plot_unit_summary(analyzer_in_folder, unit_id="1")

In [None]:
si.plot_all_amplitudes_distributions(analyzer_in_folder)

In [None]:
si.plot_quality_metrics(analyzer_in_folder, backend="ipywidgets")

The analyzer is key to playing with your data. So we'll now do somes exercises to help us explore it more. To do these tasks you'll need some basic `numpy` and `pandas` skills. For numpy, this page might help (https://numpy.org/doc/stable/user/absolute_beginners.html#indexing-and-slicing). For pandas, maybe this: https://pandas.pydata.org/docs/getting_started/intro_tutorials/03_subset_data.html. Or ask someone! 

**Exercise 1**: Try out a widget we've not seen here. Try and explain to yourself what it does.

**Exercise 2**: Get the signal-to-noise ratio ("snr" quality metric) for all units and plot an histogram.

**Exercise 3**: Plot the auto-correlogram (using `plot_autocorrelograms`) of the unit with the worst and the best "isi_violations_ratio" 

**Exercise 4**: use the SI API to find which the channel has the extremal template (hint: `get_template_extremum_channel`). Then use this information and `matplotlib` to plot the template on the extremal channel.

# Export

We are trying to improve `export` options for the `SortingAnalyzer`. We recently introduced the ability to `export_to_pynapple` (https://pynapple.org/)

In [None]:
spikes = si.to_pynapple_tsgroup(analyzer_in_folder)
spikes

You can use `Pynapple` to combine spike and behavioural data:

In [None]:
import numpy as np
import pynapple as nap
import matplotlib.pyplot as plt

behaviour = nap.load_file("/Users/christopherhalcrow/Desktop/sub-25_day-23_ses-OF1_beh.nwb")
P_x, P_y = behaviour['P_x'], behaviour['P_y']

tc = nap.compute_2d_tuning_curves(
    spikes,
    np.stack([P_x, P_y], axis=1),
    nb_bins=(40,40),
)[0]

plt.imshow(tc[0])
plt.show()

Another export option is `export_report`. This creates a nice little report about all your units:

In [None]:
si.export_report(analyzer_in_folder, output_folder="my_report")