# DataJoint Workflow for Neuropixels Analysis with Kilosort

+ This notebook will describe the steps for interacting with the data processed with the workflow.

In [None]:
import os
os.chdir('..')

In [None]:
import datajoint as dj
import matplotlib.pyplot as plt
import numpy as np

from workflow.pipeline import lab, subject, session, ephys, probe

## Workflow architecture

+ This workflow is assembled from 4 DataJoint elements:
     + [element-lab](https://github.com/datajoint/element-lab)
     + [element-animal](https://github.com/datajoint/element-animal)
     + [element-session](https://github.com/datajoint/element-session)
     + [element-array-ephys](https://github.com/datajoint/element-array-ephys)

+ Below is the diagram describing the core components of the fully assembled pipeline.


In [None]:
dj.Diagram(subject.Subject) + dj.Diagram(session.Session) + dj.Diagram(probe) + dj.Diagram(ephys)

## Browsing the data with DataJoint query and fetch 

+ DataJoint provides abundant functions to query data and fetch. For a detailed tutorials, visit our [general tutorial site](https://playground.datajoint.io/)

### `subject.Subject` and `session.Session` tables

In [None]:
subject.Subject()

In [None]:
session_key = (session.Session & 'subject="subject5"' & 'session_datetime = "2020-05-12 04:13:07"').fetch1('KEY')

### `ephys.ProbeInsertion` and `ephys.EphysRecording` tables

+ These tables stores the probe recordings within a particular session from one or more probes.

In [None]:
ephys.ProbeInsertion & session_key

In [None]:
ephys.EphysRecording & session_key

### `ephys.ClusteringTask` , `ephys.Clustering`

+ Spike-sorting is performed on a per-probe basis with the details stored in `ClusteringTask` and `Clustering`

In [None]:
ephys.ClusteringTask * ephys.Clustering & session_key

### Spike-sorting results are stored in `ephys.CuratedClustering`,  `ephys.WaveformSet.Waveform`

In [None]:
ephys.CuratedClustering.Unit & session_key

Let's pick one probe insertion and one `curation_id`, and further inspect the clustering results.

In [None]:
curation_key = (ephys.CuratedClustering & session_key & 'insertion_number = 1' & 'paramset_idx=0').fetch1('KEY')

In [None]:
ephys.CuratedClustering.Unit & curation_key

### Generate a raster plot for the "good" units

In [None]:
unit_key = ephys.CuratedClustering.Unit & curation_key & 'cluster_quality_label = "good"'

In [None]:
units, unit_spiketimes = unit_key.fetch('unit', 'spike_times')

In [None]:
x = np.hstack(unit_spiketimes)
y = np.hstack([np.full_like(s, u) for u, s in zip(units, unit_spiketimes)])

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(32, 16))
ax.plot(x, y, '|')
ax.set_xlabel('Time (s)');
ax.set_ylabel('Unit');

### Plot waveform of a unit

In [None]:
unit_key = (ephys.CuratedClustering.Unit & curation_key & 'unit = 1').fetch1('KEY')

In [None]:
ephys.CuratedClustering.Unit * ephys.WaveformSet.Waveform & unit_key

In [None]:
unit_data = (ephys.CuratedClustering.Unit * ephys.WaveformSet.PeakWaveform & unit_key).fetch1()

In [None]:
unit_data

In [None]:
sampling_rate = (ephys.EphysRecording & curation_key).fetch1('sampling_rate')/1000 # in kHz
plt.plot(np.r_[:unit_data['peak_electrode_waveform'].size] * 1/sampling_rate, unit_data['peak_electrode_waveform'])
plt.xlabel('Time (ms)');
plt.ylabel(r'Voltage ($\mu$V)');

## Summary
This notebook highlights the major tables in the workflow and visualize some of the processed results. 