# Array Electrophysiology Analysis for the DataJoint Workflow: How to Run Locally

This notebook provides a comprehensive guide on populating the ephys files with SpikeInterface using the `array-ephys` pipeline computations. It is designed to be run on a machine with access to the relevant data files for the selected session.

**_Note:_**

- The examples in this notebook use a sample dataset. Replace these entries with your actual database entries to access and analyze your data.



### **Key Steps**

- **Setup**

- **Step 1: Select Session of Interest**

- **Step 2: Populate `Ephys` Schema**


#### **Setup**


First, import the necessary packages for the data pipeline and essential schemas.


In [1]:
import os

if os.path.basename(os.getcwd()) == "notebooks":
    os.chdir("..")

In [2]:
import datajoint as dj
import numpy as np
import matplotlib.pyplot as plt
import datetime as datetime

In [3]:
from workflow.pipeline import culture, ephys, ephys_sorter

[2024-07-25 17:05:40,859][INFO]: Connecting milagros@db.datajoint.com:3306
[2024-07-25 17:05:42,450][INFO]: Connected milagros@db.datajoint.com:3306


#### **Step 1: Select Session of Interest**


In [4]:
session_key = {
    "organoid_id": "O09",
    "experiment_start_time": datetime.datetime(2023, 5, 18, 12, 25),
    "start_time": "2023-05-18 12:25:00",
    "end_time": "2023-05-18 12:26:30",
}

In [5]:
ephys.EphysSession * ephys.EphysSessionProbe & session_key

organoid_id  e.g. O17,experiment_start_time,insertion_number,start_time,end_time,session_type,probe  unique identifier for this model of probe (e.g. serial number),port_id,"used_electrodes  list of electrode IDs used in this session (if null, all electrodes are used)"
O09,2023-05-18 12:25:00,0,2023-05-18 12:25:00,2023-05-18 12:26:30,spike_sorting,Q983,A,=BLOB=


#### **Step 2: Populate `Ephys` Schema**


In [6]:
ephys_key = {**session_key, "insertion_number": 0}

In [7]:
ephys.EphysSessionInfo.heading

# Store header information from the first session file.
organoid_id          : varchar(4)                   # e.g. O17
experiment_start_time : datetime                     # 
insertion_number     : tinyint unsigned             # 
start_time           : datetime                     # 
end_time             : datetime                     # 
---
session_info         : longblob                     # Session header info from intan .rhd file. Get this from the first session file.

In [8]:
ephys.EphysSessionInfo.populate(ephys_key)

In [9]:
ephys.EphysSessionInfo & ephys_key

organoid_id  e.g. O17,experiment_start_time,insertion_number,start_time,end_time,session_info  Session header info from intan .rhd file. Get this from the first session file.
O09,2023-05-18 12:25:00,0,2023-05-18 12:25:00,2023-05-18 12:26:30,=BLOB=


In [10]:
ephys.ClusteringParamSet()

paramset_idx,clustering_method,paramset_desc,param_set_hash,params  dictionary of all applicable parameters
0,spykingcircus2,Default parameters for spyking circus2 using SpikeInterface v0.100.1,b6fb9ec2-768c-66b0-2b71-9b8ac91e94da,=BLOB=
1,spykingcircus2,Default parameter set for spyking circus2 using SpikeInterface v0.101.*,434894d0-eb7b-db6c-80e6-638a1322c568,=BLOB=
2,kilosort2,kilosort2 with SpikeInterface version 0.101+,79a731f3-f1b6-c110-5f8a-e25227464de7,=BLOB=
101,spykingcircus2,Spyking circus2 using SpikeInterface v0.101.* and `include_multi_channel_metrics=False`,fd4eb67f-5784-a6ae-6cd8-25a429cad653,=BLOB=


In [11]:
ephys.ClusteringTask & ephys_key & "paramset_idx=101"

organoid_id  e.g. O17,experiment_start_time,insertion_number,start_time,end_time,paramset_idx,clustering_output_dir  clustering output directory relative to the clustering root data directory
O09,2023-05-18 12:25:00,0,2023-05-18 12:25:00,2023-05-18 12:26:30,101,O09-12_raw/202305181225_202305181226/O09/spykingcircus2_101


In [12]:
task_key = (ephys.ClusteringTask & ephys_key & "paramset_idx=101").fetch1("KEY")
task_key

{'organoid_id': 'O09',
 'experiment_start_time': datetime.datetime(2023, 5, 18, 12, 25),
 'insertion_number': 0,
 'start_time': datetime.datetime(2023, 5, 18, 12, 25),
 'end_time': datetime.datetime(2023, 5, 18, 12, 26, 30),
 'paramset_idx': 101}

In [13]:
ephys_sorter.PreProcessing.populate(task_key)

In [14]:
ephys_sorter.PreProcessing & task_key

organoid_id  e.g. O17,experiment_start_time,insertion_number,start_time,end_time,paramset_idx,execution_time  datetime of the start of this step,execution_duration  execution duration in hours
O09,2023-05-18 12:25:00,0,2023-05-18 12:25:00,2023-05-18 12:26:30,101,2024-07-24 19:47:58,0.00165199


In [15]:
ephys_sorter.SIClustering.populate(task_key)

In [16]:
ephys_sorter.SIClustering & task_key

organoid_id  e.g. O17,experiment_start_time,insertion_number,start_time,end_time,paramset_idx,execution_time  datetime of the start of this step,execution_duration  execution duration in hours
O09,2023-05-18 12:25:00,0,2023-05-18 12:25:00,2023-05-18 12:26:30,101,2024-07-24 19:54:24,0.0263145


In [17]:
# This is the output directory of the spike sorting
ephys.ClusteringTask & ephys_key & "paramset_idx=101"

organoid_id  e.g. O17,experiment_start_time,insertion_number,start_time,end_time,paramset_idx,clustering_output_dir  clustering output directory relative to the clustering root data directory
O09,2023-05-18 12:25:00,0,2023-05-18 12:25:00,2023-05-18 12:26:30,101,O09-12_raw/202305181225_202305181226/O09/spykingcircus2_101


In [18]:
ephys_sorter.PostProcessing.populate(task_key)

In [19]:
ephys_sorter.PostProcessing & task_key

organoid_id  e.g. O17,experiment_start_time,insertion_number,start_time,end_time,paramset_idx,execution_time  datetime of the start of this step,execution_duration  execution duration in hours
O09,2023-05-18 12:25:00,0,2023-05-18 12:25:00,2023-05-18 12:26:30,101,2024-07-24 19:57:32,0.0359366


In [None]:
ephys.CuratedClustering.populate(task_key)

In [None]:
ephys.CuratedClustering & task_key

In [None]:
ephys.WaveformSet.populate(task_key)

In [None]:
ephys.WaveformSet & task_key

In [None]:
ephys.QualityMetrics.populate(task_key)

In [None]:
ephys.QualityMetrics & task_key