# Sorting Notebook

This notebook will download and sort electrophysiology collected using an Intan headstage, in the .rhd format. 

The data is intracranial mouse recording, from a 16 channel microarray. The paper can be found here: https://doi.org/10.1371/journal.pone.0221510


# Getting Set Up

Open a terminal. Make sure "Sorter" environment is active. 

```
conda deactivate
conda activate sorter
```

navigate to the correct directory, and examine the folders available. 

```
cd ~/codespace
box folders:items 352606395707
```

Download the patient level folder, which will contain multiple session folders. 
Replace with the correct file number.
```
box folders:download 123456789 --destination="data"
```

In [None]:
# List the folders in the parent directory, copy the ID associated with the patient recording
!box folders:items 352606395707

[2m----- Folder 352606396623 -----[22m
[36mType:[39m folder
[36mID:[39m '352606396623'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m Intan_RDH_2000

[2m----- Folder 354522525287 -----[22m
[36mType:[39m folder
[36mID:[39m '354522525287'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m Intan_RDH_2000 (1)


In [2]:
!cd ~/codespace
!box folders:download 352606396623 --destination="data"

[?25l[36m⠋[39m Starting download[2K[1G[36m⠙[39m Starting download[2K[1G[36m⠹[39m Starting download[2K[1G[36m⠸[39m Starting download[2K[1G[36m⠼[39m Creating folder 352606396623 at Intan_RDH_2000[2K[1G[36m⠴[39m Creating folder 352606396623 at Intan_RDH_2000[2K[1G[36m⠦[39m Creating folder 352605477299 at Intan_RDH_2000/Session1[2K[1G[36m⠧[39m Creating folder 352605477299 at Intan_RDH_2000/Session1[2K[1G[36m⠇[39m Creating folder 352605477299 at Intan_RDH_2000/Session1[2K[1G[36m⠏[39m Downloading file 2054370143795 to Intan_RDH_2000/Session1/raw/Intan RHD file1.rhd[2K[1G[1A[2K[1G[36m⠋[39m Downloading file 2054370143795 to Intan_RDH_2000/Session1/raw/Intan RHD file1.rhd[2K[1G[1A[2K[1G[36m⠙[39m Downloading file 2054370143795 to Intan_RDH_2000/Session1/raw/Intan RHD file1.rhd[2K[1G[1A[2K[1G[36m⠹[39m Downloading file 2054370143795 to Intan_RDH_2000/Session1/raw/Intan RHD file1.rhd[2K[1G[1A[2K[1G[36m⠸[39m Downloading file 20543701

In [None]:
import os
from pathlib import Path
from spikeinterface.sorters import run_sorter
import spikeinterface.full as si
import probeinterface as pi
from pathlib import Path
import probeinterface as pi


codespace = Path.home() / "codespace"
base_folder = codespace / "data"
patient = "Intan_RDH_2000"
session = "Session1"
session_location =  base_folder / patient / session
sorted_data = session_location / "sorted"
sorter_output_folder = sorted_data / "sorter_folder" 
analyzer_folder = sorted_data / "analyzer_folder"

os.chdir(session_location)

intan_file = "/home/marco/codespace/data/Intan_RDH_2000/Session1/raw/Intan RHD file1.rhd"

# Load recording into spike interface

In [32]:
# Load Intan Recording
rec = si.read_intan(intan_file, stream_id = "0")
rec

In [None]:
# # Create custom probe geometry

# probe = pi.Probe(ndim=2)
# positions = []

# for i in range(16):
#     positions.append([0, i * 50])
# probe.set_contacts(positions = positions, shapes = "circle", shape_params = {'radius':5})

# probe.set_device_channel_indices(range(16))
# probe.set_contact_ids([f"ch{i}" for i in range(16)])

# probe_path = codespace / "sorting_script/neuronexus-A16x1_2mm_50_177_A16.json"
# pi.write_probeinterface(probe_path, probe)


In [4]:
# Attach Probe to Recording
rec = rec.set_probe(probe)

n_rec = rec.get_num_channels()
n_probe = probe.get_contact_count()

if n_probe != n_rec:
    raise ValueError(f"Probe contacts ({n_probe}) != recording channels ({n_rec}). "
                     f"Pick the correct probe variant or subset/remap accordingly.")

In [None]:
# Run Kilosort
from os import remove

sorting_KS4 = run_sorter(
    sorter_name="kilosort4",
    recording=rec,
    folder=sorter_output_folder,
    remove_existing_folder = True,
    verbose=True
)

write_binary_recording (no parallelization): 100%|██████████| 1201/1201 [00:08<00:00, 143.73it/s]
kilosort.run_kilosort:  
kilosort.run_kilosort: Computing preprocessing variables.
kilosort.run_kilosort: ----------------------------------------
kilosort.run_kilosort: N samples: 24000480
kilosort.run_kilosort: N seconds: 1200.024
kilosort.run_kilosort: N batches: 401
kilosort.run_kilosort: Preprocessing filters computed in 0.36s; total 0.36s
kilosort.run_kilosort:  
kilosort.run_kilosort: Resource usage after preprocessing
kilosort.run_kilosort: ********************************************************
kilosort.run_kilosort: CPU usage:     4.00 %
kilosort.run_kilosort: Mem used:      3.90 %     |       4.89 GB
kilosort.run_kilosort: Mem avail:    119.55 / 124.44 GB
kilosort.run_kilosort: ------------------------------------------------------
kilosort.run_kilosort: GPU usage:    `conda install pynvml` for GPU usage
kilosort.run_kilosort: GPU memory:    1.86 %     |      0.27   /    14.56 

kilosort4 run time 43.42s


In [None]:
# Create Sorting Analyzer
import spikeinterface.full as si

# Load Recording
recording = si.read_intan(intan_file, stream_id = "0")
recording = recording.set_probe(probe, in_place=False)
recording = si.unsigned_to_signed(recording)
recording_filtered = si.bandpass_filter(recording)

job_kwargs = dict(n_jobs=-1, progress_bar=True, chunk_duration="1s")

sorting_analyzer = si.create_sorting_analyzer(sorting=sorting_KS4, recording=recording_filtered, folder=analyzer_folder, overwrite = True,
format="binary_folder", **job_kwargs)

sorting_analyzer.compute("random_spikes", method="uniform", max_spikes_per_unit=500)
sorting_analyzer.compute("waveforms", **job_kwargs)
sorting_analyzer.compute("templates", **job_kwargs)
sorting_analyzer.compute("noise_levels")
sorting_analyzer.compute("unit_locations", method = "monopolar_triangulation")
sorting_analyzer.compute("isi_histograms")
sorting_analyzer.compute("correlograms", window_ms=100, bin_ms=5)
sorting_analyzer.compute("principal_components", n_components=3, mode="by_channel_global", whiten=True, **job_kwargs)
sorting_analyzer.compute("quality_metrics", metric_names=["snr", "firing_rate"])
sorting_analyzer.compute("template_similarity")
sorting_analyzer.compute("spike_amplitudes", **job_kwargs)

estimate_sparsity (workers: 32 processes): 100%|██████████| 1201/1201 [00:00<00:00, 2837.88it/s]
compute_waveforms (workers: 32 processes): 100%|██████████| 1201/1201 [00:00<00:00, 1845.96it/s]
noise_level (no parallelization): 100%|██████████| 20/20 [00:00<00:00, 97.42it/s]
Fitting PCA: 100%|██████████| 28/28 [00:00<00:00, 98.48it/s] 
Projecting waveforms: 100%|██████████| 28/28 [00:00<00:00, 505.31it/s]
spike_amplitudes (workers: 32 processes): 100%|██████████| 1201/1201 [00:01<00:00, 1140.97it/s]


<spikeinterface.postprocessing.spike_amplitudes.ComputeSpikeAmplitudes at 0x7ab9983d4ad0>

In [12]:
analyzer_folder

PosixPath('/home/marco/codespace/data/Intan_RDH_2000/Session1/sorted/analyzer_folder')

# Run the curation window with the following code

sigui --mode=web --curation "/home/marco/codespace/data/Intan_RDH_2000/Session2/sorted/analyzer_folder"

In [None]:
# Run Curation GUI

# I need to figure out how to clear the cache in between runs so that the data will be cleared, without getting phantom lables. 

# The reason is because the web app panel caches the curation json, probably as a way to try and be helpful. It needs to be flushed in between sessions. 
# import spikeinterface.full as si
# from spikeinterface_gui import run_mainwindow

# analyzer_folder = r"/home/marco/codespace/data/Intan_RDH_2000/Session2/sorted/analyzer_folder"

# sorting_analyzer = si.load_sorting_analyzer(folder=analyzer_folder)

# run_mainwindow(sorting_analyzer, mode="web", curation=True)

Found available port: 57861
Launching server at http://localhost:57861




<spikeinterface_gui.backend_panel.PanelMainWindow at 0x7abb963c1610>

In [None]:
# Identify the client by Name, and copy the ID
!box folders:items 352606395707

[2m----- Folder 352606396623 -----[22m
[36mType:[39m folder
[36mID:[39m '352606396623'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m Intan_RDH_2000

[2m----- Folder 354522525287 -----[22m
[36mType:[39m folder
[36mID:[39m '354522525287'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m Intan_RDH_2000 (1)


In [None]:
# List the folders inside of the client folder, each represents a session
# Copy the client ID from above to list the sessions
# Identify the correct session, and copy the ID
!box folders:items 352606396623

[2m----- Folder 352605477299 -----[22m
[36mType:[39m folder
[36mID:[39m '352605477299'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m Session1

[2m----- Folder 352604968054 -----[22m
[36mType:[39m folder
[36mID:[39m '352604968054'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m Session2


In [None]:
# Upload the sorted folder to the correct session folder, using the ID copied from above
!box folders:upload "~/codespace/data/Intan_RDH_2000/Session1/sorted" --parent-folder 352605477299

[36mType:[39m folder
[36mID:[39m '354623627238'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m sorted
[36mCreated At:[39m '2025-12-06T13:02:43-08:00'
[36mModified At:[39m '2025-12-06T13:02:43-08:00'
[36mDescription:[39m ''
[36mSize:[39m 0
[36mPath Collection:[39m
[36m    Total Count:[39m 4
[36m    Entries:[39m
[36m        -[39m
[36m            Type:[39m folder
[36m            ID:[39m '0'
[36m            Sequence ID:[39m null
[36m            ETag:[39m null
[36m            Name:[39m All Files
[36m        -[39m
[36m            Type:[39m folder
[36m            ID:[39m '352606395707'
[36m            Sequence ID:[39m '0'
[36m            ETag:[39m '0'
[36m            Name:[39m Cloud_Sorter
[36m        -[39m
[36m            Type:[39m folder
[36m            ID:[39m '352606396623'
[36m            Sequence ID:[39m '0'
[36m            ETag:[39m '0'
[36m            Name:[39m Intan_RDH_2000
[36m        -[39m
[36m            Typ

In [None]:
# Upload the sorted folder to the correct session folder, using the ID copied from above
!box folders:upload "/home/marco/codespace/data/Intan_RDH_2000/Session2/sorted" --parent-folder 352604968054

[36mType:[39m folder
[36mID:[39m '354623985218'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m sorted
[36mCreated At:[39m '2025-12-06T13:06:28-08:00'
[36mModified At:[39m '2025-12-06T13:06:28-08:00'
[36mDescription:[39m ''
[36mSize:[39m 0
[36mPath Collection:[39m
[36m    Total Count:[39m 4
[36m    Entries:[39m
[36m        -[39m
[36m            Type:[39m folder
[36m            ID:[39m '0'
[36m            Sequence ID:[39m null
[36m            ETag:[39m null
[36m            Name:[39m All Files
[36m        -[39m
[36m            Type:[39m folder
[36m            ID:[39m '352606395707'
[36m            Sequence ID:[39m '0'
[36m            ETag:[39m '0'
[36m            Name:[39m Cloud_Sorter
[36m        -[39m
[36m            Type:[39m folder
[36m            ID:[39m '352606396623'
[36m            Sequence ID:[39m '0'
[36m            ETag:[39m '0'
[36m            Name:[39m Intan_RDH_2000
[36m        -[39m
[36m            Typ