# Sorting Notebook

This notebook will download and sort electrophysiology collected using an Intan headstage, in the .rhd format. 

The data is intracranial mouse recording, from a 16 channel microarray. The paper can be found here: https://doi.org/10.1371/journal.pone.0221510


# Getting Set Up

1. Open a terminal. Make sure "Sorter" environment is active by running these commands in the terminal 

```
conda deactivate
conda activate sorter
```


In [None]:
# 2. Import scripts
from sorting_scripts import zsort

In [None]:
# 3. Set Patient and Session Paths
patient = "Intan_RDH_2000"
session = "Session1"

path_dict = zsort.set_paths(patient, session)
path_dict

1
Found Intan file: /data/Intan_RDH_2000/Session1/raw/Intan RHD file1.rhd


{'patient': 'Intan_RDH_2000',
 'session': 'Session1',
 'repo_root': PosixPath('/home/marco/codespace/sorting_script'),
 'data_root': PosixPath('/data'),
 'session_location': PosixPath('/data/Intan_RDH_2000/Session1'),
 'sorted_data': PosixPath('/data/Intan_RDH_2000/Session1/Intan_RDH_2000-Session1-sorted'),
 'sorter_output_folder': PosixPath('/data/Intan_RDH_2000/Session1/Intan_RDH_2000-Session1-sorted/Intan_RDH_2000-Session1-sorter_folder'),
 'analyzer_folder': PosixPath('/data/Intan_RDH_2000/Session1/Intan_RDH_2000-Session1-sorted/Intan_RDH_2000-Session1-analyzer_folder'),
 'intan_file': PosixPath('/data/Intan_RDH_2000/Session1/raw/Intan RHD file1.rhd')}

# Load recording into spike interface

In [None]:
# Generate a syntehtic Dataset
# from spikeinterface import generation as gen

# # Load Recording, creates recording object in memory

# rec, drift, sort = gen.generate_drifting_recording(duration = 120)

In [None]:
# 4. Load File

import spikeinterface.full as si

recording_path = path_dict["intan_file"]

rec = si.read_intan(recording_path, stream_id = "0")

In [None]:
# 5. Set Probe
probe = "neuronexus-A16x1_2mm_50_177_A16.json"
rec = zsort.set_probe(rec, path_dict, probe)

no probe, attaching one


# Load sorter and analyzer if they exist

# If they do not exist, run the sorter and create analyzer

In [None]:
# 6. Sort

sort = zsort.sort(rec, path_dict)

[zsort] recording dtype is unsigned (uint16); converting to signed


write_binary_recording (no parallelization):   0%|          | 0/1201 [00:00<?, ?it/s]

kilosort.run_kilosort:  
kilosort.run_kilosort: Computing preprocessing variables.
kilosort.run_kilosort: ----------------------------------------
kilosort.run_kilosort: N samples: 24000480
kilosort.run_kilosort: N seconds: 1200.024
kilosort.run_kilosort: N batches: 401
kilosort.run_kilosort: Preprocessing filters computed in 0.45s; total 0.45s
kilosort.run_kilosort:  
kilosort.run_kilosort: Resource usage after preprocessing
kilosort.run_kilosort: ********************************************************
kilosort.run_kilosort: CPU usage:     7.10 %
kilosort.run_kilosort: Mem used:      5.70 %     |       3.55 GB
kilosort.run_kilosort: Mem avail:    58.55 / 62.10 GB
kilosort.run_kilosort: ------------------------------------------------------
kilosort.run_kilosort: GPU usage:    `conda install pynvml` for GPU usage
kilosort.run_kilosort: GPU memory:    1.86 %     |      0.27   /    14.58 GB
kilosort.run_kilosort: Allocated:     0.06 %     |      0.01   /    14.58 GB
kilosort.run_kilosor

kilosort4 run time 44.02s


In [None]:
# 7. Analyze
zsort.analyze(rec, sort, path_dict)

No valid analyzer found, creating a new one
Reason: This folder does not exists /data/Intan_RDH_2000/Session1/Intan_RDH_2000-Session1-sorted/Intan_RDH_2000-Session1-analyzer_folder
[zsort] recording dtype is unsigned (uint16); converting to signed


estimate_sparsity (workers: 16 processes):   0%|          | 0/1201 [00:00<?, ?it/s]



compute_waveforms (workers: 16 processes):   0%|          | 0/1201 [00:00<?, ?it/s]

noise_level (workers: 16 processes):   0%|          | 0/20 [00:00<?, ?it/s]

Fitting PCA:   0%|          | 0/32 [00:00<?, ?it/s]

Projecting waveforms:   0%|          | 0/32 [00:00<?, ?it/s]

Compute : spike_amplitudes (workers: 16 processes):   0%|          | 0/1201 [00:00<?, ?it/s]

SortingAnalyzer: 16 channels - 32 units - 1 segments - binary_folder - sparse - has recording
Loaded 11 extensions: random_spikes, waveforms, templates, noise_levels, unit_locations, isi_histograms, correlograms, principal_components, template_similarity, spike_amplitudes, quality_metrics

## 8/9. Launch Curator from Terminal (Make sure so replace with correct path)
## Copy and paste this into the terminal
```
sigui --mode=web --curation "data/Intan_RDH_2000/Session1/Intan_RDH_2000-Session1-sorted/Intan_RDH_2000-Session1-analyzer_folder"
```

In [None]:
# 10. Save Curated Data
curated_data = zsort.save_curated_data(patient, session, sort, path_dict)

compute_waveforms (no parallelization):   0%|          | 0/1201 [00:00<?, ?it/s]

noise_level (no parallelization):   0%|          | 0/20 [00:00<?, ?it/s]

Fitting PCA:   0%|          | 0/3 [00:00<?, ?it/s]

Projecting waveforms:   0%|          | 0/3 [00:00<?, ?it/s]

Compute : spike_amplitudes (no parallelization):   0%|          | 0/1201 [00:00<?, ?it/s]

Wrote: /data/Intan_RDH_2000/Session1/Intan_RDH_2000-Session1-sorted/Intan_RDH_2000-Session1-curated_analyzer.zarr


### View curated data, just for fun

```
sigui --mode=web --curation "/data/Intan_RDH_2000/Session1/Intan_RDH_2000-Session1-sorted/Intan_RDH_2000-Session1-curated_analyzer.zarr"
```

In [8]:
zsort.generate_figures(curated_data, path_dict)

In [None]:
# Interact with box via api

!box folders:items 352606395707

[2m----- Folder 352606396623 -----[22m
[36mType:[39m folder
[36mID:[39m '352606396623'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m Intan_RDH_2000

[2m----- Folder 354522525287 -----[22m
[36mType:[39m folder
[36mID:[39m '354522525287'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m Intan_RDH_2000 (1)


In [28]:
!box folders:items 352606396623

[2m----- Folder 352605477299 -----[22m
[36mType:[39m folder
[36mID:[39m '352605477299'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m Session1

[2m----- Folder 352604968054 -----[22m
[36mType:[39m folder
[36mID:[39m '352604968054'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m Session2


In [29]:
!box folders:items 352605477299

[2m----- Folder 352607353389 -----[22m
[36mType:[39m folder
[36mID:[39m '352607353389'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m raw

[2m----- Folder 354623627238 -----[22m
[36mType:[39m folder
[36mID:[39m '354623627238'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m sorted


In [30]:
!box folders:upload 354623627238

[91mCould not read directory 354623627238[39m
[91m[39m

In [58]:
!box folders:upload "/home/marco/codespace/data/Intan_RDH_2000/Session1/sorted/cleaned_analyzer.zarr" -p 354623627238

[36mType:[39m folder
[36mID:[39m '355655922049'
[36mSequence ID:[39m '0'
[36mETag:[39m '0'
[36mName:[39m cleaned_analyzer.zarr
[36mCreated At:[39m '2025-12-12T11:40:22-08:00'
[36mModified At:[39m '2025-12-12T11:40:22-08:00'
[36mDescription:[39m ''
[36mSize:[39m 0
[36mPath Collection:[39m
[36m    Total Count:[39m 5
[36m    Entries:[39m
[36m        -[39m
[36m            Type:[39m folder
[36m            ID:[39m '0'
[36m            Sequence ID:[39m null
[36m            ETag:[39m null
[36m            Name:[39m All Files
[36m        -[39m
[36m            Type:[39m folder
[36m            ID:[39m '352606395707'
[36m            Sequence ID:[39m '0'
[36m            ETag:[39m '0'
[36m            Name:[39m Cloud_Sorter
[36m        -[39m
[36m            Type:[39m folder
[36m            ID:[39m '352606396623'
[36m            Sequence ID:[39m '0'
[36m            ETag:[39m '0'
[36m            Name:[39m Intan_RDH_2000
[36m        -[39m
[36m