# Download ground truths

author: laquitainesteeve@gmail.com

purpose: download ground truths from dandi archive

Execution time: 3 secs

Special hardware: on CPU, does not require GPU.

# Setup 

Activate dandi virtual environment (envs/dandi.yml)

```bash
python -m ipykernel install --user --name ground_truth --display-name "ground_truth"
```

In [6]:
%%time 
%load_ext autoreload
%autoreload 2

# import python packages
import os
import numpy as np
from time import time
from dandi.dandiapi import DandiAPIClient
import spikeinterface.extractors as se
import spikeinterface.sorters as ss
import spikeinterface
from pynwb.file import NWBFile, Subject
from pynwb import NWBHDF5IO
import uuid
from datetime import datetime
from dateutil.tz import tzlocal
import spikeinterface as si
import os
import shutil 
import pandas as pd

print("spikeinterface", spikeinterface.__version__)

proj_path = "/home/steeve/steeve/epfl/code/spikebias"
os.chdir(proj_path)

# custom package
from src.nodes.dataloader.dataloader import SortingLoader

# setup parameters
SAVE = True # save sorting extractor

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
spikeinterface 0.101.2
CPU times: user 783 μs, sys: 0 ns, total: 783 μs
Wall time: 696 μs


## Functions

In [9]:
def save_ground_truth_metadata(property_data_path:str, Sorting, ground_truth_path:str, save:bool):
    """add cells properties to Ground truth extractor

    Args:
        property_data_path: path of cell properties h5 file (e.g., assets/metadata/silico_neuropixels/npx_evoked/cell_properties.h5)
        Sorting: Ground truth Sorting Extractor
        ground_truth_path (None): ground truth saving path
        save (bool): whether to save the Sorting Extractor
    
    Returns:
        Sorting Extractor
    """
    # cell properties to h5 file in the local project path
    df = pd.read_hdf(property_data_path, key="cell_properties")
    
    # add as property to Sorting Extractor
    for prop in df.columns:
        Sorting.set_property(prop, df[prop].values.tolist())
    
    # make a "layers" copy of "layer"
    # for convenience
    Sorting.set_property("layers", df["layer"].values.tolist())

    # save Sorting Extractor
    if save:
        shutil.rmtree(ground_truth_path, ignore_errors=True)
        Sorting.save(folder=ground_truth_path)
    return Sorting

## Load ground truth

### npx_spont

note: Sorting extractor already contains cell metadata

In [None]:
%%time

# load dandiset (npx, spontaneous, 40Khz)
dandiset_id = '001250'
filepath = 'sub-001/sub-001_ecephys.nwb' # ground truth spikes and unit metadata of biophy npx spontaneous
SFREQ = 40000 # sampling frequency
TSTART = 0 # default - these are timestamps

# Instantiate and load sorting
loader = SortingLoader(dandiset_id, filepath, SFREQ, TSTART)
GroundTruth = loader.load_sorting()

# write
if SAVE:
    GroundTruth.save(folder = os.path.join(proj_path, "dataset/00_raw/ground_truth_npx_spont"))

# report
print('\n', GroundTruth)
GroundTruth

### npx_evoked

- note: here we add metadata that were not saved in the dandi dataset

In [5]:
%%time

# load dandiset (npx, evoked, 20Khz)
DANDISET_ID = '001250'
FILEPATH = 'sub-002-fitted/sub-002-fitted_ecephys.nwb' # ground truth spikes and unit metadata
CELL_PROPERTIES_PATH = "assets/metadata/silico_neuropixels/npx_evoked/cell_properties.h5"
SAVE_PATH = "dataset/00_raw/ground_truth_npx_evoked"
SFREQ = 20000 # sampling frequency
TSTART = 0    # default - these are timestamps

# load ground truth cells and timestamps
loader = SortingLoader(DANDISET_ID, FILEPATH, SFREQ, TSTART)
GroundTruth = loader.load_sorting()

# add cells metadata
GroundTruth = save_ground_truth_metadata(CELL_PROPERTIES_PATH, GroundTruth, SAVE_PATH, save=False)

# write
if SAVE:
    GroundTruth.save(folder = os.path.join(proj_path, "dataset/00_raw/ground_truth_npx_evoked"), overwrite=True)

# report
print('\n', GroundTruth)
GroundTruth


 NwbSortingExtractor: 1836 units - 1 segments - 20.0kHz
  file_path: https://dandiarchive.s3.amazonaws.com/blobs/9d6/6ed/9d66ed40-af31-43aa-b4ba-246d2206dcad
CPU times: user 1.22 s, sys: 270 ms, total: 1.5 s
Wall time: 24.3 s
