## Bombcell getting started script

to do:
- JF: add MATLAB changes (spikeGLX meta)
- JF: generate output plots
- move all the squeeze() and astype() to the loading function
- what are the errors / warnings in the main function?
- how to load output, check and modify param, then re-generate + save
- double check no hard-coding
- double check all python names are copies of MATLAB with snake_case
- check comments & function headers

### Load in python packages 

In [2]:
import os
import sys
from pathlib import Path

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Add bombcell to Python path if NOT installed with pip
demo_dir = Path(os.getcwd())
pyBombCell_dir = demo_dir.parent
sys.path.append(str(pyBombCell_dir))

In [3]:
%load_ext autoreload
%autoreload 2

import bombcell as bc

  from .autonotebook import tqdm as notebook_tqdm


### Define data paths

By default: path to BombCell's toy dataset

In [108]:
ks_dir = demo_dir / "toy_data"  # Replace with your kilosort directory
raw_dir = (
    None  # Leave 'None' if no raw data
)
save_path = "~/Downloads/quality_metrics_JF093"  # Replace with the directory in which you want to save bombcell's output

ks_dir = "/home/netshare/znas-lab/Share/JulieF/for_sam/JF093_2023-03-06_site1/kilosort2/site1"
raw_file = "/home/netshare/znas-lab/Share/JulieF/for_sam/JF093_2023-03-06_site1/site1/2023-03-06_JF093_1_g0_t0_bc_decompressed.imec0.ap.bin"

# If a raw data directory with a meta folder is not given,
# please input the gain manually
gain_to_uV = np.nan

### Get parameters

In [137]:
# ephys_raw_data and gain_to_uv will be None if no raw_dir given
ephys_raw_data, meta_path, gain_to_uV = bc.manage_if_raw_data(raw_file, gain_to_uV)

param = bc.get_default_parameters(ks_dir, raw_file=raw_file, ephys_meta_dir=meta_path)
param["compute_distance_metrics"] = 0
param["compute_drift"] = 0
param["compute_time_chunks"] = 0

Using found decompressed data 2023-03-06_JF093_1_g0_t0_bc_decompressed.imec0.ap.bin


### Run bombcell, get unit types and save results 
To save results as a parquet either PyArrow or FastParquet needs to be installed

In [None]:
import warnings
warnings.filterwarnings("error") 

(
    quality_metrics,
    param,
    unit_type,
    unit_type_string,
) = bc.run_bombcell(
    ks_dir, raw_dir, save_path, param
)

In [139]:
unit_type_string

array(['NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE',
       'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE',
       'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE',
       'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE',
       'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE',
       'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE',
       'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE',
       'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE',
       'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE',
       'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE',
       'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE',
       'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE',
       'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE',
       'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE', 'NOISE',
       'NOISE', 'NOI

(optional) look at a table which shows why each unit fails

In [None]:
qm_table = bc.make_qm_table(
    quality_metrics, param, unique_templates, unit_type_string
)  # JF: does not include distance, drift metrics, amplitude, ... ? 
   #  also replace "somatic" with the 2 quality metrics for that.
   #  + use more sensible table headers (e.g. "Peaks" should be "n_Peaks", "Good unit" should be "unit_type")
qm_table



{'phy_cluster_id': array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.,
        13., 14.]),
 'cluster_id': array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12.,
        13., 14.]),
 'use_these_times_start': array([3.25366667e-01, 3.25366667e-01, 2.16032537e+03, 2.88032537e+03,
        3.25366667e-01, 2.52032537e+03, 7.20325367e+02, 1.80032537e+03,
        7.20325367e+02, 3.60325367e+02, 1.08032537e+03, 3.25366667e-01,
        2.16032537e+03, 3.25366667e-01, 1.08032537e+03]),
 'use_these_times_stop': array([1800.32536667, 4320.32536667, 3960.32536667, 3600.32536667,
        1440.32536667, 3960.32536667, 3960.32536667, 2880.32536667,
        2520.32536667,  720.32536667, 2520.32536667, 3960.32536667,
        3240.32536667, 3960.32536667, 3240.32536667]),
 'RPV_use_tauR_est': array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]),
 'percent_missing_gaussian': array([41.29685164,  1.        ,  1.        ,  1.        ,  6.7279165 ,
        9