### CPL Data Analysis Pipeline

This notebook demonstrates how to use the CPL data analysis pipeline.
For instructions on how to install the pipeline, see the README.

<br>

This pipeline can be started from the shell by running the following command:

```bash
git clone 
python -m ~/path/to/cpl_pipeline/repo
```

Or by opening an IPython terminal and interact with the pipeline using the following commands:

```python
[1] import cpl_pipeline as cpl
[2] my_data_path = Path(r'C:\Users\cpl_lab\data\session_1')
[3] my_data = cpl.Dataset(my_data_path)
[4] my_data.initialize_parameters(accept_params=True)
[5] my_data.extract_data()
[6] my_data.detect_spikes()
[7] my_data.cluster_spikes()
[8] my_data.sort_spikes(3)  # do this for each electrode
[9] my_data.make_unit_plots()
[10] my_data.post_sorting()
```
---

### Steps:
1. Initialize a new dataset
2. Load a dataset
3. Extract data
4. Detect spikes
5. Cluster spikes
6. Sort spikes
7. Make plots
8. Post sorting

In [2]:
# import modules 
# an error here means there's a problem with the installation
# very likely that the environment is not being used

%load_ext autoreload
%autoreload 2

from pathlib import Path  # this takes care of os-specific path issues
import cpl_pipeline as cpl
# this should be the path to the the directory containing your raw data
# for a single session. You can copy-paste this path from the file explorer

# Windows:
# datapath = Path(r'C:\Users\cpl_lab\data\session_1')
# Mac:
# datapath = Path('/Users/cpl_lab/data/session_1')

# move the data file to its own directory, if more than one session
# is included they will be merged into a single dataset
datapath = Path('/Users/flynnoconnell/data/r35_session_1')  
if not datapath.exists():
    raise FileNotFoundError(f"The above path: {datapath} does not exist.")

print(f"Path to data:        {datapath}")
print("Directory contents:  "
      f"{[f.name for f in datapath.iterdir()]}")

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
Path to data:        /Users/flynnoconnell/data/r35_session_1
Directory contents:  ['spike_clustering', 'analysis_params', 'detect_spikes', 'animal_date_drug_post.smr', 'r35_session_1_dataset.p', 'r35_session_1.h5']


###  Load or Initialize a dataset

Initialize a new dataset by passing the path to the raw data directory to the `Dataset` class.

In [3]:
my_data = cpl.Dataset(datapath)
my_data.initialize_parameters(accept_params=False)  # you can optionally skip changing parameters
print(my_data)

# if you want to load a previously initialized dataset, use load_dataset() instead:
# my_data = cpl.load_dataset(datapath)


Using r35_session_1 as name for dataset
Using default logfile /Users/flynnoconnell/cpl_pipeline/logs/r35_session_1_dataset.log.
Existing h5 file found. Using /Users/flynnoconnell/data/r35_session_1/r35_session_1.h5.
   electrode   name  port units  offsets  scales  sampling_rate     SonpyType  \
3          3  U1_OB    26  Volt      0.0     1.0       18518.52  DataType.Adc   
4          4  U1_PC    27  Volt      0.0     1.0       18518.52  DataType.Adc   
5          5  U2_PC    28  Volt      0.0     1.0       18518.52  DataType.Adc   
6          6  U3_PC    29  Volt      0.0     1.0       18518.52  DataType.Adc   

   unit    lfp  event  
3  True  False  False  
4  True  False  False  
5  True  False  False  
6  True  False  False  
Writing all parameters to json file in analysis_params folder...
Writing clustering_params.json to /Users/flynnoconnell/data/r35_session_1/analysis_params/clustering_params.json
Writing spike_array_params.json to /Users/flynnoconnell/data/r35_session_1/analy

### 1. Extraction

Extract data from the raw data files. This will create a .h5 file as storage for the extracted data.

In [4]:
my_data.extract_data()
print(my_data)

Deleting existing h5 file...Done!
Creating empty HDF5 store with raw data groups
Writing r35_session_1.h5 ...Done!

Creating empty arrays in hdf5 store for raw data...Done!
Extracting data from Spike2 file


ic| /Users/flynnoconnell/repos/cpl_pipeline/cpl_pipeline/base/dataset.py:584 in _process_spike2data()
    f"Extracting electrode {electrode_idx}/{electrodes[-1]}": 'Extracting electrode 3/6'


Writing electrode3 to /Users/flynnoconnell/data/r35_session_1/r35_session_1.h5...Writing electrode_map to /Users/flynnoconnell/data/r35_session_1/r35_session_1.h5...


ic| /Users/flynnoconnell/repos/cpl_pipeline/cpl_pipeline/base/dataset.py:597 in _process_spike2data()
    "Time vector saved to h5 file": 'Time vector saved to h5 file'
ic| /Users/flynnoconnell/repos/cpl_pipeline/cpl_pipeline/base/dataset.py:584 in _process_spike2data()
    f"Extracting electrode {electrode_idx}/{electrodes[-1]}": 'Extracting electrode 4/6'


Writing electrode4 to /Users/flynnoconnell/data/r35_session_1/r35_session_1.h5...

ic| /Users/flynnoconnell/repos/cpl_pipeline/cpl_pipeline/base/dataset.py:584 in _process_spike2data()
    f"Extracting electrode {electrode_idx}/{electrodes[-1]}": 'Extracting electrode 5/6'


Writing electrode_map to /Users/flynnoconnell/data/r35_session_1/r35_session_1.h5...
Writing electrode5 to /Users/flynnoconnell/data/r35_session_1/r35_session_1.h5...

ic| /Users/flynnoconnell/repos/cpl_pipeline/cpl_pipeline/base/dataset.py:584 in _process_spike2data()
    f"Extracting electrode {electrode_idx}/{electrodes[-1]}": 'Extracting electrode 6/6'


Writing electrode_map to /Users/flynnoconnell/data/r35_session_1/r35_session_1.h5...
Writing electrode6 to /Users/flynnoconnell/data/r35_session_1/r35_session_1.h5...Writing electrode_map to /Users/flynnoconnell/data/r35_session_1/r35_session_1.h5...
Saving dataset: r35_session_1... 
Saving to /Users/flynnoconnell/data/r35_session_1/r35_session_1_dataset.p

Data Extraction Complete
--------------------
dataset :: r35_session_1
Root Directory : /Users/flynnoconnell/data/r35_session_1
Save File : /Users/flynnoconnell/data/r35_session_1/r35_session_1_dataset.p
Log File : /Users/flynnoconnell/cpl_pipeline/logs/r35_session_1_dataset.log

Object creation date: 12/30/23
h5 File: /Users/flynnoconnell/data/r35_session_1/r35_session_1.h5

--------------------
Processing Status
--------------------
initialize_parameters     True
extract_data              True
create_trial_list         False
mark_dead_channels        False
detect_spikes             False
spike_clustering          False
cleanup_clu

### 2. Detection

Turn continuous ADC signal into discrete spike events.

In [5]:
# DETECTION ----
my_data.detect_spikes()
# my_data.edit_clustering_params()


Running Spike Detection
-------------------
Parameters
file_dir             /Users/flynnoconnell/data/r35_session_1
data_quality         clean
sampling_rate        18518.52
clustering_params    
    Max Number of Clusters      12
    Max Number of Iterations    5000
    Convergence Criterion       1e-05
    GMM random restarts         20
    
data_params          
    V_cutoff for disconnected headstage     1500
    Max rate of cutoff breach per second    0.2
    Max allowed seconds with a breach       10
    Max allowed breaches per second         20
    Intra-cluster waveform amp SD cutoff    3
    
bandpass_params      
    Lower freq cutoff    300
    Upper freq cutoff    3000
    
spike_snapshot       
    Time before spike (ms)    0.75
    Time after spike (ms)     1
    


  0%|          | 0/4 [00:00<?, ?it/s]ic| /Users/flynnoconnell/repos/cpl_pipeline/cpl_pipeline/analysis/cluster.py:715 in run()
    f"Running spike detection on electrode {electrode}": 'Running spike detection on electrode 3'


[ 0.17852783  0.13000488  0.04135132 ...  0.04180908 -0.01251221
 -0.03219604]
Scaling waveforms by energy
Computing waveform energy


 25%|██▌       | 1/4 [00:09<00:27,  9.13s/it]ic| /Users/flynnoconnell/repos/cpl_pipeline/cpl_pipeline/analysis/cluster.py:715 in run()
    f"Running spike detection on electrode {electrode}": 'Running spike detection on electrode 4'


[ 0.30502319  0.18188477  0.0617981  ... -0.02685547 -0.04745483
 -0.01663208]
Scaling waveforms by energy
Computing waveform energy


 50%|█████     | 2/4 [00:15<00:14,  7.43s/it]ic| /Users/flynnoconnell/repos/cpl_pipeline/cpl_pipeline/analysis/cluster.py:715 in run()
    f"Running spike detection on electrode {electrode}": 'Running spike detection on electrode 5'


[ 0.10986328  0.06225586  0.0227356  ...  0.03097534 -0.05569458
 -0.12237549]
Scaling waveforms by energy
Computing waveform energy


 75%|███████▌  | 3/4 [00:23<00:07,  7.53s/it]ic| /Users/flynnoconnell/repos/cpl_pipeline/cpl_pipeline/analysis/cluster.py:715 in run()
    f"Running spike detection on electrode {electrode}": 'Running spike detection on electrode 6'


[ 0.08773804 -0.00946045 -0.04364014 ...  0.03921509  0.04516602
  0.00183105]
Scaling waveforms by energy
Computing waveform energy


100%|██████████| 4/4 [00:31<00:00,  7.85s/it]

electrode    Result    Cutoff (s)
  3            1         1047.0
  4            1         1047.0
  5            1         1047.0
  6            1         1047.0
1 - Success
0 - No data or no spikes
-1 - Error
Writing electrode_map to /Users/flynnoconnell/data/r35_session_1/r35_session_1.h5...
Saving dataset: r35_session_1... 
Saving to /Users/flynnoconnell/data/r35_session_1/r35_session_1_dataset.p
Spike Detection Complete
------------------





[(None, None, None),
 (None, None, None),
 (None, None, None),
 (3, 1, 1047.0),
 (4, 1, 1047.0),
 (5, 1, 1047.0),
 (6, 1, 1047.0)]

### 3. Clustering



In [None]:
my_data.cluster_spikes()

ic| /Users/flynnoconnell/repos/cpl_pipeline/cpl_pipeline/analysis/cluster.py:998 in __init__()
    rec_dirs: PosixPath('/Users/flynnoconnell/data/r35_session_1')
    channel_number: 3
    out_dir: None
    params: {'bandpass_params': {'Lower freq cutoff': 300, 'Upper freq cutoff': 3000},
             'clustering_params': {'Convergence Criterion': 1e-05,
                                   'GMM random restarts': 20,
                                   'Max Number of Clusters': 12,
                                   'Max Number of Iterations'

### SORTING

In [None]:
my_data.sort_spikes(3)

### LOAD PICKLED DATA

In [None]:
pickle_path = Path().home() / 'data' / 'serotonin' / 'raw' / 'session_1' / 'session_1_dataset.p'
my_obj = cpl.load_pickled_object(pickle_path)

In [None]:

my_data.make_unit_plots()

In [None]:
my_data.post_sorting()