# Data processing


!! `PROBLEM: This is for low-pass filtered LFP signal only (below 1000 Hz)`!!

`The raw neuropixels recording traces are not available for the Allen visual coding experiment`

author: steeve.laquitaine@epfl.ch  
date: 2023.09.06  
last modified: 2023.09.07
status: OK  
display-status: OK  
regression: None  
duration: 


## Setup

Activate env from `allensdk.txt`

In [18]:
%load_ext autoreload
%autoreload 2

import os
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import spikeinterface.extractors as se 

# move to project path
PROJ_PATH = "/gpfs/bbp.cscs.ch/project/proj68/home/laquitai/bernstein_2023/"
os.chdir(PROJ_PATH)

from src.nodes.utils import get_config
from src.nodes.postpro.allen_cell_types import load_session_data
from src.nodes.dataeng.allen import allen
from src.nodes.prepro import preprocess
from src.nodes.dataeng.silico import recording, probe_wiring

# SETUP PARAMETERS

EXPERIMENT = "supp/allen_neuropixels"  # the experiment 
SIMULATION_DATE = "2023_08_30"    # the run (date)
PARV_SESSION_ID = 829720705       # optotagged for parvalbumin intern (reliable laser, 1.82 GB).
PARV_PROBE_ID = 832129154         # there are 5 in this session. We take the first.
data_conf, param_conf = get_config(EXPERIMENT, SIMULATION_DATE).values()
RAW_DATA = data_conf["raw"]["input"]
manifest_path = os.path.join(RAW_DATA, "manifest.json")

# NWB_PATH = data_conf["recording"]["input"]
# WRITE_PATH = data_conf["probe_wiring"]["output"]
# GT_SORTING_PATH = data_conf["sorting"]["simulation"]["ground_truth"]["input"]

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
2023-09-07 10:20:43,201 - root - utils.py - get_config - INFO - Reading experiment config.
2023-09-07 10:20:43,225 - root - utils.py - get_config - INFO - Reading experiment config. - done


### Download dataset (on-a-needs-basis)

This below downloads on a needs basis: 

- channels.csv
- manifest.json
- probes.csv
- sessions.csv
- units.csv

**Data description**:

- index column is a unique ID, which serves as a key for accessing the physiology data for each session.
- one session per mouse
- the age, sex, and genotype of the mouse (in this dataset, there's only one session per mouse)
- the number of probes, channels, and units for each session
- the brain structures recorded (CCFv3 acronyms)
- The gray period stimulus (just a blank gray screen) never gets a block. This is where spontaneous activity is collected.

In [7]:
# get data from a mouse session optotagged for parvalbumin interneurons
session = load_session_data(PARV_SESSION_ID, manifest_path)

TypeError: load_session_data() missing 1 required positional argument: 'manifest_path'

### Download lfp trace for a session and a probe

In [None]:
# load cached (already downloaded) session (NWB file)
session = allen.download_data_from_a_session(PARV_SESSION_ID, manifest_path)

# list session's probe ids
probe_ids = allen.find_probes_in_visual_cortex(session, manifest_path)

# download PROBE_ID's lfp traces (NWB file)
parv_lfp = session.get_lfp(PARV_PROBE_ID)

### Load raw recording

In [14]:
# load
RecordingExtr = se.NwbRecordingExtractor(
    "/gpfs/bbp.cscs.ch/project/proj68/scratch/laquitai/raw/npx_allen/session_829720705/probe_832129154_lfp.nwb"
)

  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."


### Cast as SI recording extractor

In [15]:
# write (3 mins)
recording.write(RecordingExtr, data_conf)

write_binary_recording with n_jobs = 1 and chunk_size = None


### Wire probe to recording

The collected recording is already wired with a probe.

In [19]:
# takes 3 min

# write wired probe to designated path
RecordingExtr = recording.load(data_conf)
probe_wiring.write(RecordingExtr, data_conf)

write_binary_recording with n_jobs = 1 and chunk_size = None
2023-09-07 10:24:56,267 - root - probe_wiring.py - write - INFO - Probe wiring done in  208.5 secs


In [21]:
RecordingExtr.sampling_frequency

1249.9998432394418

### Preprocess

In [20]:
# takes 32 min

# preprocess (8 min)
Preprocessed = preprocess.run(data_conf, param_conf)

# write
preprocess.write(Preprocessed, data_conf)

# sanity check is preprocessed
print(Preprocessed.is_filtered())

ValueError: Digital filter critical frequencies must be 0 < Wn < 1

### References


https://allensdk.readthedocs.io/en/latest/_static/examples/nb/ecephys_session.html

https://allensdk.readthedocs.io/en/latest/_static/examples/nb/ecephys_data_access.html

Find FFI (Parvalbumin) neurons : https://allensdk.readthedocs.io/en/latest/_static/examples/nb/ecephys_optotagging.html