Link to more in-depth descriptions here: https://blackrockneurotech.com/research/wp-content/ifu/LB-0023-7.00_NEV_File_Format.pdf

**Nev obj structure**

It has three main attributes/functions: basic_header, getdata(), extended_headers. The documentation mentions others, such as processroicommments(), but the files we have do not have those.

nevobj.basic_header returns a dictionary with the following keys/values:
- **key**: 'FileTypeID', **value**: str (e.g. 'NEURALEV')
- **key**: 'FileSpec', **value**: str with float (e.g. '2.3')
- **key**: 'AddFlags', **value**: int (likely bool 1/0)
- **key**: 'BytesInHeader', **value**: int
- **key**: 'BytesInDataPackets', **value**: int
- **key**: 'TimeStampResolution', **value**: int
- **key**: 'SampleTimeResolution', **value**: int
- **key**: 'TimeOrigin', **value**: datetime.datetime
- **key**: 'CreatingApplication', **value**: str (e.g. 'File Dialog v7.0.4')
- **key**: 'Comment', **value**: str
- **key**: 'NumExtendedHeaders', **value**: int

nevobj.getdata() takes a long time and returns a dictionary with the following structure:
- **key**: spike_events, **value**: dict
    - **key**: TimeStamps, **value**: list
        - A list of times (integers) in ascending order at which spikes occur (**note**: this is NOT the same as the total duration of the session)
        - The length of this list should, in theory, equal the number of spikes (aka threshold crossings)
    - **key**: Unit, **value**: list
        - A list whose length is equal to Timestamps. In all files I've opened, this list contains only 0s
    - **key**: Channel, **value**: list
        - A list that contains the channel number that corresponds to the spike event time in TimeStamps
        - If the first entry of Timestamps is 30 and the first entry of Channel is 2, that means that a spike occurred in channel 2 at time 30
    - **key**: Waveforms, ****value****: array
        - Array shape: num timestamps x num channels
        - The columns of this array contain the activity of the corresponding channel
- **key**: digital_events, **value**: dict (**note: not all files have digital_events - FR and Cage usually do not**)
    - **key**: Timestamps, **value**: list
        - Not the same timestamps as spike_events. Different lengths and values.
    - **key**: InsertionReason, **value**: list
        - A list whose length is equal to Timestamps. In all files I've opened, this list contains only 1s
    - **key**: UnparsedData, **value**: list
        - A list with integers that encode various task- and trial-related information. Details can be found here:  https://github.com/limblab/Behavior/blob/master/src/target/words.h
        
nevobj.extended_headers returns a list of dicts; the number of dicts equals 'NumExtendedHeaders' in nevobj.basic_headers. 3 dicts in a row correspond to one electrode and contain the following info:
- dict1
    - **key**: 'PacketID':, **value**: str (e.g. 'NEUEVWAV')
    - **key**: 'ElectrodeID':, **value**:  int
    - **key**: 'PhysicalConnector':, **value**: int
    - **key**: 'ConnectorPin':, **value**:  int,
    - **key**: 'DigitizationFactor':, **value**:  int,
    - **key**: 'EnergyThreshold':, **value**:  int,
    - **key**: 'HighThreshold':, **value**:  int,
    - **key**: 'LowThreshold':, **value**:  int,
    - **key**: 'NumSortedUnits':, **value**:  int,
    - **key**: 'BytesPerWaveform':, **value**:  int,
    - **key**: 'SpikeWidthSamples':, **value**:  int,
    - **key**: 'EmptyBytes':, **value**: bytes
- dict2: electrode number
    - **key**: PacketID:, **value**: str 'NEUEVLBL'
    - **key**: ElectrodeID:, **value**: int (e.g. 1, should correspond to dict1)
    - **key**: Label:, **value**: str (the actual electrode number - e.g. 'elec78')
    - **key**: EmptyBytes:, **value**:  bytes (e.g. b'\x00\x00\x00\x00\x00\x00')
- dict3: filter information (type, frequency)
    - **key**: PacketID, **value**: str (e.g. 'NEUEVFLT')
    - **key**: ElectrodeID, **value**: int (e.g. 1, should corresond to dicts 1,2)
    - **key**: HighFreqCorner, **value**: str with float (e.g. '250.0 Hz')
    - **key**: HighFreqOrder, **value**: int (e.g. 4),
    - **key**: HighFreqType, **value**: str (e.g. 'butterworth')
    - **key**: LowFreqCorner, **value**: str with float (e.g. '7500.0 Hz')
    - **key**: LowFreqOrder, **value**: int (e.g. 3)
    - **key**: LowFreqType, **value**: str (e.g. 'butterworth')
    - **key**: EmptyBytes, **value**: bytes
        
**Nsx obj structure**

nsxobj.basic_header produces a dictionary with the following structure: 
- **key**: 'FileTypeID', **value**: str (e.g.  'NEURALCD')
- **key**: 'FileSpec', **value**: str with float (e.g.  '2.3')
- **key**: 'BytesInHeader', **value**: int (e.g.  8762)
- **key**: 'Label', **value**: str representing a rate (e.g.  '2 kS/s')
- **key**: 'Comment', **value**: str 
- **key**: 'Period', **value**: int (e.g.  15)
- **key**: 'TimeStampResolution', **value**: int (e.g.  30000)
- **key**: 'TimeOrigin', **value**:(e.g.  datetime.datetime(2023, 2, 14, 21, 41, 36, 14000))
- **key**: 'ChannelCount', **value**: int (e.g.  128)

nsxobj.getdata() takes a few seconds a dict with the following structure:
- **key**: elec_ids, **value**: list
    - List of electrode id's
- **key**: start_time_s, **value**: float
    - Usually 0.
- **key**: data_time_s, **value**: str
    - In all of the files I've opened, this string has been 'all'.
- **key**: downsample, **value**: int
    - Likely boolean - 1s or 0s. In all the files I've opened, it's been 1.
- **key**: data, **value**: list
    - A list containing a numpy array. This contains emg, force, and other data.
- **key**: data_headers, **value**: list whose only element is a dict
    - **key**: Timestamp, **value**: int (always 0 from what I've seen)
    - **key**: NumDataPoints, **value**: int (with number of total data points - should equal time length of file times sampling rate)
- **key**: ExtendedHeaderIndices, **value**: list
    - A list containing the unique electrode ids - i.e. the length of this list equals the number of electrodes. Values usually between 1 - num electrodes.

nsxobj.extended_headers produces a list of dicts, one for each electrode. The dicts contain the following info:
- **key**: 'Type', **value**: str (e.g. 'CC')
- **key**: 'ElectrodeID', **value**: int (e.g. 1)
- **key**: 'ElectrodeLabel', **value**: str (e.g. 'elec109')
- **key**: 'PhysicalConnector', **value**: int (e.g. 1)
- **key**: 'ConnectorPin', **value**: int (e.g. 1)
- **key**: 'MinDigitalValue', **value**: int (e.g. -32764)
- **key**: 'MaxDigitalValue', **value**: int (e.g. 32764)
- **key**: 'MinAnalogValue', **value**: int (e.g. -8191)
- **key**: 'MaxAnalogValue', **value**: int (e.g. 8191)
- **key**: 'Units', **value**: str (e.g. 'uV')
- **key**: 'HighFreqCorner', **value**: str with float in hz (e.g. '0.3 Hz')
- **key**: 'HighFreqOrder', **value**: int (e.g. 1)
- **key**: 'HighFreqType', **value**: str (e.g. 'butterworth')
- **key**: 'LowFreqCorner', **value**: str with float in hz (e.g. '250.0 Hz')
- **key**: 'LowFreqOrder', **value**: int (e.g. 4)
- **key**: 'LowFreqType', **value**: str (e.g. 'butterworth')

EMG Files

- sessions_key - sessions table
- paper_key  - ignore
- filename - nsx file
- file_id - auto increments
- **rec_system** - 'Cerebus'. If there is emg, 'Cerebus, Jim Baker's wired'. Talk to xuan about matching up .rhd files with .nev files. rhd files are 'DSPW'
- sampling_rate - based on nsx file type. for .rhd, check headers
- **emg_quality** - don't worry for now
- **emg_notes** - don't worry for now
- **muscle_list - sometimes in nsx extended headers, but not always - seems to be only sometimes for FR files - e.g. NsxFileObj.extended_headers[hdr_idx]['ElectrodeLabel'] returns EMG_FCU1**. Plots that look like emgs oscillate around 0. Check to see if any files have emg-looking plots that are not properly labeled with muscle names

Some dates have nsx and ccf files, but no nev files. Are we interested in these? If so, how do we account for sessions key, given that data for the sessions table is pulled from nev files?

In [2]:
# Import dependencies
import pandas as pd
import numpy as np
from sqlalchemy import create_engine
import os
from os import path, system
import sys
sys.path.append('/Users/aajanquail/Desktop/Jupyter_Notebooks/Miller_Lab/xds/xds_python/')
import load_intan_rhd_format
from sys import platform
import glob
from scipy import signal
import matplotlib.pyplot as plt
import xml.etree.ElementTree as ET
import time
# from PyQt5.QtWidgets import QFileDialog

# brpylib is the module that contains functions/classes that allow us to open and extract data from .nev and .nsx files
# from Python_Utilities import brpylib
# from Python_Utilities import brMiscFxns
from Python_Utilities_Kev import brpylib

In [3]:
sampling_rate_dict = {'ns1': 500,'ns2': 1000,'ns3': 2000,'ns4': 10000,'ns5': 30000}

In [4]:
cerebus_data_dict = {}
base_dir = '/Volumes/L_MillerLab/data/'

for monkey in sorted(os.listdir(base_dir)):
#     if monkey not in ['.DS_Store','archive','Backed_up_data', 'Behavior','chewie-delete','CompiledCOFiles','DeepLabCutVids','DLC_models','DPZ','FSMIT_DataRestore_03172021', 'Han_13B1_target','IMU','Jarvis','Jango_redo','Jango_target_redo','LoadCell','Mihili_12A3_target','OldCerebusTest','Rats','Rats_target','Test data','Thumbs.db']:
    if (monkey == 'Pop_18E3'):
        print(monkey)
        cerebus_data_dict[monkey] = {}
        monkey_path = os.path.join(base_dir, monkey)
        x = [i for i in os.listdir(monkey_path) if 'cerebus' in i.lower()]
        if len(x) != 0:
            cerebus_path = os.path.join(monkey_path, x[0])
        else:
            cerebus_path = monkey_path
        print(cerebus_path)
        date_list = glob.glob(f"{cerebus_path}/*")
        nev_list = glob.glob(f"{cerebus_path}/*/*.nev")
        nsx_list = glob.glob(f"{cerebus_path}/*/*.ns*")
        ccf_list = glob.glob(f"{cerebus_path}/*/*.ccf")
        rhd_list = glob.glob(f"{cerebus_path}/*/*.rhd")
        print(len(nev_list), len(nsx_list), len(ccf_list), len(rhd_list))
        cerebus_data_dict[monkey]['date_list'] = sorted(date_list)[:-3]
        cerebus_data_dict[monkey]['nev_list'] = sorted(nev_list)
        cerebus_data_dict[monkey]['nsx_list'] = sorted(nsx_list)
        cerebus_data_dict[monkey]['ccf_list'] = sorted(ccf_list)
        cerebus_data_dict[monkey]['rhd_list'] = sorted(rhd_list)

Pop_18E3
/Volumes/L_MillerLab/data/Pop_18E3/CerebusData
766 858 607 238


In [29]:
nev_list_trunc = [i.split('/')[-2] for i in cerebus_data_dict['Pop_18E3']['nev_list']]
nsx_list_trunc = [i.split('/')[-2] for i in cerebus_data_dict['Pop_18E3']['nsx_list']]
rhd_list_trunc = [i.split('/')[-2] for i in cerebus_data_dict['Pop_18E3']['rhd_list']]

In [30]:
from collections import Counter
counts = Counter(el for lst in (nev_list_trunc, nsx_list_trunc, rhd_list_trunc) for el in set(lst))

Sometimes all three filetypes exist, sometimes only 2

In [42]:
nsx_count = Counter(el for el in nsx_list_trunc)
nev_count = Counter(el for el in nev_list_trunc)
rhd_count = Counter(el for el in rhd_list_trunc)

In [43]:
len(nsx_count), len(nev_count), len(rhd_count)

(175, 186, 46)

In [41]:
for k in nev_count.keys():
    print(k, k in nsx_count.keys())

20190327 False
20190329 True
20190402 False
20190403 True
20190409 True
20190415 False
20190423 True
20190424 True
20190426 False
20190429 True
20190430 True
20190503 True
20190506 True
20190509 False
20190603 True
20190604 True
20190605 True
20190606 True
20190607 True
20190611 True
20190614 True
20190620 True
20190628 False
20190703 True
20190708 True
20190710 True
20190724 True
20190726 False
20190729 True
20190730 True
20190807 True
20190808 True
20190809 True
20190812 False
20190814 False
20190821 False
20190903 True
20190906 True
20190909 True
20190910 True
20190913 True
20190916 True
20191008 True
20191011 True
20191021 True
20191028 True
20191101 True
20191105 True
20191108 True
20191112 True
20191120 True
20191122 True
20200205 True
20200206 True
20200207 True
20200210 True
20200213 True
20200217 True
20200220 True
20200224 True
20200226 True
20200227 True
20200228 True
20200304 True
20200309 True
20200310 True
20200311 True
20200313 True
20200316 True
20200317 True
20200320 T

In [40]:
for k in nsx_count.keys():
    print(k, k in nev_count.keys())

20190329 True
20190403 True
20190409 True
20190423 True
20190424 True
20190429 True
20190430 True
20190503 True
20190506 True
20190603 True
20190604 True
20190605 True
20190606 True
20190607 True
20190611 True
20190614 True
20190620 True
20190703 True
20190708 True
20190710 True
20190724 True
20190729 True
20190730 True
20190807 True
20190808 True
20190809 True
20190903 True
20190906 True
20190909 True
20190910 True
20190913 True
20190916 True
20191008 True
20191011 True
20191021 True
20191028 True
20191101 True
20191105 True
20191108 True
20191112 True
20191120 True
20191122 True
20200205 True
20200206 True
20200207 True
20200210 True
20200212 False
20200213 True
20200217 True
20200220 True
20200224 True
20200226 True
20200227 True
20200228 True
20200304 True
20200309 True
20200310 True
20200311 True
20200313 True
20200316 True
20200317 True
20200320 True
20200610 True
20200617 True
20200626 True
20200629 True
20200703 True
20200714 True
20200717 True
20200720 True
20200724 True
20200

In [45]:
for k in rhd_count.keys():
    print(k, k in nsx_count.keys())

20190812 False
20190814 False
20190821 False
20200313 True
20200320 True
20200626 True
20200717 True
20200724 True
20200731 True
20200821 True
20200904 True
20200925 True
20201005 True
20201020 True
20201118 True
20201120 True
20201125 True
20201128 True
20210602 True
20210609 True
20210611 True
20210616 True
20210618 True
20210625 True
20210630 True
20210702 True
20210709 True
20210716 True
20210721 True
20210723 True
20210730 True
20210806 True
20210902 True
20210908 True
20210917 True
20210922 True
20211001 True
20211020 True
20220214 True
20220304 True
20220308 True
20220312 True
20220314 True
20220321 True
20220322 True
20220328 True


In [35]:
nev_count

Counter({'20190327': 1,
         '20190329': 1,
         '20190402': 3,
         '20190403': 2,
         '20190409': 1,
         '20190415': 1,
         '20190423': 1,
         '20190424': 2,
         '20190426': 3,
         '20190429': 1,
         '20190430': 2,
         '20190503': 2,
         '20190506': 2,
         '20190509': 2,
         '20190603': 2,
         '20190604': 3,
         '20190605': 3,
         '20190606': 4,
         '20190607': 4,
         '20190611': 3,
         '20190614': 4,
         '20190620': 5,
         '20190628': 3,
         '20190703': 2,
         '20190708': 3,
         '20190710': 2,
         '20190724': 2,
         '20190726': 4,
         '20190729': 1,
         '20190730': 2,
         '20190807': 2,
         '20190808': 3,
         '20190809': 3,
         '20190812': 2,
         '20190814': 4,
         '20190821': 2,
         '20190903': 1,
         '20190906': 1,
         '20190909': 2,
         '20190910': 2,
         '20190913': 3,
         '201909

In [33]:
nsx_count

Counter({'20190329': 1,
         '20190403': 2,
         '20190409': 1,
         '20190423': 1,
         '20190424': 2,
         '20190429': 1,
         '20190430': 2,
         '20190503': 2,
         '20190506': 2,
         '20190603': 2,
         '20190604': 3,
         '20190605': 3,
         '20190606': 4,
         '20190607': 4,
         '20190611': 3,
         '20190614': 4,
         '20190620': 5,
         '20190703': 2,
         '20190708': 3,
         '20190710': 2,
         '20190724': 2,
         '20190729': 1,
         '20190730': 2,
         '20190807': 2,
         '20190808': 3,
         '20190809': 3,
         '20190903': 1,
         '20190906': 2,
         '20190909': 4,
         '20190910': 2,
         '20190913': 3,
         '20190916': 2,
         '20191008': 1,
         '20191011': 2,
         '20191021': 2,
         '20191028': 2,
         '20191101': 3,
         '20191105': 2,
         '20191108': 1,
         '20191112': 2,
         '20191120': 2,
         '201911

In [31]:
counts

Counter({'20200317': 2,
         '20220203': 2,
         '20201128': 3,
         '20190403': 2,
         '20190903': 2,
         '20200310': 2,
         '20200207': 2,
         '20190812': 2,
         '20210526': 2,
         '20201217': 2,
         '20210806': 3,
         '20190429': 2,
         '20220314': 3,
         '20191108': 2,
         '20220607': 2,
         '20211203': 2,
         '20211116': 2,
         '20211119': 2,
         '20200210': 2,
         '20220322': 3,
         '20210902': 3,
         '20210816': 2,
         '20190906': 2,
         '20200820': 2,
         '20200320': 3,
         '20200228': 2,
         '20200213': 2,
         '20210609': 3,
         '20210602': 3,
         '20210713': 2,
         '20190509': 1,
         '20220621': 2,
         '20190809': 2,
         '20220214': 3,
         '20190423': 2,
         '20190409': 2,
         '20190503': 2,
         '20190606': 2,
         '20210924': 2,
         '20190603': 2,
         '20200206': 2,
         '202009

In [8]:
rhd = load_intan_rhd_format.read_data(cerebus_data_dict['Pancake_20K3']['rhd_list'][0])


Reading Intan Technologies RHD2000 Data File, Version 1.5

n signal groups 7
Found 32 amplifier channels.
Found 0 auxiliary input channels.
Found 0 supply voltage channels.
Found 0 board ADC channels.
Found 2 board digital input channels.
Found 0 board digital output channels.
Found 0 temperature sensors channels.

File contains 300.021 seconds of data.  Amplifiers were sampled at 2.01 kS/s.

Allocating memory for data...
Reading data from file...
10% done...
20% done...
30% done...
40% done...
50% done...
60% done...
70% done...
80% done...
90% done...
Parsing data...
No missing timestamps in data.
Done!  Elapsed time: 43.7 seconds


In [10]:
type(rhd)

dict

In [11]:
rhd.keys()

dict_keys(['t_amplifier', 't_dig', 'spike_triggers', 'notes', 'frequency_parameters', 'amplifier_channels', 'amplifier_data', 'board_dig_in_channels', 'board_dig_in_data'])

In [12]:
rhd['frequency_parameters']['amplifier_sample_rate']

{'dsp_enabled': 1,
 'actual_dsp_cutoff_frequency': 42.73942565917969,
 'actual_lower_bandwidth': 29.94453239440918,
 'actual_upper_bandwidth': 499.939208984375,
 'desired_dsp_cutoff_frequency': 30.0,
 'desired_lower_bandwidth': 30.0,
 'desired_upper_bandwidth': 500.0,
 'notch_filter_frequency': 0,
 'desired_impedance_test_frequency': 1000.0,
 'actual_impedance_test_frequency': 0.0,
 'amplifier_sample_rate': 2011.060546875,
 'aux_input_sample_rate': 502.76513671875,
 'supply_voltage_sample_rate': 33.51767578125,
 'board_adc_sample_rate': 2011.060546875,
 'board_dig_in_sample_rate': 2011.060546875}

In [41]:
# fields to obtain from nev files
sessions_key = []
array_serial = []
paper_key = []
filename = []
file_id = []
setting_file = []
is_sorted = []
sorted_by = []
num_chans = []
num_units = []
rec_system = []
connect_type = []
connect_serial = []
threshold_quality = []
threshold_notes = []

shortened_nev_list = [cerebus_data_dict['Pancake_20K3']['nev_list'][1]]
for nev_filename in shortened_nev_list:
    # open file
    print(nev_filename)
    nevobj = brpylib.NevFile(nev_filename)
    output = nevobj.getdata(elec_ids='all')
    
    fname = nev_filename.split('/')[-1][:-4]
    filename.append(fname)
    is_sorted.append(fname[-2:] == '-s')
    num_chans.append(len(set(output['spike_events']['Channel'])))
    rec_system.append('Cerebus')

/Volumes/L_MillerLab/data/Pancake_20K3/Cerebus_data/20221103/20221103_Pancake_WI_001.nev

20221103_Pancake_WI_001.nev opened


KeyboardInterrupt: 