Link to more in-depth descriptions here: https://blackrockneurotech.com/research/wp-content/ifu/LB-0023-7.00_NEV_File_Format.pdf

**Nev obj structure**

It has three main attributes/functions: basic_header, getdata(), extended_headers. The documentation mentions others, such as processroicommments(), but the files we have do not have those.

nevobj.basic_header returns a dictionary with the following keys/values:
- **key**: 'FileTypeID', **value**: str (e.g. 'NEURALEV')
- **key**: 'FileSpec', **value**: str with float (e.g. '2.3')
- **key**: 'AddFlags', **value**: int (likely bool 1/0)
- **key**: 'BytesInHeader', **value**: int
- **key**: 'BytesInDataPackets', **value**: int
- **key**: 'TimeStampResolution', **value**: int
- **key**: 'SampleTimeResolution', **value**: int
- **key**: 'TimeOrigin', **value**: datetime.datetime
- **key**: 'CreatingApplication', **value**: str (e.g. 'File Dialog v7.0.4')
- **key**: 'Comment', **value**: str
- **key**: 'NumExtendedHeaders', **value**: int

nevobj.getdata() takes a long time and returns a dictionary with the following structure:
- **key**: spike_events, **value**: dict
    - **key**: TimeStamps, **value**: list
        - A list of times (integers) in ascending order at which spikes occur (**note**: this is NOT the same as the total duration of the session)
        - The length of this list should, in theory, equal the number of spikes (aka threshold crossings)
    - **key**: Unit, **value**: list
        - A list whose length is equal to Timestamps. In all files I've opened, this list contains only 0s
    - **key**: Channel, **value**: list
        - A list that contains the channel number that corresponds to the spike event time in TimeStamps
        - If the first entry of Timestamps is 30 and the first entry of Channel is 2, that means that a spike occurred in channel 2 at time 30
    - **key**: Waveforms, ****value****: array
        - Array shape: num timestamps x num channels
        - The columns of this array contain the activity of the corresponding channel
- **key**: digital_events, **value**: dict (**note: not all files have digital_events - FR and Cage usually do not**)
    - **key**: Timestamps, **value**: list
        - Not the same timestamps as spike_events. Different lengths and values.
    - **key**: InsertionReason, **value**: list
        - A list whose length is equal to Timestamps. In all files I've opened, this list contains only 1s
    - **key**: UnparsedData, **value**: list
        - A list with integers that encode various task- and trial-related information. Details can be found here:  https://github.com/limblab/Behavior/blob/master/src/target/words.h
        
nevobj.extended_headers returns a list of dicts; the number of dicts equals 'NumExtendedHeaders' in nevobj.basic_headers. 3 dicts in a row correspond to one electrode and contain the following info:
- dict1
    - **key**: 'PacketID':, **value**: str (e.g. 'NEUEVWAV')
    - **key**: 'ElectrodeID':, **value**:  int
    - **key**: 'PhysicalConnector':, **value**: int
    - **key**: 'ConnectorPin':, **value**:  int,
    - **key**: 'DigitizationFactor':, **value**:  int,
    - **key**: 'EnergyThreshold':, **value**:  int,
    - **key**: 'HighThreshold':, **value**:  int,
    - **key**: 'LowThreshold':, **value**:  int,
    - **key**: 'NumSortedUnits':, **value**:  int,
    - **key**: 'BytesPerWaveform':, **value**:  int,
    - **key**: 'SpikeWidthSamples':, **value**:  int,
    - **key**: 'EmptyBytes':, **value**: bytes
- dict2: electrode number
    - **key**: PacketID:, **value**: str 'NEUEVLBL'
    - **key**: ElectrodeID:, **value**: int (e.g. 1, should correspond to dict1)
    - **key**: Label:, **value**: str (the actual electrode number - e.g. 'elec78')
    - **key**: EmptyBytes:, **value**:  bytes (e.g. b'\x00\x00\x00\x00\x00\x00')
- dict3: filter information (type, frequency)
    - **key**: PacketID, **value**: str (e.g. 'NEUEVFLT')
    - **key**: ElectrodeID, **value**: int (e.g. 1, should corresond to dicts 1,2)
    - **key**: HighFreqCorner, **value**: str with float (e.g. '250.0 Hz')
    - **key**: HighFreqOrder, **value**: int (e.g. 4),
    - **key**: HighFreqType, **value**: str (e.g. 'butterworth')
    - **key**: LowFreqCorner, **value**: str with float (e.g. '7500.0 Hz')
    - **key**: LowFreqOrder, **value**: int (e.g. 3)
    - **key**: LowFreqType, **value**: str (e.g. 'butterworth')
    - **key**: EmptyBytes, **value**: bytes
        
**Nsx obj structure**

nsxobj.basic_header produces a dictionary with the following structure: 
- **key**: 'FileTypeID', **value**: str (e.g.  'NEURALCD')
- **key**: 'FileSpec', **value**: str with float (e.g.  '2.3')
- **key**: 'BytesInHeader', **value**: int (e.g.  8762)
- **key**: 'Label', **value**: str representing a rate (e.g.  '2 kS/s')
- **key**: 'Comment', **value**: str 
- **key**: 'Period', **value**: int (e.g.  15)
- **key**: 'TimeStampResolution', **value**: int (e.g.  30000)
- **key**: 'TimeOrigin', **value**:(e.g.  datetime.datetime(2023, 2, 14, 21, 41, 36, 14000))
- **key**: 'ChannelCount', **value**: int (e.g.  128)

nsxobj.getdata() takes a few seconds a dict with the following structure:
- **key**: elec_ids, **value**: list
    - List of electrode id's
- **key**: start_time_s, **value**: float
    - Usually 0.
- **key**: data_time_s, **value**: str
    - In all of the files I've opened, this string has been 'all'.
- **key**: downsample, **value**: int
    - Likely boolean - 1s or 0s. In all the files I've opened, it's been 1.
- **key**: data, **value**: list
    - A list containing a numpy array. This contains emg, force, and other data.
- **key**: data_headers, **value**: list whose only element is a dict
    - **key**: Timestamp, **value**: int (always 0 from what I've seen)
    - **key**: NumDataPoints, **value**: int (with number of total data points - should equal time length of file times sampling rate)
- **key**: ExtendedHeaderIndices, **value**: list
    - A list containing the unique electrode ids - i.e. the length of this list equals the number of electrodes. Values usually between 1 - num electrodes.

nsxobj.extended_headers produces a list of dicts, one for each electrode. The dicts contain the following info:
- **key**: 'Type', **value**: str (e.g. 'CC')
- **key**: 'ElectrodeID', **value**: int (e.g. 1)
- **key**: 'ElectrodeLabel', **value**: str (e.g. 'elec109')
- **key**: 'PhysicalConnector', **value**: int (e.g. 1)
- **key**: 'ConnectorPin', **value**: int (e.g. 1)
- **key**: 'MinDigitalValue', **value**: int (e.g. -32764)
- **key**: 'MaxDigitalValue', **value**: int (e.g. 32764)
- **key**: 'MinAnalogValue', **value**: int (e.g. -8191)
- **key**: 'MaxAnalogValue', **value**: int (e.g. 8191)
- **key**: 'Units', **value**: str (e.g. 'uV')
- **key**: 'HighFreqCorner', **value**: str with float in hz (e.g. '0.3 Hz')
- **key**: 'HighFreqOrder', **value**: int (e.g. 1)
- **key**: 'HighFreqType', **value**: str (e.g. 'butterworth')
- **key**: 'LowFreqCorner', **value**: str with float in hz (e.g. '250.0 Hz')
- **key**: 'LowFreqOrder', **value**: int (e.g. 4)
- **key**: 'LowFreqType', **value**: str (e.g. 'butterworth')

Force Files

- sessions_key - from sessions table
- paper_key - ignore for now
- filename
- file_id - auto
- **rec_system** - 'Cerebus'
- sampling_rate - based on nsx file type
- **force_quality** - leave for now
- **force_notes** - leave for now
- **force_labels - sometimes in nsx extended headers, but not always - seems to be only sometimes for FR files - e.g. NsxFileObj.extended_headers[hdr_idx]['ElectrodeLabel'] returns Force_x** - FR should not have forces, but...?

In [2]:
!pwd

/Users/aajanquail/Desktop/Jupyter_Notebooks/Miller_Lab/proc-aajan/Database_Migration


In [3]:
# Import dependencies
import pandas as pd
import numpy as np
from sqlalchemy import create_engine
import os
from os import path, system
import sys
import glob
from scipy import signal
import matplotlib.pyplot as plt
import xml.etree.ElementTree as ET
import time
# from PyQt5.QtWidgets import QFileDialog

# brpylib is the module that contains functions/classes that allow us to open and extract data from .nev and .nsx files
# from Python_Utilities import brpylib
# from Python_Utilities import brMiscFxns
sys.path.insert(0, '/Users/aajanquail/Desktop/Jupyter_Notebooks/Miller_Lab/')
from Python_Utilities_Kev import brpylib

In [4]:
sampling_rate_dict = {'ns1': 500, 'ns2': 1000, 'ns3': 2000, 'ns4': 10000, 'ns5': 30000, 'ns6': 0}

In [6]:
cerebus_data_dict = {}
base_dir = '/Volumes/L_MillerLab/data/'
for monkey in sorted(os.listdir(base_dir)):
#     if monkey not in ['.DS_Store','archive','Backed_up_data', 'Behavior','chewie-delete','CompiledCOFiles','DeepLabCutVids','DLC_models','DPZ','FSMIT_DataRestore_03172021', 'Han_13B1_target','IMU','Jarvis','Jango_redo','Jango_target_redo','LoadCell','Mihili_12A3_target','OldCerebusTest','Rats','Rats_target','Test data','Thumbs.db']:
    if (monkey == 'Pancake_20K3') or (monkey == 'Pop_18E3'):
        print(monkey)
        cerebus_data_dict[monkey] = {}
        monkey_path = os.path.join(base_dir, monkey)
        x = [i for i in os.listdir(monkey_path) if 'cerebus' in i.lower()]
        if len(x) != 0:
            cerebus_path = os.path.join(monkey_path, x[0])
        else:
            cerebus_path = monkey_path
        print(cerebus_path)
        nev_list = glob.glob(f"{cerebus_path}/*/*.nev")
        nsx_list = glob.glob(f"{cerebus_path}/*/*.ns*")
        ccf_list = glob.glob(f"{cerebus_path}/*/*.ccf")
        print(len(nev_list), len(nsx_list), len(ccf_list))
        cerebus_data_dict[monkey]['nev_list'] = nev_list
        cerebus_data_dict[monkey]['nsx_list'] = nsx_list
        cerebus_data_dict[monkey]['ccf_list'] = ccf_list

Pancake_20K3
/Volumes/L_MillerLab/data/Pancake_20K3/Cerebus_data
116 88 90
Pop_18E3
/Volumes/L_MillerLab/data/Pop_18E3/CerebusData
766 858 607


In [28]:
l3 = [i for i in cerebus_data_dict['Pancake_20K3']['nsx_list'] if i[-1] == '3']
l6 = [i for i in cerebus_data_dict['Pancake_20K3']['nsx_list'] if i[-1] == '6']

In [30]:
# 'TimeStampResolution'
for file in l3:
    nsxobj = brpylib.NsxFile(file)
    print(nsxobj.basic_header['Label'])


20211214_Pancake__FR_001.ns3 opened
2 kS/s

20230214_Pancake_WM_002.ns3 opened
2 kS/s

20230214_Pancake_WM_001.ns3 opened
2 kS/s

20230214_Pancake_WM_003.ns3 opened
2 kS/s

20210828_Pancake__FR_.ns3 opened
2 kS/s

20220921_Pancake_WS_Pre_Con_01.ns3 opened
2 kS/s

20220921_Pancake_PG_Pre_Con_02.ns3 opened
2 kS/s

20220921_Pancake_PG_Post_Con_03.ns3 opened
2 kS/s

20220921_Pancake_WS_Post_Con_04.ns3 opened
2 kS/s

20220920_Pancake_FR_004.ns3 opened
2 kS/s

20220920_Pancake_FR_005.ns3 opened
2 kS/s

20220920_Pancake_FR_002.ns3 opened
2 kS/s

20220920_Pancake_FR_003.ns3 opened
2 kS/s

20220920_Pancake_FR_001.ns3 opened
2 kS/s

20220121_Pancake_PG_001.ns3 opened
2 kS/s

20220106_Pancake_FR_001.ns3 opened
2 kS/s

20220103_Pancake_FR_001.ns3 opened
2 kS/s

20220103_Pancake_FR_002.ns3 opened
2 kS/s

20210826_Pancake__FR_001.ns3 opened
2 kS/s

20221102_Pancke_PG_Post_Cyp_01.ns3 opened
2 kS/s

20221102_Pancke_PG_Pre_Cyp_01.ns3 opened
2 kS/s

20210917_Pancake_PG_001.ns3 opened
2 kS/s

20210917_P

In [29]:
# 'TimeStampResolution'
for file in l6:
    nsxobj = brpylib.NsxFile(file)
    print(nsxobj.basic_header['Label'])


20220921_Pancake_PG_Pre_Con_02.ns6 opened
raw

20220921_Pancake_PG_Post_Con_03.ns6 opened
raw

20220921_Pancake_WS_Pre_Con_01.ns6 opened
raw

20220921_Pancake_WS_Post_Con_04.ns6 opened
raw

20221102_Pancke_PG_Post_Cyp_01.ns6 opened
raw

20221102_Pancke_PG_Pre_Cyp_01.ns6 opened
raw

20220916_Pancake_PG_Pre_Cyp_04.ns6 opened
raw

20220916_Pancake_PG_Pre_Cyp_03.ns6 opened
raw

20220916_Pancake_WS_Post_Cyp_02.ns6 opened
raw

20220916_Pancake_PG_Pre_Cyp_02.ns6 opened
raw

20220916_Pancake_PG_Post_Cyp_01.ns6 opened
raw

20220916_Pancake_WS_Pre_Cyp_01.ns6 opened
raw

20221027_Pancake_PG_01.ns6 opened
raw

20221028_Pancake_PG_01.ns6 opened
raw


# Remember to change code in brpylib.py to allow for large files > 1GB!!

In [4]:
force_dict = {'filename': [], 'rec_system': [], 'sampling_rate': [], 'force_labels': []}

for monkey in cerebus_data_dict.keys():
    for file in cerebus_data_dict[monkey]['nsx_list']:
        filename = file.split('/')[-1]
        if (('FR' not in filename) and ('freereaching' not in filename) and ('Cage' not in filename)):
            nsxobj = brpylib.NsxFile(file)
            output_nsx = nsxobj.getdata()
            if output_nsx == 0:
                continue
            force_labels_lst = []
            for plot_chan in output_nsx['elec_ids']:
                ch_idx  = output_nsx['elec_ids'].index(plot_chan)
                hdr_idx = output_nsx['ExtendedHeaderIndices'][ch_idx]
                label = nsxobj.extended_headers[hdr_idx]['ElectrodeLabel']
                force_labels_lst.append(label)
                force_labels_lst.append(',')
                
            force_dict['force_labels'].append(label)    
            force_dict['filename'].append(filename)
            force_dict['rec_system'].append('Cerebus')
            force_dict['sampling_rate'].append(sampling_rate_dict[filename[-3:]])


20230214_Pancake_WM_002.ns3 opened

20230214_Pancake_WM_001.ns3 opened

20230214_Pancake_WM_003.ns3 opened

20220921_Pancake_WS_Pre_Con_01.ns3 opened

20220921_Pancake_PG_Pre_Con_02.ns6 opened
Output data requested is larger than 1 GB, skipping

20220921_Pancake_PG_Post_Con_03.ns6 opened
Output data requested is larger than 1 GB, skipping

20220921_Pancake_PG_Pre_Con_02.ns3 opened

20220921_Pancake_PG_Post_Con_03.ns3 opened

20220921_Pancake_WS_Pre_Con_01.ns6 opened
Output data requested is larger than 1 GB, skipping

20220921_Pancake_WS_Post_Con_04.ns3 opened

20220921_Pancake_WS_Post_Con_04.ns6 opened
Output data requested is larger than 1 GB, skipping

20220121_Pancake_PG_001.ns3 opened

20221102_Pancke_PG_Post_Cyp_01.ns3 opened

20221102_Pancke_PG_Pre_Cyp_01.ns3 opened

20221102_Pancke_PG_Post_Cyp_01.ns6 opened
Output data requested is larger than 1 GB, skipping

20221102_Pancke_PG_Pre_Cyp_01.ns6 opened
Output data requested is larger than 1 GB, skipping

20210917_Pancake_PG_001.n


20190916_Pop_horiz_wm_001.ns3 opened

Pop_20220322_PG_001.ns3 opened

Pop_20220322_PG_002.ns3 opened

20190708_Pop_horizWM_001.ns3 opened


KeyboardInterrupt: 

In [7]:
elect_labels

{'FR': {'20211214_Pancake__FR_001.ns3': ['video_sync', 'Fx', 'Fy'],
  '20210828_Pancake__FR_.ns3': ['video_sync'],
  '20220920_Pancake_FR_004.ns3': ['EMG_FCU1',
   'EMG_FDP2',
   'EMG_FDS2',
   'EMG_APB',
   'EMG_FPB',
   'EMG_Lum',
   'EMG_ECU',
   'EMG_ECR',
   'video_sync'],
  '20220920_Pancake_FR_005.ns3': ['EMG_FCU1',
   'EMG_FDP2',
   'EMG_FDS2',
   'EMG_APB',
   'EMG_FPB',
   'EMG_Lum',
   'EMG_ECU',
   'EMG_ECR',
   'video_sync'],
  '20220920_Pancake_FR_002.ns3': ['EMG_FCU1',
   'EMG_FDP2',
   'EMG_FDS2',
   'EMG_APB',
   'EMG_FPB',
   'EMG_Lum',
   'EMG_ECU',
   'EMG_ECR',
   'video_sync'],
  '20220920_Pancake_FR_003.ns3': ['EMG_FCU1',
   'EMG_FDP2',
   'EMG_FDS2',
   'EMG_APB',
   'EMG_FPB',
   'EMG_Lum',
   'EMG_ECU',
   'EMG_ECR',
   'video_sync'],
  '20220920_Pancake_FR_001.ns3': ['EMG_FCU1',
   'EMG_FDP2',
   'EMG_FDS2',
   'EMG_APB',
   'EMG_FPB',
   'EMG_Lum',
   'EMG_ECU',
   'EMG_ECR',
   'video_sync'],
  '20220106_Pancake_FR_001.ns3': ['video_sync'],
  '20220103_Panc

In [8]:
for key in elect_labels['Non FR'].keys():
    if 'Fx' in elect_labels['Non FR'][key]:
        print(key, elect_labels['Non FR'][key])

20220921_Pancake_WS_Pre_Con_01.ns3 ['Fx', 'Fy']
20220921_Pancake_PG_Pre_Con_02.ns3 ['Fx', 'Fy']
20220921_Pancake_PG_Post_Con_03.ns3 ['Fx', 'Fy']
20220921_Pancake_WS_Post_Con_04.ns3 ['Fx', 'Fy']
20220121_Pancake_PG_001.ns3 ['Fx', 'Fy']
20221102_Pancke_PG_Post_Cyp_01.ns3 ['Fx', 'Fy']
20221102_Pancke_PG_Pre_Cyp_01.ns3 ['Fx', 'Fy']
20210917_Pancake_PG_001.ns3 ['Fx', 'Fy']
20210917_Pancake_PG_002.ns3 ['Fx', 'Fy']
20210823_Pancake20K2_PG_001.ns3 ['Fx', 'Fy']
20210913_Pancake_PG_001.ns3 ['Fx', 'Fy']
20220916_Pancake_PG_Pre_Cyp_02.ns3 ['Fx', 'Fy']
20220916_Pancake_PG_Pre_Cyp_04.ns3 ['Fx', 'Fy']
20220916_Pancake_WS_Post_Cyp_02.ns3 ['Fx', 'Fy']
20220916_Pancake_PG_Pre_Cyp_03.ns3 ['Fx', 'Fy']
20220916_Pancake_WS_Pre_Cyp_01.ns3 ['Fx', 'Fy']
20220916_Pancake_PG_Post_Cyp_01.ns3 ['Fx', 'Fy']
20221027_Pancake_PG_01.ns3 ['Fx', 'Fy']
20210831_Pancake_KG_001.ns3 ['Fx', 'Fy']
20220124_Pancake_PG_001.ns3 ['Fx', 'Fy']
20210911_Pancake_PG_001.ns3 ['Fx', 'Fy']
20210920_Pancake_KG_001.ns3 ['Fx', 'Fy']
20210920

In [41]:
sessions_key = []# - from sessions table
paper_key = []# ignore for now
filename = []#
file_id = []#auto
rec_system = []# 'Cerebus'
sampling_rate = []# based on nsx file type - what about rhd?
force_quality = []# leave for now
force_notes = []# leave for now
force_labels = []# sometimes in nsx extended headers, but not always - seems to be only sometimes for FR files - e.g. NsxFileObj.extended_headers[hdr_idx]['ElectrodeLabel'] returns Force_x - FR should not have forces, but...?

shortened_nev_list = [cerebus_data_dict['Pancake_20K3']['nev_list'][1]]
for nev_filename in shortened_nev_list:
    # open file
    print(nev_filename)
    nevobj = brpylib.NevFile(nev_filename)
    output = nevobj.getdata(elec_ids='all')
    
    fname = nev_filename.split('/')[-1][:-4]
    filename.append(fname)
    is_sorted.append(fname[-2:] == '-s')
    rec_system.append('Cerebus')

/Volumes/L_MillerLab/data/Pancake_20K3/Cerebus_data/20221103/20221103_Pancake_WI_001.nev

20221103_Pancake_WI_001.nev opened


KeyboardInterrupt: 