Link to more in-depth descriptions here: https://blackrockneurotech.com/research/wp-content/ifu/LB-0023-7.00_NEV_File_Format.pdf

**Nev obj structure**

It has three main attributes/functions: basic_header, getdata(), extended_headers. The documentation mentions others, such as processroicommments(), but the files we have do not have those.

nevobj.basic_header returns a dictionary with the following keys/values:
- **key**: 'FileTypeID', **value**: str (e.g. 'NEURALEV')
- **key**: 'FileSpec', **value**: str with float (e.g. '2.3')
- **key**: 'AddFlags', **value**: int (likely bool 1/0)
- **key**: 'BytesInHeader', **value**: int
- **key**: 'BytesInDataPackets', **value**: int
- **key**: 'TimeStampResolution', **value**: int
- **key**: 'SampleTimeResolution', **value**: int
- **key**: 'TimeOrigin', **value**: datetime.datetime
- **key**: 'CreatingApplication', **value**: str (e.g. 'File Dialog v7.0.4')
- **key**: 'Comment', **value**: str
- **key**: 'NumExtendedHeaders', **value**: int

nevobj.getdata() takes a long time and returns a dictionary with the following structure:
- **key**: spike_events, **value**: dict
    - **key**: TimeStamps, **value**: list
        - A list of times (integers) in ascending order at which spikes occur (**note**: this is NOT the same as the total duration of the session)
        - The length of this list should, in theory, equal the number of spikes (aka threshold crossings)
    - **key**: Unit, **value**: list
        - A list whose length is equal to Timestamps. In all files I've opened, this list contains only 0s
    - **key**: Channel, **value**: list
        - A list that contains the channel number that corresponds to the spike event time in TimeStamps
        - If the first entry of Timestamps is 30 and the first entry of Channel is 2, that means that a spike occurred in channel 2 at time 30
    - **key**: Waveforms, ****value****: array
        - Array shape: num timestamps x num channels
        - The columns of this array contain the activity of the corresponding channel
- **key**: digital_events, **value**: dict (**note: not all files have digital_events - FR and Cage usually do not**)
    - **key**: Timestamps, **value**: list
        - Not the same timestamps as spike_events. Different lengths and values.
    - **key**: InsertionReason, **value**: list
        - A list whose length is equal to Timestamps. In all files I've opened, this list contains only 1s
    - **key**: UnparsedData, **value**: list
        - A list with integers that encode various task- and trial-related information. Details can be found here:  https://github.com/limblab/Behavior/blob/master/src/target/words.h
        
nevobj.extended_headers returns a list of dicts; the number of dicts equals 'NumExtendedHeaders' in nevobj.basic_headers. 3 dicts in a row correspond to one electrode and contain the following info:
- dict1
    - **key**: 'PacketID':, **value**: str (e.g. 'NEUEVWAV')
    - **key**: 'ElectrodeID':, **value**:  int
    - **key**: 'PhysicalConnector':, **value**: int
    - **key**: 'ConnectorPin':, **value**:  int,
    - **key**: 'DigitizationFactor':, **value**:  int,
    - **key**: 'EnergyThreshold':, **value**:  int,
    - **key**: 'HighThreshold':, **value**:  int,
    - **key**: 'LowThreshold':, **value**:  int,
    - **key**: 'NumSortedUnits':, **value**:  int,
    - **key**: 'BytesPerWaveform':, **value**:  int,
    - **key**: 'SpikeWidthSamples':, **value**:  int,
    - **key**: 'EmptyBytes':, **value**: bytes
- dict2: electrode number
    - **key**: PacketID:, **value**: str 'NEUEVLBL'
    - **key**: ElectrodeID:, **value**: int (e.g. 1, should correspond to dict1)
    - **key**: Label:, **value**: str (the actual electrode number - e.g. 'elec78')
    - **key**: EmptyBytes:, **value**:  bytes (e.g. b'\x00\x00\x00\x00\x00\x00')
- dict3: filter information (type, frequency)
    - **key**: PacketID, **value**: str (e.g. 'NEUEVFLT')
    - **key**: ElectrodeID, **value**: int (e.g. 1, should corresond to dicts 1,2)
    - **key**: HighFreqCorner, **value**: str with float (e.g. '250.0 Hz')
    - **key**: HighFreqOrder, **value**: int (e.g. 4),
    - **key**: HighFreqType, **value**: str (e.g. 'butterworth')
    - **key**: LowFreqCorner, **value**: str with float (e.g. '7500.0 Hz')
    - **key**: LowFreqOrder, **value**: int (e.g. 3)
    - **key**: LowFreqType, **value**: str (e.g. 'butterworth')
    - **key**: EmptyBytes, **value**: bytes
        
**Nsx obj structure**

nsxobj.basic_header produces a dictionary with the following structure: 
- **key**: 'FileTypeID', **value**: str (e.g.  'NEURALCD')
- **key**: 'FileSpec', **value**: str with float (e.g.  '2.3')
- **key**: 'BytesInHeader', **value**: int (e.g.  8762)
- **key**: 'Label', **value**: str representing a rate (e.g.  '2 kS/s')
- **key**: 'Comment', **value**: str 
- **key**: 'Period', **value**: int (e.g.  15)
- **key**: 'TimeStampResolution', **value**: int (e.g.  30000)
- **key**: 'TimeOrigin', **value**:(e.g.  datetime.datetime(2023, 2, 14, 21, 41, 36, 14000))
- **key**: 'ChannelCount', **value**: int (e.g.  128)

nsxobj.getdata() takes a few seconds a dict with the following structure:
- **key**: elec_ids, **value**: list
    - List of electrode id's
- **key**: start_time_s, **value**: float
    - Usually 0.
- **key**: data_time_s, **value**: str
    - In all of the files I've opened, this string has been 'all'.
- **key**: downsample, **value**: int
    - Likely boolean - 1s or 0s. In all the files I've opened, it's been 1.
- **key**: data, **value**: list
    - A list containing a numpy array. This contains emg, force, and other data.
- **key**: data_headers, **value**: list whose only element is a dict
    - **key**: Timestamp, **value**: int (always 0 from what I've seen)
    - **key**: NumDataPoints, **value**: int (with number of total data points - should equal time length of file times sampling rate)
- **key**: ExtendedHeaderIndices, **value**: list
    - A list containing the unique electrode ids - i.e. the length of this list equals the number of electrodes. Values usually between 1 - num electrodes.

nsxobj.extended_headers produces a list of dicts, one for each electrode. The dicts contain the following info:
- **key**: 'Type', **value**: str (e.g. 'CC')
- **key**: 'ElectrodeID', **value**: int (e.g. 1)
- **key**: 'ElectrodeLabel', **value**: str (e.g. 'elec109')
- **key**: 'PhysicalConnector', **value**: int (e.g. 1)
- **key**: 'ConnectorPin', **value**: int (e.g. 1)
- **key**: 'MinDigitalValue', **value**: int (e.g. -32764)
- **key**: 'MaxDigitalValue', **value**: int (e.g. 32764)
- **key**: 'MinAnalogValue', **value**: int (e.g. -8191)
- **key**: 'MaxAnalogValue', **value**: int (e.g. 8191)
- **key**: 'Units', **value**: str (e.g. 'uV')
- **key**: 'HighFreqCorner', **value**: str with float in hz (e.g. '0.3 Hz')
- **key**: 'HighFreqOrder', **value**: int (e.g. 1)
- **key**: 'HighFreqType', **value**: str (e.g. 'butterworth')
- **key**: 'LowFreqCorner', **value**: str with float in hz (e.g. '250.0 Hz')
- **key**: 'LowFreqOrder', **value**: int (e.g. 4)
- **key**: 'LowFreqType', **value**: str (e.g. 'butterworth')

Threshold Files

- sessions_key - from sessions table
- **array_serial** - regular expression in ccf file and look for .cmp. the name of the file contains the array serial number
- paper_key - ignore
- filename
- file_id
- **setting_file** - name of ccf file
- is_sorted - should have -s
- sorted_by - leave blank (maybe check metadata of file)
- num_chans - len(set(output['spike_events']['Channel'])) = output_nsx['ExtendedHeaderIndices']
- **num_units** - for unsorted, = num_chans, else take a look at shape of data array
- **rec_system** - 'Cerebus'
- **connect_type** - should be a string containing something like 'analog', 'digital', 'wireless' - check blackrock documentation 
- **connect_serial** - check docs? If not there don't worry
- **threshold_quality** - check daily logs
- **threshold_notes** - check daily logs

Questions

- No .cmp or array serial anywhere in ccf file. Checked in both Matlab using ParseCCF and by looking at tags in python.
    - Search for implant database on labwiki. On server: General Lab Info/Implants
    - For some nev files, part of brain corresponding to array exists in the filename (e.g. Mihili in 2014), for others it doesn't (e.g. Pop)
    - array invenstory excel file exists in /Volumes/fsmresfiles/Basic_Sciences/Phys/L_MillerLab/limblab/lab_folder/Lab-Wide Animal Info/Implants/Blackrock Array Info
    - Make sure to look at implants and removals - some arrays have been removed (e.g. Pop)
    - ########################################################################
    - monkeys in arrays table
        - array(['Thor', 'Mini', 'Tiki', 'Pedro', 'Kramer', 'Arthur', 'Theo',
       'Fashizzle', 'Keedoo', 'Louie', 'Chewie', 'Jaco', 'Fidel', 'Spike'],dtype=object)
    - array_inventory.xls does not indicate removal of arrays or whether a monkey has multiple arrays
    - surgeries folders are complicated, there are folders for implants and explants and I can try and infer the arrays from the dates of those folders but those are a bit complicated

- For '/Volumes/L_MillerLab/data/Pancake_20K3/Cerebus_data/20221103/20221103_Pancake_WI_001.nev', num channels (output['spike_events']['Waveforms'].shape[1]) != num units (len(set(output['spike_events']['Channel']))), but it's not listed with -s. Labeling issue? Or am I doing something wrong?
    - look at sorted filed, find unique combos of unit/channel, discount unit 255 (considered bad data)
- Nothing on connect type in either matlab or python
    - connect type in daily logs. All cage stuff will be with wireless system.
    - Checked daily logs, but only exists for some monkeys - at least several monkeys don't have anything

In [1]:
# Import dependencies
import pandas as pd
import numpy as np
from sqlalchemy import create_engine
import os
from os import path, system
import sys
from sys import platform
import glob
from scipy import signal
import matplotlib.pyplot as plt
import xml.etree.ElementTree as ET
import time
# from PyQt5.QtWidgets import QFileDialog

# brpylib is the module that contains functions/classes that allow us to open and extract data from .nev and .nsx files
# from Python_Utilities import brpylib
# from Python_Utilities import brMiscFxns
from Python_Utilities_Kev import brpylib

In [3]:
sampling_rate_dict = {'ns1': 500,'ns2': 1000,'ns3': 2000,'ns4': 10000,'ns5': 30000}

In [8]:
pattern

re.compile(r'surge', re.IGNORECASE|re.UNICODE)

In [12]:
import os
import re

root_path = "/Volumes/L_MillerLab/limblab/lab_folder/Animal-Miscellany"
search_term = "surge"

# Compile a case-insensitive regex pattern with the search term
pattern = re.compile(search_term, re.IGNORECASE)

# Use os.walk to traverse all subdirectories
for root, dirs, files in os.walk(root_path):
    # Loop over all directory names
    for dir_name in dirs:
        # Check if the directory name matches the regex pattern
        if pattern.search(dir_name):
            # If it does, print the full path to the directory
            dir_path = os.path.join(root, dir_name)
            print(dir_path)

/Volumes/L_MillerLab/limblab/lab_folder/Animal-Miscellany/Yanny_18J1/Surgeries
/Volumes/L_MillerLab/limblab/lab_folder/Animal-Miscellany/Groot_19L2/Surgeries
/Volumes/L_MillerLab/limblab/lab_folder/Animal-Miscellany/Groot_19L2/Surgeries/complications after surgery
/Volumes/L_MillerLab/limblab/lab_folder/Animal-Miscellany/Tot_20K4/Surgery
/Volumes/L_MillerLab/limblab/lab_folder/Animal-Miscellany/Rocket_19L1/Surgeries
/Volumes/L_MillerLab/limblab/lab_folder/Animal-Miscellany/Rocket_19L1/Surgeries/Cuneate Implant 06302020/surgery notes
/Volumes/L_MillerLab/limblab/lab_folder/Animal-Miscellany/Rocket_19L1/Surgeries/Cuneate Implant 06302020/Rocket surgery pictures
/Volumes/L_MillerLab/limblab/lab_folder/Animal-Miscellany/Butter_17D2/Surgeries
/Volumes/L_MillerLab/limblab/lab_folder/Animal-Miscellany/Pancake_20K2/Surgeries
/Volumes/L_MillerLab/limblab/lab_folder/Animal-Miscellany/Sherry_20C1/Surgery
/Volumes/L_MillerLab/limblab/lab_folder/Animal-Miscellany/Sherry_20C1/Surgery/UTDright_202106

In [4]:
root_path = '/Volumes/fsmresfiles/Basic_Sciences/Phys/L_MillerLab/limblab/lab_folder/Animal-Miscellany'
extension = ".cmp"
for root, dirs, files in os.walk(root_path):
    for file in files:
        if file.endswith(extension):
            file_path = os.path.join(root, file)
            monkey = file_path.split('/')[9]
            monkey_name = monkey.split('_')[0]
            print(monkey_name,file)

Yanny SN 6250-002470 array 1052-3.cmp
Groot SN 6250-002338.cmp
Tot SN 6251-002471 array 1066-5.cmp
Rocket SN 6250-002469.cmp
Rocket A2_3a_together.cmp
Rocket 3a.cmp
Rocket SN 6251-002088.cmp
Rocket SN 6250-002385.cmp
Butter SN 6250-001799.cmp
Butter SN 6250-001799.cmp
Pancake SN 6250-002468 array 1059-12.cmp
Greyson SN 6250-001696.cmp
Greyson SN 6250-002085.cmp
Jaco Jaco_Grid_Map_1025-0397.cmp
Crackle SN 6250-002067.cmp
Crackle SN 6251-001695.cmp
Crackle S1_3a_together.cmp
Snap SN 6250-002068.cmp
Han SN 6251-001459.cmp
Duncan SN 6251-002087.cmp
Duncan SN 6251-001804.cmp
Duncan SN 4566-002186.cmp
Pop SN 6250-002084.cmp
Pop SN 6250-002339 .cmp
Pop SN 6250-002086.cmp
Pop SN 6250-002085.cmp


In [9]:
array_inventory_filename = "/Volumes/fsmresfiles/Basic_Sciences/Phys/L_MillerLab/limblab/lab_folder/Lab-Wide Animal Info/Implants/Blackrock Array Info/array_inventory.xls"
sheet_name = "Latest Inventory"
array_inv = pd.read_excel(array_inventory_filename,sheet_name)

In [10]:
array_inv.shape

(85, 9)

In [11]:
array_inv.head()

Unnamed: 0,SN,Received,Elec. Length (mm),Lead Length (cm),Type,Implanted,Monkey,Site,Notes
0,,,,,Pt (ICS-96),2002-09-24 00:00:00,Gilbert,,
1,,,,,Pt (ICS-96),2003-01-08 00:00:00,Gilbert,,
2,04Jan06F-7,9/3/04?,1.0,3.5,Pt (ICS-96),,Tito?,,
3,02Jan1E-9,2003-04-24 00:00:00,1.0,3.5,Pt (ICS-96),,Tito?,,
4,04Jan06F-4,9/3/04?,,,Pt (ICS-96),2004-12-01 00:00:00,Animal,,


In [21]:
array_inv[array_inv[['SN', 'Monkey']].notnull().all(axis=1)]

Unnamed: 0,SN,Received,Elec. Length (mm),Lead Length (cm),Type,Implanted,Monkey,Site,Notes
2,04Jan06F-7,9/3/04?,1.0,3.5,Pt (ICS-96),,Tito?,,
3,02Jan1E-9,2003-04-24 00:00:00,1.0,3.5,Pt (ICS-96),,Tito?,,
4,04Jan06F-4,9/3/04?,,,Pt (ICS-96),2004-12-01 00:00:00,Animal,,
6,04Jan06D-4,10/21/04?,1.0,3.0,Pt (ICS-96),2007-05-08 00:00:00,Thor,Right M1,
7,?,,1.5,,Pt (ICS-96),2007-03-12 00:00:00,Fitz,Right M1,
...,...,...,...,...,...,...,...,...,...
76,6251-002088,,1.0,5.0,IrOx,2021-05-25 00:00:00,Rocket,Right arm area 2,
77,6251-002087,,1.0,,IrOx,2019-02-05 00:00:00,Duncan,Left arm area 2,From the wiki
78,6250-002084,,1.5,,IrOx,2019-03-19 00:00:00,Pop,Left hand M1,
81,1024-000890,,,,,,Chips,Rt cuneate,From the implant database


In [12]:
cerebus_data_dict = {}
base_dir = '/Volumes/L_MillerLab/data/'
for monkey in sorted(os.listdir(base_dir)):
#     if monkey not in ['.DS_Store','archive','Backed_up_data', 'Behavior','chewie-delete','CompiledCOFiles','DeepLabCutVids','DLC_models','DPZ','FSMIT_DataRestore_03172021', 'Han_13B1_target','IMU','Jarvis','Jango_redo','Jango_target_redo','LoadCell','Mihili_12A3_target','OldCerebusTest','Rats','Rats_target','Test data','Thumbs.db']:
    if (monkey == 'Pancake_20K3') or (monkey == 'Pop_18E3'):
        print(monkey)
        cerebus_data_dict[monkey] = {}
        monkey_path = os.path.join(base_dir, monkey)
        x = [i for i in os.listdir(monkey_path) if 'cerebus' in i.lower()]
        if len(x) != 0:
            cerebus_path = os.path.join(monkey_path, x[0])
        else:
            cerebus_path = monkey_path
        print(cerebus_path)
        nev_list = glob.glob(f"{cerebus_path}/*/*.nev")
        nsx_list = glob.glob(f"{cerebus_path}/*/*.ns*")
        ccf_list = glob.glob(f"{cerebus_path}/*/*.ccf")
        print(len(nev_list), len(nsx_list), len(ccf_list))
        cerebus_data_dict[monkey]['nev_list'] = nev_list
        cerebus_data_dict[monkey]['nsx_list'] = nsx_list
        cerebus_data_dict[monkey]['ccf_list'] = ccf_list

Pancake_20K3
/Volumes/L_MillerLab/data/Pancake_20K3/Cerebus_data
116 88 90
Pop_18E3
/Volumes/L_MillerLab/data/Pop_18E3/CerebusData
766 858 607


In [40]:
cerebus_data_dict['Pancake_20K3']['ccf_list'][0]

'/Volumes/L_MillerLab/data/Pancake_20K3/Cerebus_data/20211214/20211214_Pancake__FR_001.ccf'

In [8]:
tree = ET.parse(cerebus_data_dict['Pancake_20K3']['ccf_list'][0])

In [13]:
type(tree)

xml.etree.ElementTree.ElementTree

In [14]:
from inspect import getmembers, isclass, isfunction

In [15]:
getmembers(ET, isclass)

[('C14NWriterTarget', xml.etree.ElementTree.C14NWriterTarget),
 ('Element', xml.etree.ElementTree.Element),
 ('ElementTree', xml.etree.ElementTree.ElementTree),
 ('ParseError', xml.etree.ElementTree.ParseError),
 ('QName', xml.etree.ElementTree.QName),
 ('TreeBuilder', xml.etree.ElementTree.TreeBuilder),
 ('XMLParser', xml.etree.ElementTree.XMLParser),
 ('XMLPullParser', xml.etree.ElementTree.XMLPullParser),
 ('_Element_Py', xml.etree.ElementTree.Element),
 ('_ListDataStream', xml.etree.ElementTree._ListDataStream)]

In [10]:
root = tree.getroot()

In [12]:
root

<Element 'CCF' at 0x11b522570>

In [16]:
print(ET.tostring(root))

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



In [22]:
print(root.attrib)

{'Version': '12'}


In [39]:
for i in root:
    print(i.tag)

FilterInfo
ChanInfo
Sorting
SysInfo
LNC
AnalogOutput
NTrodeInfo
AdaptInfo
Session


In [60]:
for i in root:
    if i.tag == 'ChanInfo':
        print(i.tag)
        for j in i:
            print('       '+j.tag)
            for k in j:
                print('              '+k.tag)
                for l in k:
                    print('                     '+l.tag)
                    for n in l:
                        print('                            '+n.tag)
                        for m in n:
                            print('                                   '+m.tag)
                            for o in m:
                                print('                                          '+o.tag)                    

ChanInfo
       ChanInfo_item
              chid
              type
              dlen
              chan
              proc
              bank
              term
              caps
                     chancaps
                     doutcaps
                     dinpcaps
                     aoutcaps
                     ainpcaps
                     spkcaps
              scale
                     physcalin
                            digmin
                            digmax
                            anamin
                            anamax
                            anagain
                            anaunit
                     physcalout
                            digmin
                            digmax
                            anamin
                            anamax
                            anagain
                            anaunit
                     scalin
                            digmin
                            digmax
                            anamin

                                   hoop_item
                                          valid
                                          time
                                          min
                                          max
                                   hoop_item
                                          valid
                                          time
                                          min
                                          max
                                   hoop_item
                                          valid
                                          time
                                          min
                                          max
                                   hoop_item
                                          valid
                                          time
                                          min
                                          max
                            hoop
                                   hoop

                                          axis_item
                                          axis_item
                                          axis_item
                            Phi
                            Valid
                     unitmapping_item
                            Override
                            center
                                   center_item
                                   center_item
                                   center_item
                            axes
                                   axis
                                          axis_item
                                          axis_item
                                          axis_item
                                   axis
                                          axis_item
                                          axis_item
                                          axis_item
                                   axis
                                          axis_item
          

                                          max
                                   hoop_item
                                          valid
                                          time
                                          min
                                          max
                                   hoop_item
                                          valid
                                          time
                                          min
                                          max
                                   hoop_item
                                          valid
                                          time
                                          min
                                          max
                            hoop
                                   hoop_item
                                          valid
                                          time
                                          min
                                       

- sessions_key - from sessions table
- **array_serial** - regular expression in ccf file and look for .cmp. the name of the file contains the array serial number
- paper_key - ignore
- filename
- file_id - autopopulate
- **setting_file** - name of ccf file?
- is_sorted - should have -s
- sorted_by - leave blank (maybe check metadata of file)
- num_chans - len(set(output['spike_events']['Channel'])) = output_nsx['ExtendedHeaderIndices']
- **num_units** - for unsorted, = num_chans, else take a look at shape of data array
- **rec_system** - 'Cerebus'
- **connect_type** - should be a string containing something like 'analog', 'digital', 'wireless' - check blackrock documentation 
- **connect_serial** - check docs? If not there don't worry
- **threshold_quality** - check daily logs
- **threshold_notes** - check daily logs

In [41]:
# fields to obtain from nev files
sessions_key = []
array_serial = []
paper_key = []
filename = []
file_id = []
setting_file = []
is_sorted = []
sorted_by = []
num_chans = []
num_units = []
rec_system = []
connect_type = []
connect_serial = []
threshold_quality = []
threshold_notes = []

shortened_nev_list = [cerebus_data_dict['Pancake_20K3']['nev_list'][1]]
for nev_filename in shortened_nev_list:
    # open file
    print(nev_filename)
    nevobj = brpylib.NevFile(nev_filename)
    output = nevobj.getdata(elec_ids='all')
    
    fname = nev_filename.split('/')[-1][:-4]
    filename.append(fname)
    is_sorted.append(fname[-2:] == '-s')
    num_chans.append(len(set(output['spike_events']['Channel'])))
    rec_system.append('Cerebus')

/Volumes/L_MillerLab/data/Pancake_20K3/Cerebus_data/20221103/20221103_Pancake_WI_001.nev

20221103_Pancake_WI_001.nev opened


KeyboardInterrupt: 

In [58]:
nevobj = brpylib.NevFile('/Volumes/L_MillerLab/data/Pop_18E3/CerebusData/20210712/20210712_Pop_FR_01.nev')
nevobj2 = brpylib.NevFile('/Volumes/L_MillerLab/data/Pop_18E3/CerebusData/20210712/20210712_Pop_FR_01-s.nev')


20210712_Pop_FR_01.nev opened

20210712_Pop_FR_01-s.nev opened


In [61]:
len(nevobj2.extended_headers[::3][:-1])

144

In [59]:
nevobj_sortedunits = {}
nevobj2_sortedunits = {}
for d in nevobj.extended_headers[::3][:-1]:
    nevobj_sortedunits[d['ElectrodeID']] = d['NumSortedUnits']
for d in nevobj2.extended_headers[::3][:-1]:
    nevobj2_sortedunits[d['ElectrodeID']] = d['NumSortedUnits']

In [60]:
nevobj_sortedunits

{1: 0,
 2: 0,
 3: 0,
 4: 0,
 5: 0,
 6: 0,
 7: 0,
 8: 0,
 9: 0,
 10: 0,
 11: 0,
 12: 0,
 13: 0,
 14: 0,
 15: 0,
 16: 0,
 17: 0,
 18: 0,
 19: 0,
 20: 0,
 21: 0,
 22: 0,
 23: 0,
 24: 0,
 25: 0,
 26: 0,
 27: 0,
 28: 0,
 29: 0,
 30: 0,
 31: 0,
 32: 0,
 33: 0,
 34: 0,
 35: 0,
 36: 0,
 37: 0,
 38: 0,
 39: 0,
 40: 0,
 41: 0,
 42: 0,
 43: 0,
 44: 0,
 45: 0,
 46: 0,
 47: 0,
 48: 0,
 49: 0,
 50: 0,
 51: 0,
 52: 0,
 53: 0,
 54: 0,
 55: 0,
 56: 0,
 57: 0,
 58: 0,
 59: 0,
 60: 0,
 61: 0,
 62: 0,
 63: 0,
 64: 0,
 65: 0,
 66: 0,
 67: 0,
 68: 0,
 69: 0,
 70: 0,
 71: 0,
 72: 0,
 73: 0,
 74: 0,
 75: 0,
 76: 0,
 77: 0,
 78: 0,
 79: 0,
 80: 0,
 81: 0,
 82: 0,
 83: 0,
 84: 0,
 85: 0,
 86: 0,
 87: 0,
 88: 0,
 89: 0,
 90: 0,
 91: 0,
 92: 0,
 93: 0,
 94: 0,
 95: 0,
 96: 0,
 97: 0,
 98: 0,
 99: 0,
 100: 0,
 101: 0,
 102: 0,
 103: 0,
 104: 0,
 105: 0,
 106: 0,
 107: 0,
 108: 0,
 109: 0,
 110: 0,
 111: 0,
 112: 0,
 113: 0,
 114: 0,
 115: 0,
 116: 0,
 117: 0,
 118: 0,
 119: 0,
 120: 0,
 121: 0,
 122: 0,
 123: 0,
 

In [62]:
nevobj2_sortedunits

{1: 0,
 2: 0,
 3: 0,
 4: 0,
 5: 0,
 6: 0,
 7: 0,
 8: 0,
 9: 0,
 10: 0,
 11: 0,
 12: 0,
 13: 0,
 14: 0,
 15: 0,
 16: 0,
 17: 0,
 18: 0,
 19: 0,
 20: 0,
 21: 0,
 22: 0,
 23: 0,
 24: 0,
 25: 0,
 26: 0,
 27: 0,
 28: 0,
 29: 0,
 30: 0,
 31: 0,
 32: 0,
 33: 0,
 34: 0,
 35: 0,
 36: 0,
 37: 0,
 38: 0,
 39: 0,
 40: 0,
 41: 0,
 42: 0,
 43: 0,
 44: 0,
 45: 0,
 46: 0,
 47: 0,
 48: 0,
 49: 0,
 50: 0,
 51: 0,
 52: 0,
 53: 0,
 54: 0,
 55: 0,
 56: 0,
 57: 0,
 58: 0,
 59: 0,
 60: 0,
 61: 0,
 62: 0,
 63: 0,
 64: 0,
 65: 0,
 66: 0,
 67: 0,
 68: 0,
 69: 0,
 70: 0,
 71: 0,
 72: 0,
 73: 0,
 74: 0,
 75: 0,
 76: 0,
 77: 0,
 78: 0,
 79: 0,
 80: 0,
 81: 0,
 82: 0,
 83: 0,
 84: 0,
 85: 0,
 86: 0,
 87: 0,
 88: 0,
 89: 0,
 90: 0,
 91: 0,
 92: 0,
 93: 0,
 94: 0,
 95: 0,
 96: 0,
 97: 0,
 98: 0,
 99: 0,
 100: 0,
 101: 0,
 102: 0,
 103: 0,
 104: 0,
 105: 0,
 106: 0,
 107: 0,
 108: 0,
 109: 0,
 110: 0,
 111: 0,
 112: 0,
 113: 0,
 114: 0,
 115: 0,
 116: 0,
 117: 0,
 118: 0,
 119: 0,
 120: 0,
 121: 0,
 122: 0,
 123: 0,
 

In [None]:
np.arange(len(nevobj2.extended_headers))

In [55]:
sorted_files_sorted_units = {}
for filename in cerebus_data_dict['Pancake_20K3']['nev_list']:
    if '-s' in filename:
        fname = filename.split('/')[-1]
        nevobj = brpylib.NevFile(filename)
        ext_head = nevobj.extended_headers[::3][:-1]
        sorted_units_dict = {}
        for d in ext_head:
            sorted_units_dict['Electrode {}'.format(d['ElectrodeID'])] = d['NumSortedUnits']
        sorted_files_sorted_units[fname] = sorted_units_dict


20220921_Pancake_PG_Post_Con_03-s.nev opened

20220921_Pancake_PG_Pre_Con_02-s.nev opened

20220921_Pancake_WS_Pre_Con_01-s.nev opened

20220921_Pancake_WS_Post_Con_04-s.nev opened

20221102_Pancke_PG_Post_Cyp_01-s.nev opened

20221102_Pancke_PG_Pre_Cyp_01-s.nev opened

20220623_Pancake_FR_001-s.nev opened

20220915_Pancake_FR_004-01-s.nev opened

20220907_Pancake_WS_Pre_Caff_02-s.nev opened

20220907_Pancake_PG_Pre_Caff_01-s.nev opened

20220907_Pancake_WS_Post_Caff_03-s.nev opened

20220907_Pancake_PG_Post_Caff_05-s.nev opened

20220907_Pancake_PG_Post_Caff_04-s.nev opened

20220628_Pancake_FR_003-s.nev opened

20220729_Pancake_PG_Post_Tiz_02-s.nev opened

20220729_Pancake_PG_Post_Tiz_01-s.nev opened


In [56]:
sorted_files_sorted_units

{'20220921_Pancake_PG_Post_Con_03-s.nev': {'Electrode 1': 0,
  'Electrode 2': 0,
  'Electrode 3': 0,
  'Electrode 4': 0,
  'Electrode 5': 0,
  'Electrode 6': 0,
  'Electrode 7': 0,
  'Electrode 8': 0,
  'Electrode 9': 0,
  'Electrode 10': 0,
  'Electrode 11': 0,
  'Electrode 12': 0,
  'Electrode 13': 0,
  'Electrode 14': 0,
  'Electrode 15': 0,
  'Electrode 16': 0,
  'Electrode 17': 0,
  'Electrode 18': 0,
  'Electrode 19': 0,
  'Electrode 20': 0,
  'Electrode 21': 0,
  'Electrode 22': 0,
  'Electrode 23': 0,
  'Electrode 24': 0,
  'Electrode 25': 0,
  'Electrode 26': 0,
  'Electrode 27': 0,
  'Electrode 28': 0,
  'Electrode 29': 0,
  'Electrode 30': 0,
  'Electrode 31': 0,
  'Electrode 32': 0,
  'Electrode 33': 0,
  'Electrode 34': 0,
  'Electrode 35': 0,
  'Electrode 36': 0,
  'Electrode 37': 0,
  'Electrode 38': 0,
  'Electrode 39': 0,
  'Electrode 40': 0,
  'Electrode 41': 0,
  'Electrode 42': 0,
  'Electrode 43': 0,
  'Electrode 44': 0,
  'Electrode 45': 0,
  'Electrode 46': 0,
  

In [57]:
sorted_files_sorted_units['20220729_Pancake_PG_Post_Tiz_01-s.nev']

{'Electrode 1': 0,
 'Electrode 2': 0,
 'Electrode 3': 0,
 'Electrode 4': 0,
 'Electrode 5': 0,
 'Electrode 6': 0,
 'Electrode 7': 0,
 'Electrode 8': 0,
 'Electrode 9': 0,
 'Electrode 10': 0,
 'Electrode 11': 0,
 'Electrode 12': 0,
 'Electrode 13': 0,
 'Electrode 14': 0,
 'Electrode 15': 0,
 'Electrode 16': 0,
 'Electrode 17': 0,
 'Electrode 18': 0,
 'Electrode 19': 0,
 'Electrode 20': 0,
 'Electrode 21': 0,
 'Electrode 22': 0,
 'Electrode 23': 0,
 'Electrode 24': 0,
 'Electrode 25': 0,
 'Electrode 26': 0,
 'Electrode 27': 0,
 'Electrode 28': 0,
 'Electrode 29': 0,
 'Electrode 30': 0,
 'Electrode 31': 0,
 'Electrode 32': 0,
 'Electrode 33': 0,
 'Electrode 34': 0,
 'Electrode 35': 0,
 'Electrode 36': 0,
 'Electrode 37': 0,
 'Electrode 38': 0,
 'Electrode 39': 0,
 'Electrode 40': 0,
 'Electrode 41': 0,
 'Electrode 42': 0,
 'Electrode 43': 0,
 'Electrode 44': 0,
 'Electrode 45': 0,
 'Electrode 46': 0,
 'Electrode 47': 0,
 'Electrode 48': 0,
 'Electrode 49': 0,
 'Electrode 50': 0,
 'Electro

In [35]:
sorted_pancake_nevs = [filename for filename in cerebus_data_dict['Pancake_20K3']['nev_list'] if '-s' in filename]
sorted_pancake_nevobjs = [brpylib.NevFile(filename) for filename in sorted_pancake_nevs]
for nevobj in sorted_pancake_nevobjs:
d['NumSortedUnits'] for d in nevobj.extended_headers[::3][:-1]

In [51]:
len(set(output['spike_events']['Channel']))

128

In [46]:
set(output['spike_events']['Unit'])

{0}

In [50]:
len(output['spike_events']['Unit'])

2355452

In [49]:
output['spike_events']['Waveforms'].shape

(2355452, 96)

In [2]:
dbName = "staging_db"
userName = "limblab"
sesame = "mvemjlht123&LL"

# this is set up using an SSH tunnel
engine = create_engine(f"mysql+pymysql://{userName}:{sesame}@127.0.0.1:3306/{dbName}")

In [6]:
# checking that the connection works
arrays = pd.read_sql_query('select * from staging_db.arrays', engine)

In [16]:
# checking that the connection works
monkeys = pd.read_sql_query('select * from staging_db.monkeys', engine)

In [8]:
arrays

Unnamed: 0,serial,array_type,monkey_id,electrode_length,lead_length,num_leads,implant_date,removal_date,map_file,implant_location,loc_ML,loc_AP,crani_Medial,crani_Lateral,crani_Anterior,crani_Posterior,comments,array_material
0,1017-033,Utah,5E2,1.5,4.0,,2008-01-08,2008-01-08,,,,,,,,,,
1,1024-0226,Utah,7H1,1.0,3.5,,2009-01-12,,,rightM1,10.0,12.0,1.0,24.0,26.0,-10.0,,
2,1024-0350,Utah,4C1,1.0,3.5,,2010-04-15,2010-06-01,,leftS1,17.0,4.0,8.7,20.7,-0.3,19.7,Pedestal was re-affixed onto the skull after 1...,
3,1024-0387,Utah,4C2,1.0,3.0,,2010-06-14,,,left S1,,,,,,,\r,
4,1024-0393,Utah,4C1,1.0,3.0,,2009-07-04,,,right S1,,,,,,,\r,
5,1024-0585,Utah,10I1,1.0,3.0,,2011-09-14,,,(used in acute proc),,,,,,,"""did not work in first surgery; sent back to ...",
6,1024-0589,Utah,10I1,1.0,3.0,,2012-02-01,,,left S1,,,,,,,\r,
7,1025-0225,Utah,5E1,1.5,3.0,,2008-07-17,,,left M1,,,,,,,\r,
8,1025-0234,Utah,7H2,1.5,3.5,,2008-06-10,,,right M1,,,,,,,\r,
9,1025-0255,Utah,5E1,1.5,3.5,,2008-07-17,,,left S1,,,,,,,\r,


In [17]:
monkeys

Unnamed: 0,name,ccm_id,usda_id,species,retired
0,Kramer,10I1,060967,Rhesus,1
1,Louie,10I2,060149,Rhesus,1
2,Spike,10I3,051641,Rhesus,1
3,Jango,12A1,,Rhesus,1
4,Kevin,12A2,,Rhesus,1
5,Mihili,12A3,0803917,Rhesus,1
6,Chips,12H1,8121,Rhesus,1
7,Fish,12H2,,Rhesus,1
8,Han,13B1,,Rhesus,1
9,Lando,13B2,090855,Rhesus,1


In [22]:
pd.merge(arrays,monkeys,how="inner",left_on='monkey_id',right_on='ccm_id').name.unique()

array(['Thor', 'Mini', 'Tiki', 'Pedro', 'Kramer', 'Arthur', 'Theo',
       'Fashizzle', 'Keedoo', 'Louie', 'Chewie', 'Jaco', 'Fidel', 'Spike'],
      dtype=object)

In [13]:
arrays.monkey_id.unique()

array(['5E2', '7H1', '4C1', '4C2', '10I1', '5E1', '7H2', '3F1', '9I3',
       '10I2', '8I2', '8I1', '9I2', '10I3'], dtype=object)

In [10]:
arrays[arrays['monkey_id']=='10I1']

Unnamed: 0,serial,array_type,monkey_id,electrode_length,lead_length,num_leads,implant_date,removal_date,map_file,implant_location,loc_ML,loc_AP,crani_Medial,crani_Lateral,crani_Anterior,crani_Posterior,comments,array_material
5,1024-0585,Utah,10I1,1.0,3.0,,2011-09-14,,,(used in acute proc),,,,,,,"""did not work in first surgery; sent back to ...",
6,1024-0589,Utah,10I1,1.0,3.0,,2012-02-01,,,left S1,,,,,,,\r,


In [11]:
arrays[arrays['monkey_id']=='5E2']

Unnamed: 0,serial,array_type,monkey_id,electrode_length,lead_length,num_leads,implant_date,removal_date,map_file,implant_location,loc_ML,loc_AP,crani_Medial,crani_Lateral,crani_Anterior,crani_Posterior,comments,array_material
0,1017-033,Utah,500.0,1.5,4.0,,2008-01-08,2008-01-08,,,,,,,,,,
13,1025-0302,Utah,500.0,1.5,3.5,,2009-03-31,,,left M1,,,,,,,\r,


In [None]:
arrays_dict = {}
for row in df:
    monkey_id = arrays_dict[]
    arrays_dict['monkey_id'] = {}

In [None]:
cerebus_data_dict

In [None]:
if nevfile date between 