# Working with NWB in Python 

In [1]:
import numpy as np
import pandas as pd 
from matplotlib import pyplot as plt
from pynwb import NWBHDF5IO

## Reading our NWB file

To access the data in our nwb file we must read the file. This is done in two steps:
- assign our file as an `NWBHDF5IO` object
- read our file 

The first step is done using the `NWBHDF5IO` class to create our `NWBHDF5IO` object and map our file to HDF5 format. Once we have done this, we can use the `read()` method to return our nwb file. For more information on how to read NWB files, please visit the *Reading data from an NWB file* section from the <a href = 'https://pynwb.readthedocs.io/en/latest/tutorials/general/file.html'> NWB Basics Tutorial</a>. For more information on the `NWBHDF5IO` class, please visit the <a href = 'https://pynwb.readthedocs.io/en/latest/pynwb.html#pynwb.NWBHDF5IO'> original documentation</a>.

In [2]:
# first read the file 
io = NWBHDF5IO('000017/sub-Cori/sub-Cori_ses-20161214T120000.nwb', 'r')
nwb_file = io.read()
print('File read.')

File read.


## File Hierarchy: Groups, Datasets, and Attributes

The NWB file is composed of various Groups, Datasets, and Attributes. The data/datasets and cooresponding meta-data are encapsulated within these Groups. The `fields` attribute returns a dictionary contiaining the metadata of the Groups of our nwb file. The dictionary `keys` are the various Groups within the file which we will use to access our datasets.

In [3]:
# nwb_file.fields

In [4]:
# Get the Groups for the nwb file 
nwb_fields = nwb_file.fields
print(nwb_fields.keys())

dict_keys(['acquisition', 'analysis', 'scratch', 'stimulus', 'stimulus_template', 'processing', 'devices', 'electrode_groups', 'imaging_planes', 'icephys_electrodes', 'ogen_sites', 'intervals', 'lab_meta_data', 'session_description', 'identifier', 'session_start_time', 'timestamps_reference_time', 'file_create_date', 'keywords', 'epoch_tags', 'electrodes', 'subject', 'trials', 'units', 'experiment_description', 'lab', 'institution', 'experimenter', 'related_publications'])


Each NWB file will have information on where the experiment was conducted, what lab conducted the experiment, as well as a description of the experiment. This information can be accessed using `institution`, `lab`, and `description`, attributes on our `nwb_file`, respectively. 

In [5]:
# Get Meta-Data from NWB file 
print('The experiment within this NWB file was conducted at {} in the lab of {}. The experiment is detailed as follows: {}'.format(nwb_file.institution, nwb_file.lab, nwb_file.experiment_description))

The experiment within this NWB file was conducted at University College London in the lab of The Carandini and Harris Lab. The experiment is detailed as follows: Large-scale Neuropixels recordings across brain regions of mice during a head-fixed visual discrimination task. 


We can access metadata from each group in our `nwb_file` with the following syntax: `nwb_file.group`. This is no different than executing a method and/or attribute. The `acquisition` group contains datasets of acquisition data. We can look at the look at the `description` field in the metadata to understand what each dataset in the group contains. 

In [37]:
# example showing how to return meta data from groups in nwb file 
# 'acquisition' is the first group in our file 
nwb_file.acquisition

{'lickPiezo': lickPiezo pynwb.base.TimeSeries at 0x140477258920848
 Fields:
   comments: no comments
   conversion: 1.0
   data: <HDF5 dataset "data": shape (1314000,), type "<f8">
   description: Voltage values from a thin-film piezo connected to the lick spout, so that values are proportional to deflection of the spout and licks can be detected as peaks of the signal.
   rate: 0.002000031887945625
   resolution: -1.0
   starting_time: 33.65250410481991
   starting_time_unit: seconds
   unit: V,
 'wheel_position': wheel_position pynwb.base.TimeSeries at 0x140477258920976
 Fields:
   comments: The wheel has radius 31 mm and 1440 ticks per revolution, so multiply by 2*pi*r/tpr=0.135 to convert to millimeters. Positive velocity (increasing numbers) correspond to clockwise turns (if looking at the wheel from behind the mouse), i.e. turns that are in the correct direction for stimuli presented to the left. Likewise negative velocity corresponds to right choices.
   conversion: 0.135
   dat

In this file, the acquisition group contains two different dataets, `lickPiezo` and `wheel_position`. To access the actual data array of these datasets we must first subset our dataset of interest from the group. We can then use `data[:]` to return our actual data array. 

In [7]:
# select our dataset of interest 
dataset = 'lickPiezo'
lickPiezo_ds = nwb_file.acquisition[dataset]

# return first 20 values in data array 
lickPiezo_data_array = lickPiezo_ds.data[:20]

print(lickPiezo_data_array)

[3.18436567 3.4575181  3.63766762 3.71285813 3.68129979 3.55107522
 3.33830297 3.06425065 2.75209663 2.42396281 2.09870512 1.7906713
 1.50938029 1.25993802 1.04387904 0.85846741 0.69874522 0.56921781
 0.46425845 0.37914098]


The `processing` group in our `nwb_file` contains all of our processed data for scientific analysis. Within the procesing group there are mulitple subgroups that belong to the `behavior` datasets. `BehavioralEpochs`, `BehavioralEvents`, `BehavioralEvents`, and `PupilTracking` are seperate groups encapsulated within `behavior` and contain their own datasets. 

In [8]:
# return meta data for prcessing group
nwb_file.processing

{'behavior': behavior pynwb.base.ProcessingModule at 0x140477258921296
 Fields:
   data_interfaces: {
     BehavioralEpochs <class 'pynwb.behavior.BehavioralEpochs'>,
     BehavioralEvents <class 'pynwb.behavior.BehavioralEvents'>,
     BehavioralTimeSeries <class 'pynwb.behavior.BehavioralTimeSeries'>,
     PupilTracking <class 'pynwb.behavior.PupilTracking'>
   }
   description: behavior module}

If we subset `PupilTracking` from `behavior` we can see that it contains two datasets. We can do as we did before and subset our dataset of interst and return the actual data array by executing `data[:]`.

In [9]:
# assign behavior group to variable 
behavior = nwb_file.processing['behavior']

# subset PupilTracking group from behavior group 
pupil_tracking = behavior['PupilTracking']
print(pupil_tracking)

PupilTracking pynwb.behavior.PupilTracking at 0x140477260165776
Fields:
  time_series: {
    eye_area <class 'pynwb.base.TimeSeries'>,
    eye_xy_positions <class 'pynwb.base.TimeSeries'>
  }



In [10]:
# subset the eye_xy_positions dataset
eye_xy_positions = pupil_tracking['eye_xy_positions']
print(eye_xy_positions)

# return firsy 10 entires in actual data array
print('\n Eye (x,y) positions:')
print(eye_xy_positions.data[:10])

eye_xy_positions pynwb.base.TimeSeries at 0x140477260166224
Fields:
  comments: The 2D position of the center of the pupil in the video frame. This is not registered to degrees visual angle, but could be used to detect saccades or other changes in eye position.
  conversion: 1.0
  data: <HDF5 dataset "data": shape (267759, 2), type "<f8">
  description: Features extracted from the video of the right eye.
  interval: 1
  resolution: -1.0
  timestamps: <HDF5 dataset "timestamps": shape (267759,), type "<f8">
  timestamps_unit: seconds
  unit: arb. unit


 Eye (x,y) positions:
[[1.56599408 0.06802196]
 [1.58213078 0.05515582]
 [1.5796662  0.14542675]
 [1.57286532 0.13382707]
 [1.57185696 0.01200337]
 [1.58985626 0.08226199]
 [1.62204413 0.26782588]
 [1.59213104 0.13432399]
 [1.6127692  0.33010592]
 [1.63706748 0.28426869]]


The `intervals` Group contains datasets from trials of our experiment, sub-experiments that were conducted, and/or epochs. For the example below, we will look into the `trials` dataset. You can return the `trials` data as a dataframe by using the `to_dataframe` method.

In [11]:
# Select the group of interest 
intervals = nwb_file.intervals

# Subset the dataset from the group and assign it as a dataframe
interval_trials_df = intervals['trials'].to_dataframe()
interval_trials_df.head()

Unnamed: 0_level_0,start_time,stop_time,included,go_cue,visual_stimulus_time,visual_stimulus_left_contrast,visual_stimulus_right_contrast,response_time,response_choice,feedback_time,feedback_type,rep_num
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
0,62.900284,67.423484,True,66.296625,65.269408,1.0,0.0,66.419612,1.0,66.456227,1,1.0
1,68.420838,73.604476,True,72.077117,71.202703,0.0,0.5,72.602206,-1.0,72.640326,1,1.0
2,74.602902,78.006757,True,76.877593,76.05238,1.0,0.5,77.001671,1.0,77.038396,1,1.0
3,79.003653,84.506778,True,81.996875,81.235263,0.0,0.0,83.502065,0.0,83.531699,1,1.0
4,85.501795,88.621336,True,87.462962,86.800952,0.5,1.0,87.617727,1.0,87.628565,-1,1.0


The `description` attribute provides a short description on each column of the dataframe. 

In [32]:
print(intervals['trials']['start_time'].description)

Start time of epoch, in seconds


For more information on all the different Groups and hierarchal structure of an NWB file, please visit the <a href = 'https://nwb-schema.readthedocs.io/en/latest/format.html#nwb-n-file'> NWB:N file section</a> of the NWB Format documentation.  

## Possible Analyses 

In [54]:
# test cell 
io2 = NWBHDF5IO('000017/sub-Cori/sub-Cori_ses-20161217T120000.nwb', 'r')
io3 = NWBHDF5IO('000017/sub-Cori/sub-Cori_ses-20161218T120000.nwb', 'r')

nwb_file2 = io2.read()
nwb_file3 = io3.read()