# NWB User Days Tutorial - Extracellular Electrophysiology

## Introduction
In this tutorial, we will create fake data for a hypothetical extracellular electrophysiology experiment with a freely moving animal. The types of data we will convert are:
- Subject information (species, strain, age, etc.) 
- Animal position
- Trials
- LFP
- Spike times



## Installing PyNWB
First, install PyNWB using pip or conda. You will need Python 3.5+ installed.
- `pip install pynwb`
- `conda install -c conda-forge pynwb`

## Set up the NWB file
An NWB file represents a single session of an experiment. Each file must have a session description, identifier, and session start time. Create a new `NWBFile` object with those and additional metadata. For all PyNWB constructors, we recommend using keyword arguments.

In [1]:
from pynwb import NWBFile
from datetime import datetime
from dateutil import tz

start_time = datetime(2018, 4, 25, 2, 30, 3, tzinfo=tz.gettz('US/Pacific'))

nwbfile = NWBFile(
    session_description='Mouse exploring an open field',
    identifier='Mouse5_Day3',
    session_start_time=start_time,
    session_id='session_1234',                                # optional
    experimenter='My Name',                                   # optional
    lab='My Lab Name',                                        # optional
    institution='University of My Institution',               # optional
    related_publications='DOI:10.1016/j.neuron.2016.12.011'   # optional
)
nwbfile

root pynwb.file.NWBFile at 0x1951404318792
Fields:
  experimenter: ['My Name']
  file_create_date: [datetime.datetime(2020, 5, 5, 9, 6, 50, 421191, tzinfo=tzlocal())]
  identifier: Mouse5_Day3
  institution: University of My Institution
  lab: My Lab Name
  related_publications: ['DOI:10.1016/j.neuron.2016.12.011']
  session_description: Mouse exploring an open field
  session_id: session_1234
  session_start_time: 2018-04-25 02:30:03-07:00
  timestamps_reference_time: 2018-04-25 02:30:03-07:00

## Subject information
Create a `Subject` object to store information about the experimental subject, such as age, species, genotype, sex, and a freeform description. Then set `nwb.subject` to the `Subject` object.

<img src="images/subject.svg" width="200">

In [2]:
from pynwb.file import Subject

nwbfile.subject = Subject(
    age='9 months', 
    description='mouse 5',
    species='Mus musculus', 
    sex='M'
)

## SpatialSeries
Many types of data have special data types in NWB. To store the spatial position of a subject, we will use the `SpatialSeries` and `Position` classes. `SpatialSeries` is a subclass of `TimeSeries`. `TimeSeries` is a common base class for measurements sampled over time, and provides fields for data and time (regularly or irregularly sampled).

<img src="images/position.png" width="600">

Create a `SpatialSeries` object named `'SpatialSeries'` with some fake data.

In [3]:
import numpy as np
from pynwb.behavior import SpatialSeries

# create fake data with shape (100, 2)
# the first dimension should always represent time
position_data = np.array([np.linspace(0, 10, 100),
                          np.linspace(0, 8, 100)]).T
position_timestamps = np.linspace(0, 100) / 200

spatial_series_obj = SpatialSeries(
    name='SpatialSeries', 
    description='(x,y) position in open field',
    data=position_data,
    timestamps=position_timestamps,
    reference_frame='(0,0) is bottom left corner'
)

To help data analysis and visualization tools know that this `SpatialSeries` object represents the position of the animal, we will store the `SpatialSeries` object inside of a `Position` object.

In [4]:
from pynwb.behavior import Position

position_obj = Position(spatial_series=spatial_series_obj)

NWB differentiates between raw, acquired data, which should never change, and processed data, which is the result of a data preprocessing algorithm and could change. Next, let's assume that the animal's position was computed from a video tracking algorithm, so it would be classified as processed data. Since processed data can be very diverse, let's store the animal's position data in a processing module that we create specifically for behavioral data.

In [5]:
behavior_module = nwbfile.create_processing_module(
    name='behavior', 
    description='processed behavioral data'
)
behavior_module.add(position_obj)

Position pynwb.behavior.Position at 0x1951687640520
Fields:
  spatial_series: {
    SpatialSeries <class 'pynwb.behavior.SpatialSeries'>
  }

## Write to file

Now, let's try writing the NWB file that we have built so far.

In [6]:
from pynwb import NWBHDF5IO

with NWBHDF5IO('ecephys_tutorial.nwb', 'w') as io:
    io.write(nwbfile)

We can then read the file and print it to inspect its contents. You can then navigate to the `SpatialSeries` data that we created by referencing the names of the objects that contain it.

In [7]:
with NWBHDF5IO('ecephys_tutorial.nwb', 'r') as io:
    read_nwbfile = io.read()
    print(read_nwbfile)
    print(read_nwbfile.processing['behavior']['Position']['SpatialSeries'])

root pynwb.file.NWBFile at 0x1951403843144
Fields:
  experimenter: ['My Name']
  file_create_date: [datetime.datetime(2020, 5, 5, 9, 6, 50, 421191, tzinfo=tzoffset(None, -25200))]
  identifier: Mouse5_Day3
  institution: University of My Institution
  lab: My Lab Name
  processing: {
    behavior <class 'pynwb.base.ProcessingModule'>
  }
  related_publications: ['DOI:10.1016/j.neuron.2016.12.011']
  session_description: Mouse exploring an open field
  session_id: session_1234
  session_start_time: 2018-04-25 02:30:03-07:00
  subject: subject pynwb.file.Subject at 0x1951688124488
Fields:
  age: 9 months
  description: mouse 5
  sex: M
  species: Mus musculus

  timestamps_reference_time: 2018-04-25 02:30:03-07:00

SpatialSeries pynwb.behavior.SpatialSeries at 0x1951690279688
Fields:
  comments: no comments
  conversion: 1.0
  data: <HDF5 dataset "data": shape (100, 2), type "<f8">
  description: (x,y) position in open field
  interval: 1
  reference_frame: (0,0) is bottom left corner
  

## Trials

`DynamicTable` objects are used to store tabular metadata throughout NWB, including electrodes and sorted units. They offer flexibility for tabular data by allowing required columns, optional columns, and custom columns.

<img src="images/trials.svg" width="500">

Trials are stored in a `TimeIntervals` object which is a subclass of `DynamicTable`. Let's continue adding to our `NWBFile` by creating a new column for the trials table named `'correct'`, which will be a boolean array.

In [8]:
nwbfile.add_trial_column(name='correct', description='whether the trial was correct')
nwbfile.add_trial(start_time=1.0, stop_time=5.0, correct=True)
nwbfile.add_trial(start_time=6.0, stop_time=10.0, correct=False)

## Electrodes table
Extracellular electrodes are stored in a `electrodes` table, which is also a `DynamicTable`. `electrodes` has several required fields: x, y, z, impedence, location, filtering, and group.

<img src="images/Electrodes.png" width="800">

Here, we also demonstrate how to add optional columns to a table by adding the `'label'` column.

In [9]:
nwbfile.add_electrode_column(name='label', description='label of electrode')
shank_channels = [4, 3]  # set up 4 shanks with 3 electrodes each

electrode_counter = 0
device = nwbfile.create_device('array')
for shankn, nelecs in enumerate(shank_channels):
    # create an electrode group for this shank
    electrode_group = nwbfile.create_electrode_group(
       name='shank{}'.format(shankn),
       description='electrode group for shank {}'.format(shankn),
       device=device,
       location='brain area'
    )
    # add electrodes to the electrode table
    for ielec in range(nelecs):
        nwbfile.add_electrode(
            x=5.3, y=1.5, z=8.5, imp=np.nan,
            location='unknown', 
            filtering='unknown',
            group=electrode_group,
            label='shank{}elec{}'.format(shankn, ielec)
        )
        electrode_counter += 1

## Links
In the above loop, we create `ElectrodeGroup` objects using `nwb.create_electrode_group`. When we add an electrode, we pass in the `ElectrodeGroup` object for the `'group'` argument. This creates a reference from the `electrodes` table to individual `ElectrodeGroup` objects, one per row.

In order to create our `ElectricalSeries` object, we will also need to create a `DynamicTableRegion` of electrodes. A `DynamicTableRegion` is a type of link that allows you to reference specific rows of a `DynamicTable`.

In [10]:
# create a table region object that refs to a set of rows of the table by index
all_table_region = nwbfile.create_electrode_table_region(region=list(range(electrode_counter)), description='all electrodes')

## LFP
`LFP` objects hold one or more `ElectricalSeries` objects, which is another subclass of `TimeSeries`. Here, we put an `ElectricalSeries` named `'bandpass'` in an `LFP` object, in a `ProcessingModule` named `'ecephys'`.

<img src="images/lfp.png" width="800">

In [11]:
from pynwb.ecephys import ElectricalSeries, LFP

lfp_data = np.random.randn(50, 4)
lfp_elec_series = ElectricalSeries(
    name='ElectricalSeries', 
    data=lfp_data, 
    electrodes=all_table_region, 
    rate=200.
)
lfp = LFP(electrical_series=lfp_elec_series)

ecephys_module = nwbfile.create_processing_module(
    name='ecephys', 
    description='extracellular electrophysiology data'
)
ecephys_module.add(lfp)

LFP pynwb.ecephys.LFP at 0x1951693749128
Fields:
  electrical_series: {
    ElectricalSeries <class 'pynwb.ecephys.ElectricalSeries'>
  }

## Spike Times
Spike times are stored in the `Units` table, which is another subclass of `DynamicTable`. You can add columns to the `Units` table just like you did for the electrodes and trials tables. Here, we generate some random spike data and populate the table.

In [12]:
poisson_lambda = 20
firing_rate = 20
n_units = 10
for n_units_per_shank in range(n_units):
    n_spikes = np.random.poisson(lam=poisson_lambda)
    spike_times = np.cumsum(np.random.exponential(1/firing_rate, n_spikes))
    nwbfile.add_unit(spike_times=spike_times)

## Ragged arrays

Spike times are an example of a ragged array - it's like a matrix, but each row has a different number of elements. We can represent this type of data as an indexed column of the units `DynamicTable`. These indexed columns have two components, the vector data object that holds the data and the vector index object that holds the indices in the vector that indicate the row breaks.

<img src="images/ragged_array.png" width="800">

## Write the file

In [13]:
with NWBHDF5IO('ecephys_tutorial.nwb', 'w') as io:
    io.write(nwbfile)

## Reading NWB data
Data arrays are read passively from the file. Calling `TimeSeries.data` does not read the data values, but presents an `h5py` object that can be indexed to read data. Index this array just like a numpy array to read only a specific section of the array, or use the `[:]` operator to read the entire thing.

In [14]:
with NWBHDF5IO('ecephys_tutorial.nwb', 'r') as io:
    read_nwbfile = io.read()

    print(read_nwbfile.processing['ecephys']['LFP']['ElectricalSeries'].data[:])

[[ 1.34121418e+00  8.70466499e-01  1.65889276e-02  1.16631916e+00]
 [-2.27022384e-01  9.63311736e-01  1.22233751e+00 -2.95633052e-01]
 [ 5.49780303e-01 -1.50784388e+00  1.78149678e+00 -1.98098081e+00]
 [ 1.44715525e-01 -4.19153868e-02 -2.25883996e-01  1.96636483e+00]
 [ 9.26385890e-01  9.26868415e-01  7.95483800e-01  6.45757203e-01]
 [ 6.99176751e-01 -3.19913693e-01  3.65440853e-01  2.64859343e-01]
 [-9.12358220e-01 -5.28023405e-01 -1.17523708e-01  9.93684567e-01]
 [-1.27738577e-01 -9.45271093e-01 -1.03879539e+00 -1.24304423e-01]
 [-6.47448194e-01  6.45679915e-01  1.07806809e+00 -6.40907078e-01]
 [-5.79912668e-01  4.19258823e-01  5.52501173e-01  8.66680845e-01]
 [ 1.53898868e+00 -1.47429134e+00  5.99631278e-01  8.13361478e-01]
 [ 3.48978106e-01  4.35619738e-01 -6.84280776e-01 -1.33263445e+00]
 [ 9.16439039e-01 -1.94143353e+00  8.46763421e-01  5.89745531e-01]
 [-2.03635529e+00  6.57251184e-01  6.48895481e-01 -3.15091519e-01]
 [ 1.94723699e+00 -5.83620724e-01 -1.40833365e+00 -2.74594228e

## Accessing data regions
It is often preferable to read only a portion of the data. To do this, index or slice into the `'data'` property. The following takes elements 0:10 in the first dimension and 0:5 in the second dimension from the LFP data we have written.

Accessing ragged arrays is similar: `read_nwbfile.units['spike_times'][0]` only reads the times from the 0th unit.

In [15]:
with NWBHDF5IO('ecephys_tutorial.nwb', 'r') as io:
    read_nwbfile = io.read()

    print('section of lfp:')
    print(read_nwbfile.processing['ecephys']['LFP']['lfp'].data[:10,:3])
    print('')
    print('spike times from 0th unit:')
    print(read_nwbfile.units['spike_times'][0])

section of lfp:
[[ 1.34121418  0.8704665   0.01658893]
 [-0.22702238  0.96331174  1.22233751]
 [ 0.5497803  -1.50784388  1.78149678]
 [ 0.14471552 -0.04191539 -0.225884  ]
 [ 0.92638589  0.92686841  0.7954838 ]
 [ 0.69917675 -0.31991369  0.36544085]
 [-0.91235822 -0.52802341 -0.11752371]
 [-0.12773858 -0.94527109 -1.03879539]
 [-0.64744819  0.64567991  1.07806809]
 [-0.57991267  0.41925882  0.55250117]]

spike times from 0th unit:
[0.0079322  0.04033125 0.04591251 0.20966924 0.27518436 0.28255893
 0.29601483 0.30625737 0.34076845 0.35842703 0.45232809 0.53963696
 0.5879278  0.59880164 0.63510865 0.77355787 0.78247874 0.90127017
 0.928094   0.9786283  1.10822076 1.16956127 1.17937646 1.20978309
 1.27515263 1.30065718 1.35858139]


# Learn more!

## Python tutorials
### See our tutorials for more details about your data type:
* [Extracellular electrophysiology](https://pynwb.readthedocs.io/en/stable/tutorials/domain/ecephys.html#sphx-glr-tutorials-domain-ecephys-py)
* [Calcium imaging](https://pynwb.readthedocs.io/en/stable/tutorials/domain/ophys.html#sphx-glr-tutorials-domain-ophys-py)
* [Intracellular electrophysiology](https://pynwb.readthedocs.io/en/stable/tutorials/domain/icephys.html#sphx-glr-tutorials-domain-icephys-py)

### Check out other tutorials that teach advanced NWB topics:
* [Iterative data write](https://pynwb.readthedocs.io/en/stable/tutorials/general/iterative_write.html#sphx-glr-tutorials-general-iterative-write-py)
* [Extensions](https://pynwb.readthedocs.io/en/stable/tutorials/general/extensions.html#sphx-glr-tutorials-general-extensions-py)
* [Advanced HDF5 I/O](https://pynwb.readthedocs.io/en/stable/tutorials/general/advanced_hdf5_io.html#sphx-glr-tutorials-general-advanced-hdf5-io-py)


## MATLAB tutorials
* [Extracellular electrophysiology](https://neurodatawithoutborders.github.io/matnwb/tutorials/html/ecephys.html)
* [Calcium imaging](https://neurodatawithoutborders.github.io/matnwb/tutorials/html/ophys.html)
* [Intracellular electrophysiology](https://neurodatawithoutborders.github.io/matnwb/tutorials/html/icephys.html)
