# NWB Tutorial - Extracellular Electrophysiology

## Introduction
In this tutorial, we will create an NWB file for a hypothetical extracellular electrophysiology experiment with a freely moving animal. The types of data we will convert are:
- Subject information (species, strain, age, etc.) 
- Animal position
- Trials
- LFP
- Spike times

## Installing PyNWB
First, install PyNWB using pip or conda. You will need Python 3.5+ installed.
- `pip install pynwb`
- `conda install -c conda-forge pynwb`

## Set up the NWB file
An NWB file represents a single session of an experiment. Each file must have a session description, identifier, and session start time. Create a new `NWBFile` object with those and additional metadata. For all PyNWB constructors, we recommend using keyword arguments.

In [None]:
from pynwb import NWBFile
from datetime import datetime
from dateutil import tz

start_time = datetime(2018, 4, 25, 2, 30, 3, tzinfo=tz.gettz('US/Pacific'))

nwbfile = NWBFile(
    session_description='Mouse exploring an open field',
    identifier='Mouse5_Day3',
    session_start_time=start_time,
    session_id='session_1234',                                # optional
    experimenter='My Name',                                   # optional
    lab='My Lab Name',                                        # optional
    institution='University of My Institution',               # optional
    related_publications='DOI:10.1016/j.neuron.2016.12.011'   # optional
)
print(nwbfile)

## Subject information
Create a `Subject` object to store information about the experimental subject, such as age, species, genotype, sex, and a freeform description. Then set `nwbfile.subject` to the `Subject` object.

<img src="images/subject.svg" width="150">

Each of these fields is free-form text, so any values will be valid, but here are our recommendations:
- For age, we recommend using the [ISO 8601 Duration format](https://en.wikipedia.org/wiki/ISO_8601#Durations), e.g., P90D for 90 days old
- For species, we recommend using the formal latin binomal name (e.g., mouse: *Mus musculus*, human: *Homo sapiens*)
- For sex, we recommend using F (female), M (male), U (unknown), and O (other)

In [None]:
from pynwb.file import Subject

nwbfile.subject = Subject(
    subject_id='001',
    age='P90D', 
    description='mouse 5',
    species='Mus musculus', 
    sex='M'
)

## SpatialSeries and Position
Many types of data can be stored in specialized classes in NWB. To store the spatial position of an animal, use the `SpatialSeries` and `Position` classes. 

`SpatialSeries` is a subclass of `TimeSeries`. `TimeSeries` is a common base class for measurements sampled over time, and provides fields for data and time (regularly or irregularly sampled).

<img src="images/SpatialSeries.svg" width="200">

Create a `SpatialSeries` object named `'SpatialSeries'` with some fake data.

In [None]:
import numpy as np
from pynwb.behavior import SpatialSeries

# create fake data with shape (50, 2)
# the first dimension should always represent time
position_data = np.array([np.linspace(0, 10, 50),
                          np.linspace(0, 8, 50)]).T
position_timestamps = np.linspace(0, 50) / 200

spatial_series_obj = SpatialSeries(
    name='SpatialSeries', 
    description='(x,y) position in open field',
    data=position_data,
    timestamps=position_timestamps,
    reference_frame='(0,0) is bottom left corner'
)

You can print the `SpatialSeries` object to view its contents.

In [None]:
print(spatial_series_obj)

To help data analysis and visualization tools know that this `SpatialSeries` object represents the position of the animal, store the `SpatialSeries` object inside of a `Position` object, which can hold one or more `SpatialSeries` objects.

<img src="images/Position2.svg" width="450">

In [None]:
from pynwb.behavior import Position

position_obj = Position(spatial_series=spatial_series_obj)

NWB differentiates between raw, *acquired data*, which should never change, and *processed data*, which are the results of preprocessing algorithms and could change. Let's assume that the animal's position was computed from a video tracking algorithm, so it would be classified as processed data. Since processed data can be very diverse, NWB allows us to create processing modules, which are like folders, to store related processed data or data that comes from a single algorithm. 

Create a processing module called "behavior" for storing behavioral data in the `NWBFile` and add the `Position` object to the module.

In [None]:
behavior_module = nwbfile.create_processing_module(
    name='behavior', 
    description='processed behavioral data'
)
behavior_module.add(position_obj)

<img src="images/behavior.svg" width="600">

## Write to file

Now, write the NWB file that we have built so far.

In [None]:
from pynwb import NWBHDF5IO

with NWBHDF5IO('ecephys_tutorial.nwb', 'w') as io:
    io.write(nwbfile)

We can then read the file and print it to inspect its contents. We can also print the `SpatialSeries` data that we created by referencing the names of the objects in the hierarchy that contain it. The processing module called `'behavior'` contains our `Position` object. By default, the `Position` object is named `'Position'`. The `Position` object contains our `SpatialSeries` object named `'SpatialSeries'`.

In [None]:
with NWBHDF5IO('ecephys_tutorial.nwb', 'r') as io:
    read_nwbfile = io.read()
    print(read_nwbfile.processing['behavior']['Position']['SpatialSeries'])

We can also use the HDFView tool to inspect the resulting NWB file.

<img src="images/position_hdfview.png" width="400">

## Trials

Trials are stored in a `TimeIntervals` object which is a subclass of `DynamicTable`. `DynamicTable` objects are used to store tabular metadata throughout NWB, including for trials, electrodes, and sorted units. They offer flexibility for tabular data by allowing required columns, optional columns, and custom columns.

<img src="images/trials.svg" width="300">

The trials DynamicTable can be thought of as a table with this structure:

<img src="images/trials_example.png" width="400">

Continue adding to our `NWBFile` by creating a new column for the trials table named `'correct'`, which will be a boolean array.

In [None]:
nwbfile.add_trial_column(name='correct', description='whether the trial was correct')
nwbfile.add_trial(start_time=1.0, stop_time=5.0, correct=True)
nwbfile.add_trial(start_time=6.0, stop_time=10.0, correct=False)

We can view the trials table in tabular form by converting it to a pandas dataframe.

In [None]:
nwbfile.trials.to_dataframe()

## Electrodes table
Extracellular electrodes are stored in a `electrodes` table, which is also a `DynamicTable`. `electrodes` has several required fields: x, y, z, impedence, location, filtering, and electrode group.

<img src="images/Electrodes.png" width="500">

Use the following code to add electrodes for a multi-shank probe with 4 shanks, each with 3 electrodes. We will also add a custom column named `'label'` to a table.

In [None]:
nwbfile.add_electrode_column(name='label', description='label of electrode')
shank_channels = [4, 3]  # set up two shanks, the first with 4 electrodes and the second with 3 electrodes

electrode_counter = 0
device = nwbfile.create_device('array')
for shankn, nelecs in enumerate(shank_channels):
    # create an electrode group for this shank
    electrode_group = nwbfile.create_electrode_group(
       name='shank{}'.format(shankn),
       description='electrode group for shank {}'.format(shankn),
       device=device,
       location='brain area'
    )
    # add electrodes to the electrode table
    for ielec in range(nelecs):
        nwbfile.add_electrode(
            x=5.3, y=1.5, z=8.5, imp=np.nan,
            location='unknown', 
            filtering='unknown',
            group=electrode_group,
            label='shank{}elec{}'.format(shankn, ielec)
        )
        electrode_counter += 1

Like for the trials table, we can view the electrodes table in tabular form by converting it to a pandas dataframe.

In [None]:
nwbfile.electrodes.to_dataframe()

## Links
In the above loop, we created `ElectrodeGroup` objects in the `NWBFile`, and when we added an electrode to the `NWBFile`, we passed in the `ElectrodeGroup` object for the required `'group'` argument. This creates a reference from the `electrodes` table to individual `ElectrodeGroup` objects, one per row (electrode).

## ElectricalSeries and DynamicTableRegion

Voltage data are stored in `ElectricalSeries` objects. `ElectricalSeries` is a subclass of `TimeSeries` specialized for voltage data. In order to create our `ElectricalSeries` object, we will need to reference a set of rows in the `electrodes` table to indicate which electrodes were recorded. We will do this by creating a `DynamicTableRegion`, which is a type of link that allows you to reference specific rows of a `DynamicTable`, such as the `electrodes` table, by row indices.

Create a `DynamicTableRegion` that references all rows of the `electrodes` table.

In [None]:
all_table_region = nwbfile.create_electrode_table_region(
    region=list(range(electrode_counter)), 
    description='all electrodes'
)

Now create an `ElectricalSeries` object to hold LFP data collected during the experiment.

<img src="images/electricalseries.png" width="800">

In [None]:
from pynwb.ecephys import ElectricalSeries

lfp_data = np.random.randn(50, 4)
lfp_elec_series = ElectricalSeries(
    name='ElectricalSeries', 
    data=lfp_data, 
    electrodes=all_table_region, 
    rate=200.
)

## LFP
To help data analysis and visualization tools know that this `ElectricalSeries` object represents LFP data, store the `ElectricalSeries` object inside of an `LFP` object. Then place the `LFP` object in a `ProcessingModule` named `'ecephys'`. This is analogous to how we stored the `SpatialSeries` object inside of a `Position` object and stored the `Position` object in a `ProcessingModule` named `'behavior'` earlier.

<img src="images/lfp.png" width="800">

In [None]:
from pynwb.ecephys import LFP

lfp = LFP(electrical_series=lfp_elec_series)

ecephys_module = nwbfile.create_processing_module(
    name='ecephys', 
    description='extracellular electrophysiology data'
)
ecephys_module.add(lfp)

## Spike Times
Spike times are stored in the `Units` table, which is another subclass of `DynamicTable`. We can add columns to the `Units` table just like we did for the electrodes and trials tables. 

Generate some random spike data and populate the `Units` table using `nwbfile.add_unit`. Then display the `Units` table as a pandas dataframe.

In [None]:
nwbfile.add_unit_column(name='quality', description='sorting quality')

poisson_lambda = 20
firing_rate = 20
n_units = 10
for n_units_per_shank in range(n_units):
    n_spikes = np.random.poisson(lam=poisson_lambda)
    spike_times = np.round(np.cumsum(np.random.exponential(1/firing_rate, n_spikes)), 5)
    nwbfile.add_unit(spike_times=spike_times, quality='good', waveform_mean=[1, 2, 3, 4, 5])

nwbfile.units.to_dataframe()

## Write the file

In [None]:
with NWBHDF5IO('ecephys_tutorial.nwb', 'w') as io:
    io.write(nwbfile)

## Reading NWB data
Data arrays are read passively from the file. Calling `TimeSeries.data` does not read the data values, but presents an `h5py` object that can be indexed to read data. Index this array just like a numpy array to read only a specific section of the array, or use the `[:]` operator to read the entire thing.

In [None]:
with NWBHDF5IO('ecephys_tutorial.nwb', 'r') as io:
    read_nwbfile = io.read()

    print(read_nwbfile.processing['ecephys']['LFP']['ElectricalSeries'].data[:])

## Accessing data regions
It is often preferable to read only a portion of the data. To do this, index or slice into the `'data'` property. The following prints elements 0:10 in the first dimension and 0:3 in the second dimension from the LFP data we have written.

Accessing data from a `DynamicTable` is similar: `read_nwbfile.units['spike_times'][0]` only reads the times from the 0th unit.

In [None]:
with NWBHDF5IO('ecephys_tutorial.nwb', 'r') as io:
    read_nwbfile = io.read()

    print('section of lfp:')
    print(read_nwbfile.processing['ecephys']['LFP']['ElectricalSeries'].data[:10,:3])
    print('')
    print('spike times from 0th unit:')
    print(read_nwbfile.units['spike_times'][0])

# Learn more!

## Python tutorials
### See our tutorials for more details about your data type:
* [Extracellular electrophysiology](https://pynwb.readthedocs.io/en/stable/tutorials/domain/ecephys.html#sphx-glr-tutorials-domain-ecephys-py)
* [Calcium imaging](https://pynwb.readthedocs.io/en/stable/tutorials/domain/ophys.html#sphx-glr-tutorials-domain-ophys-py)
* [Intracellular electrophysiology](https://pynwb.readthedocs.io/en/stable/tutorials/domain/icephys.html#sphx-glr-tutorials-domain-icephys-py)

### Check out other tutorials that teach advanced NWB topics:
* [Iterative data write](https://pynwb.readthedocs.io/en/stable/tutorials/general/iterative_write.html#sphx-glr-tutorials-general-iterative-write-py)
* [Extensions](https://pynwb.readthedocs.io/en/stable/tutorials/general/extensions.html#sphx-glr-tutorials-general-extensions-py)
* [Advanced HDF5 I/O](https://pynwb.readthedocs.io/en/stable/tutorials/general/advanced_hdf5_io.html#sphx-glr-tutorials-general-advanced-hdf5-io-py)


## MATLAB tutorials
* [Extracellular electrophysiology](https://neurodatawithoutborders.github.io/matnwb/tutorials/html/ecephys.html)
* [Calcium imaging](https://neurodatawithoutborders.github.io/matnwb/tutorials/html/ophys.html)
* [Intracellular electrophysiology](https://neurodatawithoutborders.github.io/matnwb/tutorials/html/icephys.html)
