# ASAP TDT dataset tutorial 

This tutorial demonstrates how to access an NWB file from the public dataset using `pynwb`.

This dataset contains recordings of single-unit activity from globus pallidus-internus (GPi) in monkeys performing
a choice reaction time reaching task. The neuronal activity was recorded using 16-contact linear probes (0.5–1.0 MΩ, V-probe, Plexon) or glass-insulated tungsten microelectrodes (0.5–1.5 MΩ, Alpha Omega). The neuronal data were amplified (4×, 2 Hz–7.5 kHz) and digitized at 24.414 kHz (approx., 16-bit resolution; Tucker Davis Technologies). The neuronal data were high-pass filtered (Fpass: 200 Hz, Matlab FIRPM) and thresholded, and candidate action potentials were sorted into clusters in principal components space (Off-line Sorter, Plexon).

Contents:

- [Reading an NWB file](#read-nwb)
- [Access Subject metadata](#access-subject)
- [Access Trials](#access-trials)
- [Access Recording](#access-recording)
- [Access Units](#access-units)
- [View NWB files](#view-nwb)


A schematic representation where the source data is saved in NWB:

![Alt text](../asap_tdt_uml.png)


# Reading an NWB file <a name="read-nwb"></a>

This section demonstrates how to read an NWB file using `pynwb`.

Based on the [NWB File Basics](https://pynwb.readthedocs.io/en/stable/tutorials/general/plot_file.html#sphx-glr-tutorials-general-plot-file-py) tutorial from [PyNWB](https://pynwb.readthedocs.io/en/stable/#).

An [NWBFile](https://pynwb.readthedocs.io/en/stable/pynwb.file.html#pynwb.file.NWBFile) represents a single session of an experiment. Each NWBFile must have a `session description`, `identifier`, and `session start time`.

Reading is carried out using the [NWBHDF5IO](https://pynwb.readthedocs.io/en/stable/pynwb.html#pynwb.NWBHDF5IO) class. To read the NWB file use the read mode ("r") to retrieve an NWBFile object.

In [None]:
from pynwb import NWBHDF5IO

# The file path to a .nwb file
nwbfile_path = "/Volumes/LaCie/CN_GCP/Turner/nwbfiles/stub_Gaia_G_140721_1.nwb"
io = NWBHDF5IO(path=nwbfile_path, mode="r", load_namespaces=True)
nwbfile = io.read()

nwbfile

Importantly, the `session start time` is the reference time for all timestamps in the file. For instance, an event with a timestamp of 0 in the file means the event occurred exactly at the session start time.

The `session_start_time` is extracted from the `Tanksummary`, `CollectDate`nwbfile.session_start_time structure from the .mat file that contains the events, curated units and optionally include the stimulation data.

In [None]:
nwbfile.session_start_time

# Access subject metadata <a name="access-subject"></a>

This section demonstrates how to access the [Subject](https://pynwb.readthedocs.io/en/stable/pynwb.file.html#pynwb.file.Subject) field in an NWB file.

The [Subject](https://pynwb.readthedocs.io/en/stable/pynwb.file.html#pynwb.file.Subject) field can be accessed as `nwbfile.subject`.

In [None]:
nwbfile.subject

The MPTP status is stored in a [TurnerLabMetaData](https://github.com/catalystneuro/ndx-turner-metadata) container which extends [pynwb.file.LabMetaData](https://pynwb.readthedocs.io/en/stable/pynwb.file.html#pynwb.file.LabMetaData), and can be accessed as `nwbfile.lab_meta_data["MPTPMetaData"]`.


In [None]:
nwbfile.lab_meta_data["MPTPMetaData"]

# Access trials <a name="access-trials"></a>

Behavior trials are stored in `nwbfile.trials`. The `start_time` denotes the start time of each trial in seconds relative to the global session start time (using the "starttime" column from  `.mat` file containing the events).
The `stop_time` denotes the end time of each trial in seconds relative to the global session start time
(using the "endtime" column from the `.mat` file).

`nwbfile.trials` can be converted to a pandas DataFrame for convenient analysis using `nwbfile.trials.to_dataframe()`.


In [None]:
trials = nwbfile.trials.to_dataframe()

trials[:10]

In [None]:
trials[trials["target"] == "Left"][:3]

# Access Recording <a name="access-recording"></a>

This section demonstrates how to access the raw `ElectricalSeries` data.

`NWB` organizes data into different groups depending on the type of data. Groups can be thought of as folders within the file. Here are some of the groups within an NWBFile and the types of data they are intended to store:

- `acquisition`: raw, acquired data that should never change
- `processing`: processed data, typically the results of preprocessing algorithms and could change

## Raw ElectricalSeries

The raw ElectricalSeries data is stored in an [pynwb.ecephys.ElectricalSeries](https://pynwb.readthedocs.io/en/stable/pynwb.ecephys.html#pynwb.ecephys.ElectricalSeries) object which is added to `nwbfile.acquisition`. The data can be accessed as `nwbfile.acquisition["ElectricalSeries"]`.

The data in `ElectricalSeries` is stored as a two dimensional array: the first dimension is time, the second dimension represents electrodes/channels.


In [None]:
electrical_series = nwbfile.acquisition["ElectricalSeries"]

In [None]:
import numpy as np
import pandas as pd
import warnings
warnings.simplefilter(action='ignore', category=pd.errors.PerformanceWarning)

data = electrical_series.data[:100000, :3]
df = pd.DataFrame(data)
df["time"] = np.arange(0, data.shape[0])
df.set_index("time", inplace=True)
df.columns.name = "electrodes"

import plotly.express as px

fig = px.line(df, facet_row="electrodes", facet_row_spacing=0.01)

# hide and lock down axes
fig.update_xaxes(visible=True, fixedrange=False)
fig.update_yaxes(visible=False, fixedrange=False)

# remove facet/subplot labels
fig.update_layout(annotations=[], overwrite=True)

# strip down the rest of the plot
fig.update_layout(
    showlegend=True,
    plot_bgcolor="white",
    margin=dict(t=10, l=10, b=10, r=10)
)

fig.show(config=dict(displayModeBar=True))


We can access the sampling frequence of the ElectricalSeries as `nwbfile.acquisition["ElectricalSeries"].rate`, and starting_time (which is relative to `session_start_time` and is in unit of seconds) as `nwbfile.acquisition["ElectricalSeries"].starting_time`: 

In [None]:
electrical_series.rate, electrical_series.starting_time

The electrodes table describe the electrodes that generated this data. Extracellular electrodes are stored in an "electrodes" table, which is a [DynamicTable](https://hdmf.readthedocs.io/en/stable/hdmf.common.table.html#hdmf.common.table.DynamicTable) and can be can be converted to a pandas DataFrame for convenient analysis using `nwbfile.electrodes.to_dataframe()`.

In [None]:
nwbfile.electrodes.to_dataframe()

## Filtered ElectricalSeries


The processed ecephys data is stored in "processing/ecephys" which can be accessed as `nwbfile.processing["ecephys"]`.
Within this processing module we can access the container of filtered traces as `nwbfile.processing["ecephys"]["Processed"]` which can hold multiple processed `ElectricalSeries` objects.


In [None]:
nwbfile.processing["ecephys"]

In [None]:
nwbfile.processing["ecephys"]["Processed"]

In [None]:
nwbfile.processing["ecephys"]["Processed"]["ElectricalSeriesProcessedGPi"]

# Access Units <a name="access-units"></a>

Spike times are stored in the `Units` table, which is a DynamicTable and can be can be converted to a pandas DataFrame for convenient analysis using `nwbfile.units.to_dataframe()`.

The spike_times and other metadata that are stored in `nwbfile.units` is extracted from the .mat file that contains the "units" structure.

_Note_:
The spike times from the Plexon files are added to "processing/ecephys" processing module and can be accessed as `nwbfile.processing["ecephys"]["units"]`.

In [None]:
nwbfile.units.to_dataframe()

# View NWB  <a name="view-nwb"></a>


In [None]:
from nwbwidgets import nwb2widget

nwb2widget(nwbfile)

We also use [Neurosift](https://github.com/flatironinstitute/neurosift), a platform for the visualization of neuroscience data in the web browser.