# Water Maze NWB Tutorial

This tutorial demonstrates how to load and explore an NWB file containing behavioral data from the Morris water maze experiment, part of the [Rat Behavioural Phenotyping Pipeline Methodologies](https://sidb.org.uk/facilities/rat-behavioural-phenotyping-pipeline/).

**Contents:**
- [Reading NWBFile locally](#read-nwb)
- [Access Subject Metadata](#access-subject)
- [Access Devices Metadata](#access-devices)
- [Access Behavioral Videos](#access-behavior)
- [Access Trials](#access-trials)


# Reading an NWB file locally <a id="read-nwb"></a>

This section demonstrates how to read an NWB file using `pynwb`.

Based on the [NWB File Basics](https://pynwb.readthedocs.io/en/stable/tutorials/general/plot_file.html#sphx-glr-tutorials-general-plot-file-py) tutorial from [PyNWB](https://pynwb.readthedocs.io/en/stable/#).

An [NWBFile](https://pynwb.readthedocs.io/en/stable/pynwb.file.html#pynwb.file.NWBFile) represents a single session of an experiment. Each NWBFile must have a `session description`, `identifier`, and `session start time`.

Reading is carried out using the [NWBHDF5IO](https://pynwb.readthedocs.io/en/stable/pynwb.html#pynwb.NWBHDF5IO) class. To read the NWB file use the read mode ("r") to retrieve an NWBFile object.

In [None]:
from pynwb import NWBHDF5IO
from pathlib import Path

# Define the path to the NWB file
directory = Path("/Users/weian/data/Auditory Fear Conditioning/nwbfiles")
session_id = "WM_2_Reference_Day1"
subject_id = "302_Arid1b(2)"
nwbfile_path = directory / f"sub-{subject_id}_ses-{session_id}.nwb"
io = NWBHDF5IO(path=nwbfile_path, load_namespaces=True)
nwbfile = io.read()

In [None]:
print(nwbfile.experiment_description)

'This experiment is part of the Rat Behavioural Phenotyping Pipeline Methodologies https://sidb.org.uk/facilities/rat-behavioural-phenotyping-pipeline/.\nAs an initial effort to characterize the rat lines, these models are being behaviorally phenotyped at SIDB.\nRat models are phenotyped according to a rigorous pipeline that assesses behaviours relevant to autism with intellectual disability, such as social and motor skills, and learning and sensory processing.\nDifferent cohorts of rats are run through different subsets of tasks to overcome potential order effects on behaviours and to limit the number of tasks each animal is put through.\nThis experiment investigated spatial learning and memory in different rat models\n(Arid1b: https://rgd.mcw.edu/rgdweb/report/strain/main.html?id=14394518,\nGrin2b: https://rgd.mcw.edu/rgdweb/report/strain/main.html?id=14394515,\nScn2a: https://rgd.mcw.edu/rgdweb/report/strain/main.html?id=25394530) using the Morris Water Maze.\nThe experimenter was b

In [None]:
print(nwbfile.session_description)

'Long-term spatial memory assessment.\nIn long term spatial memory testing the platform remained in the same location over days and was 1-2 cm below water level.\nSpatial cues were available.\nIn the first swim of each day the platform only became available after 1 min.\n'

Importantly, the `session start time` is the reference time for all timestamps in the file. For instance, an event with a timestamp of 0 in the file means the event occurred exactly at the session start time.

The `session_start_time` is extracted from the CSV table (e.g. "Reference_analysis.csv") that contains a "Date" and "Time" column.

In [None]:
print(nwbfile.session_start_time)

datetime.datetime(2022, 9, 16, 10, 18, tzinfo=tzoffset(None, 3600))

# Access Subject Metadata <a name="access-subject"></a>

This section demonstrates how to access the [Subject](https://pynwb.readthedocs.io/en/stable/pynwb.file.html#pynwb.file.Subject) field in an NWB file.

The [Subject](https://pynwb.readthedocs.io/en/stable/pynwb.file.html#pynwb.file.Subject) field can be accessed as `nwbfile.subject`.


In [7]:
nwbfile.subject

# Access Devices Metadata <a name="access-devices"></a>

This section demonstrates how to access the [Device](https://pynwb.readthedocs.io/en/stable/pynwb.device.html#module-pynwb.device) field in an NWB file.

The [Device](https://pynwb.readthedocs.io/en/stable/pynwb.device.html#module-pynwb.device) field can be accessed as `nwbfile.devices`.

In [None]:
nwbfile.devices

# Access Behavioral Video <a name="access-behavior"></a>

The behavioral videos are stored as an external `ImageSeries` in the NWB file.

In [19]:
# List all behavioral video acquisitions
for name in nwbfile.acquisition:
    if "BehavioralVideo" in name:
        print(name)

BehavioralVideoTrial1
BehavioralVideoTrial2
BehavioralVideoTrial3
BehavioralVideoTrial4


In [12]:
video = nwbfile.acquisition["BehavioralVideoTrial1"]
video

0,1
Data type,uint8
Shape,"(0, 0, 0)"
Array size,0.00 bytes
Chunk shape,
Compression,
Compression opts,
Compression ratio,undefined

0,1
Data type,object
Shape,"(1,)"
Array size,8.00 bytes
Chunk shape,
Compression,
Compression opts,
Compression ratio,0.5

0,1
Data type,int64
Shape,"(1,)"
Array size,8.00 bytes


The video filepath:

In [13]:
video.external_file[0]

'/Volumes/T9/Behavioural Pipeline/Water Maze/Arid1b/Arid1b(2)_WM/2_Reference/302_00.avi'

The metadata of the device used to record the behavioral video:

In [18]:
video.device

# Access Trials <a id="access-trials"></a>

The `nwbfile.trials` contains information about each swim trial, including start and stop times, and trial-specific metrics such as path length, duration, and swim speed.

The source of this table is the Water Maze analysis CSV file, generated by the WaterMaze software.

You can convert the trials table to a `pandas.DataFrame` for easier exploration:



In [22]:
trials = nwbfile.trials.to_dataframe()
trials

Unnamed: 0_level_0,start_time,stop_time,platform,time_to_platform,trial_duration,distance_travelled_cm,average_speed,percent_time_near_walls,platform_quadrant,time_in_NE_quadrant,...,time_in_NW_quadrant,percent_time_in_NW_quadrant,time_to_platform_NW,time_in_SW_quadrant,percent_time_in_SW_quadrant,time_to_platform_SW,time_in_SE_quadrant,percent_time_in_SE_quadrant,time_to_platform_SE,timeseries
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
0,0.0,103.05,NE_A,102.76,103.05,3217.592,31.224,1.359,1,35.117,...,13.206,12.816,33.506,19.009,18.447,0.0,35.717,34.66,0.0,"[(0, 1030, BehavioralVideoTrial1 pynwb.image.I..."
1,1320.0,1329.349,NE,9.051,9.349,303.463,32.459,0.0,1,3.936,...,0.0,0.0,0.0,0.0,0.0,0.0,5.413,57.895,0.0,"[(0, 93, BehavioralVideoTrial2 pynwb.image.Ima..."
2,2820.0,2827.957,NE,5.949,7.957,156.39,19.654,0.0,1,5.599,...,2.358,29.63,0.0,0.0,0.0,0.0,0.0,0.0,0.0,"[(0, 79, BehavioralVideoTrial3 pynwb.image.Ima..."
3,3840.0,3847.8,NE,7.501,7.8,253.789,32.537,2.532,1,1.284,...,0.296,3.797,0.0,3.456,44.304,2.6,2.765,35.443,4.6,"[(0, 78, BehavioralVideoTrial4 pynwb.image.Ima..."


In [None]:
import seaborn as sns
import numpy as np

import matplotlib.pyplot as plt

# Set the style
plt.style.use('seaborn-v0_8')
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
fig.suptitle('Water Maze Trial Metrics', fontsize=16)

# Plot 1: Path Length across trials
sns.barplot(x=trials.index, y='path_length', data=trials, ax=axes[0, 0], palette='viridis')
axes[0, 0].set_title('Path Length by Trial')
axes[0, 0].set_xlabel('Trial Number')
axes[0, 0].set_ylabel('Path Length (cm)')

# Plot 2: Trial Duration
sns.barplot(x=trials.index, y='duration', data=trials, ax=axes[0, 1], palette='viridis')
axes[0, 1].set_title('Trial Duration')
axes[0, 1].set_xlabel('Trial Number')
axes[0, 1].set_ylabel('Duration (s)')

# Plot 3: Swim Speed
sns.barplot(x=trials.index, y='average_speed', data=trials, ax=axes[1, 0], palette='viridis')
axes[1, 0].set_title('Average Swim Speed by Trial')
axes[1, 0].set_xlabel('Trial Number')
axes[1, 0].set_ylabel('Speed (cm/s)')

# Plot 4: Time spent in each quadrant (if available)
quadrant_columns = [col for col in trials.columns if 'quadrant' in col and 'time' in col]
if quadrant_columns:
    quadrant_data = trials[quadrant_columns].mean()
    sns.barplot(x=quadrant_data.index, y=quadrant_data.values, ax=axes[1, 1], palette='viridis')
    axes[1, 1].set_title('Avg Time in Quadrants')
    axes[1, 1].set_xlabel('Quadrant')
    axes[1, 1].set_ylabel('Time (s)')
    axes[1, 1].set_xticklabels(axes[1, 1].get_xticklabels(), rotation=45)
else:
    # Alternative plot if quadrant data is not available
    if 'stop_time' in trials.columns and 'start_time' in trials.columns:
        sns.scatterplot(x=trials.index, y=trials['stop_time']-trials['start_time'], 
                        ax=axes[1, 1], s=80, color='purple')
        axes[1, 1].set_title('Trial Duration (from timestamps)')
        axes[1, 1].set_xlabel('Trial Number')
        axes[1, 1].set_ylabel('Duration (s)')

plt.tight_layout()
plt.subplots_adjust(top=0.9)
plt.show()

The trials table also contains a `timeseries` column which contains references to the video associated with each trial.


In [58]:
# Access the external video from the trials table for the first trial
video_reference = nwbfile.trials["timeseries"][0][0]
video_object = video_reference.timeseries
video_object

0,1
Data type,uint8
Shape,"(0, 0, 0)"
Array size,0.00 bytes
Chunk shape,
Compression,
Compression opts,
Compression ratio,undefined

0,1
Data type,object
Shape,"(1,)"
Array size,8.00 bytes
Chunk shape,
Compression,
Compression opts,
Compression ratio,0.5

0,1
Data type,int64
Shape,"(1,)"
Array size,8.00 bytes
