# Pre requisites

If you were running this on your own machine you would have to install jupyter, neuroconv package and the data.

However, if you are using either binder or github codespaces for the workshop these steps have already been done for you so you don't need to install anything.


1- Download, install and open GUIDE : https://nwb-guide.readthedocs.io/en/stable/

2- Get datasets on your local machine : 

Option 1 for IDA members: Connect to Gaia/@ida/Equipements_Materials_and_Platforms/PLATEFORMES/PF_Signal/NWB_Workshop2025

Option 2 for Pasteur members : Download 3 NWB datasets on Pasteur Drive (Total 39Go)

https://drive.pasteur.fr/s/Jp4ibFHEQ8rz2Km

https://drive.pasteur.fr/s/5kSEqaE6mqPtSnH 
        
https://drive.pasteur.fr/s/iLGB2FH6wSd9kpD

with the following password: ILoveNWB@2025


# NWB workshop
In this section, you will load NWB datasets with the no-code tool "GUIDE".

## Dataset 1 : BASIL acquisition 

1- In the GUIDE app, click Explore, load and find in your local repository the file sub-390_ses-17.nwb

2- Expand the « acquisition » Tab. 

3- Check the « Lick » and click the View 1 item. Zoom to 10s of experiments. 

4- Close and now check 3 of these items : Lick, Reward, TTLTrigcamera1, TTLTrigsound, TrialType. Zoom on 3 trials.

5- Close and click « open in pynwb ». With few line of codes, you can open and visualize the data in a notebook as follow: 

In [None]:
# import libraries
from pathlib import Path
from pynwb import NWBHDF5IO
import matplotlib.pyplot as plt

In [None]:
data_dir = Path("../data") # Data path
data_dir.mkdir(exist_ok=True) # Create data dir

In [None]:
%%bash
# Download BASIL dataset on the binder workspace
if [[ ! -e "../data/sub-390_ses-17.nwb" ]]; then
   wget --user "5kSEqaE6mqPtSnH" --password "ILoveNWB@2025" "https://drive.pasteur.fr/public.php/webdav" -O ../data/sub-390_ses-17.nwb
fi        

In [None]:
# Load the dataset, look at the data structure and plot timeseries
folder_path = "Y:/Equipements_Materials_and_Platforms/PLATEFORMES/PF_Signal/NWB_Workshop2025/sub-390_ses-17.nwb"
#folder_path = "../data/sub-390_ses-17.nwb"
io = NWBHDF5IO(folder_path, mode='r')
nwbfile = io.read()
nwbfile

In [None]:
# Visualize timeseries
plt.plot(nwbfile.acquisition['Lick'].data[100:20000])
plt.plot(nwbfile.acquisition['TTLtrigsounds'].data[100:20000])

In [None]:
io.close()

## Dataset 2 : Ephys acquisition

Here, we explore a dataset including : 

- Raw electrophysiology data acquired with spikeglx 

- Preprocessing / spike sorting performed with Kilosort

- Behavior/stimuli data acquired with BASIL.

Dataset courtesy : Pierre Platel, DSAPM.

In GUIDE app :

1- In the GUIDE app, click Explore, load and find in your local repository the file spikeglx_kilosort_BASIL_NWBfile.nwb 

2- Expand the «unit» Tab and have a look at the units description. 

3- Check the raster plot and click « view 1 item ». 

Zoom on 10s

Click « view more units »

4- Close and click « open in pynwb ». With few line of codes, you can open and visualize the data in a notebook as follow. 

For storage issue, you can not download the data on binder. We provide the line of codes to do so but it will make your notebook crash. If interested, you can create a local python environment and copy these lines of code to vizualise and process your data as follow.

### Look at the data structure and plot spike raster plot

In [None]:
#%%bash
## Download ephys and spike sorting dataset
#if [[ ! -e "../data/spikeglx_kilosort_BASIL_NWBfile.nwb" ]]; then
#    wget --user "Jp4ibFHEQ8rz2Km" --password "ILoveNWB@2025" "https://drive.pasteur.fr/public.php/webdav" -O ../data/spikeglx_kilosort_BASIL_NWBfile.nwb
#fi            

In [None]:
#folder_path = "../data/spikeglx_kilosort_BASIL_NWBfile.nwb"
folder_path = "Y:/Equipements_Materials_and_Platforms/PLATEFORMES/PF_Signal/NWB_Workshop2025/spikeglx_kilosort_BASIL_NWBfile.nwb"
io = NWBHDF5IO(folder_path, mode='r')
nwbfile = io.read()
nwbfile

In [None]:
import numpy as np

units = nwbfile.units
units_spike_times = units["spike_times"]
# bin size for counting spikes
time_resolution = 0.01

# start and end times (relative to the stimulus at 0 seconds) that we want to examine and align spikes to
window_start_time = -0.1
window_end_time = 2

# time bins used
n_bins = int((window_end_time - window_start_time) / time_resolution)
bin_edges = np.linspace(window_start_time, window_end_time, n_bins, endpoint=True)

# useful throughout analysis
n_units = len(units_spike_times)

In [None]:
n_trials = 1

# 3D spike matrix to be populated with spike counts
spike_matrix = np.zeros((n_units, len(bin_edges), n_trials))

# populate 3D spike matrix for each unit for each stimulus trial by counting spikes into bins
for unit_idx in range(n_units):
    spike_times = units_spike_times[unit_idx]
    
    # get spike times that fall within the bin's time range relative to the stim time        
    first_bin_time = bin_edges[0]
    last_bin_time = bin_edges[-1]
    first_spike_in_range, last_spike_in_range = np.searchsorted(spike_times, [first_bin_time, last_bin_time])
    spike_times_in_range = spike_times[first_spike_in_range:last_spike_in_range]

    # convert spike times into relative time bin indices
    bin_indices = ((spike_times_in_range - (first_bin_time)) / time_resolution).astype(int)
    
    # mark that there is a spike at these bin times for this unit on this stim trial
    for bin_idx in bin_indices:
        spike_matrix[unit_idx, bin_idx, 0] += 1

spike_matrix.shape

In [None]:
import matplotlib.pyplot as plt

trial = 0
fig, ax = plt.subplots(figsize=(10,10))

ax.set_title("Unit Spikes")
ax.set_xlabel("Time (s)")
ax.set_ylabel("Unit #")

img = ax.imshow(spike_matrix[:,:,trial], extent=[window_start_time,window_end_time,0,n_units], aspect=0.001, vmin=0, vmax=1)
cbar = fig.colorbar(img, shrink=0.5)
cbar.set_label("# Spikes")

In [None]:
io.close()

## Dataset 3 : 2P Ca2+ imaging acquisition

Here, we explore a dataset including : 

- Preprocessing : segmentation and fluorescence trace performed with Suite2P on data acquired with a 2ph microscope (Karthala or Mega2P, at the Hearing Institute)

Note, we could also have included BASIL acquisition to visualize sound stimuli and raw calcium imaging stack file in the same NWB file.

Dataset courtesy : Amel Saoudi (TGTD) and Anthony Lourdiane (DSAPM)

In GUIDE app :

1- In the GUIDE app, click Explore, load and find in your local repository the file Karthala_Suite2P_NWBfile.nwb

2- Expand the «processing/ophys» Tab.

3- Check the « ImageSegmentation » and click « view 1 item ». 

Look at cell segmentation of each single plane (4 in total)

4- Uncheck « Image segmentation » and check « Segmentation Images ».

Click view 1 item and visualize correlation and mean images.

5- Close, Uncheck and now check Fluorescence/RoiResponseSeriesChan1Plane0

Click View 1 item

Choose  # visible chans: 10.

You can also zoom on a smaller time window

6- Close and click « open in pynwb ». With few line of codes, you can open and visualize the data in a notebook as follow. 

For storage issue, you can not download the data on binder. We provide the line of codes to do so but it will make your notebook crash. If interested, you can create a local python environment and copy these lines of code to vizualise and process your data as follow.

### Look at the data structure and plot calcium imaging raster plot

In [None]:
#%%bash
## Download Calcium imaging and Suite2P dataset
#if [[ ! -e "../data/Karthala_Suite2P_NWBfile.nwb" ]]; then
#    wget --user "q6m5SXzmmA8fZwD" --password "ILoveNWB@2025" "https://drive.pasteur.fr/public.php/webdav" -O ../data/Karthala_Suite2P_NWBfile.nwb
#fi            

In [None]:
#folder_path = "../data/Karthala_Suite2P_NWBfile.nwb"
folder_path = "Y:/Equipements_Materials_and_Platforms/PLATEFORMES/PF_Signal/NWB_Workshop2025/Karthala_Suite2P_NWBfile.nwb"
io = NWBHDF5IO(folder_path, mode='r')
nwbfile = io.read()
nwbfile

In [None]:
X = nwbfile.processing['ophys']
Y = X.data_interfaces['Fluorescence']
Z = Y.roi_response_series['RoiResponseSeriesChan1Plane0']
starting_time = Z.starting_time[()]
rate = Z.rate
data = Z.data
n_cells = data.shape[1]

print(f'starting_time: {starting_time}')
print(f'rate: {rate}')
print(f'data shape: {data.shape}')
print(f'n cells: {n_cells}')

In [None]:
fig, ax = plt.subplots(figsize=(10,10))

ax.set_title("Raw fluorescence trace")
ax.set_xlabel("Time (frame)")
ax.set_ylabel("Cells #")

img = ax.imshow(np.transpose(data[0:50000,1:50]), aspect=500)
cbar.set_label("# Spikes")

In [None]:
io.close()

## Converting to NWB with neuroconv

We will deal with three problems that we think arise often.
The first one is having a type of data that you would like
to convert to NWB format and for which there is already a converter in [<img src=https://neuroconv.readthedocs.io/en/main/_images/neuroconv_logo.png width=50 height=50>](https://neuroconv.readthedocs.io/en/main/) The available converters can be found
in the neuroconv [gallery](https://neuroconv.readthedocs.io/en/main/conversion_examples_gallery/index.html).
The second one is already
having an NWB file and some data, for which there is a neuroconv datainterface, that you would like to add to the NWB file. The third problem is having an NWB file from which you would like to extract some data and create a new file.


### Problem 1a:
We will start with the simplest case scenario.
We have a single source of data. The data that we will use are from a csv file, 
and to convert them to NWB we will use the [`CsvTimeIntervalsInterface`](https://neuroconv.readthedocs.io/en/main/conversion_examples_gallery/text/csv.html) from neuroconv.
The procedure will be the same for any type of data and the corresponding neuroconv interface.

In [None]:
# Import requirement for the conversion
from neuroconv.datainterfaces import CsvTimeIntervalsInterface
from datetime import datetime
from zoneinfo import ZoneInfo
from pathlib import Path

#### Create CSV file if not available

In [None]:
import random
import pandas as pd
data_dir = Path("../data") # Data path
data_dir.mkdir(exist_ok=True) # Create data dir
csv_file_path = data_dir / "mydata.csv" # Form file data path
random.seed(42) # set seed
i = 0
starts  = []
ends  = []
vals = []
s0 = 0
shift = 0.00001
while (not ends) or ends[-1] < 120 :
    starts.append(s0 + random.randrange(1,4))
    end = starts[-1] + random.randrange(1,8)
    vals.append(random.uniform(0,1))
    s0 = end
    ends.append(end)
trial_times = pd.DataFrame({'start_time':starts, 'end_time':ends, 'value':vals })
# Write csv file if it doesn't exist
if not csv_file_path.exists():
    trial_times.to_csv(csv_file_path,index=False)

In [None]:
### Show first 5 content lines of csv file
pd.read_csv(csv_file_path).head()

In [None]:
### Show first 5 content lines of csv file with bash
### 1 line for the header + 5 lines for the contents
!head -n6 ../data/mydata.csv

#### Create interface

In [None]:
csv_interface = CsvTimeIntervalsInterface(file_path=csv_file_path, verbose=False)

#### Get metadata from file

In [None]:
metadata = csv_interface.get_metadata()
metadata

#### Attempt to create NWB file
This is expected to Fail we just want to get an idea of the error messages and how to interpret them.

In [None]:
prob1a_nwb_file = data_dir/"problem_1a.nwb"
csv_interface.run_conversion(nwbfile_path = prob1a_nwb_file, metadata=metadata)

#### Add missing metadata
From the error message above we see that we are missing the session_start_time which is a required property.
We will add that information in the metadata dictionary.

In [None]:
session_start_time = datetime(2025,1,10,11,45,0, tzinfo=ZoneInfo("Europe/Paris"))
metadata["NWBFile"]["session_start_time"] = session_start_time

#### Create NWB file
Now that we have have the required metadata we can try again to create the NWB file.

In [None]:
csv_interface.run_conversion(nwbfile_path = prob1a_nwb_file, metadata=metadata)

#### Look at the generated NWB file
##### Use the pynwb library to read the file
First we look at the file using the pynwb library.

In [None]:
from pynwb import NWBHDF5IO
import pynwb

In [None]:
# Ideally we would use a with statement 
# as in the commented code
#
## with NWBHDF5IO(prob1a_nwb_file, mode = 'r') as io:
##    nwbfile = io.read()
##    nwbfile
# we instead use the code below to be able to
# see a nice version of the file.
io = NWBHDF5IO(prob1a_nwb_file, mode = 'r')
nwbfile = io.read()
nwbfile

In [None]:
io.close() # Don't forget to close the file

##### Use NWBwidgets to read the file.
Just a taste of how NWBwidgets work, we will take a closer look at NWBwidgets later on.

In [None]:
from nwbwidgets import nwb2widget

io = NWBHDF5IO(prob1a_nwb_file, mode = 'r')
nwbfile = io.read()
nwb2widget(nwbfile)

In [None]:
io.close()

##### Use h5py to read the file
Since our file is an HDF5 file we use any of the HDF5 libraries to read the file. Here we are using the h5py library.

In [None]:
import h5py
f = h5py.File(prob1a_nwb_file, 'r')

In [None]:
list(f.keys())

In [None]:
my_intervals = f.get('intervals')
my_print = lambda x,_: print(x)
my_intervals.visititems(my_print)
print(my_intervals.get('trials').get('value')[:])

In [None]:
f.close()

###  Problem 1b
Now we again want to generate a NWBFile from already available data.
However, in addition to the csv file with some tiff images that 
should also be inside the generated NWB file.
#### Download tiff file

In [None]:
%%bash
# Download movie file if not already available
if [[ ! -e "../data/demoMovie.tif" ]]; then
   wget https://github.com/flatironinstitute/CaImAn/raw/refs/heads/main/example_movies/demoMovie.tif -O ../data/demoMovie.tif
fi        

In [None]:
from neuroconv.datainterfaces import TiffImagingInterface
from neuroconv import NWBConverter
movie_path = data_dir / 'demoMovie.tif'
prob1b_nwb_file = data_dir / 'problem_1b.nwb'


class MyConverter(NWBConverter):
    data_interface_classes = dict (
        csvIntervals = CsvTimeIntervalsInterface,
        movieRecording = TiffImagingInterface )

sourceData = dict(
      csvIntervals = dict(file_path=csv_file_path),
      movieRecording = dict(file_path=movie_path, sampling_frequency=15.0))

dual_converter = MyConverter(sourceData)

metadata = dual_converter.get_metadata()
metadata

In [None]:
metadata["NWBFile"]["session_start_time"] = session_start_time
dual_converter.run_conversion(metadata=metadata, nwbfile_path=prob1b_nwb_file)

#### Look at the generated file

In [None]:
io = NWBHDF5IO(prob1b_nwb_file, mode='r') 
nwbfile = ### Fill in the missing code to read the file
nwbfile

In [None]:
io.close() # Remember to close the file handle

### Problem 2
In this problem we have an NWB file with some data (the file we
created in problem 1a) and we have acquired some new data the tiff
file from problem 1b). We want to have all the data in a single file.
We will use two approaches:
1. Append the data to an existing nwb file on disk.
2. Create a new nwb file in memory file and save it.

#### Create an appropriate interface
First we create an appropriate interface

In [None]:
tiff_interface = TiffImagingInterface(file_path=movie_path, sampling_frequency=15.0)

#### Append data to an existing NWB file
We copy the file we created in problem 1a

In [None]:
# First we create a copy of the file we created in  problem 1a
import shutil
prob2a_nwb_file = data_dir / "problem_2a.nwb"

shutil.copyfile(prob1a_nwb_file, prob2a_nwb_file)

In [None]:
tiff_interface.run_conversion(prob2a_nwb_file) 

#### Read in the generated file
Use pynwb to read the generated file

In [None]:
io =  ### Fill in missing code to read the NWB file
nwbfile = io.read()
nwbfile

In [None]:
io.close() # Close the file handle

#### Create an NWB file in memory and save it

In [None]:
prob2b_nwb_file = data_dir / "problem_2b.nwb"
with NWBHDF5IO(prob1a_nwb_file, mode = 'r') as fin, NWBHDF5IO(prob2b_nwb_file, mode = 'w' ) as fout:
    prob1a = fin.read() # Read nwb file from prob1a
    tiff_interface.add_to_nwbfile(prob1a) # Add the photon information to prob1a, modifies in place
    fout.export(fin, nwbfile=prob1a) # Export the new file

##### Read in the generated file

In [None]:
# Write the code to read in the file
# 1. Open a filehandle
# 2. Read the file in the variable my_file
# 3. Write the variable as the last statement in the block to print notebook 

In [None]:
# Close the filehandle you opened in the previous code block.

### Problem 3

In this problem we are looking at the scenario where we have an NWB file already. 
However, we would like to remove some information and save the result as an NWB file.
We will start with the NWB file we created in problem 1b and remove the TwoPhotonSeries
from acquisition. Note that you can only pop items from LabelledDict objects.

In [None]:
fin = NWBHDF5IO(prob1b_nwb_file, mode = 'r')
prob1b = fin.read()
prob1b

In [None]:
type(prob1b.acquisition)

In [None]:
two_photon = prob1b.acquisition.pop('TwoPhotonSeries')
prob1b

In [None]:
prob3_nwb_file = data_dir / "problem_3.nwb"
with NWBHDF5IO(prob3_nwb_file, mode = 'w' ) as fout:
    fout.export(fin, nwbfile=prob1b) 

In [None]:
fin.close()

#### Read in the generated file

In [None]:
with NWBHDF5IO(prob3_nwb_file, mode='r') as fin:
    print(fin.read())


## NWBwidgets
A closer look at NWB widgets. We will look at a file from the DANDI archive. 
Run the code block below and then follow the instructions on the displayed widget.
 1. Select DANDI using the radio button. 
 2. Select dandiset 4. 
 3. Select the nwb file for sub-P27CS
 4. Press the button load file.

In [None]:
from nwbwidgets.panel import Panel
Panel()

## Writing your own neuroconv interface
We will take a look on how to write a simple neuroconv interface.
Let's assume we have some TTL signals that we have saved in a matlab file.
We would like to create an interface to convert such files to the nwb format.

### Create a mat file with the data we would like to convert.
We will create a mat file with some random data. The file will also contain
a label, and a frequency.

In [None]:
import numpy as np
from scipy.io import savemat
data = np.outer(
    np.random.choice(a=[0,1], p=[0.8, 0.2],replace=True, size=100), 
    np.ones(10)).reshape(-1)
matdict = {'data': data, 'freq': 1000, 'label':'TTLStrobe'}
savemat(data_dir/"test.mat", matdict)

In [None]:
from scipy.io import loadmat
res = loadmat(data_dir/"test.mat")
#res['freq'][0][0] # Obtains the frequency
#res['label'][0]   # Obtains the label
res

### Looking at BaseDataInterface
You can seee that is an Abstract data class and that we need to overwrite the `add_to_nwbfile`
and `__init__` method of the BaseDataInterface.

```
class BaseDataInterface(ABC):
    """Abstract class defining the structure of all DataInterfaces."""

    display_name: Union[str, None] = None
    keywords: tuple[str] = tuple()
    associated_suffixes: tuple[str] = tuple()
    info: Union[str, None] = None

    @classmethod
    def get_source_schema(cls) -> dict:
        """Infer the JSON schema for the source_data from the method signature (annotation typing)."""
        return get_json_schema_from_method_signature(cls, exclude=["source_data"])

    @classmethod
    def validate_source(cls, source_data: dict, verbose: bool = False):
        """Validate source_data against Converter source_schema."""
        cls._validate_source_data(source_data=source_data, verbose=verbose)

    def _validate_source_data(self, source_data: dict, verbose: bool = False):

        encoder = _NWBSourceDataEncoder()
        # The encoder produces a serialized object, so we deserialized it for comparison

        serialized_source_data = encoder.encode(source_data)
        decoded_source_data = json.loads(serialized_source_data)
        source_schema = self.get_source_schema()
        validate(instance=decoded_source_data, schema=source_schema)
        if verbose:
            print("Source data is valid!")

    @validate_call
    def __init__(self, verbose: bool = False, **source_data):
        self.verbose = verbose
        self.source_data = source_data

        self._validate_source_data(source_data=source_data, verbose=verbose)

    def get_metadata_schema(self) -> dict:
        """Retrieve JSON schema for metadata."""
        metadata_schema = load_dict_from_file(Path(__file__).parent / "schemas" / "base_metadata_schema.json")
        return metadata_schema

    def get_metadata(self) -> DeepDict:
        """Child DataInterface classes should override this to match their metadata."""
        metadata = DeepDict()
        metadata["NWBFile"]["session_description"] = ""
        metadata["NWBFile"]["identifier"] = str(uuid.uuid4())

        # Add NeuroConv watermark (overridden if going through the GUIDE)
        neuroconv_version = importlib.metadata.version("neuroconv")
        metadata["NWBFile"]["source_script"] = f"Created using NeuroConv v{neuroconv_version}"
        metadata["NWBFile"]["source_script_file_name"] = __file__  # Required for validation

        return metadata

    def validate_metadata(self, metadata: dict, append_mode: bool = False) -> None:
        """Validate the metadata against the schema."""
        encoder = _NWBMetaDataEncoder()
        # The encoder produces a serialized object, so we deserialized it for comparison

        serialized_metadata = encoder.encode(metadata)
        decoded_metadata = json.loads(serialized_metadata)
        metdata_schema = self.get_metadata_schema()
        if append_mode:
            # Eliminate required from NWBFile
            nwbfile_schema = metdata_schema["properties"]["NWBFile"]
            nwbfile_schema.pop("required", None)

        validate(instance=decoded_metadata, schema=metdata_schema)

    def get_conversion_options_schema(self) -> dict:
        """Infer the JSON schema for the conversion options from the method signature (annotation typing)."""
        return get_json_schema_from_method_signature(self.add_to_nwbfile, exclude=["nwbfile", "metadata"])

    def create_nwbfile(self, metadata: Optional[dict] = None, **conversion_options) -> NWBFile:
        """
        Create and return an in-memory pynwb.NWBFile object with this interface's data added to it.

        Parameters
        ----------
        metadata : dict, optional
            Metadata dictionary with information used to create the NWBFile.
        **conversion_options
            Additional keyword arguments to pass to the `.add_to_nwbfile` method.

        Returns
        -------
        nwbfile : pynwb.NWBFile
            The in-memory object with this interface's data added to it.
        """
        if metadata is None:
            metadata = self.get_metadata()

        nwbfile = make_nwbfile_from_metadata(metadata=metadata)
        self.add_to_nwbfile(nwbfile=nwbfile, metadata=metadata, **conversion_options)

        return nwbfile

    @abstractmethod
    def add_to_nwbfile(self, nwbfile: NWBFile, **conversion_options) -> None:
        """
        Define a protocol for mapping the data from this interface to NWB neurodata objects.

        These neurodata objects should also be added to the in-memory pynwb.NWBFile object in this step.

        Parameters
        ----------
        nwbfile : pynwb.NWBFile
            The in-memory object to add the data to.
        **conversion_options
            Additional keyword arguments to pass to the `.add_to_nwbfile` method.
        """
        raise NotImplementedError

    def run_conversion(
        self,
        nwbfile_path: FilePath,
        nwbfile: Optional[NWBFile] = None,
        metadata: Optional[dict] = None,
        overwrite: bool = False,
        backend: Optional[Literal["hdf5", "zarr"]] = None,
        backend_configuration: Optional[Union[HDF5BackendConfiguration, ZarrBackendConfiguration]] = None,
        **conversion_options,
    ):
        """
        Run the NWB conversion for the instantiated data interface.

        Parameters
        ----------
        nwbfile_path : FilePathType
            Path for where the data will be written or appended.
        nwbfile : NWBFile, optional
            An in-memory NWBFile object to write to the location.
        metadata : dict, optional
            Metadata dictionary with information used to create the NWBFile when one does not exist or overwrite=True.
        overwrite : bool, default: False
            Whether to overwrite the NWBFile if one exists at the nwbfile_path.
            The default is False (append mode).
        backend : {"hdf5", "zarr"}, optional
            The type of backend to use when writing the file.
            If a `backend_configuration` is not specified, the default type will be "hdf5".
            If a `backend_configuration` is specified, then the type will be auto-detected.
        backend_configuration : HDF5BackendConfiguration or ZarrBackendConfiguration, optional
            The configuration model to use when configuring the datasets for this backend.
            To customize, call the `.get_default_backend_configuration(...)` method, modify the returned
            BackendConfiguration object, and pass that instead.
            Otherwise, all datasets will use default configuration settings.
        """

        backend = _resolve_backend(backend, backend_configuration)
        no_nwbfile_provided = nwbfile is None  # Otherwise, variable reference may mutate later on inside the context

        if metadata is None:
            metadata = self.get_metadata()

        file_initially_exists = Path(nwbfile_path).exists() if nwbfile_path is not None else False
        append_mode = file_initially_exists and not overwrite

        self.validate_metadata(metadata=metadata, append_mode=append_mode)

        with make_or_load_nwbfile(
            nwbfile_path=nwbfile_path,
            nwbfile=nwbfile,
            metadata=metadata,
            overwrite=overwrite,
            backend=backend,
            verbose=getattr(self, "verbose", False),
        ) as nwbfile_out:
            if no_nwbfile_provided:
                self.add_to_nwbfile(nwbfile=nwbfile_out, metadata=metadata, **conversion_options)

            if backend_configuration is None:
                backend_configuration = self.get_default_backend_configuration(nwbfile=nwbfile_out, backend=backend)

            configure_backend(nwbfile=nwbfile_out, backend_configuration=backend_configuration)

    @staticmethod
    def get_default_backend_configuration(
        nwbfile: NWBFile,
        # TODO: when all H5DataIO prewraps are gone, introduce Zarr safely
        # backend: Union[Literal["hdf5", "zarr"]],
        backend: Literal["hdf5"] = "hdf5",
    ) -> Union[HDF5BackendConfiguration, ZarrBackendConfiguration]:
        """
        Fill and return a default backend configuration to serve as a starting point for further customization.

        Parameters
        ----------
        nwbfile : pynwb.NWBFile
            The in-memory object with this interface's data already added to it.
        backend : "hdf5", default: "hdf5"
            The type of backend to use when creating the file.
            Additional backend types will be added soon.

        Returns
        -------
        backend_configuration : HDF5BackendConfiguration or ZarrBackendConfiguration
            The default configuration for the specified backend type.
        """
        return get_default_backend_configuration(nwbfile=nwbfile, backend=backend)


```

### Imports 
We will use the following imports in constructing our class.
The notebook format is not really appropriate for creating 
a class, this is something you would likely want to do
as a python module. We are only using the notebook presentation
to allow you to easily follow along.

In [None]:
from typing import Optional
from neuroconv import BaseDataInterface
from pydantic import FilePath
from pydantic.validate_call_decorator import validate_call
from scipy.io import loadmat
from pynwb import NWBFile, TimeSeries

### Extend base data interface class
#### First attempt

In [None]:
class MatTTL(BaseDataInterface):
    """ My class to convert matlab files to TLL"""
    

In [None]:
MatTTL(True) # This is intended to fail

#### Second attempt
We will need to add the method add_to_nwb_file to fix the previous error

In [None]:
class MatTTL(BaseDataInterface):
    """ My class to convert matlab files to TLL"""
    

    def add_to_nwbfile(self, nwbfile: NWBFile, metadata: Optional[dict], **conversion_options) -> None:
        """
        Define a protocol for mapping the data from this interface to NWB neurodata objects.

        These neurodata objects should also be added to the in-memory pynwb.NWBFile object in this step.

        Parameters
        ----------
        nwbfile : pynwb.NWBFile
            The in-memory object to add the data to.
        **conversion_options
            Additional keyword arguments to pass to the `.add_to_nwbfile` method.
        """
        ts = TimeSeries(name=self.name, 
                        data=self.data, 
                        unit="V", 
                        starting_time=self.starting_time, 
                        rate= self.rate)
        nwbfile.add_acquisition(ts)

In [None]:
MatTTL(verbose=True)

#### Third attempt
Trying to use the interface above will generate run-time errors as the variables we have used are not already available. We will fix that by adding a constructor to our class.

In [None]:


class MatTTL(BaseDataInterface):
    """ My class to convert matlab files to TTL """
    @validate_call
    def __init__(self,
                 file_path: FilePath,
                 verbose: bool = True
                 ):
        super().__init__(verbose,file_path=file_path)
        res = loadmat(file_path) # Read matlab file
        self.starting_time = 0.0 # Assume that starting time is alway the start time of the session
        self.name = res.get('label', ['TTLSignal'])[0]
        self.rate = float(res.get('freq', [[1000]])[0][0])
        self.data = res.get('data').reshape(-1)

    def add_to_nwbfile(self, nwbfile: NWBFile, metadata: Optional[dict], **conversion_options) -> None:
        ts = TimeSeries(name=self.name, 
                        data=self.data, 
                        unit="V", 
                        starting_time=self.starting_time, 
                        rate= self.rate)
        nwbfile.add_acquisition(ts)


In [None]:
MatTTL(file_path=data_dir/"test.mat", verbose=True)

#### Using our interface

In [None]:
import datetime
from zoneinfo import ZoneInfo
mat_file = data_dir/"test.mat"
mat_nwb_file = data_dir /"test.nwb"
mat_interface = MatTTL(file_path=mat_file, verbose=True)
metadata = mat_interface.get_metadata()
metadata['NWBFile']['session_start_time'] = datetime.datetime.now(tz=ZoneInfo("Europe/Paris"))
mat_interface.run_conversion(mat_nwb_file, metadata= metadata)

#### Reading the generated file.

In [None]:
from pynwb import NWBHDF5IO
fin = NWBHDF5IO(mat_nwb_file, mode = 'r')
mat_nwb = fin.read()
mat_nwb

In [None]:
fin.close()

We can extend this by implementing other methods of the base class for example the get_metadata() method.

## Pynapple
[Pynapple](https://pynapple.org/) is a python package aiming to make timeseries analysis easier.
In the documentation you can find instructions for working with the following concepts.
- Timeseries
- Perievent
- Correlation
- Tuning curves
- Spectrogram
- Filtering

In this workshop we will only take a cursory look at a small fraction of this functionality.

#### Create some data series
We start by creating some data series that we will 
use down the line to show some of the pynapple functionality

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from pynwb import TimeSeries, NWBHDF5IO, NWBFile
from pathlib import Path
data_dir = Path("../data")
prob1b_nwb_file = data_dir / "problem_1b.nwb"

t = 200.0 # We will make our longest time series to be 200 seconds
# Sample rates in Hz
f1 = 48000.0 
f2 = 96000.0
f3 = 10000.0
# Time series starts
s1 = 10.0
s3 = s2 = 0.0
# The time series objects
y1 = TimeSeries("sin48k",
                data=np.sin(np.linspace(s1,t,num=int((t-s1)*f1))),
                unit = "V",
                starting_time=s1,
                rate=f1)


y2 = TimeSeries("sin96k",
                data=np.sin(np.linspace(s2,t,num=int((t-s2)*f2))),
                unit = "V",
                starting_time=s2,
                rate=f2)
              
t3 = 50
tpoints = np.linspace(0,t3,num=int(t3*f3))
y3 = TimeSeries("composite",
                data= np.sin(2*np.pi*5*tpoints)+np.sin(2*np.pi*50*tpoints)+np.sin(2*np.pi*1000*tpoints),
                rate=f3,
                unit="V",
                starting_time=0.0)

# Put all the time series  a an array
my_timeseries = [y1, y2, y3]

In [None]:
with NWBHDF5IO(prob1b_nwb_file, mode = 'a') as fio:
    prob1b = fio.read() # Read nwb file from prob1a
    for tseries in my_timeseries:
        prob1b.add_acquisition(tseries)
    fio.write(prob1b) # Export the appended file

### Read the NWB file with Pynapple
At this stage before you run the next code cell restart the Kernel (this is to reduce the memory usage which is important in Binder where we have only 2GB of RAM available).

In [None]:
import pynapple as nap
from pathlib import Path
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data_dir = Path("../data")
csv_file_path = data_dir / "mydata.csv" # Form file data path
prob1b_nwb_file = data_dir / "problem_1b.nwb"
nwb = nap.load_file(prob1b_nwb_file)
print(nwb)

### Pynapple objects
pynapple provides some objects to make our life easier various types of timeseries.
#### Timeseries
1. 1d timeseries (Tsd)
2. 2d timeseries aka timeframes (TsdFrame)
3. n-dimensional timeseries (TsdTensor)

It also provides some auxiliary objects as time intervals (IntervalSet), timestamps (Ts) and a way to group different timeStamps/1d Timeseries (TsGroup).

In [None]:
nwb["sin96k"]

In [None]:
nwb["TwoPhotonSeries"]

### Interval set
We will create the intervals we care about. Note that pynapple has a bug and the
data doesn't look right this is just a pretty printer error that hopefully
will be fixed soon.

In [None]:
intervals = nap.IntervalSet(start=pd.read_csv(csv_file_path).start_time, 
                            end=pd.read_csv(csv_file_path).end_time)
intervals

In [None]:
intervals[16] # The data stored are correct but the representation above is wrong

### Time support 
All time series have a time support property. You can also set the time support using the restrict method.

In [None]:
nwb["sin96k"].time_support

In [None]:
nwb["TwoPhotonSeries"].time_support

### Restrict functionality
We can restrict a timeseries objects to an interval or IntervalSet

In [None]:
nwb["sin96k"].restrict(intervals[16]) 

In [None]:
nwb["TwoPhotonSeries"].restrict(intervals)

#### Bin Count and Bin averaging
We can count our data or average per bin. The result timestamps are the bin centers.

In [None]:
nwb["TwoPhotonSeries"].count(ep=intervals)

In [None]:
ycur = nwb["sin96k"]
ycur.count(1050,time_units='ms') # Count in bins of 1050ms

In [None]:
ycur.bin_average(np.pi) # Which are about \pm 2/pi as expected.

In [None]:
ycur.bin_average(1,ep=intervals[0:5]) # Average over 1 second intervals over the intervals[0:5] support

### Threshold signals

In [None]:
from matplotlib import pyplot as plt

In [None]:
yp = ycur.restrict(nap.IntervalSet(start=[0],end=[8*np.pi]))
plt.plot(yp)
plt.plot(yp.threshold(0))

### Use numpy function
We can use numpy functionality as shown in the example
below. Note that we are getting a TsdFrame

In [None]:
yp = np.mean(nwb["TwoPhotonSeries"],1)
print(type(yp))
yp

In [None]:
plt.plot(yp[:,0:4]) # Convenience in plotting timeseries items (plot the first 4 columns)

### Autocorrelation functions
Pynapple provides an autocorrelation function

In [None]:
ts_group=nap.TsGroup({0:nwb['sin96k'], 1:nwb['sin48k']},time_support=nap.IntervalSet(start=0,end=15))
print(ts_group)

In [None]:
ts_group=nap.TsGroup({0:nwb['sin96k'].bin_average(2e-3), 1:nwb['sin48k'].bin_average(1e-3)},time_support=nap.IntervalSet(start=4*np.pi,end=6*np.pi))
print(ts_group)

In [None]:
autocorrs = nap.compute_autocorrelogram(group=ts_group,binsize=np.pi/16, windowsize=4*np.pi)

In [None]:
plt.plot(autocorrs)

In [None]:
%reset -f out array

###  Signal processing functions
It provides functions for power spectral calculation and filtering.

In [None]:
psd = nap.compute_power_spectral_density(nwb['composite']) # Power spectral density
plt.plot(psd)
plt.xlim(0,1100)
plt.xlabel("Frequency in Hz")

In [None]:
signal_50hz = nap.apply_bandpass_filter(nwb['composite'],(20,80), mode='sinc',transition_bandwidth=1e-4) # Filter application.
plt.plot(signal_50hz)
plt.plot(nap.Tsd(t=np.linspace(0,2/50,100),d=np.sin(2*np.pi*50*np.linspace(0,2/50,100))))
plt.plot(nwb['composite'])
plt.xlim(0,2/50)

In [None]:
bpass_filter = nap.get_filter_frequency_response((40,60), 10000, filter_type="bandpass", 
                                           mode="sinc",transition_bandwidth=1e-4)

In [None]:
plt.plot(bpass_filter)
plt.xlim(0,1000)

In [None]:
nwb.close() # Close nwb file

## Other ressources 

OpenScope Databook (Allen Institute) : https://alleninstitute.github.io/openscope_databook/