# Prerequisites

If you were running this on your own machine you would have
to install jupyter, neuroconv package and the data.

However for the workshop these steps have already been done
for you so you don't need to install anything.

# NWB workshop

## Converting to NWB with neuroconv

We will deal with three problems that we think arise often.
The first one is having a type of data that you would like
to convert to NWB format.  The second problem is that you already 
have an NWB file and some data you would like to add to the NWB file.
The third problem is that you have an NWB file and you would like to
extract some data from the file.


### Problem 1a:
We will start with the simplest case scenario.
We have a single source of data. The data that we have
are from a csv file, 
to convert them to NWB we will use the `CsvTimeIntervalsInterface`.

In [2]:
# Import requirement for the conversion
from neuroconv.datainterfaces import CsvTimeIntervalsInterface
from datetime import datetime
from zoneinfo import ZoneInfo
from pathlib import Path

#### Create CSV file if not available

In [21]:
import random
import pandas as pd
data_dir = Path("../data") # Data path
data_dir.mkdir(exist_ok=True) # Create data dir
csv_file_path = data_dir / "mydata.csv" # Form file data path
random.seed(42) # set seed
if not csv_file_path.exists():
    # Create csv file if it doesn't exist
    n = 100
    starts  = [-1]*n
    ends  = [-1]*n
    vals = [-1]*n
    s0 = 0
    for i in range(n):
        starts[i]= s0 + random.randrange(5)
        ends[i] = starts[i] + random.randrange(1,50)
        vals[i] = random.uniform(0,1)
        s0 = ends[i]


    pd.DataFrame({'start_time':starts, 'end_time':ends, 'value':vals }).to_csv(csv_file_path,index=False)

#### Create interface

In [None]:
csv_interface = CsvTimeIntervalsInterface(file_path=csv_file_path, verbose=False)

#### Get metadata

In [None]:
metadata = csv_interface.get_metadata()
metadata

#### Attempt to create NWB file

In [None]:
prob1a_nwb_file = data_dir/"problem_1a.nwb"
csv_interface.run_conversion(nwbfile_path = prob1a_nwb_file)

#### Add missing metadata

In [None]:
session_start_time = datetime(2025,1,10,11,45,0, tzinfo=ZoneInfo("Europe/Paris"))
metadata["NWBFile"]["session_start_time"] = session_start_time

#### Create NWB file

In [None]:
csv_interface.run_conversion(nwbfile_path = prob1a_nwb_file, metadata=metadata)

#### Look at the generated NWB file

In [38]:
from pynwb import NWBHDF5IO
import pynwb

In [None]:
# Ideally we would use a with statement 
# as in the commented code
#
## with NWBHDF5IO(prob1a_nwb_file, mode = 'r') as io:
##    nwbfile = io.read()
##    nwbfile
# we instead use the code below to be able to
# see a nice version of the file.
io = NWBHDF5IO(prob1a_nwb_file, mode = 'r')
nwbfile = io.read()
nwbfile

In [None]:
io.close() # Don't forget to close the file

#### Use nwbwidgets to look at the file
Just a taste of how nwbwidgets work, we will take a closer look at nwbwidgets later on.

In [None]:
from nwbwidgets import nwb2widget

io = NWBHDF5IO(prob1a_nwb_file, mode = 'r')
nwbfile = io.read()
nwb2widget(nwbfile)

In [None]:
io.close()

###  Problem 1b
Now we again want to generate a NWBFile from already available data.
However, in addition to the csv file with some tiff images that 
should also be inside the generated NWB file.
#### Download tiff file

In [None]:
%%bash
# Download movie file if not already available
if [[ ! -e "../data/demoMovie.tif" ]]; then
   wget https://github.com/flatironinstitute/CaImAn/raw/refs/heads/main/example_movies/demoMovie.tif -O ../data/demoMovie.tif
fi        

In [None]:
from neuroconv.datainterfaces import TiffImagingInterface
from neuroconv import NWBConverter
movie_path = data_dir / 'demoMovie.tif'
prob1b_nwb_file = data_dir / 'problem_1b.nwb'


class MyConverter(NWBConverter):
    data_interface_classes = dict (
        csvIntervals = CsvTimeIntervalsInterface,
        movieRecording = TiffImagingInterface )

sourceData = dict(
      csvIntervals = dict(file_path=csv_file_path),
      movieRecording = dict(file_path=movie_path, sampling_frequency=15.0))

dual_converter = MyConverter(sourceData)

metadata = dual_converter.get_metadata()
metadata

In [None]:
metadata["NWBFile"]["session_start_time"] = session_start_time
dual_converter.run_conversion(metadata=metadata, nwbfile_path=prob1b_nwb_file)

### Problem 2
In this problem we have an NWB file with some data (the file we
created in problem 1a) and we have acquired some new data the tiff
file from problem 1b). We want to have all the data in a single file.
We will use two approaches:
1. Append the data to an existing nwb file on disk.
2. Create a new nwb file in memory file and save it.

#### Create an appropriate interface
First we create an appropriate interface

In [None]:
tiff_interface = TiffImagingInterface(file_path=movie_path, sampling_frequency=15.0)

#### Append data to an existing NWB file
We copy the file we created in problem 1a

In [None]:
# First we create a copy of the file we created in  problem 1a
import shutil
prob2a_nwb_file = data_dir / "problem_2a.nwb"

shutil.copyfile(prob1a_nwb_file, prob2a_nwb_file)

In [None]:
tiff_interface.run_conversion(prob2a_nwb_file)

#### Create an NWB file in memory and save it

In [None]:
prob2b_nwb_file = data_dir / "problem_2b.nwb"
with NWBHDF5IO(prob1a_nwb_file, mode = 'r') as fin, NWBHDF5IO(prob2b_nwb_file, mode = 'w' ) as fout:
    prob1a = fin.read() # Read nwb file from prob1a
    tiff_interface.add_to_nwbfile(prob1a) # Add the photon information to prob1a, modifies in place
    fout.export(fin, nwbfile=prob1a) # Export the new file

### Problem 3

In this problem we are looking at the scenario where we have an NWB file already. 
However, we would like to remove some information and save the result as an NWB file.
We will start with the NWB file we created in problem 1b and remove the TwoPhotonSeries
from acquisition. Not you can pop items only from LabelledDict items.

In [None]:
fin = NWBHDF5IO(prob1b_nwb_file, mode = 'r')
prob1b = fin.read()
prob1b

In [None]:
type(prob1b.acquisition)

In [None]:
two_photon = prob1b.acquisition.pop('TwoPhotonSeries')
prob1b

In [None]:
prob3_nwb_file = data_dir / "problem_3.nwb"
with NWBHDF5IO(prob3_nwb_file, mode = 'w' ) as fout:
    fout.export(fin, nwbfile=prob1b) 

In [None]:
fin.close()

## NWBwidgets
A closer look at NWB widgets. We will look at somefile from the DANDI archive. Select DANDI using the radio button. Then select dandiset 4, from the dataset we will look at the nwb file for sub-P27CS

In [None]:
from nwbwidgets.panel import Panel
Panel()

# Writing your own neuroconv interface
We will take a look on how to write a simple neuroconv interface.
Let's assume we have some TTL signals that we have saved in a matlab file.
We would like to create an interface to convert such files to the nwb format.

## Create a mat file with the data to test our code
We will create a mat file with some random data the file will also include
a label, and a frequency.

In [26]:
import numpy as np
from scipy.io import savemat
data = np.outer(
    np.random.choice(a=[0,1], p=[0.8, 0.2],replace=True, size=100), 
    np.ones(10)).reshape(-1)
matdict = {'data': data, 'freq': 1000, 'label':'TTLStrobe'}
savemat(data_dir/"test.mat", matdict)

In [38]:
from scipy.io import loadmat
res = loadmat(data_dir/"test.mat")
#res['freq'][0][0]
#res['label'][0]

'TTLStrobe'

## Looking at BaseDataInterface
You can seee that is an Abstract data class and that we need to overwrite the `add_to_nwbfile`
and `__init__` method of the BaseDataInterface.

```
class BaseDataInterface(ABC):
    """Abstract class defining the structure of all DataInterfaces."""

    display_name: Union[str, None] = None
    keywords: tuple[str] = tuple()
    associated_suffixes: tuple[str] = tuple()
    info: Union[str, None] = None

    @classmethod
    def get_source_schema(cls) -> dict:
        """Infer the JSON schema for the source_data from the method signature (annotation typing)."""
        return get_json_schema_from_method_signature(cls, exclude=["source_data"])

    @classmethod
    def validate_source(cls, source_data: dict, verbose: bool = False):
        """Validate source_data against Converter source_schema."""
        cls._validate_source_data(source_data=source_data, verbose=verbose)

    def _validate_source_data(self, source_data: dict, verbose: bool = False):

        encoder = _NWBSourceDataEncoder()
        # The encoder produces a serialized object, so we deserialized it for comparison

        serialized_source_data = encoder.encode(source_data)
        decoded_source_data = json.loads(serialized_source_data)
        source_schema = self.get_source_schema()
        validate(instance=decoded_source_data, schema=source_schema)
        if verbose:
            print("Source data is valid!")

    @validate_call
    def __init__(self, verbose: bool = False, **source_data):
        self.verbose = verbose
        self.source_data = source_data

        self._validate_source_data(source_data=source_data, verbose=verbose)

    def get_metadata_schema(self) -> dict:
        """Retrieve JSON schema for metadata."""
        metadata_schema = load_dict_from_file(Path(__file__).parent / "schemas" / "base_metadata_schema.json")
        return metadata_schema

    def get_metadata(self) -> DeepDict:
        """Child DataInterface classes should override this to match their metadata."""
        metadata = DeepDict()
        metadata["NWBFile"]["session_description"] = ""
        metadata["NWBFile"]["identifier"] = str(uuid.uuid4())

        # Add NeuroConv watermark (overridden if going through the GUIDE)
        neuroconv_version = importlib.metadata.version("neuroconv")
        metadata["NWBFile"]["source_script"] = f"Created using NeuroConv v{neuroconv_version}"
        metadata["NWBFile"]["source_script_file_name"] = __file__  # Required for validation

        return metadata

    def validate_metadata(self, metadata: dict, append_mode: bool = False) -> None:
        """Validate the metadata against the schema."""
        encoder = _NWBMetaDataEncoder()
        # The encoder produces a serialized object, so we deserialized it for comparison

        serialized_metadata = encoder.encode(metadata)
        decoded_metadata = json.loads(serialized_metadata)
        metdata_schema = self.get_metadata_schema()
        if append_mode:
            # Eliminate required from NWBFile
            nwbfile_schema = metdata_schema["properties"]["NWBFile"]
            nwbfile_schema.pop("required", None)

        validate(instance=decoded_metadata, schema=metdata_schema)

    def get_conversion_options_schema(self) -> dict:
        """Infer the JSON schema for the conversion options from the method signature (annotation typing)."""
        return get_json_schema_from_method_signature(self.add_to_nwbfile, exclude=["nwbfile", "metadata"])

    def create_nwbfile(self, metadata: Optional[dict] = None, **conversion_options) -> NWBFile:
        """
        Create and return an in-memory pynwb.NWBFile object with this interface's data added to it.

        Parameters
        ----------
        metadata : dict, optional
            Metadata dictionary with information used to create the NWBFile.
        **conversion_options
            Additional keyword arguments to pass to the `.add_to_nwbfile` method.

        Returns
        -------
        nwbfile : pynwb.NWBFile
            The in-memory object with this interface's data added to it.
        """
        if metadata is None:
            metadata = self.get_metadata()

        nwbfile = make_nwbfile_from_metadata(metadata=metadata)
        self.add_to_nwbfile(nwbfile=nwbfile, metadata=metadata, **conversion_options)

        return nwbfile

    @abstractmethod
    def add_to_nwbfile(self, nwbfile: NWBFile, **conversion_options) -> None:
        """
        Define a protocol for mapping the data from this interface to NWB neurodata objects.

        These neurodata objects should also be added to the in-memory pynwb.NWBFile object in this step.

        Parameters
        ----------
        nwbfile : pynwb.NWBFile
            The in-memory object to add the data to.
        **conversion_options
            Additional keyword arguments to pass to the `.add_to_nwbfile` method.
        """
        raise NotImplementedError

    def run_conversion(
        self,
        nwbfile_path: FilePath,
        nwbfile: Optional[NWBFile] = None,
        metadata: Optional[dict] = None,
        overwrite: bool = False,
        backend: Optional[Literal["hdf5", "zarr"]] = None,
        backend_configuration: Optional[Union[HDF5BackendConfiguration, ZarrBackendConfiguration]] = None,
        **conversion_options,
    ):
        """
        Run the NWB conversion for the instantiated data interface.

        Parameters
        ----------
        nwbfile_path : FilePathType
            Path for where the data will be written or appended.
        nwbfile : NWBFile, optional
            An in-memory NWBFile object to write to the location.
        metadata : dict, optional
            Metadata dictionary with information used to create the NWBFile when one does not exist or overwrite=True.
        overwrite : bool, default: False
            Whether to overwrite the NWBFile if one exists at the nwbfile_path.
            The default is False (append mode).
        backend : {"hdf5", "zarr"}, optional
            The type of backend to use when writing the file.
            If a `backend_configuration` is not specified, the default type will be "hdf5".
            If a `backend_configuration` is specified, then the type will be auto-detected.
        backend_configuration : HDF5BackendConfiguration or ZarrBackendConfiguration, optional
            The configuration model to use when configuring the datasets for this backend.
            To customize, call the `.get_default_backend_configuration(...)` method, modify the returned
            BackendConfiguration object, and pass that instead.
            Otherwise, all datasets will use default configuration settings.
        """

        backend = _resolve_backend(backend, backend_configuration)
        no_nwbfile_provided = nwbfile is None  # Otherwise, variable reference may mutate later on inside the context

        if metadata is None:
            metadata = self.get_metadata()

        file_initially_exists = Path(nwbfile_path).exists() if nwbfile_path is not None else False
        append_mode = file_initially_exists and not overwrite

        self.validate_metadata(metadata=metadata, append_mode=append_mode)

        with make_or_load_nwbfile(
            nwbfile_path=nwbfile_path,
            nwbfile=nwbfile,
            metadata=metadata,
            overwrite=overwrite,
            backend=backend,
            verbose=getattr(self, "verbose", False),
        ) as nwbfile_out:
            if no_nwbfile_provided:
                self.add_to_nwbfile(nwbfile=nwbfile_out, metadata=metadata, **conversion_options)

            if backend_configuration is None:
                backend_configuration = self.get_default_backend_configuration(nwbfile=nwbfile_out, backend=backend)

            configure_backend(nwbfile=nwbfile_out, backend_configuration=backend_configuration)

    @staticmethod
    def get_default_backend_configuration(
        nwbfile: NWBFile,
        # TODO: when all H5DataIO prewraps are gone, introduce Zarr safely
        # backend: Union[Literal["hdf5", "zarr"]],
        backend: Literal["hdf5"] = "hdf5",
    ) -> Union[HDF5BackendConfiguration, ZarrBackendConfiguration]:
        """
        Fill and return a default backend configuration to serve as a starting point for further customization.

        Parameters
        ----------
        nwbfile : pynwb.NWBFile
            The in-memory object with this interface's data already added to it.
        backend : "hdf5", default: "hdf5"
            The type of backend to use when creating the file.
            Additional backend types will be added soon.

        Returns
        -------
        backend_configuration : HDF5BackendConfiguration or ZarrBackendConfiguration
            The default configuration for the specified backend type.
        """
        return get_default_backend_configuration(nwbfile=nwbfile, backend=backend)


```

## Imports 
We will use the following imports in constructing our class.
The notebook format is not really appropriate for creating 
a class, this is something you would likely want to do
as a python module. You are only using the notebook presentation
for ease of use and convenience.

In [1]:
from typing import Optional
from neuroconv import BaseDataInterface
from pydantic import FilePath
from pydantic.validate_call_decorator import validate_call
from scipy.io import loadmat
from pynwb import NWBFile, TimeSeries

## Extend base data interface class

In [8]:
class MatTTL(BaseDataInterface):
    """ My class to convert matlab files to TLL"""
    

In [17]:
MatTTL(True)

TypeError: Can't instantiate abstract class MatTTL with abstract method add_to_nwbfile

In [27]:
class MatTTL(BaseDataInterface):
    """ My class to convert matlab files to TLL"""
    

    def add_to_nwbfile(self, nwbfile: NWBFile, metadata: Optional[dict], **conversion_options) -> None:
        """
        Define a protocol for mapping the data from this interface to NWB neurodata objects.

        These neurodata objects should also be added to the in-memory pynwb.NWBFile object in this step.

        Parameters
        ----------
        nwbfile : pynwb.NWBFile
            The in-memory object to add the data to.
        **conversion_options
            Additional keyword arguments to pass to the `.add_to_nwbfile` method.
        """
        ts = TimeSeries(name=self.name, 
                        data=self.data, 
                        unit="V", 
                        starting_time=self.starting_time, 
                        rate= self.rate)
        nwbfile.add_acquisition(ts)

In [28]:
MatTTL(verbose=True)

Source data is valid!


<__main__.MatTTL at 0x7f50d4bef650>

In [33]:


class MatTTL(BaseDataInterface):
    """ My class to convert matlab files to TTL """
    @validate_call
    def __init__(self,
                 file_path: FilePath,
                 verbose: bool = True
                 ):
        super().__init__(verbose,file_path=file_path)
        res = loadmat(file_path) # Read matlab file
        self.starting_time = 0.0 # Assume that starting time is alway the start time of the session
        self.name = res.get('label', ['TTLSignal'])[0]
        self.rate = float(res.get('freq', [[1000]])[0][0])
        self.data = res.get('data')

    def add_to_nwbfile(self, nwbfile: NWBFile, metadata: Optional[dict], **conversion_options) -> None:
        ts = TimeSeries(name=self.name, 
                        data=self.data, 
                        unit="V", 
                        starting_time=self.starting_time, 
                        rate= self.rate)
        nwbfile.add_acquisition(ts)


In [34]:
MatTTL(file_path=data_dir/"test.mat", verbose=True)

Source data is valid!


<__main__.MatTTL at 0x7f50d514d9d0>

In [40]:
import datetime
from zoneinfo import ZoneInfo
mat_file = data_dir/"test.mat"
mat_nwb_file = data_dir /"test.nwb"
mat_interface = MatTTL(file_path=mat_file, verbose=True)
metadata = mat_interface.get_metadata()
metadata['NWBFile']['session_start_time'] = datetime.datetime.now(tz=ZoneInfo("Europe/Paris"))
mat_interface.run_conversion(mat_nwb_file, metadata= metadata)

Source data is valid!
NWB file saved at ../data/test.nwb!


In [41]:
fin = NWBHDF5IO(mat_nwb_file, mode = 'r')
mat_nwb = fin.read()
mat_nwb

0,1
Data type,float64
Shape,"(1, 1000)"
Array size,7.81 KiB
Chunk shape,"(1, 1000)"
Compression,gzip
Compression opts,4
Compression ratio,60.60606060606061


In [None]:
fin.close()