# 1. Introduction

The ndx-icephys-meta extension at https://github.com/oruebel/ndx-icephys-meta defines 5 new main tables for organizing ICEphys metadata:

* IntracellularRecordings,
* SimultaneousRecordingsTable,
* SequentialRecordingsTable,
* RepetitionsTable,
* ExperimentalConditionsTable

To make these more easily accesible and to allow us to store the tables in ``/general/intracellular__ephys`` in the NWBFile, the extension also extends NWBFile itself via the new type:
* ICEphysFile

ICEphysFile makes all 5 tables accesible via corresponding properties, manages creation of the tables, and provides convenienc functions for populating the tables. In addition, ICEphysFile also declares SweepTable as deprecated in the schema and adds warnings if SweepTable is being used.

**Note:** Upon merging of this proposal with the NWB core, the ``ICEphysFile`` neurodata_type would be removed and NWBFile updated accordingly instead. 

In [1]:
# Standard Python imports
from datetime import datetime
from dateutil.tz import tzlocal
import numpy as np

In [2]:
# Standard PyNWB imports
from pynwb.icephys import VoltageClampStimulusSeries, VoltageClampSeries
from pynwb import NWBHDF5IO

In [3]:
# Imports needed from the ndx-icephys-meta
from ndx_icephys_meta.icephys import ICEphysFile

# 2. Create a basic NWB File for testing

**The following parts of the code are for basic setup only and are unchanged from what is in the Voltage NWB release**

In [4]:
# Create the file
nwbfile = ICEphysFile(
            session_description='my first synthetic recording',
            identifier='EXAMPLE_ID',
            session_start_time=datetime.now(tzlocal()),
            experimenter='Dr. Bilbo Baggins',
            lab='Bag End Laboratory',
            institution='University of Middle Earth at the Shire',
            experiment_description='I went on an adventure with thirteen dwarves to reclaim vast treasures.',
            session_id='LONELYMTN')
# Add a device
device = nwbfile.create_device(name='Heka ITC-1600')
# Add an intracellular electrode
electrode = nwbfile.create_icephys_electrode(name="elec0",
                                        description='a mock intracellular electrode',
                                        device=device)
# Create an ic-ephys stimulus
stimulus = VoltageClampStimulusSeries(
            name="ccss",
            data=[1, 2, 3, 4, 5],
            starting_time=123.6,
            rate=10e3,
            electrode=electrode,
            gain=0.02,
            sweep_number=np.uint64(15))
# Create and ic-response
response = VoltageClampSeries(
            name='vcs',
            data=[0.1, 0.2, 0.3, 0.4, 0.5],
            conversion=1e-12,
            resolution=np.nan,
            starting_time=123.6,
            rate=20e3,
            electrode=electrode,
            gain=0.02,
            capacitance_slow=100e-12,
            resistance_comp_correction=70.0,
            sweep_number=np.uint64(15))

In the Voltage workflow, we would now add our response and stimulus to the file via:

```
nwbfile.add_stimulus(stimulus)
nwbfile.add_acquisition(response)
```

This workflow is still fine, but when using the new metadata tables, we don't have to do this, because the ``add_intracellular_recording`` function will add them to the NWBFile for us if they are not already part of it, so we can skip this step.

# 3. Construct our Intracellular Electrophysiolgy Metadata tables

**The parts in this section are what consitutes the new elements from this proposal.**

### 3.1 Add an intracelluar recording
Add a intracellular recording consisting of and electrode, stimulus, and reponse. Optionally, the user may set the ``id`` field for the recording. 

In [5]:
rowindex = nwbfile.add_intracellular_recording(electrode=electrode,
                                               stimulus=stimulus,
                                               response=response,
                                               id=10)

**Note:** Any time we add a row to one of our tables, the corresponding add function returns the integer index of the newly created row. The rowindex is used in subsequent tables that reference rows in our table.

**Note:** If the ``id`` is omitted then PyNWB will automatically number recordings in sequences (i.e., id is the same as the row number).

**Note:** The IntracellularRecordings table is optional and will be created automatically by ICEphysFile the first time the table is being modified. 

**Note:** If the given electrode, stimulus, or response are not part of the nwbfile object, then they will be automatically added to it here.

**Note** The IntracellularRecordigns, SimultaneousRecordings, SequentialRecordingsTable, RepetitionsTable and ExperimentalConditionsTable tables all enforce unique ids when adding rows. I.e., adding an intracellular recording with the same id results in an error, e.g.,:

In [6]:
try:
    nwbfile.add_intracellular_recording(electrode=electrode,
                                        stimulus=stimulus,
                                        response=response,
                                        id=10)
except ValueError as e:
    print("ValueError raised with message: '%s' "  % str(e))

ValueError raised with message: 'id 10 already in the table' 


**Note** We may optionaly also specify the relevant time range for a stimulus and/or response as part of the intracellular_recording. This is useful, e.g., in case where the recording of the stimulus and response do not align (e.g., in case that recording o the response started before the recording of the stimulus).

In [7]:
rowindex2 = nwbfile.add_intracellular_recording(electrode=electrode,
                                                stimulus=stimulus,
                                                stimulus_start_index=1,
                                                stimulus_index_count=3,
                                                response=response,
                                                response_start_index=2,
                                                response_index_count=3,
                                                id=11)

**Note:** A recording may optionally also consist of just an ``electrode`` and ``stimulus`` or ``electrode`` and ``response``, but at least one of ``stimulus`` or ``response`` are required. If either ``stimulus`` or ``response`` are missing, then the ``stimulus`` and ``response`` are internally set ot the same ``TimeSeries`` and the ``start_index`` and ``index_count`` for the missing parameter are set to ``-1``.

In [8]:
rowindex3 = nwbfile.add_intracellular_recording(electrode=electrode,
                                                response=response,
                                                id=12)

**WARNING** For brevity we reused in the above example the same ``response`` and ``stimulus`` in all rows of the ``intracellular_recordings``. While this is allowed, in most practical cases the ``stimulus`` and ``response`` will change between ``intracellular_recordings``.

### 3.2 Add a simultaneous recording

Before adding a simultaneous recording in Section 3.2.2, we'll take a brief discource to illustrate how we can add custom columns to tables before (see Section 3.2.1) and after (see Section 3.2.3) we have populated the table with data. 

#### 3.2.1 Define a custom column for a simultaneous recording before populating the table 

Before we add a simultaneous recording, lets create a custom data column in our ``Sweeps`` table. We can create columns at the beginning (i.e., before we populate the table with rows/data) or we can add columns after we have already populated the table with rows. Here we will show the former. For this, we first need to get access to our table. 

In [9]:
print(nwbfile.icephys_simultaneous_recordings)

None


The Sweeps table is options, and since we have not populated it with any data yet, we can see that the Sweeps table does not actually exist yet. In order to make sure the table is being created we can use the ``get_icephys_simultaneous_recordings()`` instead, which makes sure that the table is being created if it does not exist yet.

In [10]:
icephys_simultaneous_recordings = nwbfile.get_icephys_simultaneous_recordings()
icephys_simultaneous_recordings.add_column(name='simultaneous_recording_tag', description='A custom tag for simultaneous_recordings')

As we can see, we now have succesfully created a new custom column. 

In [11]:
print(icephys_simultaneous_recordings.colnames)

('recordings', 'simultaneous_recording_tag')


**Note:** The same process applies to all our other tables as well. Here we use the corresponding ``get_intracelluar_recordings``, ``get_icephys_sequential_recordings``, ``get_icephys_repetitions`` and ``get_icephys_conditions`` functions intead. In general, we can alwasy use the get functions instead of accessing the property of the file.

#### 3.2.2 Add a simultaneous recording
Add a single simultaneous recording consisting of a set of intracellular recordings. Again, setting the ``id`` for a simultaneous recording is optional. Also this table is optional and will be created automatically by ICEphysFile. The ``recordings`` argument of the ``add_simultaneous_recording`` function here is simply a list of ints with the indices of the correspondign rows in the IntracellularRecordings table.

**Note:** Since we created our custom ``simultaneous_recording_tag`` column earlier we now also need to populate this custom field for every row we add to the ``Sweeps`` table.

In [12]:
rowindex = nwbfile.add_icephys_simultaneous_recording(recordings=[rowindex, rowindex2, rowindex3], 
                                     id=12, 
                                     simultaneous_recording_tag='LabTag1')

**Note** The `recordings`` argument is the list of indicies of the rows in our intracellular recordings table that we want to reference. The indices are determined by the order in which added the elements to the table. 

If we don't know the row indicies, but only the id's of the intracellular recordings that we want to reference, then we can search for the id's as follows:

In [13]:
row_indicies = (nwbfile.intracellular_recordings.id == [10,])
print(row_indicies)

[0]


**Note:** The same is true for our other tables as well, i.e., referencing is done by indices of rows (NOT ids). If we only know ids we can search for them in the same manner on the other tables as well, e.g,. ```nwbfile.simultaneous_recordings == 15```. In the search we can use a list of integer ids or a single it. 

#### 3.2.3 Define a custom column for a simultaneous recording after adding rows

Depending on the lab workflow, it may be useful to add complete columns to a table (here ``Sweeps``) after we have already populated the table with rows. We can do this the same way as before, only now we need to provide a data array to populate the values for the existing rows. E.g.:

In [14]:
nwbfile.icephys_simultaneous_recordings.add_column(name='simultaneous_recording_type', description='Description of the type of simultaneous_recording', data=['SweepType1', ])

### 3.3 Add a sequential recording
Add a single sequential recording consisting of a set of simultaneous recordings. Again, setting the ``id`` for a sequential recording is optional. Also this table is optional and will be created automatically by ICEphysFile. The ``simultaneous recordings`` argument of the ``add_sequential_recording`` function here is simply a list of ints with the indices of the correspondign rows in the Sweeps table.

In [15]:
rowindex = nwbfile.add_icephys_sequential_recording(simultaneous_recordings=[0], stimulus_type='square', id=15)

### 3.4 Add a repetition
Add a single repetition consisting of a set of sequential recordings. Again, setting the ``id`` for a repetition is optional. Also this table is optional and will be created automatically by ICEphysFile. The ``sequential_recordings`` argument of the ``add_sequential_recording`` function here is simply a list of ints with the indices of the correspondign rows in the Sweeps table.

In [16]:
rowindex = nwbfile.add_icephys_repetition(sequential_recordings=[0], id=17)

### 3.5 Add a condition
Add a single condition consisting of a set of repetitions. Again, setting the ``id`` for a condition is optional. Also this table is optional and will be created automatically by ICEphysFile. The ``repetitions`` argument of the ``add_icephys_condition`` function here is simply a list of ints with the indices of the correspondign rows in the Runs table.

In [17]:
rowindex = nwbfile.add_icephys_experimental_condition(repetitions=[0], id=19)

To add additonal columns to any of the tables, we can use the``.add_column`` function on the corresponding table after they have been created. We can then also add new items with the new column values.

In [18]:
nwbfile.icephys_experimental_conditions.add_column(name='tag', data=np.arange(1), description='integer tag for a experimental condition')
rowindex = nwbfile.add_icephys_experimental_condition(repetitions=[0], id=21, tag=3)

# 4. Accessing the tables

All icephys metadata tables are available as attributes on the nwbfile. The following simply plots the tables to show their content.

In [19]:
nwbfile.intracellular_recordings.to_dataframe()

Unnamed: 0_level_0,stimulus,response,electrode
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
10,"(0, 5, ccss pynwb.icephys.VoltageClampStimulus...","(0, 5, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...
11,"(1, 3, ccss pynwb.icephys.VoltageClampStimulus...","(2, 3, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...
12,"(-1, -1, vcs pynwb.icephys.VoltageClampSeries ...","(0, 5, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...


In [20]:
nwbfile.icephys_simultaneous_recordings.to_dataframe()

Unnamed: 0_level_0,recordings,simultaneous_recording_tag,simultaneous_recording_type
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
12,s...,LabTag1,SweepType1


In [21]:
nwbfile.icephys_sequential_recordings.to_dataframe()

Unnamed: 0_level_0,simultaneous_recordings,stimulus_type
id,Unnamed: 1_level_1,Unnamed: 2_level_1
15,rec...,square


In [22]:
nwbfile.icephys_repetitions.to_dataframe()

Unnamed: 0_level_0,sequential_recordings
id,Unnamed: 1_level_1
17,simultaneous_rec...


In [23]:
nwbfile.icephys_experimental_conditions.to_dataframe()

Unnamed: 0_level_0,repetitions,tag
id,Unnamed: 1_level_1,Unnamed: 2_level_1
19,sequential_rec...,0
21,sequential_rec...,3


As the tables in the hierarchy are optional, a given NWBFile may contain, e.g., a ``experimental_conditions`` table. To provide a consistent interface for users, and to avoid having to manually check which tables exist in a given file to find the top-most table in the hierarchy, we provide the ``get_icephys_meta_parent_table`` function. With this we can retrieve the highest table in the icephys metadata hierarchy that actually exists in the file. For example, since we here populate all tables, ``get_icephys_meta_parent_table`` will return the ``experimental_conditions`` table.

In [24]:
print(nwbfile.get_icephys_meta_parent_table())

experimental_conditions ndx_icephys_meta.icephys.ExperimentalConditionsTable at 0x4901307024
Fields:
  colnames: ['repetitions' 'tag']
  columns: (
    repetitions_index <class 'hdmf.common.table.VectorIndex'>,
    repetitions <class 'hdmf.common.table.DynamicTableRegion'>,
    tag <class 'hdmf.common.table.VectorData'>
  )
  description: A table for grouping different intracellular recording repetitions together that belong to the same experimental experimental_conditions.
  id: id <class 'hdmf.common.table.ElementIdentifiers'>



# 5. Read/Write the data

**Read/Write of the file is unchanged from what is in the Voltage NWB release**

In [25]:
# Write our test file
testpath = "test_icephys_file.h5"
with NWBHDF5IO(testpath, 'w') as io:
    io.write(nwbfile)

In [26]:
# Read the data back in
with NWBHDF5IO(testpath, 'r') as io:
    infile = io.read() 

# 6. Validate the file

# 6.1 Confirm that the data we have written is the what we expect

In the following we read the data again and do asserts to confirm that the data in the low-level h5py datasets on disk matches the data we expect.

In [27]:
# Read the data back in
with NWBHDF5IO(testpath, 'r') as io:
    infile = io.read() 
   
    # assert intracellular_recordings
    assert np.all(infile.intracellular_recordings.id[:] == nwbfile.intracellular_recordings.id[:])
   
    # Assert that the ids and the VectorData, VectorIndex, and table target of the recordings column of the Sweeps table are correct
    assert np.all(infile.icephys_simultaneous_recordings.id[:] == nwbfile.icephys_simultaneous_recordings.id[:])
    assert np.all(infile.icephys_simultaneous_recordings['recordings'].target.data[:] == nwbfile.icephys_simultaneous_recordings['recordings'].target.data[:])
    assert np.all(infile.icephys_simultaneous_recordings['recordings'].data[:] == nwbfile.icephys_simultaneous_recordings['recordings'].data[:])
    assert infile.icephys_simultaneous_recordings['recordings'].target.table.name == nwbfile.icephys_simultaneous_recordings['recordings'].target.table.name 
    
    # Assert that the ids and the VectorData, VectorIndex, and table target of the simultaneous recordings column of the SweepSequences table are correct
    assert np.all(infile.icephys_sequential_recordings.id[:] == nwbfile.icephys_sequential_recordings.id[:])
    assert np.all(infile.icephys_sequential_recordings['simultaneous_recordings'].target.data[:] == nwbfile.icephys_sequential_recordings['simultaneous_recordings'].target.data[:])
    assert np.all(infile.icephys_sequential_recordings['simultaneous_recordings'].data[:] == nwbfile.icephys_sequential_recordings['simultaneous_recordings'].data[:])
    assert infile.icephys_sequential_recordings['simultaneous_recordings'].target.table.name == nwbfile.icephys_sequential_recordings['simultaneous_recordings'].target.table.name 
    
    # Assert that the ids and the VectorData, VectorIndex, and table target of the sequential_recordings column of the Repetitions table are correct
    assert np.all(infile.icephys_repetitions.id[:] == nwbfile.icephys_repetitions.id[:])
    assert np.all(infile.icephys_repetitions['sequential_recordings'].target.data[:] == nwbfile.icephys_repetitions['sequential_recordings'].target.data[:])
    assert np.all(infile.icephys_repetitions['sequential_recordings'] .data[:] == nwbfile.icephys_repetitions['sequential_recordings'].data[:])
    assert infile.icephys_repetitions['sequential_recordings'].target.table.name == nwbfile.icephys_repetitions['sequential_recordings'].target.table.name 
    
    # Assert that the ids and the VectorData, VectorIndex, and table target of the repetitions column of the Conditions table are correct
    assert np.all(infile.icephys_experimental_conditions.id[:] == nwbfile.icephys_experimental_conditions.id[:])
    assert np.all(infile.icephys_experimental_conditions['repetitions'].target.data[:] == nwbfile.icephys_experimental_conditions['repetitions'].target.data[:])
    assert np.all(infile.icephys_experimental_conditions['repetitions'] .data[:] == nwbfile.icephys_experimental_conditions['repetitions'].data[:])
    assert infile.icephys_experimental_conditions['repetitions'].target.table.name == nwbfile.icephys_experimental_conditions['repetitions'].target.table.name 
    assert np.all(infile.icephys_experimental_conditions['tag'][:] == nwbfile.icephys_experimental_conditions['tag'][:])
    
    # Show all the tables for visual validation
    print(infile.intracellular_recordings.name)
    display(infile.intracellular_recordings.to_dataframe())
    print(infile.icephys_simultaneous_recordings.name)
    display(infile.icephys_simultaneous_recordings.to_dataframe())
    print(infile.icephys_sequential_recordings.name)
    display(infile.icephys_sequential_recordings.to_dataframe())
    print(infile.icephys_repetitions.name)
    display(infile.icephys_repetitions.to_dataframe())
    print(infile.icephys_experimental_conditions.name)
    display(infile.icephys_experimental_conditions.to_dataframe())
    print('All Metadata')
    display(infile.icephys_experimental_conditions.to_hierarchical_dataframe())

intracellular_recordings


Unnamed: 0_level_0,stimulus,response,electrode
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
10,"(0, 5, ccss pynwb.icephys.VoltageClampStimulus...","(0, 5, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...
11,"(1, 3, ccss pynwb.icephys.VoltageClampStimulus...","(2, 3, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...
12,"(-1, -1, vcs pynwb.icephys.VoltageClampSeries ...","(0, 5, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...


simultaneous_recordings


Unnamed: 0_level_0,recordings,simultaneous_recording_tag,simultaneous_recording_type
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
12,s...,LabTag1,SweepType1


sequential_recordings


Unnamed: 0_level_0,simultaneous_recordings,stimulus_type
id,Unnamed: 1_level_1,Unnamed: 2_level_1
15,rec...,square


repetitions


Unnamed: 0_level_0,sequential_recordings
id,Unnamed: 1_level_1
17,simultaneous_rec...


experimental_conditions


Unnamed: 0_level_0,repetitions,tag
id,Unnamed: 1_level_1,Unnamed: 2_level_1
19,sequential_rec...,0
21,sequential_rec...,3


All Metadata


Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Unnamed: 5_level_0,Unnamed: 6_level_0,source_table,intracellular_recordings,intracellular_recordings,intracellular_recordings,intracellular_recordings
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,label,id,stimulus,response,electrode
experimental_conditions_id,experimental_conditions_tag,repetitions_id,sequential_recordings_id,sequential_recordings_stimulus_type,simultaneous_recordings_id,simultaneous_recordings_simultaneous_recording_tag,simultaneous_recordings_simultaneous_recording_type,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2
19,0,17,15,square,12,LabTag1,SweepType1,10,"[0, 5, ccss pynwb.icephys.VoltageClampStimulus...","[0, 5, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...
19,0,17,15,square,12,LabTag1,SweepType1,11,"[1, 3, ccss pynwb.icephys.VoltageClampStimulus...","[2, 3, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...
19,0,17,15,square,12,LabTag1,SweepType1,12,"[-1, -1, vcs pynwb.icephys.VoltageClampSeries ...","[0, 5, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...
21,3,17,15,square,12,LabTag1,SweepType1,10,"[0, 5, ccss pynwb.icephys.VoltageClampStimulus...","[0, 5, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...
21,3,17,15,square,12,LabTag1,SweepType1,11,"[1, 3, ccss pynwb.icephys.VoltageClampStimulus...","[2, 3, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...
21,3,17,15,square,12,LabTag1,SweepType1,12,"[-1, -1, vcs pynwb.icephys.VoltageClampSeries ...","[0, 5, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...


# 6.2 Validate the file via the NWB validator

In [28]:
import subprocess
valres = subprocess.run(["python", "-m", "pynwb.validate", testpath], capture_output=True)

In [29]:
print("Validation Return Code: %s" % str(valres.returncode))
print()
print("Validation stderr")
print("-----------------")
print(valres.stderr.decode('utf8'))
print("Validation stdcout")
print("------------------")
print(valres.stdout.decode('utf8'))

Validation Return Code: 1

Validation stderr
-----------------
Traceback (most recent call last):
  File "/Users/oruebel/anaconda3/envs/py4nwb/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Users/oruebel/anaconda3/envs/py4nwb/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/oruebel/Devel/nwb/pynwb/src/pynwb/validate.py", line 126, in <module>
    main()
  File "/Users/oruebel/Devel/nwb/pynwb/src/pynwb/validate.py", line 72, in main
    ns_deps = NWBHDF5IO.load_namespaces(catalog, path)
  File "/Users/oruebel/Devel/nwb/hdmf/src/hdmf/utils.py", line 483, in func_call
    return func(args[0], **parsed['args'])
  File "/Users/oruebel/Devel/nwb/hdmf/src/hdmf/backends/hdf5/h5tools.py", line 126, in load_namespaces
    d.update(namespace_catalog.load_namespaces('namespace', reader=reader))
  File "/Users/oruebel/Devel/nwb/hdmf/src/hdmf/utils.py", line 483, in func_call
    return func(args[0], **parsed['arg