# 1. Introduction

The ndx-icephys-meta extension at https://github.com/oruebel/ndx-icephys-meta defines 5 new main tables for organizing ICEphys metadata:

* IntracellularRecordings,
* Sweeps,
* SweepSequences,
* Runs,
* Conditions

To make these more easily accesible and to allow us to store the tables in ``/general/intracellular__ephys`` in the NWBFile, the extension also extends NWBFile itself via the new type:
* ICEphysFile

ICEphysFile makes all 5 tables accesible via corresponding properties, manages creation of the tables, and provides convenienc functions for populating the tables. In addition, ICEphysFile also declares SweepTable as deprecated in the schema and adds warnings if SweepTable is being used.

**Note:** Upon merging of this proposal with the NWB core, the ``ICEphysFile`` neurodata_type would be removed and NWBFile updated accordingly instead. 

In [1]:
# Standard Python imports
from datetime import datetime
from dateutil.tz import tzlocal
import numpy as np

In [2]:
# Standard PyNWB imports
from pynwb.icephys import CurrentClampStimulusSeries, VoltageClampSeries
from pynwb import NWBHDF5IO

In [3]:
# Imports needed from the ndx-icephys-meta
from ndx_icephys_meta.icephys import ICEphysFile

# 2. Create a basic NWB File for testing

**The following parts of the code are for basic setup only and are unchanged from what is in the current NWB release**

In [4]:
# Create the file
nwbfile = ICEphysFile(
            session_description='my first synthetic recording',
            identifier='EXAMPLE_ID',
            session_start_time=datetime.now(tzlocal()),
            experimenter='Dr. Bilbo Baggins',
            lab='Bag End Laboratory',
            institution='University of Middle Earth at the Shire',
            experiment_description='I went on an adventure with thirteen dwarves to reclaim vast treasures.',
            session_id='LONELYMTN')
# Add a device
device = nwbfile.create_device(name='Heka ITC-1600')
# Add an intracellular electrode
electrode = nwbfile.create_ic_electrode(name="elec0",
                                        description='a mock intracellular electrode',
                                        device=device)
# Create an ic-ephys stimulus
stimulus = CurrentClampStimulusSeries(
            name="ccss",
            data=[1, 2, 3, 4, 5],
            starting_time=123.6,
            rate=10e3,
            electrode=electrode,
            gain=0.02,
            sweep_number=15)
# Create and ic-response
response = VoltageClampSeries(
            name='vcs',
            data=[0.1, 0.2, 0.3, 0.4, 0.5],
            conversion=1e-12,
            resolution=np.nan,
            starting_time=123.6,
            rate=20e3,
            electrode=electrode,
            gain=0.02,
            capacitance_slow=100e-12,
            resistance_comp_correction=70.0,
            sweep_number=15)

In the current workflow, we would now add our response and stimulus to the file via:

```
nwbfile.add_stimulus(stimulus)
nwbfile.add_acquisition(response)
```

This workflow is still fine, but when using the new metadata tables, we don't have to do this, because the ``add_intracellular_recording`` function will add them to the NWBFile for us if they are not already part of it, so we can skip this step.

# 3. Construct our Intracellular Electrophysiolgy Metadata tables

**The parts in this section are what consitutes the new elements from this proposal.**

### Add an intracelluar recording
Add a intracellular recording consisting of and electrode, stimulus, and reponse. Optionally, the user may set the ``id`` field for the recording. 

In [5]:
nwbfile.add_intracellular_recording(electrode=electrode,
                                    stimulus=stimulus,
                                    response=response,
                                    id=10)

0

**Note:** A recording may optionally also consist of just an electrode and stimulus or electrode and response, but at least one of stimulus or response are required.

**Note:** If the ``id`` is omitted then PyNWB will automatically number recordings in sequences (i.e., id is the same as the row number).

**Note:** The IntracellularRecordings table is optional and will be created automatically by ICEphysFile the first time the table is being modified. 

**Note:** If the given electrode, stimulus, or response are not part of the nwbfile object, then they will be automatically added to it here.

**Note** The IntracellularRecordigns, Sweeps, SweepSequences, Runs and Conditions tables all enforce unique ids when adding rows. I.e., adding an intracellular recording with the same id results in an error, e.g.,:

In [6]:
try:
    nwbfile.add_intracellular_recording(electrode=electrode,
                                        stimulus=stimulus,
                                        response=response,
                                        id=10)
except ValueError as e:
    print("ValueError raised with message: '%s' "  % str(e))

ValueError raised with message: 'id 10 already in the table' 


### Add a sweep
Add a single sweep consisting of a set of intracellular recordings. Again, setting the ``id`` for a sweep is optional. Also this table is optional and will be created automatically by ICEphysFile. The ``recordings`` argument of the ``add_sweep`` function here is simply a list of ints with the indices of the correspondign rows in the IntracellularRecordings table.

In [7]:
nwbfile.add_ic_sweep(recordings=[0], id=12)

0

**Note** The `recordings`` argument is the list of indicies of the rows in our intracellular recordings table that we want to reference. The indices are determined by the order in which added the elements to the table. 

If we don't know the row indicies, but only the id's of the intracellular recordings that we want to reference, then we can search for the id's as follows:

In [8]:
row_indicies = (nwbfile.intracellular_recordings.id == [10,])
print(row_indicies)

[0]


**Note:** The same is true for our other tables as well, i.e., referencing is done by indices of rows (NOT ids). If we only know ids we can search for them in the same manner on the other tables as well, e.g,. ```nwbfile.sweeps == 15```. In the search we can use a list of integer ids or a single it. 

### Add a sweep sequence
Add a single sweep sequence consisting of a set of sweeps. Again, setting the ``id`` for a sweep sequence is optional. Also this table is optional and will be created automatically by ICEphysFile. The ``sweeps`` argument of the ``add_sweep_sequence`` function here is simply a list of ints with the indices of the correspondign rows in the Sweeps table.

In [9]:
nwbfile.add_ic_sweep_sequence(sweeps=[0], id=15)

0

### Add a run
Add a single run consisting of a set of sweep sequences. Again, setting the ``id`` for a run is optional. Also this table is optional and will be created automatically by ICEphysFile. The ``sweep_sequences`` argument of the ``add_sweep_sequence`` function here is simply a list of ints with the indices of the correspondign rows in the Sweeps table.

In [10]:
nwbfile.add_ic_run(sweep_sequences=[0], id=17)

0

### Add a condition
Add a single condition consisting of a set of runs. Again, setting the ``id`` for a condition is optional. Also this table is optional and will be created automatically by ICEphysFile. The ``runs`` argument of the ``add_ic_condition`` function here is simply a list of ints with the indices of the correspondign rows in the Runs table.

In [11]:
nwbfile.add_ic_condition(runs=[0], id=19)

0

To add additonal columns to any of the tables, we can use the``.add_column`` function on the corresponding table after they have been created. We can then also add new items with the new column values.

In [12]:
nwbfile.ic_conditions.add_column(name='tag', data=np.arange(1), description='integer tag for a sweep')
nwbfile.add_ic_condition(runs=[0], id=21, tag=3)

1

# 4. Accessing the tables

All icephys metadata tables are available as attributes on the nwbfile. The following simply plots the tables to show their content.

In [13]:
nwbfile.intracellular_recordings.to_dataframe()

Unnamed: 0_level_0,stimulus,response,electrode
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
10,"(0, 5, ccss pynwb.icephys.CurrentClampStimulus...","(0, 5, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...


In [14]:
nwbfile.ic_sweeps.to_dataframe()

Unnamed: 0_level_0,recordings
id,Unnamed: 1_level_1
12,s...


In [15]:
nwbfile.ic_sweep_sequences.to_dataframe()

Unnamed: 0_level_0,sweeps
id,Unnamed: 1_level_1
15,rec...


In [16]:
nwbfile.ic_runs.to_dataframe()

Unnamed: 0_level_0,sweep_sequences
id,Unnamed: 1_level_1
17,...


In [17]:
nwbfile.ic_conditions.to_dataframe()

Unnamed: 0_level_0,runs,tag
id,Unnamed: 1_level_1,Unnamed: 2_level_1
19,sweep_se...,0
21,sweep_se...,3


# 5. Read/Write the data

**Read/Write of the file is unchanged from what is in the current NWB release**

In [18]:
# Write our test file
testpath = "test_icephys_file.h5"
with NWBHDF5IO(testpath, 'w') as io:
    io.write(nwbfile)

In [19]:
# Read the data back in
with NWBHDF5IO(testpath, 'r') as io:
    infile = io.read() 

# 6. Validate that the data we have written is the same

In the following we read the data again and do asserts to confirm that the data in the low-level h5py datasets on disk matches the data we expect.

In [20]:
# Read the data back in
with NWBHDF5IO(testpath, 'r') as io:
    infile = io.read() 
   
    # assert intracellular_recordings
    assert np.all(infile.intracellular_recordings.id[:] == nwbfile.intracellular_recordings.id[:])
   
    # Assert that the ids and the VectorData, VectorIndex, and table target of the recordings column of the Sweeps table are correct
    assert np.all(infile.ic_sweeps.id[:] == nwbfile.ic_sweeps.id[:])
    assert np.all(infile.ic_sweeps['recordings'].target.data[:] == nwbfile.ic_sweeps['recordings'].target.data[:])
    assert np.all(infile.ic_sweeps['recordings'] .data[:] == nwbfile.ic_sweeps['recordings'].data[:])
    assert infile.ic_sweeps['recordings'].target.table.name == nwbfile.ic_sweeps['recordings'].target.table.name 
    
    # Assert that the ids and the VectorData, VectorIndex, and table target of the sweeps column of the SweepSequences table are correct
    assert np.all(infile.ic_sweep_sequences.id[:] == nwbfile.ic_sweep_sequences.id[:])
    assert np.all(infile.ic_sweep_sequences['sweeps'].target.data[:] == nwbfile.ic_sweep_sequences['sweeps'].target.data[:])
    assert np.all(infile.ic_sweep_sequences['sweeps'].data[:] == nwbfile.ic_sweep_sequences['sweeps'].data[:])
    assert infile.ic_sweep_sequences['sweeps'].target.table.name == nwbfile.ic_sweep_sequences['sweeps'].target.table.name 
    
    # Assert that the ids and the VectorData, VectorIndex, and table target of the sweep_sequences column of the Runs table are correct
    assert np.all(infile.ic_runs.id[:] == nwbfile.ic_runs.id[:])
    assert np.all(infile.ic_runs['sweep_sequences'].target.data[:] == nwbfile.ic_runs['sweep_sequences'].target.data[:])
    assert np.all(infile.ic_runs['sweep_sequences'] .data[:] == nwbfile.ic_runs['sweep_sequences'].data[:])
    assert infile.ic_runs['sweep_sequences'].target.table.name == nwbfile.ic_runs['sweep_sequences'].target.table.name 
    
    # Assert that the ids and the VectorData, VectorIndex, and table target of the runs column of the Conditions table are correct
    assert np.all(infile.ic_conditions.id[:] == nwbfile.ic_conditions.id[:])
    assert np.all(infile.ic_conditions['runs'].target.data[:] == nwbfile.ic_conditions['runs'].target.data[:])
    assert np.all(infile.ic_conditions['runs'] .data[:] == nwbfile.ic_conditions['runs'].data[:])
    assert infile.ic_conditions['runs'].target.table.name == nwbfile.ic_conditions['runs'].target.table.name 
    assert np.all(infile.ic_conditions['tag'][:] == nwbfile.ic_conditions['tag'][:])
    
    # Show all the tables for visual validation
    print(infile.intracellular_recordings.name)
    display(infile.intracellular_recordings.to_dataframe())
    print(infile.ic_sweeps.name)
    display(infile.ic_sweeps.to_dataframe())
    print(infile.ic_sweep_sequences.name)
    display(infile.ic_sweep_sequences.to_dataframe())
    print(infile.ic_runs.name)
    display(infile.ic_runs.to_dataframe())
    print(infile.ic_conditions.name)
    display(infile.ic_conditions.to_dataframe())
    print('All Metadata')
    display(infile.ic_conditions.to_hierarchical_dataframe())

intracellular_recordings


Unnamed: 0_level_0,stimulus,response,electrode
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
10,"(0, 5, ccss pynwb.icephys.CurrentClampStimulus...","(0, 5, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...


sweeps


Unnamed: 0_level_0,recordings
id,Unnamed: 1_level_1
12,s...


sweep_sequences


Unnamed: 0_level_0,sweeps
id,Unnamed: 1_level_1
15,rec...


runs


Unnamed: 0_level_0,sweep_sequences
id,Unnamed: 1_level_1
17,...


conditions


Unnamed: 0_level_0,runs,tag
id,Unnamed: 1_level_1,Unnamed: 2_level_1
19,sweep_se...,0
21,sweep_se...,3


All Metadata


Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,source_table,intracellular_recordings,intracellular_recordings,intracellular_recordings,intracellular_recordings
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,label,id,stimulus,response,electrode
conditions_id,conditions_tag,runs_id,sweep_sequences_id,sweeps_id,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
19,0,17,15,12,10,"[0, 5, ccss pynwb.icephys.CurrentClampStimulus...","[0, 5, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...
21,3,17,15,12,10,"[0, 5, ccss pynwb.icephys.CurrentClampStimulus...","[0, 5, vcs pynwb.icephys.VoltageClampSeries at...",elec0 pynwb.icephys.IntracellularElectrode at ...
