tutorial covering DynamicTable basics #320

cechava · 2021-10-14T14:03:43Z

hdmf_common objects covered:

VectorData
VectorIndex
DynamicTable
DynamicTableRegion

resolves issue #318

codecov · 2021-10-14T14:17:00Z

Codecov Report

Merging #320 (ceb23ec) into master (1647dc4) will not change coverage.
The diff coverage is n/a.

❗ Current head ceb23ec differs from pull request most recent head c83da30. Consider uploading reports for the commit c83da30 to get more accurate results

@@           Coverage Diff           @@
##           master     #320   +/-   ##
=======================================
  Coverage   85.66%   85.66%           
=======================================
  Files         125      125           
  Lines        4709     4709           
=======================================
  Hits         4034     4034           
  Misses        675      675

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1647dc4...c83da30. Read the comment docs.

bendichter

Looking good, @cechava ! You are definitely on the right track.

If no value is passed to id, the corresponding field within the DynamicTable object is an empty array.

When you nwbExport, does this create a DynamicTable that contains an ElementIdentifiers with a different length from the columns in the table? That is not allowed

You may also supply an optional row ID.

What happens if an ID is not provided?

Ragged array columns

Could you add info about util.create_indexed_column?

Now, you can write the file, read it back, and run:

Can you demonstrate this?

cechava · 2021-10-14T16:18:57Z

@bendichter is there a way to write a bare-bones DynamicTable object to file?
In Python HDMF one can do this with HDF5IO from the hdmf.backends.hdf5 module.

In the DataPipe tutorial a numeric matrix is first put into a TimeSeries object (which is itself a DynamicTable) before putting into a NWBFile and exported. A bare-boned DynamicTable object might have non-numeric data so this is not the best.
Do you know of any other way?

bendichter · 2021-10-14T16:21:42Z

Sorry, I don’t understand the question. Is this about DynamicTables or DataPipes?

On Thu, Oct 14, 2021 at 12:19 PM cechava ***@***.***> wrote: @bendichter <https://github.com/bendichter> is there a way to write a bare-bones DynamicTable object to file? In Python HDMF one can do this with HDF5IO from the hdmf.backends.hdf5 module. In the DataPipe tutorial a numeric matrix is first put into a TimeSeries object (which is itself a DynamicTable) before putting into a NWBFile and exported. A bare-boned DynamicTable object might have non-numeric data so this is not the best. Do you know of any other way? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#320 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGOEEQBOSDTLOJWFBJRTHTUG37HZANCNFSM5F7X26TA> .

-- Ben Dichter, PhD Data Science Consultant personal website <http://bendichter.com/>

cechava · 2021-10-14T16:37:37Z

Tagging @ln-vidrio, in case they know
The question is about DynamicTables. Is there a straightforward way to write a bare-bones DynamicTable of the type below to file?

col1_exp = types.hdmf_common.VectorData( ...
    'description', 'column #1', ...
    'data', types.untyped.DataPipe('data', [1; 2], ...  # data must be numerical
        'maxSize', [Inf, 1], ...
        'axis', 1 ...
    ) ...
);

col2_exp = types.hdmf_common.VectorData('description', 'column #2',...
    'data',types.untyped.DataPipe('data', [10; 20], ...  #data must be numerical
        'maxSize', [Inf, 1], ...
        'axis', 1 ...
    ) ...
);

ids_exp = types.hdmf_common.ElementIdentifiers('data', ...
    types.untyped.DataPipe('data', [0;1], ...  # data must be numerical
        'maxSize', [1, Inf], ...
        'axis', 1 ...
    ) ...
);

table_exp = types.hdmf_common.DynamicTable('description','an expandable table',...
    'colnames', {'col1','col2'},...
    'col1',col1_exp,'col2',col2_exp,...
    'id', ids_exp);

cechava · 2021-10-14T21:22:15Z

@bendichter
Another issue is the behavior of ObjectView. You mentioned that it is possible to pass in the object itself rather than the full hdf5 path to the object. However, it doesn't seem to be the case

For example, when I run the following code to create a DynamicTable with ragged array columns:

col1 = types.hdmf_common.VectorData( ...
    'description', 'column #1', ...
    'data', {'1a'; '1b'; '1c'; '2a'} ...
);
col1_index = types.hdmf_common.VectorIndex( ...
    'description', 'column #1 index',...
    'target',types.untyped.ObjectView('/col1'), ...  % object view of target column
    'data', [3; 4] ...
);
table_ragged_col = types.hdmf_common.DynamicTable( ...
    'description', 'an example table', ...
    'colnames', {'col1','col1_index'}, ...
    'col1', col1, ...
    'col1_index', col1_index, ...
    'id', types.hdmf_common.ElementIdentifiers('data', [0; 1]) ...  % 0-indexed, for compatibility with Python
);
table_ragged_col.getRow(1)

The output is, as expected:

       col1       col1_index
    __________    __________

    {3×1 cell}        3

However, using the suggested method of casting the object itself as an object view:

col1_index = types.hdmf_common.VectorIndex( ...
    'description', 'column #1 index',...
    'target',types.untyped.ObjectView(col1), ...  % object view of target column
    'data', [3; 4] ...
);

table_ragged_col = types.hdmf_common.DynamicTable( ...
    'description', 'an example table', ...
    'colnames', {'col1','col1_index'}, ...
    'col1', col1, ...
    'col1_index', col1_index, ...
    'id', types.hdmf_common.ElementIdentifiers('data', [0; 1]) ...  % 0-indexed, for compatibility with Python
);

table_ragged_col.getRow(1)

Return the clearly wrong output.

     col1     col1_index
    ______    __________

    {'1a'}        3

Clearly, VectorIndex column is not formatting the ragged array column properly in the second scenario.
Let me know if there's something I'm not doing right.

For now, I will keep the HDF5 path in the tutorials since that produces the expected behavior.

cechava · 2021-10-14T21:28:45Z

I can open an issue with the above, if preferable

…o a ragged array column

bendichter · 2021-10-14T22:21:27Z

@cechava good catch, but I think you are identifying an issue with getRow, not with ObjectView. Yes, I think this should be its own issue.

…ged array column

…ials_hdmf

…nctions

…ials_hdmf

bendichter

@cechava, this is really excellent! You've already put a lot of work into this, and it's almost done! I made a few edits, for small typos and cleaned up spacing a bit. I wanted to bring up two points though:

Lines 60-73: Are you sure that should be a cell array? It seems like it should be a standard tensor.

Lines 108-129: Does this do the same thing as table_ragged_numeric.getRow(1, 'columns', {'col_num'})? If so, maybe we should just have that one and remove this, since I think getRow is easier to understand.

cechava · 2021-12-06T17:39:46Z

@bendichter thanks for feedback.

Lines 60-73: I have corrected this in latest commit. The multidimensional table is properly formatted.

Lines 108-129: To answer your question first, the following would produce the same output:
table_ragged_numeric.getRow(1, 'columns', {'col_num'}).col_num{1}
The purpose of including this bit is to give an example of how to use the read_indexed_column utility function. I agree that getRow is easier to understand, but it returns a table rather than the data itself. Because this is a utility function that we use in other NWB tutorials, I think this bit is worth keeping. Let me know if you agree.

bendichter · 2021-12-06T19:31:52Z

It was a utility function that I wrote before `getRow` was available. I'm thinking maybe we should deprecate it in favor of `getRow`, and adjust the tutorial accordingly (we need to update them to use `addRow` and `addColumn` anyway). What do you think? It's easy to get the matrix from the MATLAB Table object, right?

…

On Mon, Dec 6, 2021 at 9:39 AM cechava ***@***.***> wrote: @bendichter <https://github.com/bendichter> thanks for feedback. Lines 60-73: I have corrected this in latest commit. The multidimensional table is properly formatted. Lines 108-129: To answer your question first, the following would produce the same output: table_ragged_numeric.getRow(1, 'columns', {'col_num'}).col_num{1} The purpose of including this bit is to give an example of how to use the read_indexed_column utility function. I agree that getRow is easier to understand, but it returns a table rather than the data itself. Because this is a utility function that we use in other NWB tutorials, I think this bit is worth keeping. Let me know if you agree. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#320 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGOEEQUGMQVANVRDT4PEZLUPTYO3ANCNFSM5F7X26TA> .

-- Ben Dichter, PhD Data Science Consultant personal website <http://bendichter.com/>

cechava · 2021-12-06T23:00:10Z

yes, it's easy to get the matrix from the table object. I agree with deprecating and updating tutorials.

bendichter · 2021-12-07T04:26:57Z

start and stop time are the wrong shape

bendichter · 2021-12-07T04:56:04Z

Can you please test writing for these? Lines 60-73 was still incorrect, and throws an error when writing:

col5 = types.hdmf_common.VectorData( ...
    'description', 'column #5',...
    'data', ...
    { ...
        [ ...
            [1, 2, 3]; ...
            [4, 5, 6] ...
        ]; ...
        [ ...
            [11,12,13]; ...
            [14,15,16] ...
        ] ...
    } ...
);

multi_dim_table = types.core.TimeIntervals( ...
    'description','test table', ...
    'colnames', {'col5', 'start_time', 'stop_time'}, ...
    'start_time', types.hdmf_common.VectorData('description', 'hi', 'data', [1.; 2.]), ...
    'stop_time', types.hdmf_common.VectorData('description', 'hi', 'data', [1.; 2.]), ...
    'col5', col5, ...
    'id', types.hdmf_common.ElementIdentifiers('data', [0; 1]) ...  % 0-indexed, for compatibility with Python
);
>> file.intervals.set('multi', multi_dim_table);
>> nwbExport(file, 'test_multi.nwb');

it needs to be e.g.

data = [[1, 2, 3]; [4, 5, 6]; [7, 8, 9]];
data(:, :, 2) = [ [11, 12, 13]; [14, 15, 16]; [17, 18, 19]];

col5 = types.hdmf_common.VectorData( ...
    'description', 'column #5',...
    'data', data ...
);

multi_dim_table = types.core.TimeIntervals( ...
    'description','test table', ...
    'colnames', {'col5', 'start_time', 'stop_time'}, ...
    'start_time', types.hdmf_common.VectorData('description', 'hi', 'data', [1.; 2.]), ...
    'stop_time', types.hdmf_common.VectorData('description', 'hi', 'data', [1.; 2.]), ...
    'col5', col5, ...
    'id', types.hdmf_common.ElementIdentifiers('data', [0; 1]) ...  % 0-indexed, for compatibility with Python
);

cechava · 2021-12-07T19:37:45Z

The reason for using cell arrays is to follow the lead of this snippet which adds multi-dimensional rows to a DynamicTable object:

matnwb/+tests/+system/DynamicTableTest.m

Line 14 in 131fdff

'randomvalues', {rand(5,2);rand(3,2)},...

Below I test defining a multi-dimensional column by either : 1) adding rows to a table or 2) defining the column from the beginning

% -------- Build row by row
file= NwbFile( ...
    'session_start_time', '2021-01-01 00:00:00', ...
    'identifier', 'ident1', ...
    'session_description', 'test' ...
    );

colnames = {'start_time', 'stop_time', 'randomvalues'};
multi_dim_table1 = types.core.TimeIntervals(...
    'description', 'test dynamic table column',...
    'colnames', colnames);
            
for i = 1:2
    multi_dim_table1.addRow(...
        'start_time', i,...
        'stop_time', i+1,...
        'randomvalues', {rand(5,2)},...
        'id', i-1);
end
file.intervals.set('multi', multi_dim_table1);
nwbExport(file, 'test_multi_addRow.nwb');

% -------- Define in one go

start_times = types.hdmf_common.VectorData( ...
    'description', 'start_times', ...
    'data', [1;2] ...
);

stop_times = types.hdmf_common.VectorData( ...
    'description', 'stop_times', ...
    'data', [2;3] ...
);

randomvalues = types.hdmf_common.VectorData( ...
    'description', 'random values',...
    'data', ...
    [ ...
        {rand(5,2)}; ...
        {rand(5,2)} ...
    ] ...
);

multi_dim_table2 = types.core.TimeIntervals( ...
    'description','test table', ...
    'colnames', {'randomvalues', 'start_time', 'stop_time'}, ...
    'start_time', start_times, ...
    'stop_time', stop_times, ...
    'randomvalues', randomvalues, ...
     'id', types.hdmf_common.ElementIdentifiers('data', [0; 1]) ...
 );
file.intervals.set('multi', multi_dim_table2);
nwbExport(file, 'test_multi_oneGo.nwb');

The second approach leads to a write error. The output of the toTable utility function is the same for both

>> multi_dim_table1.toTable()

ans =

  2×4 table

    id    start_time    stop_time    randomvalues
    __    __________    _________    ____________

    0         1             2        {5×2 double}
    1         2             3        {5×2 double}

>> multi_dim_table2.toTable()

ans =

  2×4 table

    id    randomvalues    start_time    stop_time
    __    ____________    __________    _________

    0     {5×2 double}        1             2    
    1     {5×2 double}        2             3

The difference between both these tables is in the fact that the table that was built row-by-row has an index for the multidimensional column.

>> multi_dim_table1.vectordata

ans = 

  2×1 Set array with properties:

          randomvalues: [types.hdmf_common.VectorData]
    randomvalues_index: [types.hdmf_common.VectorIndex]

>> multi_dim_table2.vectordata

ans = 

  Set with properties:

    randomvalues: [types.hdmf_common.VectorData]

We probably want to do something similar for any table that is created with multidimensional columns from the beginning. Do you agree @ln-vidrio?

Including the error traceback below for reference:

Error using unicode2native
At least one dimension must be equal to 1.

Error in io.mapData2H5 (line 62)
            data{i} = char(unicode2native(char(data{i})));

Error in io.writeDataset (line 5)
[tid, sid, data] = io.mapData2H5(fid, data, varargin{:});

Error in types.untyped.MetaClass/write_base (line 26)
                    io.writeDataset(fid, fullpath, obj.data, 'forceArray');

Error in types.untyped.MetaClass/export (line 66)
            refs = obj.write_base(fid, fullpath, refs);

Error in types.hdmf_common.Data/export (line 39)
        refs = export@types.untyped.MetaClass(obj, fid, fullpath, refs);

Error in types.hdmf_common.VectorData/export (line 53)
        refs = export@types.hdmf_common.Data(obj, fid, fullpath, refs);

Error in types.untyped.Set/export (line 181)
                    refs = v.export(fid, propfp, refs);

Error in types.hdmf_common.DynamicTable/export (line 114)
            refs = obj.vectordata.export(fid, fullpath, refs);

Error in types.core.TimeIntervals/export (line 91)
        refs = export@types.hdmf_common.DynamicTable(obj, fid, fullpath, refs);

Error in types.untyped.Set/export (line 181)
                    refs = v.export(fid, propfp, refs);

Error in types.core.NWBFile/export (line 1016)
            refs = obj.intervals.export(fid, [fullpath '/intervals'], refs);

Error in NwbFile/export (line 61)
                refs = export@types.core.NWBFile(obj, output_file_id, '/', {});

Error in nwbExport (line 35)
    export(nwb(i), filename);

Error in hdmf_tutorial_test (line 227)
nwbExport(file, 'test_multi_oneGo.nwb');

cechava · 2021-12-07T20:05:45Z

though looking at how things are done in PyNWB, I see @bendichter's point. I generated an analogous table with multi-dimensional columns in pynwb:

from datetime import datetime
from dateutil.tz import tzlocal
from pynwb import NWBFile
import numpy as np
from pynwb import NWBHDF5IO

start_time = datetime(2017, 4, 3, 11, tzinfo=tzlocal())

nwbfile = NWBFile(session_description='demonstrate NWBFile basics',  # required
                 identifier='NWB123',  # required
                 session_start_time=start_time)

nwbfile.add_trial_column(name='randomvalues', description='')

for i in range(2):
   nwbfile.add_trial(start_time=float(i+1), stop_time=float(i+2), randomvalues=np.random.rand(3,2),id = i)
with NWBHDF5IO('pynwb_multi_test.nwb', 'w') as io:
   io.write(nwbfile)

The following is the format of that multi-dimensional column:

The different rows are stacked along the third dimension. In contrast, the way it is done in matnwb now produces this:

I think it's worth trying to match the way multi-dimensional columns are written to file.

lawrence-mbf · 2021-12-07T20:46:40Z

So I'm not too clear on the precise issue. I think the misunderstanding is that addRow is intended to literally only add one "row" (defined as only one ID per call of the function). The cell arrays in addRow are used solely to create and test nested ragged arrays and has nothing to do with multidimensional rows nor are used with any types.hdmf_common.* classes.

In fact, calling this snippet twice:

for i = 1:2
  DynamicTable.addRow('randomvalues', rand(1, 3,2));
end

Should produce something similar as in pynwb as concatenation is automatically done on the first dimension (in MATLAB).

cechava · 2021-12-07T20:52:44Z

I see. Thanks for clarifying. I misunderstood the intention behind the cell array

cechava · 2021-12-07T21:04:25Z

ok so properly defining multidimensional columns encounters an error with the new checkConfig utility function. I have opened an issue and will fix the function so it can handle multi-dimensional columns. I will also add to the DynamicTable tests so that multidimensional columns are covered in case something breaks them in the future.

…ials_hdmf

…nsional columns; wont work until PR 367 is merged

…ials_hdmf

cechava · 2022-01-13T20:11:59Z

depends on PR #387 and #388 to work properly

…ials_hdmf

cechava · 2022-02-02T10:19:19Z

updated and tested after merging of PR #387 and #388

change title add link to README

* tutorial covering DynamicTable basics * style changes * added warnings about empty id fields; added demo of how to add rows to a ragged array column * added warnings about empty id fields; added demo of how to add rows to a ragged array column * aded demo of utils.read_indexed_column to get row elements from a ragged array column * ObjectView takes object instead of full HDF5 path * added expandable table demo to tutorial * added info related to addColumns, toTable, and configCheck utility functions * fixed typo * fixed a few typos * fixed multidimensional table * uncommented setup * removal of read_indexed_column utility function * corrected and tested multi-dimensional column tables * updated hdmf tutorial to work with rows-last convention for multidimensional columns; wont work until PR 367 is merged * progress save * updated tutorials for expected behavior of addrow * updated tutorial for multidimensional columns * uncomment setup * change name to dynamic_tables.mlx * add html change title add link to README Co-authored-by: cechava <cesar@Cesars-MacBook-Pro-2.local> Co-authored-by: bendichter <ben.dichter@gmail.com> Co-authored-by: cechava <cesar@cesars-mbp-2.attlocal.net>

tutorial covering DynamicTable basics

282c8bd

bendichter self-requested a review October 14, 2021 15:12

bendichter requested changes Oct 14, 2021

View reviewed changes

style changes

2512664

cechava added 2 commits October 14, 2021 17:56

added warnings about empty id fields; added demo of how to add rows t…

b891548

…o a ragged array column

added warnings about empty id fields; added demo of how to add rows t…

ab90e47

…o a ragged array column

cechava marked this pull request as draft October 14, 2021 21:59

cechava added 10 commits October 14, 2021 20:39

aded demo of utils.read_indexed_column to get row elements from a rag…

811a988

…ged array column

Merge branch 'master' of https://github.com/cechava/matnwb into tutor…

bcaab2b

…ials_hdmf

ObjectView takes object instead of full HDF5 path

ae4b9b0

Merge branch 'master' of https://github.com/cechava/matnwb into tutor…

0f05dfa

…ials_hdmf

Merge branch 'master' of https://github.com/cechava/matnwb into tutor…

0cf59df

…ials_hdmf

added expandable table demo to tutorial

d74cf7a

Merge branch 'master' of https://github.com/cechava/matnwb into tutor…

9f0f6db

…ials_hdmf

Merge branch 'master' of https://github.com/cechava/matnwb into tutor…

7447cba

…ials_hdmf

added info related to addColumns, toTable, and configCheck utility fu…

4904e34

…nctions

fixed typo

1d9bc7a

cechava marked this pull request as ready for review December 2, 2021 05:28

Merge branch 'master' of https://github.com/cechava/matnwb into tutor…

31c76e6

…ials_hdmf

cechava mentioned this pull request Dec 2, 2021

NWB DynamicTable --> MATLAB Table converter #111

Closed

7 tasks

cechava requested a review from bendichter December 2, 2021 16:12

fixed a few typos

4b85736

bendichter requested changes Dec 3, 2021

View reviewed changes

cechava added 2 commits December 6, 2021 11:27

fixed multidimensional table

e082b16

uncommented setup

8997134

removal of read_indexed_column utility function

b160bda

cechava and others added 6 commits December 9, 2021 11:20

Merge branch 'master' of https://github.com/cechava/matnwb into tutor…

13a225e

…ials_hdmf

corrected and tested multi-dimensional column tables

a4dd6bc

updated hdmf tutorial to work with rows-last convention for multidime…

0a97301

…nsional columns; wont work until PR 367 is merged

Merge branch 'master' of https://github.com/cechava/matnwb into tutor…

05d748f

…ials_hdmf

progress save

7f8744a

updated tutorials for expected behavior of addrow

6222562

cechava added 3 commits February 1, 2022 21:17

Merge branch 'master' of https://github.com/cechava/matnwb into tutor…

a069a5b

…ials_hdmf

updated tutorial for multidimensional columns

9c7bf8a

uncomment setup

ceb23ec

cechava requested a review from bendichter February 2, 2022 10:18

bendichter added 2 commits February 16, 2022 19:56

change name to dynamic_tables.mlx

56be015

add html

c83da30

change title add link to README

bendichter merged commit 412290d into NeurodataWithoutBorders:master Feb 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tutorial covering DynamicTable basics #320

tutorial covering DynamicTable basics #320

cechava commented Oct 14, 2021 •

edited

Loading

codecov bot commented Oct 14, 2021 •

edited

Loading

bendichter left a comment

cechava commented Oct 14, 2021

bendichter commented Oct 14, 2021 via email

cechava commented Oct 14, 2021 •

edited

Loading

cechava commented Oct 14, 2021 •

edited

Loading

cechava commented Oct 14, 2021

bendichter commented Oct 14, 2021

bendichter left a comment

cechava commented Dec 6, 2021

bendichter commented Dec 6, 2021 via email

cechava commented Dec 6, 2021

bendichter commented Dec 7, 2021

bendichter commented Dec 7, 2021

cechava commented Dec 7, 2021

cechava commented Dec 7, 2021 •

edited

Loading

lawrence-mbf commented Dec 7, 2021

cechava commented Dec 7, 2021

cechava commented Dec 7, 2021

cechava commented Jan 13, 2022 •

edited

Loading

cechava commented Feb 2, 2022

tutorial covering DynamicTable basics #320

tutorial covering DynamicTable basics #320

Conversation

cechava commented Oct 14, 2021 • edited Loading

codecov bot commented Oct 14, 2021 • edited Loading

Codecov Report

bendichter left a comment

Choose a reason for hiding this comment

cechava commented Oct 14, 2021

bendichter commented Oct 14, 2021 via email

cechava commented Oct 14, 2021 • edited Loading

cechava commented Oct 14, 2021 • edited Loading

cechava commented Oct 14, 2021

bendichter commented Oct 14, 2021

bendichter left a comment

Choose a reason for hiding this comment

cechava commented Dec 6, 2021

bendichter commented Dec 6, 2021 via email

cechava commented Dec 6, 2021

bendichter commented Dec 7, 2021

bendichter commented Dec 7, 2021

cechava commented Dec 7, 2021

cechava commented Dec 7, 2021 • edited Loading

lawrence-mbf commented Dec 7, 2021

cechava commented Dec 7, 2021

cechava commented Dec 7, 2021

cechava commented Jan 13, 2022 • edited Loading

cechava commented Feb 2, 2022

cechava commented Oct 14, 2021 •

edited

Loading

codecov bot commented Oct 14, 2021 •

edited

Loading

cechava commented Oct 14, 2021 •

edited

Loading

cechava commented Oct 14, 2021 •

edited

Loading

cechava commented Dec 7, 2021 •

edited

Loading

cechava commented Jan 13, 2022 •

edited

Loading