Wrap a subset of the libhdf5 table interface methods #777

musm · 2020-12-10T18:58:31Z

Methods from #511

musm · 2020-12-10T23:01:40Z

src/api_helpers.jl

+
+function h5tb_get_field_info(loc_id, table_name)
+    nfields, = h5tb_get_table_info(loc_id, table_name)
+    field_names = Vector{UInt8}[fill(0x00,255) for i in 1:2]


I don't know if there is any way to determine the length of these.
In https://confluence.hdfgroup.org/display/HDF5/H5TB_GET_FIELD_INFO they just allocate 255 sized buffer.

src/api_helpers.jl

musm · 2020-12-10T23:43:16Z

src/api_helpers.jl

+    field_sizes = Vector{Csize_t}(undef,nfields)
+    field_offsets = Vector{Csize_t}(undef,nfields)
+    type_size = Ref{Csize_t}()
+    while true


To work around the issue of determing the string buffer size, here I just initially arbitrary set it to 64 then we keep doubling until it until the string fits (detected through presence of a null byte)

Unfortunately, I don't think this is valid/safe: https://bitbucket.hdfgroup.org/projects/HDFFV/repos/hdf5/browse/hl/src/H5TB.c?at=refs%2Fheads%2Fhdf5_1_10#3092 libhdf5 just does a strcpy, so for a long name, it is overflowing the buffer.

Right, which is why if it doesn't fit, we resize the buffer and then check if there's a 0x00 in the buffer which would indicate it was copied correctly. If it didn't fit it would overflow the buffer and no null byte would be detected, in which case we increase the buffer size.

My local testing, where I set the fieldname to be a long string, showed that this strategy worked.

But overflowing means you're overwriting other bytes in memory — this is a classic buffer overflow attack vector.

It appears this is just a wrapper over datatype querying functions, so unfortunately the safe thing to do might be to just pass C_NULL pointer for the names (which the function implementation shows should just skip over filling), and then to do the loop of h5t_get_member_name over the table dataset's datatype ourselves.

Hmm, good point. I don't know what we can do other than perhaps loop through and call H5Tget_member_name to manually determine the required buffer size? At which point we're just reimplementing this function?

Heck even their example is also susceptible to overflow:
https://confluence.hdfgroup.org/display/HDF5/H5TB_GET_FIELD_INFO

Yeah, I'm kind of disappointed with both the documentation and implementation of this function :-\

Ok I updated this PR to avoid the buffer attack, by computing the column names via h5t_get_member_name

musm · 2020-12-10T23:44:05Z

This is now ready for review.

… buffer overflow

musm · 2020-12-13T17:48:59Z

Sans objections I plan to merge soon

src/api_helpers.jl

Co-authored-by: jmert <2965436+jmert@users.noreply.github.com>

musm added 2 commits December 10, 2020 13:57

Wrap several table interface functions

10aa29c

High level wrappers and tests

bcda026

musm force-pushed the tableint branch from 8f16943 to bcda026 Compare December 10, 2020 22:55

musm mentioned this pull request Dec 10, 2020

wrap a subset of H5TB #511

Closed

musm commented Dec 10, 2020

View reviewed changes

musm added 3 commits December 10, 2020 18:21

Handle column names through resizing

2dc4327

Finish tests

c314d42

Fix up x86 test because the HDF5 docs are lying to us

b439d67

musm commented Dec 10, 2020

View reviewed changes

src/api_helpers.jl Outdated Show resolved Hide resolved

musm commented Dec 10, 2020

View reviewed changes

Manually compute column names, in order to avoid libhdf5 triggering a…

a2b4e56

… buffer overflow

musm changed the title ~~Wrap several table interface functions~~ Wrap a subset of the libhdf5 table interface methods Dec 11, 2020

musm added 2 commits December 11, 2020 17:05

Update api_helpers.jl

b3b3573

Update api_helpers.jl

128980d

jmert reviewed Dec 14, 2020

View reviewed changes

src/api_helpers.jl Outdated Show resolved Hide resolved

Update src/api_helpers.jl

e38548a

Co-authored-by: jmert <2965436+jmert@users.noreply.github.com>

musm merged commit bbac593 into JuliaIO:master Dec 14, 2020

musm deleted the tableint branch December 14, 2020 15:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrap a subset of the libhdf5 table interface methods #777

Wrap a subset of the libhdf5 table interface methods #777

musm commented Dec 10, 2020 •

edited

Loading

musm Dec 10, 2020

musm Dec 10, 2020

jmert Dec 11, 2020 •

edited

Loading

musm Dec 11, 2020 •

edited

Loading

jmert Dec 11, 2020

jmert Dec 11, 2020

musm Dec 11, 2020

jmert Dec 11, 2020

musm Dec 11, 2020

musm commented Dec 10, 2020

musm commented Dec 13, 2020 •

edited

Loading

Wrap a subset of the libhdf5 table interface methods #777

Wrap a subset of the libhdf5 table interface methods #777

Conversation

musm commented Dec 10, 2020 • edited Loading

musm Dec 10, 2020

Choose a reason for hiding this comment

musm Dec 10, 2020

Choose a reason for hiding this comment

jmert Dec 11, 2020 • edited Loading

Choose a reason for hiding this comment

musm Dec 11, 2020 • edited Loading

Choose a reason for hiding this comment

jmert Dec 11, 2020

Choose a reason for hiding this comment

jmert Dec 11, 2020

Choose a reason for hiding this comment

musm Dec 11, 2020

Choose a reason for hiding this comment

jmert Dec 11, 2020

Choose a reason for hiding this comment

musm Dec 11, 2020

Choose a reason for hiding this comment

musm commented Dec 10, 2020

musm commented Dec 13, 2020 • edited Loading

musm commented Dec 10, 2020 •

edited

Loading

jmert Dec 11, 2020 •

edited

Loading

musm Dec 11, 2020 •

edited

Loading

musm commented Dec 13, 2020 •

edited

Loading