Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for overflow when calculating on-disk attribute data size #2459

Merged
merged 3 commits into from Mar 2, 2023

Conversation

e4t
Copy link
Contributor

@e4t e4t commented Feb 11, 2023

A bogus hdf5 file may contain dataspace messages with sizes which lead to the on-disk data sizes to exceed what is addressable. When calculating the size, make sure, the multiplication does not
overflow.
The test case was crafted in a way that the overflow caused the size to be 0.

This fixes CVE-2021-37501 / Bug #2458.

Also remove duplicate code.

e4t added a commit to e4t/hdf5 that referenced this pull request Feb 13, 2023
…roup#2459)

A bogus hdf5 file may contain dataspace messages with sizes
which lead to the on-disk data sizes to exceed what is addressable.
When calculating the size, make sure, the multiplication does not
overflow.
The test case was crafted in a way that the overflow caused the
size to be 0.

This fixes CVE-2021-37501 / Bug HDFGroup#2458.

Signed-off-by: Egbert Eich <eich@suse.com>
@e4t e4t force-pushed the CVE-2021-37501-develop branch 3 times, most recently from 29e4e67 to 0b9d9bf Compare February 13, 2023 11:06
src/H5private.h Outdated
* A macro for checking whether a multiplication has overflown
* r is assumed to be the result of a prior multiplication of a and b
*/
#define H5_CHECK_MUL_OVERFLOW(r, a, b, err) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this macro be surrounded by "#ifndef NDEBUG" like the rest of the macros?
If not then maybe it needs to be a different kind of macro - do we have a section for runtime NDEBUG type macros?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@byrnHDF

Shouldn't this macro be surrounded by "#ifndef NDEBUG" like the rest of the macros? If not then maybe it needs to be a different kind of macro - do we have a section for runtime NDEBUG type macros?

I don't think so. This is not mean for debugging. It is meant to catch 'bogus' HDF5 files which were not created by HDF5 itself. All the CVEs reported for HDF5 so far had hand crafted reproducers, not ones that have been generated with HDF5 itself.

With debugging enabled, I'm sure a lot of bogus files will be identified, but debugging does a lot more and slows down execution even more.

In some environments where HDF5 is used it may be irrelevant to check for bogus and possible malicious data, but in general one may want to have enough validation in place to make sure invalid files are caught.

Now, the problem at hand seems to be rather benign and I don't believe that this overflow can be used for a serious exploit (the reproducer leads to an invalid heap access on read). Also, I'm unable to foresee any way it may. However, people trying to target certain institutions will spend the effort to look for such ways. The fact that HDF5 has caught the attention of folks researching potential security vulnerabilities - I've counted > 60 CVEs over the past 6 years - may suggest that the 'bad guys' could do so likewise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the macro location, if there is a better place to put it, let me know, I'd be happy to move it there. I've created a macro instead of coding it directly as I assumed it may be useful in other places as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the answer, I just noticed something not like the others. We will need a library dev to answer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest a fix in place in H5O__attr_decode() of H5Oattr.c as follows:

  • if(ds_size != 0 && ((attr->shared->data_size/ds_size) != dt_size))
  •    HGOTO_ERROR(H5E_RESOURCE, H5E_OVERFLOW, NULL, "data size exceeds addressable range")
    

Note that dt_size cannot be zero, if so, it will flag as error few lines back:

if (0 == (dt_size = H5T_get_size(attr->shared->dt)))
    HGOTO_ERROR(H5E_ATTR, H5E_CANTGET, NULL, "unable to get datatype size")

With the above fix, h5dump output on the corrupted file will be something like this:

GROUP "input_1" {h5dump error: error getting attribute information

}

I also suggest to add a test to verify the fix with the corrupted file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had created a more generic macro in the hope it may be useful in other cases as well. But it can certainly be done explicitly. If we do so and since we already know that dt_sizew != 0, we could take it as the divisor and simply do:

if ((attr->shared->data_size/dt_size) != ds_size)
       HGOTO_ERROR(...)

bmwiedemann pushed a commit to bmwiedemann/openSUSE that referenced this pull request Feb 17, 2023
https://build.opensuse.org/request/show/1066251
by user eeich + dimstar_suse
- Fix CVE-2021-37501 - overflow in calculation of data buffer due to bogus
  input file (bsc#1207973).
  HDFGroup/hdf5#2458
  HDFGroup/hdf5#2459
  Check-for-overflow-when-calculating-on-disk-attribute-data-size-2459.patch
  Remove-duplicate-code.patch (forwarded request 1066178 from eeich)
Signed-off-by: Egbert Eich <eich@suse.com>
e4t added a commit to e4t/hdf5 that referenced this pull request Feb 25, 2023
…roup#2459)

A bogus hdf5 file may contain dataspace messages with sizes
which lead to the on-disk data sizes to exceed what is addressable.
When calculating the size, make sure, the multiplication hasn't
overflown.
The test case was crafted in a way that the overflow caused the
size to be 0.

This fixes CVE-2021-37501 / Bug HDFGroup#2458.

Signed-off-by: Egbert Eich <eich@suse.com>
Co-authored-by: Allen Byrne <byrn@hdfgroup.org>
e4t added a commit to e4t/hdf5 that referenced this pull request Feb 25, 2023
…roup#2459)

A bogus hdf5 file may contain dataspace messages with sizes
which lead to the on-disk data sizes to exceed what is addressable.
When calculating the size, make sure, the multiplication hasn't
overflown.
The test case was crafted in a way that the overflow caused the
size to be 0.

This fixes CVE-2021-37501 / Bug HDFGroup#2458.

Signed-off-by: Egbert Eich <eich@suse.com>
Co-authored-by: Allen Byrne <byrn@hdfgroup.org>
e4t and others added 2 commits February 27, 2023 09:14
Bogus sizes in this test case causes the on-disk data size
calculation in H5O__attr_decode() to overflow so that the
calculated size becomes 0. This causes the read to overflow
and h5dump to segfault.
This test case was crafted, the test file was not directly
generated by HDF5.
Test case from:
https://github.com/ST4RF4LL/Something_Found/blob/main/HDF5_v1.13.0_h5dump_heap_overflow.md

Signed-off-by: Egbert Eich <eich@suse.com>
…roup#2459)

A bogus hdf5 file may contain dataspace messages with sizes
which lead to the on-disk data sizes to exceed what is addressable.
When calculating the size, make sure, the multiplication hasn't
overflown.
The test case was crafted in a way that the overflow caused the
size to be 0.

This fixes CVE-2021-37501 / Bug HDFGroup#2458.

Signed-off-by: Egbert Eich <eich@suse.com>
Co-authored-by: Allen Byrne <byrn@hdfgroup.org>
@lrknox lrknox merged commit b16ec83 into HDFGroup:develop Mar 2, 2023
@e4t
Copy link
Contributor Author

e4t commented Mar 2, 2023

Thanks for merging!
@byrnHDF , @vchoi-hdfgroup , @derobins , @lrknox would you like me to add reproducers for CVEs we had fixed earlier as well? I can give it a go, but would only do so if there is interest.

@lrknox lrknox added merge to 1.12 Merge - To 1.14 This needs to be merged to HDF5 1.14 labels Mar 2, 2023
lrknox pushed a commit to lrknox/hdf5 that referenced this pull request Mar 17, 2023
…roup#2459)

* Remove duplicate code

Signed-off-by: Egbert Eich <eich@suse.com>

* Add test case for CVE-2021-37501

Bogus sizes in this test case causes the on-disk data size
calculation in H5O__attr_decode() to overflow so that the
calculated size becomes 0. This causes the read to overflow
and h5dump to segfault.
This test case was crafted, the test file was not directly
generated by HDF5.
Test case from:
https://github.com/ST4RF4LL/Something_Found/blob/main/HDF5_v1.13.0_h5dump_heap_overflow.md
lrknox pushed a commit to lrknox/hdf5 that referenced this pull request Mar 17, 2023
…roup#2459)

* Remove duplicate code

Signed-off-by: Egbert Eich <eich@suse.com>

* Add test case for CVE-2021-37501

Bogus sizes in this test case causes the on-disk data size
calculation in H5O__attr_decode() to overflow so that the
calculated size becomes 0. This causes the read to overflow
and h5dump to segfault.
This test case was crafted, the test file was not directly
generated by HDF5.
Test case from:
https://github.com/ST4RF4LL/Something_Found/blob/main/HDF5_v1.13.0_h5dump_heap_overflow.md
derobins added a commit that referenced this pull request Mar 18, 2023
* Elaborate how cd_values get stored (#2522)

* Enclose MESG in do...while loop (#2576)

Enclose MSG macro in a do...while loop

* Add a clang-format comment about permissions (#2577)

* Check for overflow when calculating on-disk attribute data size (#2459)

* Remove duplicate code

Signed-off-by: Egbert Eich <eich@suse.com>

* Add test case for CVE-2021-37501

Bogus sizes in this test case causes the on-disk data size
calculation in H5O__attr_decode() to overflow so that the
calculated size becomes 0. This causes the read to overflow
and h5dump to segfault.
This test case was crafted, the test file was not directly
generated by HDF5.
Test case from:
https://github.com/ST4RF4LL/Something_Found/blob/main/HDF5_v1.13.0_h5dump_heap_overflow.md

---------

Co-authored-by: Mark (he/his) C. Miller <miller86@llnl.gov>
Co-authored-by: glennsong09 <43005495+glennsong09@users.noreply.github.com>
Co-authored-by: Dana Robinson <43805+derobins@users.noreply.github.com>
Co-authored-by: Egbert Eich <eich@suse.com>
lrknox pushed a commit to lrknox/hdf5 that referenced this pull request Mar 19, 2023
…roup#2459)

* Remove duplicate code

Signed-off-by: Egbert Eich <eich@suse.com>

* Add test case for CVE-2021-37501

Bogus sizes in this test case causes the on-disk data size
calculation in H5O__attr_decode() to overflow so that the
calculated size becomes 0. This causes the read to overflow
and h5dump to segfault.
This test case was crafted, the test file was not directly
generated by HDF5.
Test case from:
https://github.com/ST4RF4LL/Something_Found/blob/main/HDF5_v1.13.0_h5dump_heap_overflow.md
@Mingli-Yu
Copy link

Does the change will merge to 1.8 branch or does the hdf5 1.8.21 have the issue? Thanks!

lrknox added a commit that referenced this pull request Apr 4, 2023
* Enclose MESG in do...while loop (#2576)

Enclose MSG macro in a do...while loop

* Add a clang-format comment about permissions (#2577)

* Remove an obsolete comment from the MDS test (#2578)

The seed is now broadcast from rank 0, so the warning about multiple
machines having different seeds is unnecessary.

* Subfiling VFD - fix issues with I/O concentrator selection strategies (#2571)

Fix multiple bugs with the SELECT_IOC_EVERY_NTH_RANK and
SELECT_IOC_TOTAL I/O concentrator selection strategies and add a
regression test for them

* Check for overflow when calculating on-disk attribute data size (#2459)

* Remove duplicate code

Signed-off-by: Egbert Eich <eich@suse.com>

* Add test case for CVE-2021-37501

Bogus sizes in this test case causes the on-disk data size
calculation in H5O__attr_decode() to overflow so that the
calculated size becomes 0. This causes the read to overflow
and h5dump to segfault.
This test case was crafted, the test file was not directly
generated by HDF5.
Test case from:
https://github.com/ST4RF4LL/Something_Found/blob/main/HDF5_v1.13.0_h5dump_heap_overflow.md

---------

Co-authored-by: glennsong09 <43005495+glennsong09@users.noreply.github.com>
Co-authored-by: Dana Robinson <43805+derobins@users.noreply.github.com>
Co-authored-by: jhendersonHDF <jhenderson@hdfgroup.org>
Co-authored-by: Egbert Eich <eich@suse.com>
brtnfld pushed a commit to brtnfld/hdf5 that referenced this pull request May 17, 2023
…roup#2459)

* Remove duplicate code

Signed-off-by: Egbert Eich <eich@suse.com>

* Add test case for CVE-2021-37501

Bogus sizes in this test case causes the on-disk data size
calculation in H5O__attr_decode() to overflow so that the
calculated size becomes 0. This causes the read to overflow
and h5dump to segfault.
This test case was crafted, the test file was not directly
generated by HDF5.
Test case from:
https://github.com/ST4RF4LL/Something_Found/blob/main/HDF5_v1.13.0_h5dump_heap_overflow.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Merge - To 1.14 This needs to be merged to HDF5 1.14
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

6 participants