Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H5Easy support for complex integral type #828

Merged
merged 6 commits into from Oct 19, 2023

Conversation

unbtorsten
Copy link
Contributor

@unbtorsten unbtorsten commented Oct 17, 2023

Description

Among other functionality, H5Easy allows reading and writing of vectors of standard types.
However, an assertion prevents use of vector of complex integral numbers for H5Easy::load and H5Easy::dump.

Example:

std::vector<std::complex<int16_t>> ds;
ds = H5Easy::load<std::vector<std::complex<int16_t>>>(const File& file, const std::string& path)

H5Easy::dump(const File& file, const std::string& path, ds)

This pull request modifies the the assertion to allow integral types and enable H5Easy support for complex integral type.

How to test this?
I have tested the modified code in a non-public application for complex<int16_t>. I have not tested other cases at this time, but would be happy to extend HighFive test's where appropriate. I am not familiar with the development of this library at this time. Furthermore, this pull requests extends HighFive's unit tests.

Test System

  • OS: Debian 12
  • Compiler: gcc 12.2.0
  • (Using hdf5 1.10.8)

@1uc
Copy link
Collaborator

1uc commented Oct 18, 2023

Thank you for your PR. What makes you confident that std::complex supports integral types?

Reading cppreference I find:

T the type of the real and imaginary parts. The behavior is unspecified (and may fail to compile) if T is not a cv-unqualified standard (until C++23) floating-point type and undefined if T is not NumericType.

https://en.cppreference.com/w/cpp/numeric/complex
(Edit: incorrect link.)

@unbtorsten
Copy link
Contributor Author

unbtorsten commented Oct 18, 2023

Good point! I am using the complex type in a context where I do not have to worry (and was not thinking) about operations on the integral types. These can indeed yield unexpected behaviour, if not done right, i.e. may have to return floating point types even for an integral input type.

For example, considering the documentation at

it seems as if abs and arg can handle the integers, but polar cannot.

That said, I think that it is up to a programmer to consider the consequences of working with integral types. A compiler may intervene in case of undefined functions or operators. I do not see that the scope of HighFive, i.e. storage and retrieval of data, would be affected by these constraints.

@1uc
Copy link
Collaborator

1uc commented Oct 18, 2023

I agree that HighFive doesn't need have an opinion on whether std::complex<int> is supported.

Next, I think we should find out how widely supported std::complex<int> is by adding unit-tests that a) perform a write-read cycle; b) check that the real and imaginary parts got serialized as the appropriate type.

@unbtorsten
Copy link
Contributor Author

unbtorsten commented Oct 18, 2023

Could a corresponding unit test be derived from the test at https://github.com/BlueBrain/HighFive/blob/master/tests/unit/tests_high_five_easy.cpp#L79 ?

I was unable to run the unit tests as currently available from the master branch using the script https://github.com/BlueBrain/HighFive/blob/master/tests/test_project_integration.sh. What am I missing here? How do I run the unit tests?

@1uc
Copy link
Collaborator

1uc commented Oct 18, 2023

Yes, that seems like a starting point. I usually use cmake and ctest directly:

cmake -B build-test .
cmake --build build-test --parallel && ctest --test-dir build-test  

If you want to turn off Boost (which is On by default) you add -DHIGHFIVE_USE_BOOST=Off to the first line. Depending on your 'luck' (system) you might need to cd build-test and do the ctest there.

You probably also want to add --output-on-failure, --stop-on-failure or --rerun-failed to the ctest command. You can filter what's run with -E <regex>.

@unbtorsten
Copy link
Contributor Author

unbtorsten commented Oct 18, 2023

Pull request now includes tests for complex types. All tests pass:

100% tests passed, 0 tests failed out of 282

Datatypes are verified to be correct using h5py 3.7.0 on Debian 12:

In [1]: hf = h5.File('h5easy_scalar.h5', 'r')

In [2]: hf['path/to/a'].dtype
Out[2]: dtype('<f8')

In [3]: hf['path/to/b'].dtype
Out[3]: dtype('int32')

In [4]: hf['path/to/c'].dtype
Out[4]: dtype('O')

In [5]: hf['path/to/d'].dtype
Out[5]: dtype('complex128')

In [6]: hf['path/to/e'].dtype
Out[6]: dtype([('r', '<i4'), ('i', '<i4')])

In [7]: hf = h5.File('h5easy_vector1d.h5', 'r')

In [8]: hf['path/to/a'].dtype
Out[8]: dtype('uint64')

In [9]: hf['path/to/b'].dtype
Out[9]: dtype('complex128')

In [10]: hf['path/to/c'].dtype
Out[10]: dtype([('r', '<i4'), ('i', '<i4')])

HDF compass 0.7 correctly displays the complex integral quantities as compound type with components 'r' and 'i', albeit it does not display the numbers as complex quantities. vitables 3.0.2 refuses to deal with the datatype. However, this is an issue with the implementation in the respective projects and not related to hdf5 or HighFive.

@codecov
Copy link

codecov bot commented Oct 18, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (0e8d63b) 84.62% compared to head (cadb79b) 84.68%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #828      +/-   ##
==========================================
+ Coverage   84.62%   84.68%   +0.06%     
==========================================
  Files          68       68              
  Lines        4792     4812      +20     
==========================================
+ Hits         4055     4075      +20     
  Misses        737      737              
Files Coverage Δ
include/highfive/bits/H5DataType_misc.hpp 79.59% <ø> (ø)
tests/unit/tests_high_five_easy.cpp 100.00% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@1uc
Copy link
Collaborator

1uc commented Oct 19, 2023

I see you've had your share of compiler fun.

HDF compass 0.7 correctly displays the complex integral quantities as compound type with components 'r' and 'i', albeit it does not display the numbers as complex quantities. vitables 3.0.2 refuses to deal with the datatype.

From what I can tell HDF5 doesn't have a complex datatype (probably because C doesn't have one). Hence, projects need to make their own convention, e.g. via a compound datatype. What HighFive does seems to match what h5py does. When converting to a Numpy array, it uses the existing numpy complex types. It seems numpy only support floating-point complex numbers, at least I can spot integer-based floating point numbers:
https://numpy.org/doc/stable/user/basics.types.html

This probably explains why vitables (which probably uses h5py and numpy) doesn't know what to do with these datatypes.

@unbtorsten given the limitation this faces in Python, does the above still solve your problem?

@1uc
Copy link
Collaborator

1uc commented Oct 19, 2023

In [1]: import h5py
In [2]: import numpy as np

In [3]: with h5py.File("h5easy_scalar.h5", "r") as h5:
   ...:     x = np.array(h5["/path/to/e"])
   ...: 

In [4]: x
Out[4]: array((12345, -54321), dtype=[('r', '<i4'), ('i', '<i4')])

In [5]: x.dtype
Out[5]: dtype([('r', '<i4'), ('i', '<i4')])

In [6]: with h5py.File("h5easy_scalar.h5", "r") as h5:
   ...:     y = np.array(h5["/path/to/d"])
   ...: 

In [7]: y
Out[7]: array(1.2345-5.4321j)

In [8]: y.dtype
Out[8]: dtype('complex128')

@unbtorsten
Copy link
Contributor Author

unbtorsten commented Oct 19, 2023

This probably explains why vitables (which probably uses h5py and numpy) doesn't know what to do with these datatypes.

Indeed. I remember the struggles of early vitables (version 2.x?) before it implemented the complex-convention of a r/i-compound..

given the limitation this faces in Python, does the above still solve your problem?

It does, and it does so reliably. The data storage has been working without issues since I amended HighFive. h5py (v3.7.0) readily reads in the compound as complex128 floating point in case of vectors. For scalars, numbers can be casted without hurdle.

Copy link
Collaborator

@1uc 1uc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the nice PR.

@1uc 1uc merged commit 4d68087 into BlueBrain:master Oct 19, 2023
35 checks passed
@unbtorsten unbtorsten deleted the complex-integral-type branch October 19, 2023 13:29
@unbtorsten
Copy link
Contributor Author

My pleasure and the least I could do. This project has allowed me to integrate hdf5 very efficiently. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants