Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

writing an armadillo matrix produces a transposed matrix in h5 #76

Open
scontini76 opened this issue Jan 10, 2022 · 2 comments
Open

writing an armadillo matrix produces a transposed matrix in h5 #76

scontini76 opened this issue Jan 10, 2022 · 2 comments

Comments

@scontini76
Copy link

Consider this code:

int main()
{
    arma::mat M = arma::linspace(1, 10, 10);
    M.reshape(2, 5);

    std::cout << "my M shape is:"
              << "\n";

    M.brief_print();
    auto h = h5::create("test.h5", H5F_ACC_TRUNC);
    h5::write(h, "/test", M);

    std::cout << "reading back..."
              << "\n";

    M = h5::read<arma::mat>("test.h5", "/test");
    std::cout << "my M shape is:"
              << "\n";

    M.brief_print();

    return 0;
}

That give this output:

my M shape is:
[matrix size: 2x5]
    1.0000    3.0000    5.0000    7.0000    9.0000
    2.0000    4.0000    6.0000    8.0000   10.0000
reading back...
my M shape is:
[matrix size: 5x2]
    1.0000    6.0000
    2.0000    7.0000
    3.0000    8.0000
    4.0000    9.0000
    5.0000   10.0000

That looks wrong in h5dump:

HDF5 "test.h5" {
GROUP "/" {
   DATASET "test" {
      DATATYPE  H5T_IEEE_F64LE
      DATASPACE  SIMPLE { ( 5, 2 ) / ( 5, 2 ) }
      DATA {
      (0,0): 1, 2,
      (1,0): 3, 4,
      (2,0): 5, 6,
      (3,0): 7, 8,
      (4,0): 9, 10
      }
   }
}
}

What I expected is:

my M shape is:
[matrix size: 2x5]
    1.0000    3.0000    5.0000    7.0000    9.0000
    2.0000    4.0000    6.0000    8.0000   10.0000
reading back...
my M shape is:
[matrix size: 2x5]
    1.0000    3.0000    5.0000    7.0000    9.0000
    2.0000    4.0000    6.0000    8.0000   10.0000

Which is fine in h5dump too:

HDF5 "test.h5" {
GROUP "/" {
   DATASET "test" {
      DATATYPE  H5T_IEEE_F64LE
      DATASPACE  SIMPLE { ( 2, 5 ) / ( 2, 5 ) }
      DATA {
      (0,0): 1, 2, 3, 4, 5,
      (1,0): 6, 7, 8, 9, 10
      }
   }
}
}

To have what I expected I must revert the commit "arma matrix rows/cols swapped to correct C/Fortran order mismatch" (be0a191)

Can you confirm that something is wrong in the library or I'm doing something stupid? :)

@steven-varga
Copy link
Owner

Great find! Hmmm there is more to this, get to it in a sec... The top one shouldn't be happening: write - read == 0 whereas the h5dump case is not quite straightforward.

The first case is a clear bug. Thanks for letting me know! The second one is complicated: Linear algebra libraries mark the layout with transpose flag but HDF5 doesn't have this feature, leading to:

  1. give up zero copy (performance) and copy each dataset from col major to row major. That is lot's of copying! Considering that the majority of scientific platforms are colmajor. In cases where not enough memory is left then it will fail
  2. as a feature request: convince the HDFGroup to add a flag marking transpose

Will take a closer look when I can, until I am adding @gheber to the conversation.
best:
steve

@scontini76
Copy link
Author

Ok, let me know if you need some tests from me.
Then I wish HDFGroup consider to add a transpose flag!
In the meanwhile I'll transpose the data before writing them, or I'll do the trick of reverting be0a191

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants