Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataOut hdf5 does not support large files #13345

Closed
tjhei opened this issue Feb 7, 2022 · 6 comments
Closed

DataOut hdf5 does not support large files #13345

tjhei opened this issue Feb 7, 2022 · 6 comments

Comments

@tjhei
Copy link
Member

tjhei commented Feb 7, 2022

It looks like hdf5 is not careful with using 64 bit integers for global data like number of cells:

// Compute the global total number of nodes/cells and determine the offset of
// the data for this process
unsigned int global_node_cell_count[2] = {0, 0};
unsigned int global_node_cell_offsets[2] = {0, 0};
# ifdef DEAL_II_WITH_MPI
ierr = MPI_Allreduce(local_node_cell_count,
global_node_cell_count,
2,
MPI_UNSIGNED,
MPI_SUM,
comm);
AssertThrowMPI(ierr);
ierr = MPI_Exscan(local_node_cell_count,
global_node_cell_offsets,
2,
MPI_UNSIGNED,
MPI_SUM,
comm);
AssertThrowMPI(ierr);

The fix should be pretty easy as the hdf5 types like hsize_t are 64 bit but this needs testing.

@bangerth
Copy link
Member

bangerth commented Feb 8, 2022

In this one place, one could also quite easily replace the MPI_Allreduce by an MPI_Iallreduce so that the two operations can run in parallel.

@tjhei
Copy link
Member Author

tjhei commented Feb 8, 2022

so that the two operations can run in parallel.

Not sure it is worth it to maybe save 1e-5s considering we are doing something that is likely much, much slower right after. I would like to get it working correctly first. :-)

@singima I will let you work on this issue next...

@bangerth
Copy link
Member

bangerth commented Feb 8, 2022

Absolutely, get it working right first :-)

@tjhei
Copy link
Member Author

tjhei commented May 10, 2022

#13626

@tjhei
Copy link
Member Author

tjhei commented May 10, 2022

@zjiaqi2018 and @singima together tested the output and got a ~ 200 GB .h5 file:

not found
[jiaqi2@node0029 bin]$ ./h5dump -H  /scratch1/jiaqi2/step-40_edit/solution00006.h5
HDF5 "/scratch1/jiaqi2/step-40_edit/solution00006.h5" {
GROUP "/" {
   DATASET "cells" {
      DATATYPE  H5T_STD_U32LE
      DATASPACE  SIMPLE { ( 4290361600, 4 ) / ( 4290361600, 4 ) }
   }
   DATASET "nodes" {
      DATATYPE  H5T_IEEE_F64LE
      DATASPACE  SIMPLE { ( 4321061764, 2 ) / ( 4321061764, 2 ) }
   }
   DATASET "subdomain" {
      DATATYPE  H5T_IEEE_F64LE
      DATASPACE  SIMPLE { ( 4321061764, 1 ) / ( 4321061764, 1 ) }
   }
   DATASET "u" {
      DATATYPE  H5T_IEEE_F64LE
      DATASPACE  SIMPLE { ( 4321061764, 1 ) / ( 4321061764, 1 ) }
   }
}
}

with a correct .xdmf:

<?xml version="1.0" ?>
<!DOCTYPE Xdmf SYSTEM "Xdmf.dtd" []>
<Xdmf Version="2.0">
  <Domain>
    <Grid Name="CellTime" GridType="Collection" CollectionType="Temporal">
      <Grid Name="mesh" GridType="Uniform">
        <Time Value="0"/>
        <Geometry GeometryType="XY">
          <DataItem Dimensions="4321061764 2" NumberType="Float" Precision="8" Format="HDF">
            solution00006.h5:/nodes
          </DataItem>
        </Geometry>
        <Topology TopologyType="Quadrilateral" NumberOfElements="4290361600">
          <DataItem Dimensions="4290361600 4" NumberType="UInt" Format="HDF">
            solution00006.h5:/cells
          </DataItem>
        </Topology>
        <Attribute Name="subdomain" AttributeType="Scalar" Center="Node">
          <DataItem Dimensions="4321061764 1" NumberType="Float" Precision="8" Format="HDF">
            solution00006.h5:/subdomain
          </DataItem>
        </Attribute>
        <Attribute Name="u" AttributeType="Scalar" Center="Node">
          <DataItem Dimensions="4321061764 1" NumberType="Float" Precision="8" Format="HDF">
            solution00006.h5:/u
          </DataItem>
        </Attribute>
      </Grid>
    </Grid>
  </Domain>
</Xdmf>

@tjhei
Copy link
Member Author

tjhei commented May 10, 2022

This now works correctly, thanks guys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants