Single precision ParaView output #658

Thomas-Ulrich · 2022-08-26T14:51:00Z

In #649, @sebwolf-de improved checkpointing for single precision by allowing to write the checkpointed data either in double or in float precision (introducing HDF_C_REAL in the hdf5 writer).
We could apply the same recipe and write ParaView output in single or double precision.
This would nevertheless require some change in the xdmfwriter.
(Overall, I would always save storage and write output in single precision).

Thomas-Ulrich · 2023-06-20T12:49:16Z

I double-checked and the ParaView outputs are written in the datatype used in the simulation.
That is single precision simulations get written in float, double in double.
(And that means to write the dataset in float when computing with double, we would only require a cast and changing the type of the xdmfwriter template).

But for datasets with a limited number of cells, e.g. fault or surface output, the output size will not be decreased by a factor of 2 in single precision, because of the large alignment and block size used.
On NG, we have a disc block size of:

di73yeq4@login03:/hppfs/work/pr45fi/di73yeq4/Examples/tpv12_13> stat -fc %s .
16777216

which might have motivated:

export XDMFWRITER_ALIGNMENT=8388608
export XDMFWRITER_BLOCK_SIZE=8388608

Looking at the surface output of the latest Turkey simulation (double precision) we can see the used block size leads to 1.3 larger datafile than expected

      <DataItem NumberType="UInt" Precision="4" Format="XML" Dimensions="3 2">0 0 1 1 1 2466245</DataItem>
      <DataItem NumberType="Float" Precision="8" Format="Binary" Dimensions="1 3145728">Turkey_ext4_o6_el_ev1-surface_cell/mesh0/v1.bin</DataItem>

(3145728*8= 3 * XDMFWRITER_BLOCK_SIZE), 3=ceil(2466245 * 8/XDMFWRITER_BLOCK_SIZE).

If we were writing in single precision, the dataset would be written on n = ceil(2466245*4/8388608)=2 blocks, that is the output would only take 66% the size of the double output ( and not 50%). This shows the limits of writing the output in float with large blocks, and explains the potential gain of rewriting the output sequentially.

Note that on Frontera the disc block size is much smaller:

ulrich@login2:/scratch1/09160/ulrich$ stat -fc %s .
4096

I wonder if anyone has ever tried decreasing XDMFWRITER_BLOCK_SIZE in this machine.

sebwolf-de · 2023-06-20T14:18:49Z

I think, nobody has tried to tweak XDMFWRITER_BLOCK_SIZE. The 8388608 is just a magic number (although motivated) that everybody copies around.
Maybe, we can write some documentation of how to choose the optimal block size.

krenzland · 2023-06-22T07:10:36Z

Note that on Frontera the disc block size is much smaller:
ulrich@login2:/scratch1/09160/ulrich$ stat -fc %s .
4096

I think this is because Frontera uses a different file system:
https://frontera-portal.tacc.utexas.edu/user-guide/files/#striping-large-files

Not sure how to tune the writers for this one.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single precision ParaView output #658

Single precision ParaView output #658

Thomas-Ulrich commented Aug 26, 2022

Thomas-Ulrich commented Jun 20, 2023

sebwolf-de commented Jun 20, 2023

krenzland commented Jun 22, 2023

Single precision ParaView output #658

Single precision ParaView output #658

Comments

Thomas-Ulrich commented Aug 26, 2022

Thomas-Ulrich commented Jun 20, 2023

sebwolf-de commented Jun 20, 2023

krenzland commented Jun 22, 2023