Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add compression to netcdf decomp files #1407

Open
edhartnett opened this issue May 4, 2019 · 6 comments
Open

add compression to netcdf decomp files #1407

edhartnett opened this issue May 4, 2019 · 6 comments
Assignees
Projects

Comments

@edhartnett
Copy link
Collaborator

In my performance test I am seeing a decomp file of >1GB. Wow, that's big.

I can turn on compression so that when a netcdff-4 file is used for a netcdf decomp file, it can be automatically compressed.

@jedwards4b
Copy link
Contributor

You can do that, but it has a huge affect on performance and is only available with netcdf-4 in serial.

@edhartnett
Copy link
Collaborator Author

I mean compress the decomp file. The decomp file would have to be in netCDF-4, but decomp files are always written sequentially. Not sure that it would have a noticeable affect on performance, but I can test that.

@wkliao
Copy link
Contributor

wkliao commented May 6, 2019

FYI. I wrote a C program to convert decomp files to a classic NetCDF file.
dat2nc.c
See "Prepare the data decomposition file in NetCDF file format" in README for a short description.

@edhartnett
Copy link
Collaborator Author

@wkliao the format that the PIO library now outputs is also understood by the PIO library. So if you can convert to that format, you will no longer need your own conversion format.

Also I would like to see an ncdump -h of your file...

@wkliao
Copy link
Contributor

wkliao commented May 6, 2019

The purpose of my dat2nc.c is also to reduce the size of decomp files.
This program reads .dat files and coalesce the individual array elements into
offset-length pairs. An example of the nc file header is shown in README.

@edhartnett
Copy link
Collaborator Author

OK, maybe I should convert to this format. ;-)

Here's what I see in your README:

% ncmpidump -h f_case_866x72_16p.nc
  netcdf f_case_866x72_16p {
  // file format: CDF-1
  dimensions:
      num_decomp = 3 ;
      decomp_nprocs = 16 ;
      D1.total_nreqs = 47 ;
      D2.total_nreqs = 407 ;
      D3.total_nreqs = 29304 ;
  variables:
      int D1.nreqs(decomp_nprocs) ;
          D1.nreqs:description = "Number of noncontiguous requests per process" ;
      int D1.offsets(D1.total_nreqs) ;
          D1.offsets:description = "Flattened starting indices of noncontiguous requests" ;
      int D1.lengths(D1.total_nreqs) ;
          D1.lengths:description = "Lengths of noncontiguous requests" ;
      int D2.nreqs(decomp_nprocs) ;
          D2.nreqs:description = "Number of noncontiguous requests per process" ;
      int D2.offsets(D2.total_nreqs) ;
          D2.offsets:description = "Flattened starting indices of noncontiguous requests" ;
      int D2.lengths(D2.total_nreqs) ;
          D2.lengths:description = "Lengths of noncontiguous requests" ;
      int D3.nreqs(decomp_nprocs) ;
          D3.nreqs:description = "Number of noncontiguous requests per process" ;
      int D3.offsets(D3.total_nreqs) ;
          D3.offsets:description = "Flattened starting indices of noncontiguous requests" ;
      int D3.lengths(D3.total_nreqs) ;
          D3.lengths:description = "Lengths of noncontiguous requests" ;

  // global attributes:
      :command_line = "./dat2nc -o f_case_866x72_16p.nc -1 datasets/piodecomp16tasks16io01dims_ioid_514.dat -2 datasets/piodecomp16tasks16io01dims_ioid_516.dat -3 datasets/piodecomp16tasks16io02dims_ioid_548.dat " ;
      :D1.ndims = 1 ;
      :D1.dims = 866 ;
      :D1.max_nreqs = 4 ;
      :D1.min_nreqs = 2 ;
      :D2.ndims = 1 ;
      :D2.dims = 866 ;
      :D2.max_nreqs = 39 ;
      :D2.min_nreqs = 13 ;
      :D3.ndims = 2 ;
      :D3.dims = 72, 866 ;
      :D3.max_nreqs = 2808 ;
      :D3.min_nreqs = 936 ;
  }

For comparison, here's what I have now (but so far unused by anyone):


netcdf darray_no_async_decomp {
dimensions:
	dims = 2 ;
	task = 16 ;
	map_element = 4 ;
variables:
	int global_size(dims) ;
	int maplen(task) ;
	int map(task, map_element) ;

// global attributes:
		:PIO_library_version = "2.4.2" ;
		:max_maplen = 4 ;
		:title = "Example Decomposition from darray_no_async.c" ;
		:history = "This file is created by the program darray_no_async in the PIO C library" ;
		:source = "Decomposition file produced by PIO library." ;
		:array_order = "C" ;
		:backtrace = "/home/ed/tmp/ParallelIO/src/clib/.libs/libpio.so.1(pioc_write_nc_decomp_int+0x57d) [0x7f7d89452806]\n",
			"/home/ed/tmp/ParallelIO/src/clib/.libs/libpio.so.1(PIOc_write_nc_decomp+0x4a1) [0x7f7d89451e9f]\n",
			"/home/ed/tmp/ParallelIO/examples/c/.libs/darray_no_async(+0x1a81) [0x55a2eb54ea81]\n",
			"/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7f7d88bb8b97]\n",
			"/home/ed/tmp/ParallelIO/examples/c/.libs/darray_no_async(+0x100a) [0x55a2eb54e00a]\n",
			"" ;
data:

 global_size = 8, 8 ;

 maplen = 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4 ;

 map =
  0, 1, 2, 3,
  4, 5, 6, 7,
  8, 9, 10, 11,
  12, 13, 14, 15,
  16, 17, 18, 19,
  20, 21, 22, 23,
  24, 25, 26, 27,
  28, 29, 30, 31,
  32, 33, 34, 35,
  36, 37, 38, 39,
  40, 41, 42, 43,
  44, 45, 46, 47,
  48, 49, 50, 51,
  52, 53, 54, 55,
  56, 57, 58, 59,
  60, 61, 62, 63 ;
}

@edwardhartnett edwardhartnett added this to To do in PIO v2.5.3 Sep 15, 2020
@edwardhartnett edwardhartnett added this to To do in PIO v2.5.4 via automation Jan 19, 2021
@edwardhartnett edwardhartnett removed this from To do in PIO v2.5.3 Jan 19, 2021
@edwardhartnett edwardhartnett self-assigned this Jan 19, 2021
@edwardhartnett edwardhartnett added this to To do in PIO v2.5.5 via automation Apr 23, 2021
@edwardhartnett edwardhartnett removed this from To do in PIO v2.5.4 Apr 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

4 participants