Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No way to specify chunking for string variables when writing to netcdf #453

Closed
Kirill888 opened this issue May 18, 2018 · 0 comments

Comments

@Kirill888
Copy link
Contributor

commented May 18, 2018

datacube.storage.netdcf_writer.create_variable

silently truncates chunk size that have too many dimensions, then for string variables adds extra dimension back in by converting them from 1d array of strings to 2d array of bytes

if 'chunksizes' in kwargs:
maxsizes = [len(nco.dimensions[dim]) for dim in var.dims]
kwargs['chunksizes'] = [min(chunksize, maxsize) if chunksize and maxsize else chunksize
for maxsize, chunksize in zip(maxsizes, kwargs['chunksizes'])]

Code above intends to clamp each element of chunksizes to maximum size in that dimension, but because of the use of zip it also truncates to number of dimensions present in the original variable (which lacks extra _nchar dimension)

Later on exception will be thrown from inside netcdf library complaining about mismatch for chunksizes parameters.

related to #452 (without setting chunksizes compressions settings will be sub-optimal)

Kirill888 added a commit that referenced this issue May 18, 2018

@Kirill888 Kirill888 closed this May 30, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.