Support for the new compression arguments. #7551

markelg · 2023-02-23T09:32:56Z

Use a dict for the arguments and update it with the encoding, so all variables are passed.

Closes Xarray does not support full range of netcdf-python compression options #7388
Tests added
User visible changes (including notable bug fixes) are documented in whats-new.rst

Use a dict for the arguments and update it with the encoding, so all variables are passed.

headtr1ck

This should also fix #7634
And future encoding updates of netcdf.

xarray/backends/netCDF4_.py

Should close pydata#7634

for more information, see https://pre-commit.ci

djhoese · 2023-03-24T00:59:35Z

Just curious, what is the status of this PR? We have some conditions in our library's tests based on the version of xarray on what compression arguments to pass to xarray. We had hoped this PR would be included in 2023.3.0.

jhamman · 2023-03-27T02:15:54Z

@djhoese - I think we're just missing tests for this. No reason it can't go out in the next release.

sfinkens · 2023-05-24T12:15:18Z

@markelg Thanks a lot for adding this! Do you have time to finalize it in the near future? If not, I could also take a look at the tests if you like.

garciampred · 2023-05-24T12:54:46Z

This is currently stuck waiting until the problems with the last netcdf-c versions are fixed in a new release. See the issues (#7388).

When they are fixed I will write the tests If I have time. But of course any help and suggestions are welcomed.

for more information, see https://pre-commit.ci

markelg · 2023-06-21T13:36:09Z

I did my best to write a test. Please check it out. It would be nice to test the lossycompression (significant_digits) too but I could use some help for that as I don't have much more time to dedicate to this at the moment.

There are a lot of combinations possible with the encoding parameters that netCDF4 accepts, and they change the output file encoding in different ways. The test does not check all of them. Also blosc apparently only works with blosc_shuffle=1, although 0 and 2 are available too according to the documentation. I had to parametrize the shape of the test data in order to increase it in this test, as blosc does not work with small chunk sizes.

zklaus · 2023-09-28T12:48:08Z

Could someone (@rabernat?) restart the CI here, at least to get viewable logs again? Following #7388 (comment), this seems to be the next step.

rabernat · 2023-09-28T12:55:39Z

I'm not seeing an option to rerun the jobs, I think due to the age of the last run.

I think the best bet would be to merge latest upstream changes from main (needed anyway to move forward) and push to this branch, which will trigger a new CI run.

zklaus · 2023-09-28T13:07:54Z

I think merging main was the right thing to do anyways. For the future: If you just close and reopen the PR, the CI will be restarted.

rabernat · 2023-09-28T13:15:07Z

Running the tests revealed a bunch of new errors in test_combine.py. I wonder if these are really related to this PR.

@markelg - are you available to respond to the review comments and finish up this PR?

markelg · 2023-09-28T13:27:44Z

Running the tests revealed a bunch of new errors in test_combine.py. I wonder if these are really related to this PR.

@markelg - are you available to respond to the review comments and finish up this PR?

I can do this next week, I am busy until Tuesday. I think I already answered to the comments though. Edit: I see now that I ignored the first one, I will commit the suggested change.

I would be nice if someone reviewed the tests. It was hard to do as the compression arguments seem to interact between themselves in complicated ways.

zklaus · 2023-09-28T13:37:09Z

The good news is that with the exception of ubuntu-latest/py3.9 min-all-deps, no libnetcdf compression errors are showing up, right? That test pulls in an older version of the library, probably due to some dependency issues?

xarray/tests/test_backends.py

kmuehlbauer · 2023-12-19T07:58:51Z

@markelg Thanks! I've now fully cleaned up my earlier changes, fixed the test-decorator and moved the whats-new entry to correct position. Added one suggestion for the dict-comparison.

The last remaining issue is with the windows builds. I've no expertise on windows, so have to rely on other opinions how to resolve.

xarray/tests/__init__.py

for more information, see https://pre-commit.ci

rabernat · 2023-12-20T14:34:58Z

I just wanted to thank everyone working on this PR. It seems like a pretty tricky puzzle to put together! It will be great once we have these new compression options available via Xarray.

How can we help?

kmuehlbauer · 2023-12-20T14:39:18Z

@rabernat Thanks for stepping by.

I've lost track if netcdf-c/netcdf4-python (here the conda-forge builds) have the new compressions compiled in or not. The question is, can xarray do anything about the failing windows runs or not?

If not, we could just skip windows in the tests for now.

rabernat · 2023-12-20T14:48:51Z

I would be in favor of xfailing the windows tests and moving forward with tested support for these new compression options in Linux and MacOS. Packaging NetCDF for windows it probably not our job as Xarray devs. We should raise upstream issues as needed to bring these problems to the attention of whoever maintains those packages.

This is a significant improvement for Xarray users and has already been a lot of work for the developers. So we should try to wrap it up asap.

xarray/tests/test_backends.py

dopplershift · 2023-12-20T19:59:03Z

(cc @WardF @DennisHeimbigner for awareness)

DennisHeimbigner · 2023-12-20T20:33:21Z

Is there any specific issue that netcdf-c needs to address?

markelg · 2023-12-21T10:36:05Z

Is there any specific issue that netcdf-c needs to address?

It looks like the netCDF windows build in conda-forge does not support the new compression filters. But as others said, I think it is OK to merge this PR and close the issues linked to it. And someone can open a new issue here or upstream to report this problem in windows.

rabernat · 2023-12-21T15:06:53Z

I'd love to be able to merge this. There are just two tiny mypy errors left before everything is green:

xarray/core/dataset.py:174: error: Unused "type: ignore" comment  [unused-ignore]
xarray/core/dataarray.py:83: error: Unused "type: ignore" comment  [unused-ignore]

These files were unchanged by this PR. Any idea what is going on here?

headtr1ck · 2023-12-21T15:19:57Z

I'd love to be able to merge this. There are just two tiny mypy errors left before everything is green:
xarray/core/dataset.py:174: error: Unused "type: ignore" comment  [unused-ignore]
xarray/core/dataarray.py:83: error: Unused "type: ignore" comment  [unused-ignore]
These files were unchanged by this PR. Any idea what is going on here?

These are already in main. We can merge anyway.

dcherian · 2023-12-21T15:24:33Z

Thanks for your patience here @markelg!

* main: Fix mypy type ignore (pydata#8564) Support for the new compression arguments. (pydata#7551) FIX: reverse index output of bottleneck move_argmax/move_argmin functions (pydata#8552) Adapt map_blocks to use new Coordinates API (pydata#8560) add xeofs to ecosystem.rst (pydata#8561) Offer a fixture for unifying DataArray & Dataset tests (pydata#8533) Generalize cumulative reduction (scan) to non-dask types (pydata#8019)

commit 0a0f800 Merge: 33c8033 41d33f5 Author: Deepak Cherian <dcherian@users.noreply.github.com> Date: Tue Jan 2 20:42:51 2024 -0700 Merge branch 'main' into depr-groupby-squeeze-2 commit 33c8033 Author: Deepak Cherian <deepak@cherian.net> Date: Tue Jan 2 20:40:42 2024 -0700 Don't skip for resampling commit d7be352 Author: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Wed Jan 3 03:24:13 2024 +0000 [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci commit d13fa0e Author: Deepak Cherian <dcherian@users.noreply.github.com> Date: Tue Jan 2 20:23:43 2024 -0700 Apply suggestions from code review Co-authored-by: Michael Niklas <mick.niklas@gmail.com> commit dd6ea53 Author: Deepak Cherian <deepak@cherian.net> Date: Thu Dec 21 19:29:40 2023 -0700 Silence more warnings commit 44e5a41 Author: Deepak Cherian <deepak@cherian.net> Date: Thu Dec 21 19:21:06 2023 -0700 minimize test mods commit 94c1c1f Author: Deepak Cherian <deepak@cherian.net> Date: Thu Dec 21 18:55:46 2023 -0700 Add tests for pydata#8263 commit 0ab4eb6 Author: Deepak Cherian <deepak@cherian.net> Date: Thu Dec 21 18:47:41 2023 -0700 Fix typing commit a064430 Merge: d6a3f2d 03ec3cb Author: Deepak Cherian <deepak@cherian.net> Date: Thu Dec 21 18:47:04 2023 -0700 Merge branch 'main' into depr-groupby-squeeze-2 * main: Fix mypy type ignore (pydata#8564) Support for the new compression arguments. (pydata#7551) FIX: reverse index output of bottleneck move_argmax/move_argmin functions (pydata#8552) Adapt map_blocks to use new Coordinates API (pydata#8560) add xeofs to ecosystem.rst (pydata#8561) Offer a fixture for unifying DataArray & Dataset tests (pydata#8533) Generalize cumulative reduction (scan) to non-dask types (pydata#8019) commit d6a3f2d Author: Deepak Cherian <deepak@cherian.net> Date: Thu Dec 21 18:46:50 2023 -0700 Fix generator for aggregations commit 97f1695 Author: Deepak Cherian <deepak@cherian.net> Date: Tue Dec 19 10:58:11 2023 -0700 Fix docs commit 5b33b98 Author: Deepak Cherian <deepak@cherian.net> Date: Sun Dec 17 20:35:53 2023 -0700 fix whats-new commit 80b2b36 Author: Deepak Cherian <deepak@cherian.net> Date: Sun Dec 17 20:26:17 2023 -0700 Reduce more warnings commit 5f6f4ea Merge: a57d4ae 2971994 Author: Deepak Cherian <deepak@cherian.net> Date: Sat Dec 16 20:33:13 2023 -0700 Merge branch 'main' into depr-groupby-squeeze-2 * main: (26 commits) Filter null values before plotting (pydata#8535) Update concat.py (pydata#8538) Add getitem to array protocol (pydata#8406) Added option to specify weights in xr.corr() and xr.cov() (pydata#8527) Filter out doctest warning (pydata#8539) Bump actions/setup-python from 4 to 5 (pydata#8540) Point users to where in their code they should make mods for Dataset.dims (pydata#8534) Add Cumulative aggregation (pydata#8512) dev whats-new Whats-new for 2023.12.0 (pydata#8532) explicitly skip using `__array_namespace__` for `numpy.ndarray` (pydata#8526) Add `eval` method to Dataset (pydata#7163) Deprecate ds.dims returning dict (pydata#8500) test and fix empty xindexes repr (pydata#8521) Remove PR labeler bot (pydata#8525) Hypothesis strategy for generating Variable objects (pydata#8404) Use numbagg for `rolling` methods (pydata#8493) Bump pypa/gh-action-pypi-publish from 1.8.10 to 1.8.11 (pydata#8514) fix RTD docs build (pydata#8519) Fix type of `.assign_coords` (pydata#8495) ... commit a57d4ae Author: Deepak Cherian <deepak@cherian.net> Date: Fri Dec 1 21:36:04 2023 -0700 Test one more warning commit bf8139d Author: Deepak Cherian <dcherian@users.noreply.github.com> Date: Fri Dec 1 21:33:45 2023 -0700 Update xarray/tests/test_groupby.py commit 4e9a063 Author: Deepak Cherian <deepak@cherian.net> Date: Fri Dec 1 21:10:14 2023 -0700 Set squeeze=None for Dataset too commit c2e576e Author: Deepak Cherian <deepak@cherian.net> Date: Fri Dec 1 20:54:17 2023 -0700 Fix first, last commit 6d8e822 Author: Deepak Cherian <deepak@cherian.net> Date: Fri Dec 1 20:46:21 2023 -0700 better warning commit 62c334b Author: Deepak Cherian <deepak@cherian.net> Date: Fri Dec 1 20:45:17 2023 -0700 silence warnings commit b7805a8 Author: dcherian <deepak@cherian.net> Date: Tue Aug 15 10:54:25 2023 -0600 Deprecate `squeeze` in GroupBy. Closes pydata#2157

* upstream/main: Faster encoding functions. (pydata#8565) ENH: vendor SerializableLock from dask and use as default backend lock, adapt tests (pydata#8571) Silence a bunch of CachingFileManager warnings (pydata#8584) Bump actions/download-artifact from 3 to 4 (pydata#8556) Minimize duplication in `map_blocks` task graph (pydata#8412) [pre-commit.ci] pre-commit autoupdate (pydata#8578) ignore a `DeprecationWarning` emitted by `seaborn` (pydata#8576) Fix mypy type ignore (pydata#8564) Support for the new compression arguments. (pydata#7551) FIX: reverse index output of bottleneck move_argmax/move_argmin functions (pydata#8552)

Support for the new compression arguments.

f8a0014

Use a dict for the arguments and update it with the encoding, so all variables are passed.

github-actions bot added io topic-backends labels Feb 23, 2023

djhoese mentioned this pull request Mar 13, 2023

Repair CF writer tests failing with libnetcdf >= 4.9.0 pytroll/satpy#2412

Merged

lsetiawan mentioned this pull request Mar 15, 2023

Unpinning netCDF4 version causes failure OSOceanAcoustics/echopype#988

Open

Merge branch 'main' into fix_compression_options

42c5a97

headtr1ck reviewed Mar 18, 2023

View reviewed changes

xarray/backends/netCDF4_.py Outdated Show resolved Hide resolved

xarray/backends/netCDF4_.py Show resolved Hide resolved

markelg and others added 2 commits March 22, 2023 17:33

significant_digit and other missing keys added

7abf103

Should close pydata#7634

[pre-commit.ci] auto fixes from pre-commit.com hooks

fa408c5

for more information, see https://pre-commit.ci

djhoese mentioned this pull request Mar 24, 2023

Update condition of xarray version in CF writer tests pytroll/satpy#2422

Merged

3 tasks

Merge branch 'main' into fix_compression_options

1b9f6db

markelg and others added 6 commits June 20, 2023 14:08

Merge branch 'pydata:main' into fix_compression_options

8c0c584

test for the new compression argument

e489a32

[pre-commit.ci] auto fixes from pre-commit.com hooks

fff18a8

for more information, see https://pre-commit.ci

move the new test to TestNetCDF4Data

a89f816

fixed conflict

2feafc5

[pre-commit.ci] auto fixes from pre-commit.com hooks

371d2d7

for more information, see https://pre-commit.ci

markelg mentioned this pull request Jun 22, 2023

Xarray does not support full range of netcdf-python compression options #7388

Closed

Merge branch 'main' into fix_compression_options

e74ec56

kmuehlbauer added 2 commits December 19, 2023 07:47

fix whats-new.rst

4ae2bca

fix requires-decorator

f8f37f4

kmuehlbauer reviewed Dec 19, 2023

View reviewed changes

xarray/tests/test_backends.py Outdated Show resolved Hide resolved

headtr1ck reviewed Dec 19, 2023

View reviewed changes

xarray/tests/__init__.py Outdated Show resolved Hide resolved

kmuehlbauer reviewed Dec 19, 2023

View reviewed changes

xarray/tests/__init__.py Outdated Show resolved Hide resolved

fix for asserts of other tests that use test data

b7e56fe

kmuehlbauer reviewed Dec 20, 2023

View reviewed changes

xarray/tests/__init__.py Outdated Show resolved Hide resolved

Apply suggestions from code review

138a439

kmuehlbauer reviewed Dec 20, 2023

View reviewed changes

xarray/tests/__init__.py Show resolved Hide resolved

kmuehlbauer and others added 3 commits December 20, 2023 08:47

Update xarray/tests/__init__.py

9e25e6e

[pre-commit.ci] auto fixes from pre-commit.com hooks

4e59c9a

for more information, see https://pre-commit.ci

Merge branch 'main' into fix_compression_options

674d35f

kmuehlbauer reviewed Dec 20, 2023

View reviewed changes

xarray/tests/test_backends.py Show resolved Hide resolved

Update xarray/tests/test_backends.py

bc74cd8

rabernat approved these changes Dec 21, 2023

View reviewed changes

dcherian merged commit a04900d into pydata:main Dec 21, 2023
25 of 27 checks passed

bzah mentioned this pull request Jan 5, 2024

Add support for netCDF4.EnumType #8147

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for the new compression arguments. #7551

Support for the new compression arguments. #7551

markelg commented Feb 23, 2023 •

edited by kmuehlbauer

headtr1ck left a comment

djhoese commented Mar 24, 2023

jhamman commented Mar 27, 2023

sfinkens commented May 24, 2023

garciampred commented May 24, 2023 •

edited

markelg commented Jun 21, 2023 •

edited

zklaus commented Sep 28, 2023

rabernat commented Sep 28, 2023

zklaus commented Sep 28, 2023

rabernat commented Sep 28, 2023

markelg commented Sep 28, 2023 •

edited

zklaus commented Sep 28, 2023

kmuehlbauer commented Dec 19, 2023

rabernat commented Dec 20, 2023

kmuehlbauer commented Dec 20, 2023

rabernat commented Dec 20, 2023 •

edited

dopplershift commented Dec 20, 2023

DennisHeimbigner commented Dec 20, 2023

markelg commented Dec 21, 2023 •

edited

rabernat commented Dec 21, 2023

headtr1ck commented Dec 21, 2023

dcherian commented Dec 21, 2023

Support for the new compression arguments. #7551

Support for the new compression arguments. #7551

Conversation

markelg commented Feb 23, 2023 • edited by kmuehlbauer

headtr1ck left a comment

Choose a reason for hiding this comment

djhoese commented Mar 24, 2023

jhamman commented Mar 27, 2023

sfinkens commented May 24, 2023

garciampred commented May 24, 2023 • edited

markelg commented Jun 21, 2023 • edited

zklaus commented Sep 28, 2023

rabernat commented Sep 28, 2023

zklaus commented Sep 28, 2023

rabernat commented Sep 28, 2023

markelg commented Sep 28, 2023 • edited

zklaus commented Sep 28, 2023

kmuehlbauer commented Dec 19, 2023

rabernat commented Dec 20, 2023

kmuehlbauer commented Dec 20, 2023

rabernat commented Dec 20, 2023 • edited

dopplershift commented Dec 20, 2023

DennisHeimbigner commented Dec 20, 2023

markelg commented Dec 21, 2023 • edited

rabernat commented Dec 21, 2023

headtr1ck commented Dec 21, 2023

dcherian commented Dec 21, 2023

markelg commented Feb 23, 2023 •

edited by kmuehlbauer

garciampred commented May 24, 2023 •

edited

markelg commented Jun 21, 2023 •

edited

markelg commented Sep 28, 2023 •

edited

rabernat commented Dec 20, 2023 •

edited

markelg commented Dec 21, 2023 •

edited