Preserve chunks in CF Writer #1254

sfinkens · 2020-07-09T10:19:49Z

This PR updates the encoding applied by the CF writer so that generated netCDF files are chunked like the original dask arrays. Chunks specified by the user via encoding={'foo': {'chunksizes': (1024, 1024)}} take precedence.

I tested this with a LEO scene (avhrr_gaclac_l1b), a single geostationary scene (ahi_hsd) and a timeseries of ahi_hsd scenes (via Multiscene). The latter has some problems with time bounds (see #1242), but the chunking works fine.

I also tested both the netCDF4 and the h5netcdf backend.

Edit: I also removed the new _satpy* attributes before writing datasets to netcdf`.

Tests added
Tests passed
Passes flake8 satpy

coveralls · 2020-07-09T10:40:32Z

Coverage increased (+0.006%) to 90.523% when pulling 9525cc9 on sfinkens:cf-writer-chunks into aa65035 on pytroll:master.

codecov · 2020-07-09T10:45:30Z

Codecov Report

Merging #1254 into master will increase coverage by 0.00%.
The diff coverage is 93.93%.

@@           Coverage Diff           @@
##           master    #1254   +/-   ##
=======================================
  Coverage   90.51%   90.52%           
=======================================
  Files         228      228           
  Lines       33334    33377   +43     
=======================================
+ Hits        30173    30214   +41     
- Misses       3161     3163    +2

Impacted Files	Coverage Δ
satpy/writers/cf_writer.py	`93.84% <90.69%> (-0.26%)`	⬇️
satpy/tests/writer_tests/test_cf.py	`99.65% <100.00%> (+0.01%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update aa65035...9525cc9. Read the comment docs.

sfinkens · 2020-07-09T15:55:25Z

I just noticed that also coordinate variables like latitude or longitude can be chunked.

mraspaud

The logic is sound, so I'm good with this. I have two comments about cleanliness though.

satpy/tests/writer_tests/test_cf.py

satpy/writers/cf_writer.py

djhoese · 2020-09-08T14:58:33Z

satpy/writers/cf_writer.py

@@ -616,6 +632,7 @@ def save_datasets(self, datasets, filename=None, groups=None, header_attrs=None,
            root.attrs['Conventions'] = CF_VERSION

        # Remove satpy-specific kwargs
+        to_netcdf_kwargs = copy.deepcopy(to_netcdf_kwargs)  # may contain dictionaries (encoding)
        satpy_kwargs = ['overlay', 'decorate', 'config_files']


I know you didn't do this, but it is too bad we have to do this. I'm sure it will come to bite us in the future.

mraspaud · 2020-09-15T12:51:59Z

@sfinkens what is the status on this? Do you think you can have it ready by friday?

sfinkens · 2020-09-16T11:42:08Z

@mraspaud Yes, I think that's possible!

Replace DataArray.drop with .drop_vars

sfinkens · 2020-09-18T08:21:52Z

@mraspaud Done!

mraspaud

LGTM, two small style comments inline.

satpy/writers/cf_writer.py

sfinkens requested review from djhoese and mraspaud as code owners July 9, 2020 10:19

sfinkens self-assigned this Jul 9, 2020

sfinkens added component:writers enhancement code enhancements, features, improvements labels Jul 9, 2020

sfinkens mentioned this pull request Jul 9, 2020

Attribute Finetuning pytroll/pygac-fdr#14

Merged

sfinkens added 2 commits July 10, 2020 09:44

Preserve chunks in CF Writer

ff6d6a0

Include all variables, not just data variables

1951de4

sfinkens force-pushed the cf-writer-chunks branch from c27beb2 to 1951de4 Compare July 10, 2020 08:05

Limit chunksizes to shape of the data

976c002

mraspaud reviewed Sep 8, 2020

View reviewed changes

satpy/tests/writer_tests/test_cf.py Outdated Show resolved Hide resolved

satpy/writers/cf_writer.py Outdated Show resolved Hide resolved

djhoese reviewed Sep 8, 2020

View reviewed changes

mraspaud added this to the v0.23.0 milestone Sep 9, 2020

sfinkens added 4 commits September 18, 2020 09:32

Factorize encoding update & tests

aea0bf4

Merge branch 'master' into cf-writer-chunks

403228a

Fix deprecation warnings

6827a0f

Replace DataArray.drop with .drop_vars

Remove _satpy* attributes

da97497

mraspaud approved these changes Sep 18, 2020

View reviewed changes

satpy/writers/cf_writer.py Outdated Show resolved Hide resolved

satpy/writers/cf_writer.py Outdated Show resolved Hide resolved

Factorize encoding update

9525cc9

mraspaud merged commit 5063081 into pytroll:master Sep 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve chunks in CF Writer #1254

Preserve chunks in CF Writer #1254

sfinkens commented Jul 9, 2020 •

edited

coveralls commented Jul 9, 2020 •

edited

codecov bot commented Jul 9, 2020 •

edited

sfinkens commented Jul 9, 2020

mraspaud left a comment

djhoese Sep 8, 2020

mraspaud commented Sep 15, 2020

sfinkens commented Sep 16, 2020

sfinkens commented Sep 18, 2020

mraspaud left a comment

Preserve chunks in CF Writer #1254

Preserve chunks in CF Writer #1254

Conversation

sfinkens commented Jul 9, 2020 • edited

coveralls commented Jul 9, 2020 • edited

codecov bot commented Jul 9, 2020 • edited

Codecov Report

sfinkens commented Jul 9, 2020

mraspaud left a comment

Choose a reason for hiding this comment

djhoese Sep 8, 2020

Choose a reason for hiding this comment

mraspaud commented Sep 15, 2020

sfinkens commented Sep 16, 2020

sfinkens commented Sep 18, 2020

mraspaud left a comment

Choose a reason for hiding this comment

sfinkens commented Jul 9, 2020 •

edited

coveralls commented Jul 9, 2020 •

edited

codecov bot commented Jul 9, 2020 •

edited