Stream ARCO ERA5 data#357
Conversation
This reverts commit 0001b99.
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
ScottEilerman
left a comment
There was a problem hiding this comment.
Pending confirmation that the diffs in the data look fine, this is ready to go! Thanks for helping get it cleaned up!
|
@ScottEilerman I'm still getting the warning: in the docs notebook when instantiating the |
What seems odd here is that chardet-normalizer is a dependency of requests, and at least in a fresh environment, seems to get installed via both conda and pip. Is it possible the environment where these doc notebooks are running is maybe in a weird state? |
That was indeed the case! Thanks! |
|
This is good to merge. |
This PR is in collaboration with @ScottEilerman. I started from #355, but somehow could not push to it directly. The PR implements streaming ERA5 data directly from the cloud so that users do not need to pre-download their ERA5 data.
Changes
ERA5ARCODatasetclass.gcsfsas an optional dependency to support cloud streaming.streamto isolate streaming tests:This runs only streaming tests and excludes them otherwise. This allows faster development since streaming tests take longer.
Other changes:
Reorganized internal logic for handling
use_daskandread_zarrworkflows.Cleaned up
conftest.pyby removing unnecessaryrequestparameters from fixtures.Adjusted fixture
start_timevalues inconftest.pyto more intuitive dates (changed from January 31 to February 2) since source data does not include January.Tests added
Passes
pre-commit run --all-filesChanges are documented in
docs/releases.mdNew functions/methods are listed in
docs/api.rstNew functionality has documentation