Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retrieval function for ERA5 reanalysis data #1264

Open
wants to merge 44 commits into
base: main
Choose a base branch
from

Conversation

AdamRJensen
Copy link
Member

@AdamRJensen AdamRJensen commented Jul 27, 2021

  • I am familiar with the contributing guidelines
  • Tests added
  • Updates entries to docs/sphinx/source/api.rst for API changes.
  • Adds description and name entries in the appropriate "what's new" file in docs/sphinx/source/whatsnew for all changes. Includes link to the GitHub Issue with :issue:`num` or this Pull Request with :pull:`num`. Includes contributor name and/or GitHub username (link with :ghuser:`user`).
  • New code is fully documented. Includes numpydoc compliant docstrings, examples, and comments where necessary.
  • Pull request is nearly complete and ready for detailed review.
  • Maintainer: Appropriate GitHub Labels and Milestone are assigned to the Pull Request and linked Issue.

Description
ERA5 is the reanalysis dataset provided by the ECMWF and provides hourly irradiance data for the entire world starting in 1970.

The proposed function retrieves ERA5 data from the Climate Data Store (CDS) using the cdsapi.

@AdamRJensen
Copy link
Member Author

AdamRJensen commented Aug 9, 2021

In order to test ERA5, CDSAPI_KEY needs to be saved as an environment variable.

The API key can be obtained by creating a CDS account. Note the tests do not need a local file with the key info as described here.

@wholmgren Sorry to bother, but I think maybe you're the only one who has access to the pvlib email and azure?

docs/sphinx/source/whatsnew/v0.9.0.rst Outdated Show resolved Hide resolved
pvlib/iotools/era5.py Outdated Show resolved Hide resolved
pvlib/iotools/era5.py Outdated Show resolved Hide resolved
pvlib/iotools/era5.py Outdated Show resolved Hide resolved
pvlib/iotools/era5.py Outdated Show resolved Hide resolved
pvlib/iotools/era5.py Outdated Show resolved Hide resolved
pvlib/iotools/era5.py Outdated Show resolved Hide resolved
pvlib/iotools/era5.py Outdated Show resolved Hide resolved
pvlib/iotools/era5.py Show resolved Hide resolved
pvlib/tests/iotools/test_era5.py Outdated Show resolved Hide resolved
Copy link
Member

@kandersolar kandersolar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good, a few minor notes below. It's a shame that the CDS API is so slow to return data, but I guess it is what it is. Ready for @pvlib/pvlib-maintainer to take a look if anyone has the time.

setup.py Show resolved Hide resolved
pvlib/iotools/era5.py Outdated Show resolved Hide resolved
@wholmgren wholmgren mentioned this pull request Aug 20, 2021
24 tasks
pvlib/iotools/era5.py Outdated Show resolved Hide resolved
pvlib/iotools/era5.py Show resolved Hide resolved
pvlib/iotools/era5.py Show resolved Hide resolved
@kandersolar
Copy link
Member

@wholmgren said in #1214 (comment) that we should think through how xarray and dask are used in pvlib, and now that 0.9.0 is out I guess we can do that. This PR as it currently stands adds three new optional dependencies: xarray, dask, and cdsapi. #1274 currently needs pydap as well. I assume that what needs to be thought through is whether we are willing to add these new dependencies; if there are other questions to answer please point that out. Here are my thoughts:

In the context of this PR, as I understand it (@AdamRJensen please correct any errors), I think dask is only needed for reading multiple files in a single read_era5 call, so we might be able to drop dask without losing much functionality. And if we do restrict to single files, I suspect we could drop xarray as well and replace it with a lower level netcdf library (can h5py read netcdf?) combined with something like pvlib.forecast.ForecastModel._netcdf2pandas.

#1274 is less clear. MERRA2 data retrieval is supposed to be done using OpenDAP instead of basic HTTP requests like most APIs. My very limited understanding of OpenDAP suggests that it is advisable to use a dedicated library for it instead of just using requests like we normally do. xarray combined with pydap is one way to access the data, but I am pretty reluctant to use pydap: the project seems abandoned and we could only get things working with an unreleased development version. I think there are alternatives worth looking into but I don't know yet whether xarray will need to be involved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants