Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve data source downloading #342

Open
sjpfenninger opened this issue Apr 10, 2024 · 0 comments
Open

Improve data source downloading #342

sjpfenninger opened this issue Apr 10, 2024 · 0 comments
Labels
data enhancement New feature or request

Comments

@sjpfenninger
Copy link
Member

What can be improved?

Rather than having the curl command repeated throughout the rules, it would probably be better to wrap it in one place.

This helper could optionally print a simple log message like "Downloading from {url} to {output}" which would help understanding and debugging.

Could further also improve the specification of data sources to e.g. check md5 hash to add protection against the source file having changed, and to optionally specify what dimensions and variables are expected in downloaded netCDF files, such as:

gridded-temperature-data:
    url: https://zenodo.org/records/6557643/files/temperature.nc?download=1
    output_file: data/automatic/gridded-weather/temperature.nc
    md5: 5ee6d0152f174c56d42b0a144c13c822
    nc_dims: time, site
    nc_vars: temperature, lat, lon

Version

1.2.0.dev

@sjpfenninger sjpfenninger added enhancement New feature or request data labels Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant