New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom aggregation #147
Custom aggregation #147
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the good to know page is really neat!
I am having issues using to_dataset
. Unclear to me if I don't understand what it is suppose to do or it is not working...
Co-authored-by: juliettelavoie <juliette.lavoie@hotmail.ca>
Co-authored-by: RondeauG <38501935+RondeauG@users.noreply.github.com>
for more information, see https://pre-commit.ci
@RondeauG I re-edited the docstring because it felt a bit redundant. |
Co-authored-by: RondeauG <38501935+RondeauG@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good to go!
Pull Request Checklist:
number
) and pull request (:pull:number
) has been addedWhat kind of change does this PR introduce?
New
to_dataset
method onDataCatalog
.Same as
to_dask
, but exposes options to change the aggregation control :concat_on
to list columns over which the datasets are concatenated.ensemble_on
to list columns over which arealization
dimension is created.The goal of this function is to reduce code complexity is some common cases where one wants a dataset with all members and experiments (for examples).
I also added a "good to know" page to the doc. A place where to list all sorts of misc information that users of xscen should be aware of. The first section is about how to open data.
Example:
The output:
The same code with pure xscen:
The plus value of this PR seems evident to me here.
Where ?
As it was developed for an ensemble case, I implemented it as a single-dataset output. But that seems a bit limited. In a xscen-world, this could be implemented at the
search_data_catalogs
level.However, it felt to me that the
search_data_catalogs
/extract_dataset
combo is best used for raw data. Once at the ensemble step, simplesDataCatalog.search
are often enough. Thus, the need to have this onDataCatalog
.Also, this could be moved upstream to
intake-esm
, but theensemble_on
andcalendar
args seem a bit to xclim-specific...