New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a function to aggregate a variable to a region from subregions #207
Add a function to aggregate a variable to a region from subregions #207
Conversation
using latest `pyam.append()` beauty
for consistency with `filter()`
Hey @danielhuppmann quick first question. I think we already have an |
Probably need to ask @gidden about this one I think? Otherwise looks very nice to me |
[1] e.g., for
is similar to (except for the "average" issue in [2])
where [2] calling |
I meant |
Ok, just getting back to this now. I agree that there are differences between the two, but my suspicion here is that we should harmonize them into one function (or two, but different). The original goal of What do you think? |
Agree that the two could, maybe should, be refactored into a data-manipulation and a “paint” function. But...
Merging these two features will require a lot kwargs... I did go down that rabbit hole to harmonise them and I did not see a light after an hour or two, so I abandoned the effort. We can venture into it again together next week... |
Per in-person discussions, we have tentitavely agreed to break this into two functions:
|
@gidden, merge conflicts resolved, should be good to be merged once the CI passes |
Just catching up on this now
So in this PR you add |
Yes (but renaming the functions to And as suggested in the inline discussion above, we could refactor the existing |
hey @danielhuppmann. it turns out the geopandas dep is more complicated than initially thought - it seems some underlying datasets have changed (perhaps ISO names? I would need to dig further)... For the moment, can you try updating the CI conda installs to |
cherrypicked from `gidden:testci`
f51bf94
to
e38afc5
Compare
Woo, thanks so much @danielhuppmann. can you please add that conversation to the |
hah, beat me to it =) |
Please confirm that this PR has done the following:
Description of PR
This PR refactors the
check_aggregate_regions()
function into a separateaggregate_region()
function and acheck_...()
function to use it in scenario postprocessing (calculating regional values rather than just checking that the aggregation is correct).The important issue here is that any variables components that are only defined at the
region
(e.g., 'World') should be added to the regional total. Say that CO2 emissions from air travel are only accounted for at the global level, the following output should be returned:Input
Expected output
Refactoring as part of this PR
While implementing, there were a number of issues that I believed should be improved even though they break the API.
@znicholls, you implemented the first version of this, can you mark the items below if you agree with the change?
[check_]aggregate[_regions]()
functions: refactorunits
tounit
for closer resemblance todf.filter()
check_aggregate...()
functions return the index columns in the standard IAMC-order(i.e,
[model, scenario, region, variable, unit]
)check_aggregate_regions()
was renamed tocheck_aggregate_region()
because it only ever checks one region at a timecheck_aggregate_regions()
takes a kwargsubregions
for an optional list of regions to aggregate and a kwargcomponents
for variable components to be included at theregion
level(before, it was not possible to select custom variable components and
components
referred to the custom subregions, differing from the use incheck_aggregate()
_apply_filters()
interpretscol=None
as no filter applied, i.e, allTrue
(before, it would return allFalse
). This streamlines passing the unit filter from the[check_]aggregate[_regions]()
and is in line with how pandas treatsslice(None)
._apply_filters()
takeskwargs
, not adict