Skip to content

Conversation

@tsmbland
Copy link
Collaborator

@tsmbland tsmbland commented Jan 28, 2025

This is part 1 of two PRs (also #585) to do with the broadcast_techs function and correct formatting of data

To give some background, technologies datasets (and also other datasets such as prices and demand) can take two formats:

  • a fully-explicit format, in which every combination of technology, region and year is specified, with each of these as a separate dimension
  • a flattened format, in which certain combinations of technology/region/year are represented in a single "asset" dimension. In this case, each coordinate represents an asset, and will hold technology data for the particular combination of technology/region/year relevant to that asset

Let's say, for example, you had a capacity dataset for a series of assets:

capacity = xr.DataArray(
    data=[10, 50],
    dims=["asset"],
    coords={
        "region": (["asset"], ["R1", "R2"]),
        "technology": (["asset"], ["gasboiler", "heatpump"])},
)

You want to find the capital costs of installing each asset to this level of capacity. First, you first need to find the appropriate cap_par value for each asset. Then multiply this by the capacity to get overall capital costs.

If your cap_par dataset is in a format like this (i.e. fully-explicit):

cap_par = xr.DataArray(
    data=[[1, 2, 3], [4, 5, 6]],
    dims=['technology', 'region'],
    coords={'technology': ['gasboiler', 'heatpump'],
            'region': ['R1', 'R2', 'R3']},
)

you first need to select the appropriate value for each asset (i.e. gasboiler/R1 for the first asset and heatpump/R2 for the second asset). You can do this with the broadcast_techs function using capacity as a template, which will convert the data to a flattened format:

<xarray.DataArray (asset: 2)> Size: 16B
        array([1, 5])
        Coordinates:
            technology  (asset) <U9 72B 'gasboiler' 'heatpump'
            region      (asset) <U2 16B 'R1' 'R2'
        Dimensions without coordinates: asset

You can then multiply this with your capacity dataset to get the desired capital costs for each asset (which will be an array with a single "asset" dimension).

If you didn't use broadcast_techs to convert the technology data, you'd end up with separate "asset", "technology" and "region" dimensions after the multiplication, which doesn't make any sense. In fact, any array with an "asset" dimension should never have separate "technology" or "region" dimensions - it this ever crops up it should raise serious alarm bells (see #585 for one such mistake).

I'm keen to avoid any mistakes like this going forward, so I thought it was worth investing some time to make sure that these mistakes could never happen. The most robust solution I could think of was to add a check to patched_broadcast_compat_data to ensure that no array with an "asset" dimension also has a "region" or "technology" dimension (also "installed").

This flagged numerous issues in the tests, mostly due to poor design of fixtures, which I have gone through and fixed

Fortunately, only one issue was raised during the regression tests to do with the supply function in multi-region models. I've fixed this in a separate PR (#585), hence why the regression tests are failing in this PR, but I've kept this separate to make reviewing easier.

Also in this PR:

  • Added a doctest to broadcast_techs
  • Hardcoded the name of the asset dimension in broadcast_techs - this will always be "asset" and never anything else, so no need to make it configurable
  • Deleted gross_margin and it's test, as this isn't used anywhere

I suggest reviewing this and #585 together. Obviously the tests are failing in this PR, but this is fixed in #585. In this PR, I'd suggest focusing on the code in src, and not worrying too much about the tests as this is all just boring refactoring

Close #619

@tsmbland tsmbland changed the title Check array dimensions after multiplication Xarray patch: Check array dimensions after operation Jan 29, 2025
@tsmbland tsmbland changed the title Xarray patch: Check array dimensions after operation Xarray patch: Check array dimensions Jan 29, 2025
@tsmbland tsmbland marked this pull request as ready for review January 29, 2025 23:28
@tsmbland tsmbland mentioned this pull request Jan 30, 2025
@tsmbland tsmbland requested a review from dalonsoa January 30, 2025 15:50
Copy link
Collaborator

@dalonsoa dalonsoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!!

@tsmbland tsmbland merged commit bc99cfb into main Feb 3, 2025
8 of 14 checks passed
@tsmbland tsmbland deleted the check_arrays branch February 3, 2025 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Prevent arrays from having separate "asset" and "region" dimensions

3 participants