Improve reporting features #149

gidden · 2019-01-16T09:40:32Z

It should be possible to “report” or “post-process” a message_ix.Scenario (given a sufficient amount of configuration metadata) to generate output (IAMC-compliant pd.DataFrame or file) that can be directly submitted to IIASA databases for either single- or multi-model assessments.

Tasks

Identify requirements — done as of 2019-02-20, see https://github.com/iiasa/message_ix/wiki/Reporting.
Develop enhancement/strategy proposal (similar to PEPs or NEPs) —done as of 2019-03-01. In two stages (see wiki page for further details):
1. For version 1.2.0 (April 2019): pyam-based message_ix.Reporting class for early release.
2. For version 1.3 or 2.0 (later): dask/graph-based architecture using pyam, with deprecation.
Implementation & examples:
- ~~WIP - do not merge: generic MESSAGEix reporting class #142 —led by @danielhuppmann, includes tutorial updates & examples.~~
- Improve reporting (phase 2) ixmp#150 and Improve reporting (phase 2) #206.

The text was updated successfully, but these errors were encountered:

OFR-IIASA · 2019-01-16T10:11:42Z

The first document I would like to add was originally intended as part of the message_ix documentation. This document explains the mathematical operations carried out for currently reported variables.
https://iiasahub.sharepoint.com/:u:/s/ene/MESSAGEix/Ea0QmEl4TB5BqxeIyzERqFIBkq54eBAeFDwq92xR9nTe7A?e=eRCodt

OFR-IIASA · 2019-01-16T10:16:02Z

The second document, which may not reflect the most current version of the reporting, shows for every variable calculated in the model the exact calculation process and therefore provides a detailed overview of required operations.
https://iiasahub.sharepoint.com/:x:/s/ene/MESSAGEix/ERxvb_nkeTZGkD6iXpgznfoBTBxiqui1L50Gs4WqqyjFxQ?e=Maf3HA

OFR-IIASA · 2019-01-16T13:03:57Z

There are several important features required for the current reporting. Please feel free to add features.

specify single technologies with filters on modes, commodities, etc.
special treatment of global variables: dont report, calc. mean, max, min from regions, weighted average based on another variable.
perform additional operations on results: multiplication with factors (see for example investments or prices)
unit conversion
use data stored as timeseries: e.g. factors for f-gases; globiom reporting can be moved to such an application.
calculation of temporary variables which are not reported
account for the fact that not all technologies defined in the reporting are also part of the model. i.e. reporting for the global model would be the same for all three SSPs, but not all the technologies in SSP1 are also included in SSP3 (resources for example)
special aggregates: non-hierarchy variables (sums across variables from different hierarchy levels)
reporting of historical data: pre-firstmodelyear
cumulating data over time: e.g. for variable cumulative resource extraction
the variable tree for reporting powerplant parameters is always the same (e.g. variable o&m costs, capital costs etc.) but it would be very tiresome to define these hierarchies multiple for each of the variables individually.

khaeru · 2019-01-17T13:17:43Z

[Note for future readers that there is a separate, non-public Google Doc containing requirements discussion.]

khaeru · 2019-01-17T13:57:54Z

I left a comment on #150, but this comment also responds to the discussion in #151. Hopefully this is the right place for it 🤷‍♂️

Other software efforts (dask (detailed example), TensorFlow, many others) use the pattern of a graph in which:

nodes represent tasks or atomic operations.
edges represent data.

#150 and #151 invert these, so that nodes are data and edges are (sort of) tasks. I don't see that it's necessary to invent a new pattern, and in the process cut ourselves off from libraries that would simplify the codebase/slow the accumulation of technical debt. Everything discussed so far can be expressed in the common pattern as tasks:

Perform basic arithmetic: addition, multiplication, division, of two or more values.
- Note that "disaggregation" is just array multiplication, yielding a result of higher dimension.
Take a simple sum over one or more dimension(s) of an array (input: which dimension(s)).
- This covers "cumulative" anything.
Take a weighted sum (input: the weights).
Take a sum across disparate items ("special aggregates" per @OFR-IIASA) (input: which items to include).
Other low-level/numpy-like operations, e.g. min, max, mean, median.
Convert units (can introspect units from inputs, or take separate inputs describing the source/target units).

In both the dask and tf semantics, even the basic action of yielding a fixed value (of 0 or more dimensions) is a task/operation/node. In the present discussion, that covers:

Yield values of specific GAMS objects (more generally, retrieving any value from the ixmp API).
Yield auxiliary/non-model data, e.g. weights, conversion factors, intensities not present in the model, historical data, etc.
Yield configuration values (i.e. "list of items to sum into a certain aggregate").

Non-mathematical manipulations of data are also tasks, e.g.:

Rename variables, e.g. from MESSAGEix internal names to IAMC names or others.

Using the common pattern, almost all of the requirements can be met by defining an exhaustive collection of tasks, and then by composing and manipulating graphs. We would provide both low- and high-level shorthands for such manipulation, e.g.:

duplicating structures (per the comment above about "powerplant parameters").
reading structures from file.
using user-provided tasks (= computations).

Note in particular the print_and_return task in the dask example linked above. We would define reports as tasks that each take specific other data as input, then format them to an expected return value (e.g. pyam.IamDataFrame or something else). By requesting the report, the computation of data that it depends on it triggered. The user can then write the return value to file formats of choice.

khaeru · 2019-03-01T13:00:50Z

I updated the description here to match the results of today's (2019-03-01) discussion. Further details are on the MESSAGEix OneNote folder for this date.

khaeru · 2019-03-01T13:04:42Z

I've set the milestone for this issue to 1.2.0. Once #142 is merged, the milestone for this issue can be switched to 1.3.0 or 2.0 (whichever we'll target for the dask-based reporting).

khaeru · 2019-06-25T09:05:10Z

Closing this as resolved by the experimental reporting modules in the ixmp 0.2 (just now including iiasa/ixmp#150) and message_ix 1.2 (later today, including #206) releases.

We can use separate, smaller issues to iterate on these features as needed.

gidden added this to the Reporting Revamp milestone Jan 16, 2019

gidden mentioned this issue Jan 16, 2019

Reporting Enhancement Proposal A #150

Closed

danielhuppmann mentioned this issue Jan 16, 2019

Reporting Enhancement Proposal B #151

Closed

khaeru changed the title ~~Reporting Revamp Overview~~ Improve reporting features Mar 1, 2019

khaeru modified the milestones: Reporting Revamp, 1.2.0 Mar 1, 2019

khaeru added the reporting label Mar 1, 2019

This was referenced Mar 1, 2019

WIP: Improve reporting (phase 2) #176

Closed

Improve reporting (phase 2) iiasa/ixmp#126

Closed

Python 2.7 reaches end-of-life on 2020-01-01 #177

Closed

This was referenced Jun 13, 2019

Improve reporting (phase 2) iiasa/ixmp#144

Closed

Improve reporting (phase 2) #206

Merged

gidden mentioned this issue Jun 19, 2019

Improve reporting (phase 2) iiasa/ixmp#150

Merged

29 tasks

khaeru closed this as completed Jun 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve reporting features #149

Improve reporting features #149

gidden commented Jan 16, 2019 •

edited by khaeru

Loading

OFR-IIASA commented Jan 16, 2019

OFR-IIASA commented Jan 16, 2019 •

edited

Loading

OFR-IIASA commented Jan 16, 2019

khaeru commented Jan 17, 2019

khaeru commented Jan 17, 2019 •

edited

Loading

khaeru commented Mar 1, 2019

khaeru commented Mar 1, 2019

khaeru commented Jun 25, 2019

Improve reporting features #149

Improve reporting features #149

Comments

gidden commented Jan 16, 2019 • edited by khaeru Loading

Tasks

OFR-IIASA commented Jan 16, 2019

OFR-IIASA commented Jan 16, 2019 • edited Loading

OFR-IIASA commented Jan 16, 2019

khaeru commented Jan 17, 2019

khaeru commented Jan 17, 2019 • edited Loading

khaeru commented Mar 1, 2019

khaeru commented Mar 1, 2019

khaeru commented Jun 25, 2019

gidden commented Jan 16, 2019 •

edited by khaeru

Loading

OFR-IIASA commented Jan 16, 2019 •

edited

Loading

khaeru commented Jan 17, 2019 •

edited

Loading