# Westeros Tutorial - Introducing reporting

‘Reporting’ is the term used in the MESSAGEix ecosystem to refer to any calculations performed _after_ the MESSAGE mathematical optimization problem has been solved.

This tutorial introduces the reporting features provided by the ``ixmp`` and ``message_ix`` packages.
It was developed by Paul Natsuo Kishimoto ([@khaeru](https://github.com/khaeru)) for a MESSAGEix training workshop held at IIASA in October 2019.
Participants in the MESSAGEix workshops of June and September 2020 contributed feedback.
<!-- Add a line here if you revise the tutorial! -->

**Pre-requisites**
- You have the *MESSAGEix* framework installed and working.
  In particular, you should have installed ``message_ix[report]``, which requires ``ixmp[report]``, ``genno[compat]``, and ``plotnine``.
- Complete tutorial Part 1 (``westeros_baseline.ipynb``).
  - Understand the following MESSAGEix terms: ‘variable’, ‘parameter’.
- Open the [‘Reporting’ page in the MESSAGEix documentation](https://docs.messageix.org/en/stable/reporting.html);
  bookmark it or keep it open in a tab.
  Some text in this tutorial is drawn from that page,
  and it provides a concise reference for concepts explained below

## Introduction
### What does ‘reporting’ include?

Individual modelers will make different distinctions between—on one hand—the internals of an optimization model and—on the other—reporting, ‘post-processing’, ‘analysis’, and other tasks.
Doing valid research using models like MESSAGE requires that we understand these differences clearly, as well as how we choose to communicate them.

For example, we might say:

> The MESSAGE model shows that total secondary energy (electricity) output in Westeros in the year 720 is 9 GWa.

But, if we are using the model from `westeros_baseline.ipynb`:
1. The raw data from the `Scenario`, after `.solve()` has been called,
   **only** tells us the $\mathrm{ACT}$ variable has certain values.
2. To get the “9 GWa” figure, we must:
   1. Compute the product of activity ($\mathrm{ACT}$, which is dimensionless) and output efficiency ($\mathrm{output}$, in GWa/year), then
   2. Sum across the $t$/`technology` dimension, and finally
   3. Select the single value for the period ($y^A$/`year`) 720.
   
In this example, steps (2.1), (2.2), and (2.3) are ‘reporting’.
Even an intuitive concept like “total secondary energy” **is not a direct output** of the model:
it must be reported.

Next, we may want to create a plot of electricity output by year.
Some modelers consider this part of ‘reporting’; for others, ‘reporting’ is complete when the values needed for the plot are written to a file, which they can then use with their favourite plotting tool.

## Reporting features in MESSAGEix

The reporting features in ``ixmp`` and ``message_ix`` are developed to support the complicated reporting and multiple workflows required by the IIASA ECE Program for research projects involving large, detailed models such as the [MESSAGEix-GLOBIOM global model](https://docs.messageix.org/global/).
While powerful enough for this purpose, they are also intended to be user-friendly, flexible, and customizable.

The main Python class used for reporting is `message_ix.Reporter`,
which extends the classes `ixmp.Reporter` and `genno.Computer`.
A reporting workflow has two steps:

1. **Prepare** or describe all tasks the Reporter may possibly handle, using ``Reporter.add()`` and other helper methods.
2. **Execute** a subset of these tasks using ``Reporter.get()``, in order to generate one or more quantities, files, reports, etc.

This two-step process allows the Reporter to deliver good performance,
by excluding unneeded tasks
and storing intermediate results that are used in multiple places
(in order to avoid recomputing them).

## Concepts: task graph, node, edge, key

The Reporter is built around a **graph** of *nodes* and *edges*; specifically, a *directed, acyclic graph* (DAG).
This is a simple and generic mathematical concept wherein:
- Every edge has a direction; *from* one node *to* another.
- There are no recursive loops in the graph; that is, no node is its own ancestor.

In the Reporter's graph, every **node** represents some kind of **task**.
These can include:

- Numerical *calculation*.
- Other *computations* like manipulating data formats, writing files, etc.
- Simply collecting the outputs of 1 or more *other* nodes.

The node is labeled with a unique name for the output, quantity, or effect it produces.
We call this a **key**.

The task represented by one node may depend on certain inputs that are outputs of other nodes.
These output-to-input links are represented by the **edges** of the graph.

### Example

The following equation

> $C = A + B$

…can be represented as:
- A node named "A" that outputs the raw value of A.
- A node named "B" that outputs the raw value of B.
- A node named "C" that computes a sum of its inputs.
- An edge from "A" to "C", indicating that the value of A is an input to C.
- An edge from "B" to "C".

We use the Reporter to describe this equation (step 1 of the 2-step workflow):

In [None]:
from message_ix.report import Reporter

# Create a new Reporter object
rep = Reporter()

# Add two nodes
#
# These have no inputs and don't execute any code.
# They only return a literal value: integer 1 or 2
rep.add("A", 1)
rep.add("B", 2)


def compute_sum(*inputs):
    """A function that adds 1 or more inputs together."""
    return sum(inputs)


# Add one more node
rep.add("C", compute_sum, "A", "B")

Here is a detailed explanation, in words, of what we just did:
- We use the ``Reporter.add()`` method to build the graph.

  (Remember: you can type ``Reporter.add?`` or ``rep.add?`` in a new cell to use Jupyter's help features;
  or look at the documentation page linked above.)
- The first argument to ``add()`` is the key (name) of the node.
- The remaining arguments describe the computation to be performed:
  - For nodes "A" and "B", these are simply the raw or literal value to be produced by the node.
  - For node "C" there are 3 items: ``compute_sum, "A", "B"``.
    Let's break that down further:
    - The first item, ``compute_sum`` is a reference to a Python function that we have defined.
    - The second item, `"A"` is a string.
      This matches the key we already gave to another node in the graph.
    - Likewise, the third item.

At this point, we have given the Reporter a series of instructions like "In order to compute 'C', first compute 'A' and 'B', then run the function ``compute_sum()`` on their values." But **we have not yet executed** any of these tasks.

Let's do that now.
We trigger the calculation of `"C"` (step 2 in the 2-step workflow),
which gives the expected value:

In [None]:
rep.get("C")

We can use ``Reporter.describe()`` to see a text representation
of the steps used in this calculation.
The task graph is printed out as a hierarchical list:

In [None]:
print(rep.describe("C"))

This description shows how the Reporter traverses the graph in order to calculate the quantity we asked for:

1. The desired value is from node "C".
2. "C" is computed by running a function, specifically `compute_sum()`.
3. The first argument to the function is "A".
4. "A" is the name of another node.
5. Node "A" gives a literal value `int(1)`, which the Reporter stores for later use.
6. The Reporter returns to "C" and moves on to the next argument: "B".
7. Steps (4) and (5) are repeated for "B", giving `int(2)`.
8. All of the arguments to "C" have been processed.
9. The function for "C" is called.

   Instead of the (string) keys "A" and "B", this function is given the computed `int` values from steps (5) and (7) respectively.
   In effect, the Reporter runs: ``compute_sum(1, 2)``.
10. The result is returned.

We can also use `rep.visualize()` to generate an image with a simplified version of `rep.describe()`.
(This requires installing the ``graphviz`` package, so is not included in this tutorial.
[See the documentation](https://genno.readthedocs.io/en/latest/api.html#genno.Computer.visualize).)

![Visualization of A + B = C](westeros_report-C.svg)

In [None]:
# rep.visualize(key="C", filename="westeros_report-C.svg", rankdir="LR")

In [None]:
# Store *rep* for the solutions at the bottom of the notebook
rep1 = rep


## From simple to complicated

In this first example nodes "A" and "B" are, at most, 1 step away from the node we requested.
The values each computes are used only once,
and they run very quickly.

In more realistic examples, the task graph can have:
- Long chains of calculations, each depending on the output of its ancestors, and/or
- Multiple connection, so that for instance the result of task "A" is used in 2 or more places.
- Tasks that perform operations on lots of data, load large files, access a network, or are otherwise slow.

However, the Reporter can still follow the same,
simple procedure to traverse the graph and calculate the results.

### Exercise 1

Add a node "X" to the graph that returns the literal value 42.

In [None]:
# Add your code here

After adding "X", what do you think will be the result when you run the following cell?
Why?

Write down your answer before trying the code.

(Answers and code blocks that solve all exercises are listed at the bottom of the tutorial—don't peek!)

In [None]:
print(rep.describe("C"))

### Exercise 2
Extend the Reporter to describe the following equation:

> $E = A + D \times \frac{A}{A + B} = A + D \times \frac{A}{C}; \qquad D = 12$

In [None]:
# Some helper functions you can use
def product(a, b):
    return a * b


def ratio(a, b):
    return a / b


# Add your code here:

## Concepts: Quantities, Keys, and data formats

In the last section, $A$, $B$, and so on were *scalar* variables with a single value.
In energy systems modeling, including with MESSAGE*ix*, we usually deal with scientific **quantities** that have the following properties:

1. They are **multi-dimensional**, with 1 or more *dimensions* and *labels* that identify *coordinates* along those dimensions such as specific time periods, technologies, etc.
2. They are **sparse**: there is not necessarily a value for every possible combination of labels; and
3. They have **units** of measurement associated, like "exajoule" or "kilometre",
   that we want to preserve and keep consistent as we do calculations.

Mathematically, we can say the following:

$$
\begin{align}
A_{ij} & = \left[a_{i,j} \right] \\
i & \in I = \left\{ i1, i2, i3, ... \right\} \\
j & \in J = \left\{ j1, j2, j3, ... \right\} \\
a_{i,j} & \in \left\{ \mathbb{R}, \mathrm{NaN} \right\} \\
a_{i,j} & \, [=]\, \mathrm{units\ of\ measurement\ for\ A}
\end{align}
$$
…where ‘NaN’ means “not a number,” i.e. a missing value.

### Kinds of data

In other MESSAGEix tutorials, we emphasized the distinction between…

- **parameters** or “input data” that are fixed/exogenous to the core optimization model; and
- **variables** that the optimizer can change in order to find the optimal model solution.
  These are “output data” from the core.

During the reporting phase of a modeling workflow,
the optimal solution *has already been computed*.
So this distinction is no longer crucial,
and we often want to mix together parameter and variable data in our calculations
—plus other “non-model data” from files and other places.
Thus we use the term “quantity” for everything.

### Dimensionality of quantities

Some of the quantities in the MESSAGE mathematical formulation have many dimensions.
For some calculations, we may want to “get rid of” some of these dimensions,
or handle them in specific ways.

For instance, $\mathrm{output}$ has ten dimensions: $\mathrm{output}_{chh^Dn^Dn^Llmty^Vy^A}$.
We might be interested in the total output in a given period ($y^A$),
but not concerned about different vintages of a technology ($y^V$).
In this case, we don't really want the 10-dimensional quantity
—instead we want its **partial sum** over all values of $y^V$.

**Notation.**
Consider a quantity with three dimensions, $A_{ijk}$, and another with two, $B_{kl}$, and a scalar $C$.
We define partial sums over every possible combination of dimensions:

$$
\begin{align}
A_{ij} & = \left[ a_{i,j} \right], \quad
  & a_{i,j} = \sum_{k}{a_{i,j,k}} \ \forall \ i, j
  & \quad\mathrm{similarly } A_{ik}, A_{jk} \\
A_{i} & = \left[ a_i \right], \quad
  & a_i = \sum_j\sum_{k}{a_{i,j,k}} \ \forall\  i
  & \quad\mathrm{similarly } A_j, A_k \\
A & = \sum_i\sum_j\sum_k{a_{i,j,k}}
  & & \mathrm{(a\ scalar)}
\end{align}
$$

Note that $A$ and $B$ share one dimension, $k$, but the other dimensions are distinct.
We specify that simple arithmetic operations result in a quantity whose dimensions are the union of the dimensions of the operands. In other words:

$$
\begin{align}
C + A_{i} = X_{i} = \left[ x_{i} \right],
  & \quad x_{i} = C + a_{i} \ \forall \ i \\
A_{jk} \times B_{kl} = Y_{jkl} = \left[ y_{j,k,l} \right],
  & \quad y_{j,k,l} = a_{j,k} \times b_{k,l} \ \forall \ j, k, l \\
A_{j} - B_{j} = Z_{j} = \left[ z_{j} \right],
  & \quad z_{j} = a_{j} - b_{j} \ \forall \ j \\
\end{align}
$$

As a result of this rule:
- The difference $Z_j$ has the same dimensionality as *both* of its operands.
- The sum $X_i$ has the same dimensionality as *one* of its operands.
- The product $Y_{jkl}$ has a *different* dimensonality from each of its operands.

These operations are called **broadcasting** and **alignment**:
- The scalar value $C$ is *broadcast* across all labels on the dimension $i$ that it lacks, in order to calculate $x_i$.
- $A_{jk}$ and $B_{kl}$ are *aligned* on matching values of $k$, but *broadcast* over dimensions $j$ and $l$, respectively.

### Keys

In the first code example, `"C"` was the node label or key that we used to refer to the output of a certain calculation—even before it was been computed.
Likewise, the Python string `"A"` is a key.
When computed, node `"A"` returns a Python `int(1)`—an object representing its actual *value*.

In step 1 of the reporting workflow, tasks are described using *only* keys.
No *values* are created until step 2—and *only* the values needed to provide the result of `Reporter.get()`.

For multi-dimensional calculations, we need keys that distinguish $A_i$—the partial sum of $A_{ijk}$ used in the calculation of $X_i$—from $A_{jk}$—a *different* partial sum used in the calculation of $Y_{jkl}$.
It is **not** sufficient to refer to both as `"A"`, since this is ambiguous about what calculation we want to perform.

For this purpose we use the `Key` class provided by message_ix/ixmp/genno.

A Key has a name, zero or more dimensions, and an optional tag:

In [None]:
from message_ix.report import Key

# from ixmp.report import Key  # Same class
# from genno import Key        # Same class

# Quantity named "A" with 3 dimensions
k1 = Key("A", ("i", "j", "k"))

k1

The Key class allows many kinds of manipulations to get related keys.
For example, we can drop one dimension from the key for $A_{ijk}$ to get the key referring to its partial sum $A_{ik}$:

In [None]:
# "/" operator = drop a dimension
k1 / "j"

In [None]:
k2 = Key("A:i-k")  # Construct a Key from a string
k2 == k1 / "j"  #    Compare Keys with one another

### Quantity values

To represent the **values** of quantities from a model or produced by reporting calculations, message_ix uses the `genno.Quantity` class.
Quantity is derived from [`xarray.DataArray`](http://xarray.pydata.org/en/stable/data-structures.html#dataarray)—a labeled, multi-dimensional array, with attributes.

The combination of Key and Quantity lets the Reporter (and you!) handle multi-dimensional data, while automatically handling alignment and broadcasting. 

## Automated reporting

A `message_ix.Reporter` for a specific `Scenario` is created using the `.from_scenario()` method.
This method automatically adds many nodes to the graph based on (a) the contents of the Scenario and (b) the well-defined mathematical formulation of MESSAGE.

### Demonstration

In [None]:
from ixmp import Platform

from message_ix.testing import make_westeros

mp = Platform()
scen = make_westeros(mp, emissions=True, solve=True)

In [None]:
from ixmp.report import configure

# Create a reporter from the existing Scenario
rep = Reporter.from_scenario(scen)

# Reporter uses the Python package 'pint' to handle units. "-"", used in the Westeros
# tutorial, is not a defined SI unit. We tell the Reporter to replace it with ""
# (dimensionless) everywhere it appears.
configure(units={"replace": {"-": ""}})

What is in this `rep` object that we've just created?

In [None]:
len(rep.graph)

Over 16,000 nodes!

Remember: `rep` simply *describes* these operations.
None of them is executed until or unless you `get()` them.

Let's look at some of the automatically populated content of the graph:

In [None]:
# Return the full-dimensionality Key for the MESSAGE parameter "output"
output = rep.full_key("output")
output

In [None]:
# Return the full-dimensionality Key for the MESSAGE variable "ACT"
ACT = rep.full_key("ACT")
ACT

What would happen if we were to `get()` this key?

In [None]:
print(rep.describe(ACT))

We can see:
- The Reporter will call a function named `data_for_quantity()`.

  This (and all built-in operators) are [described in the MESSAGEix documentation](https://docs.messageix.org/en/stable/reporting.html#ixmp.reporting.utils.data_for_quantity).
- The function gets some direct arguments: `"var", "ACT", "lvl"`.

  From the documentation, we can see this indicates the level (rather than marginal) of an ixmp `"var"`iable (rather than parameter) named `"ACT"`.
- The next argument is ‘scenario’, another node in the graph.
- This node gives a reference to the same Scenario object we passed to `Reporter.from_scenario()`.

In short, if we `.get()` this key `ACT`,
the Reporter will extract a 6-dimensional quantity from the Scenario object and return it.
The other ~16,000 tasks will not be executed.

Let's try:

In [None]:
rep.get(ACT)

### More automated contents

As mentioned, because `Reporter.from_scenario()` knows that `scen` follows the MESSAGE mathematical formulation, it can automatically fill the graph with tasks to derive quantities that we may want to work with.

For example: the activity ($\mathrm{ACT}$) for various technologies ($t$) has no units;
these are not understood by GAMS or by ixmp.
The amounts of specific commodities ($c$) produced by each $t$, **with** units, are given by the product:

$$\mathrm{ACT} \times \mathrm{output}$$

The Reporter automatically prepares this calculation, with the name "out" (the documentation contains [the names for all automatic quantities](https://docs.messageix.org/en/latest/reporting.html#message_ix.reporting.Reporter.from_scenario)):

In [None]:
out = rep.full_key("out")
out

In [None]:
# Show what would be done
print(rep.describe(out))

In [None]:
rep.get(out)

### Automatic partial sums

In this example, some dimensions like mode ($m$), time ($h$) and time_dest ($h^D$) don't contain useful information: they have the same label for every value.
We also have a single-region model, so we don't need node_loc ($n^L$) or node_dest ($n^D$) either.
We can instead ask for a partial sum.

**Exercise:** review the notation above and satisfy yourself that for $A_{ijk}$, where $i \in I$ and $\|I\| = 1$—that is, when there is only one label along the dimension $I$—then $a_{j,k} = a_{i,j,k} \,\forall\, j, k$. That is, a partial sum over dimension $i$ is the same as ‘dropping’ the dimension $i$.

`Key.drop()` lets us derive its key from the one we already have.
This doesn't perform any calculation; simply returns a new Key with fewer dimensions:

In [None]:
out2 = out.drop("h", "hd", "m", "nd", "nl")

# out2 = out / ("h", "hd", "m", "nd", "nl")  # Same result
# out2 = out / Key("_:h-hd-m-nd-nl")         # Same result

out2

This partial sum is *also* already described in the Reporter:

In [None]:
print(rep.describe(out2))

In [None]:
rep.get(out2)

### File output

As noted above, the labeled, multi-dimensional Quantity is used so that values passing between reporting calculations are in a consistent, easy-to-manipulate format.

For research purposes, we often want to transform data into other, particular formats or write it to file, in order to feed it into other tools such as existing analysis or plotting codes; both our own, and collaborators'. Reporter provides multiple ways to do this.

For instance, we can `get()` a Quantity and write it directly to a file in a single step:

In [None]:
rep.write(out2, "output.csv")

The file appears in the same directory where we started the Jupyter notebook.

**Exercise:** Try using an .xlsx file name in the above.

We can also define a conversion to a different data format.
This is described in the next section.

## Describing additional computations

The previous section showed how to find and retrieve the results of computations for tasks automatically added by `Reporter.from_scenario()`.
Reporter also provides many helper methods to describe **additional** computations in step 1 of the workflow.

After using these methods, we can continue to describe further calculations using them as input (step 1); or we can `get()` them (step 2).

### Converting to IAMC data structure

Here, we'll use built-in operator [`as_pyam`](https://genno.readthedocs.io/en/latest/compat-pyam.html#genno.compat.pyam.operator.as_pyam).

This converts data from a Quantity object to the `pyam.IamDataFrame` class from the [`pyam`](https://pyam-iamc.readthedocs.io) package.
`pyam` is built around the [data file format](https://data.ene.iiasa.ac.at/database/) used by the [Integrated Assessment Modeling Consortium](http://www.globalchange.umd.edu/iamc/) (IAMC), and offers plotting and further calculation features.

In [None]:
def format_variable(df):
    """Callback function to construct a label for the IAMC "variable" column.

    The IAMC format does not support "level", "technology", or "commodity" dimensions,
    so we prepare a label like "final energy|technology name|commodity name".
    """
    df["variable"] = df["l"] + " energy|" + df["t"] + "|" + df["c"]
    return df.drop(["c", "l", "t"], axis=1)


# Add a node that converts data to a pyam.IamDataFrame object
# - Give the quantity to convert: out, with partial sum across some dimensions.
# - Use the MESSAGE "node_loc" dimension for the IAMC "region" dimension.
# - Use the MESSAGE "year_act" dimension for the IAMC "year" dimension.
# - Use format_variable() to collapse "l"/"t"/"c" into the IAMC "variable" dimension.
new_key = rep.add(
    "as_pyam",
    out / ("h", "hd", "m", "nd", "yv"),
    rename=dict(nl="region", ya="year"),
    collapse=format_variable,
)

new_key

Note that nothing was computed: we're still in step 1 of the reporting workflow!

However, the method *did* return a new key for the node added to the graph.
This key has the **tag** `'iamc'` added at the end.

We describe the added computation, then execute it to get a `pyam.IamDataFrame`:

In [None]:
print(rep.describe(new_key))

In [None]:
iamc_df = rep.get(new_key)
iamc_df

(Note that, unlike a pandas.DataFrame, the contents of a pyam.IamDataFrame are not displayed by default.)

After we have retrieved the `pyam` object, we can use its built-in methods to filter and plot the data:

In [None]:
%matplotlib inline

(
    iamc_df.filter(
        model="Westeros Electrified", scenario="baseline", region="Westeros"
    ).plot()
)

### Custom computations

Thus far we've described reporting calculations using simple, atomic operators,
including those automatically added by `Reporter.from_scenario()`.

However—just as in the first, introductory example—operators are merely Python functions.
This means they can be **any function**—no matter how complex!
Thus, it is easy to connect any existing analysis codes into the graph.

To demonstrate this, we add several nodes, each using a custom function.
- `as_tidy_data()` operates on the internal Quantity value to coerce it into a `pandas.DataFrame` in a specific format.
- `my_plot()` uses a different Python plotting package named [`plotnine`](https://plotnine.readthedocs.io) that implements a “grammar of graphics,” similar to R's `ggplot` package. It returns a plot object without drawing it.
- `save_plot()` saves the plot to file.
- `show_plot()` shows the plot directly in this notebook.

Finally, we define a node `"do both"` whose task is simply a Python list of other keys.
This form of task means “compute each of the nodes, and return their outputs in a list.”

In [None]:
import pandas as pd
import plotnine as p9

from message_ix.report import Quantity


def as_tidy_data(qty: Quantity) -> pd.DataFrame:
    """Convert `qty` to a tidy data frame, as expected by plotnine."""
    return qty.to_series().rename("value").reset_index()


def my_plot(data: pd.DataFrame):
    """Computation that returns a plotnine plot object."""
    # 'Aes'thetic mappings between column names and parts of the plot
    aes = p9.aes(x="ya", y="value", color="t + ' ' + c", shape="l")

    # Set up the plot but don't draw it
    plot = (
        p9.ggplot(data, aes)  # Create the plot
        + p9.geom_line()  # Add a line
        + p9.geom_point()  # Add points
        + p9.labs(  # Label axes & legend
            x="Year",
            y="Energy output",
            color="Tech & commodity",
            shape="Level",
        )
    )

    print("my_plot() runs only once")
    return plot


def save_plot(obj) -> str:
    obj.save("westeros_report.pdf", verbose=False)
    return "Saved to westeros_report.pdf"


def show_plot(obj) -> str:
    obj.show()
    return "Drawn in notebook"


# Add nodes to the graph
rep.add("tidy", as_tidy_data, out2 / "yv")
rep.add("plot", my_plot, "tidy")
rep.add("save", save_plot, "plot")
rep.add("show", show_plot, "plot")
rep.add("do both", ["save", "show"])

In [None]:
print(rep.describe("do both"))

# With Graphviz installed
# rep.visualize(key="do both", filename="westeros_report-do-both.svg")

![Visualization of "do both"](westeros_report-do-both.svg)

Note that Reporter will avoid calling `my_plot()` repeatedly.
Instead, it will be called just once, and the resulting object is stored.
When the `"save"` and `"show"` nodes are executed,
the *same* object is passed to each of `save_plot()` and `show_plot()` in turn.

In [None]:
rep.get("do both")

In a real-world reporting workflow,
a key like `"do both"` could refer to **many** plots.
The Reporter would compute all the data necessary for these plots, generate them, and save them, all on a single `Reporter.get()` call.

## Wrapping up

`message_ix`, ixmp, and genno offer many other reporting features not covered by this tutorial.

See the [`message-ix` reporting documentation](https://docs.messageix.org/en/stable/reporting.html) to learn how to:
- Add exogenous (non-model) data to be used in other calculations, with `Reporter.add_file()`.
- Use a function to add many nodes at once, with `Reporter.apply()`.

You can also read documentation for other packages:
- [`genno`](https://genno.readthedocs.io/en/stable/) that provides the core features underlying `message_ix.Reporter`.
- [`message-ix-models`](https://docs.messageix.org/projects/models/en/stable/api/report/index.html) that:
  - builds on `message_ix.report` to provide additional features tailored to the MESSAGEix-GLOBIOM model family and IIASA ECE Program research workflows.
  - Applies these for model variants like MESSAGEix-Transport.
  - Uses `genno.Computer` also for preparation of model *input* data.

We would greatly appreciate:
- Reports of your experience using the reporting features in your work, and
- Issues and pull requests to extend the feature set.

## Solutions to exercises

### Exercise 1

The result does not change, because "X" is not needed to calculate "C".

### Exercise 2

One solution involves adding some intermediate nodes—call them "foo1" and "foo2":

In [None]:
# Restore the saved value rep1
rep = rep1

rep.add("D", 12)
rep.add("foo1", (ratio, "A", "C"))
rep.add("foo2", (product, "D", "foo1"))
rep.add("E", (compute_sum, "A", "foo2"))
print(rep.describe("E"))

rep.get("E")

Another solution is to define a new anonymous function that computes "E" in a single step:

In [None]:
rep.add("D", 12)
rep.add("E", (lambda a, c, d: a + d * (a / c), "A", "C", "D"))
print(rep.describe("E"))

rep.get("E")

In [None]:
mp.close_db()