# Westeros tutorial, part VII: Reporting

‘Reporting’ is the term used in the MESSAGEix ecosystem to refer to any calculations performed _after_ the MESSAGE mathematical optimization problem has been solved (using ``Scenario.solve()``).

This tutorial introduces the reporting features provided by the ``ixmp`` and ``message_ix`` packages.
It was developed by Paul Natsuo Kishimoto ([@khaeru](https://github.com/khaeru)) for a MESSAGE training workshop held at IIASA in October 2019.
<!-- Add a line here if you revise the tutorial! -->

## Introduction

### Pre-requisites

- Complete tutorial Part I (``westeros_baseline.ipynb``)
  - Understand the following MESSAGEix terms: ‘variable’, ‘parameter’
- Read the [‘Reporting’ page in the MESSAGEix documentation](https://message.iiasa.ac.at/en/stable/reporting.html).
  Some text in this tutorial is drawn from that page.
  - Understand the following terms: ‘quantity’.

### What does ‘reporting’ include?

Individual modelers often make different distinctions between—on one hand—the internals of an optimization model and—on the other—reporting, ‘post-processing’, ‘analysis’, and other tasks.
Doing valid research using models like MESSAGE requires that we understand these differences clearly, as well as how we choose to communicate them.

For example, we might say: “The MESSAGE model shows that total secondary energy (electricity) output in Westeros in the year 720 is 9 GWa.”

But, if we are using the model from ``westeros_baseline.ipynb``:
1. The raw data from the ``Scenario`` after the ``.solve()`` command **only** tells us the ``ACT`` variable has certain values.
2. To get the 9 GWa figure, we must:
   1. Compute the product of activity (``ACT``, which is dimensionless) and output efficiency (``output`` in GWa/year), then
   2. Sum across the ``technology`` dimension, and finally
   3. Select the value for the ``year`` 720.
   
In the above example, the steps under 2 are reporting.

Next, we may want to create a plot of electricity output by year.
Some modelers consider this part of ‘reporting’; for others, ‘reporting’ is complete when the values needed for the plot are written to a file, which they can then use with their favourite plotting tool.

### Reporting features in MESSAGEix

The reporting features in ``ixmp`` and ``message_ix`` were developed to support the complicated reporting and multiple workflows required by the IIASA Energy program for scenario analysis projects involving large, detailed models such as the [MESSAGE-GLOBIOM global model](https://message.iiasa.ac.at/global/).
While powerful enough for this purpose, they are also intended to be user-friendly, flexible, and customizable.

## The Graph

``ixmp.Reporter`` is based around a *graph* of *nodes* and *edges*; specifically, a *directed, acyclic graph*.
This means:
- Every edge has a direction; *from* one node *to* another.
- There are no recursive loops in the graph; i.e. no node is its own ancestor.

In the reporting graph, every node is a *calculation*—or, more generally, a *computation*.
We label the node with the name of the quantity it produces and the operation, calculation, or computation it performs.

Every computation node depends on certain inputs; these are represented by the *edges* of the graph.

For example, the following equation:

> $C = A + B$

is represented by:
- A node 'A' that provides the raw value of A.
- A node 'B' that provides the raw value of C.
- A node 'C' that computes a sum of its inputs.
- An edge from 'A' to 'C'
- An edge from 'B' to 'C'

Let's do this in code:

In [None]:
import ixmp

# Create a new Reporter object
rep = ixmp.Reporter()

# Add two nodes to the graph
rep.add('A', 1)
rep.add('B', 2)

# Add one node and two edges to the graph
rep.add('C', (lambda *inputs: sum(inputs), 'A', 'B'))

We use the ``add()`` method of a Reporter object to build the graph.
(Remember, you can type ``rep.add?`` in a new cell to use Jupyter's help features; or look at the documentation page linked above.)

Let's break this down:
- The first argument is the name of the node, which we call a **key**.
- The second argument describes the computation to be performed:
  - For A and B, these are simply the raw or literal value to be produced by the node.
  - For C, it is ``(lambda *inputs: sum(inputs), 'A', 'B')``; let's break that down further.
    - The first argument (``lambda *inputs: sum(inputs)``) is an anonymous function ([read more](https://doc.python.org/3/tutorial/controlflow.html#lambda-expressions)) that computes the sum of its inputs.
    - The remaining arguments (``'A', B'``) are keys that reference other nodes in the graph.

Next, let's trigger the calculation of 'C', which gives the expected value:

In [None]:
rep.get('C')

We can use the ``Reporter.describe()`` to see steps used in this calculation.
The graph is printed out as a hierarchical list:

In [None]:
print(rep.describe('C'))

### Exercise 1
Extend the graph to produce the following equation.

> $E = A + D * \frac{A}{A + B} = A + D * \frac{A}{C}; D = 12$

(Code blocks that solve all exercises are listed at the bottom of the tutorial—don't peek!)

In [None]:
# Some helper functions you can use
def sum_calc(*inputs):
    return sum(inputs)

def product(a, b):
    return a * b

def ratio(a, b):
    return a / b

# Replace 'C' with a reference to sum_calc (instead of an anonymous function)
rep.add('C', (sum_calc, 'A', 'B'))

# Add your code here:

## Big Graphs






## Solutions to exercises

### Exercise 1

One solution involves using intermediate nodes:

In [None]:
rep.add('D', 12)
rep.add('foo1', (ratio, 'A', 'C'))
rep.add('foo2', (product, 'D', 'foo1'))
rep.add('E', (sum_calc, 'A', 'foo2'))
rep.get('E')

Another solution is to define a new or anonymous function:

In [None]:
rep.add('D', 12)
rep.add('E', (lambda a, c, d: a + d * (a / c), 'A', 'C', 'D'))
rep.get('E')