# OpenMod4Africa Workshop Madrid 2024: the pyam package

[![License:
MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![python](https://img.shields.io/badge/python-≥3.10,<3.13-blue?logo=python&logoColor=white)](https://github.com/IAMconsortium/pyam)
[![Code style:
black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

Copyright 2022 (c) Daniel Huppmann; this repository is released under the [MIT
License](LICENSE).

This repository is based on the work done by Daniel Huppmann for *ENGAGE Capacity
Building Workshop: the pyam package*
<https://github.com/danielhuppmann/ENGAGE-pyam-tutorial/>.

## Overview

This repository holds a [Jupyter notebook](tutorial-notebook.ipynb) for a live-demo of
the **pyam** package given as part of the OpenMod4Africa workshop on June 18, 2024.

The Jupyter notebook is based on the *ENGAGE Capacity Building Workshop: the pyam
package* (<https://github.com/danielhuppmann/ENGAGE-pyam-tutorial/>) which itself is
based on the advanced assignment of the [Modelling
Lab](https://github.com/danielhuppmann/climate-risks-academy-2021), which was part of
the *Climate Risks Academy 2021* organized by the European University Institute (EUI)
Florence School of Banking and Finance in cooperation with Oliver Wyman.

The scenario data used in this tutorial notebook is taken from the [OpenMod4Africa
Internal Scenario Explorer](https://data.ece.iiasa.ac.at/openmod4africa-internal).

### Requirements

You can install the **pyam** package using the following command -
note the subtle naming difference on [pypi.org](https://pypi.org/project/pyam-iamc/).

```console
pip install pyam-iamc
```

[Read the docs](https://pyam-iamc.readthedocs.io/en/stable/install.html) for alternative installation options.

In [None]:
import pyam

## Import and inspect the scenario data

In this example we are using an *excel* file but, pyam also supports *csv* and *frictionless data*.
Details can be found in the pyam docs [here](https://pyam-iamc.readthedocs.io/en/stable/api/io.html#input-output-file-formats).

Just calling an **IamDataFrame** prints an overview of all index dimensions and coordinates.

In [None]:
tutorial_df = pyam.IamDataFrame("data/ngfs_data_snapshot.xlsx")
tutorial_df

Because there are more scenarios and variables than can be displayed in one line, the summary only shows a few items.

We can easily display all items of an index dimension or a coordinate individually using attributes of the **IamDataFrame**.

In [None]:
tutorial_df.variable

In [None]:
tutorial_df.scenario

In [None]:
tutorial_df.region

For the remainder of this noteboook, we only use the global data from this scenario ensemble.  
Therefore, we [filter()](https://pyam-iamc.readthedocs.io/en/stable/api/iamdataframe.html#pyam.IamDataFrame.filter)
to the data of interest...

In [None]:
df = tutorial_df.filter(region="World")

## A few simple plots

As a first step to get an idea of the scenario data, let's draw some [plots](https://pyam-iamc.readthedocs.io/en/stable/api/plotting.html).

Why not start with the temperature?

In [None]:
df.filter(variable="Temperature").plot(legend=dict(loc="outside right"))

Let's apply some styling by model and scenario...

In [None]:
df.filter(variable="Temperature").plot(color="scenario", linestyle="model", legend=dict(loc="outside right"))

## Unit conversion

Working with different units is a constant headache (and source of errors) when handling energy systems data.

To simplify such tasks, **pyam** incorporates the [**iam-units**](https://github.com/iamconsortium/units) package,
a community resource for units commonly used in energy-systems modelling, integrated assessment and climate research.

In [None]:
df_coal = df.filter(model="MESSAGEix-GLOBIOM 1.1", scenario="Current Policies", variable="Primary Energy|Coal")
df_coal.timeseries()

In [None]:
df_coal.convert_unit("EJ/yr", "PWh/yr").timeseries()

The **iam-units** package also includes a module to convert different greenhouse-gas emissions
by alternative global-warming-potential (GWP) metrics.

See [this tutorial](https://pyam-iamc.readthedocs.io/en/stable/tutorials/unit_conversion.html#4.-Use-contexts-to-specify-conversion-metrics) for more information!

## Computing aggregates

If you look at the list of variables in the scenario data, you'll see that we initially only have sub-categories of *Primary Energy*.

However, **pyam** offers a number of useful functions to aggregate (or downscale) by sectors or regions.

### Aggregation by sector

By default, the [aggregate()](https://pyam-iamc.readthedocs.io/en/stable/api/iamdataframe.html#pyam.IamDataFrame.aggregate) method
takes all components of the given variable, in this case `Primary Energy|*`.<br />
It returns a new **IamDataFrame** - and the cell displays the summary. You will see that the object has exactly one variable now.

In [None]:
df.aggregate("Primary Energy")

In [None]:
df.aggregate("Primary Energy").plot(legend=dict(loc="outside right"))

Or you can use the [timeseries()](https://pyam-iamc.readthedocs.io/en/stable/api/iamdataframe.html#pyam.IamDataFrame.timeseries) method to show the timeseries data in wide format.

In [None]:
df.aggregate("Primary Energy").timeseries()

It is often convenient to directly append computed timeseries data to the original object.
For that, you can use the `append=True` keyword argument.

In [None]:
df.aggregate("Primary Energy", append=True)

In [None]:
df

When displaying the variables of the **IamDataFrame** again, there is now an additional variable `Primary Energy`.

In [None]:
df.filter(variable="Primary Energy").data

### Aggregation by region

In the interest of time, the features for regional aggregation and downscaling are not shown in this notebook.

Take a look at [this tutorial](https://pyam-iamc.readthedocs.io/en/stable/tutorials/aggregating_downscaling_consistency.html)
for more information!

## Categorization of scenarios by their temperature outcome

We often want to categorize scenarios by some metrics or indicators.
As an example, let us divide scenarios into groups "above 2C" and "below 2C".

First, we assign **all** scenarios to the "above 2C" group,
and then use the [categorize()](https://pyam-iamc.readthedocs.io/en/stable/api/iamdataframe.html#pyam.IamDataFrame.categorize) method
to re-assign all scenarios that satisfy certain criteria.

In [None]:
df.set_meta(meta="above 2C", name="warming-category")
df.to_excel("df_warming_category_meta.xlsx")

In [None]:
df.filter(variable="Temperature").timeseries()

In [None]:
df.categorize(
    "warming-category", "below 2C",
    criteria={"Temperature": {"up": 2.0}},
)

We can inspect the assignment of categories via the `meta` attribute.

In [None]:
df.to_excel("df_warming_category_meta.xlsx")

We can now use this categorization to assign styles for the plots of other variables.

In [None]:
(
    df.filter(variable="Primary Energy|Gas")
    .plot(color="warming-category", linestyle="model", fill_between=True, final_ranges=True)
)

Of course, **pyam** also supports a lot of other plot types and styles -
check out the [plotting gallery](https://pyam-iamc.readthedocs.io/en/stable/gallery/index.html)!

## Algebraic operations

**pyam** can also perform algebraic directly on the timeseries data.

All algebraic-operations functions (
[add()](https://pyam-iamc.readthedocs.io/en/stable/api/iamdataframe.html#pyam.IamDataFrame.add),
[subtract()](https://pyam-iamc.readthedocs.io/en/stable/api/iamdataframe.html#pyam.IamDataFrame.subtract),
[multiply()](https://pyam-iamc.readthedocs.io/en/stable/api/iamdataframe.html#pyam.IamDataFrame.multiply),
[divide()](https://pyam-iamc.readthedocs.io/en/stable/api/iamdataframe.html#pyam.IamDataFrame.divide)
) follow the syntax:

```
df.<method>(a, b, c) => a <op> b = c
```

If possible, **pyam** will try to keep the unit consistent during the operation.  
This feature is supported by the **pint** and the **iam-units** packages,
see [here](https://github.com/iamconsortium/units).

## Computing the amount of primary energy that is not coal

First, we subtract coal from total primary energy and draw a simple plot.  
For this section, we will use a downselected version of the scenario data that only has global values.

In [None]:
df.subtract("Primary Energy", "Primary Energy|Coal", "Primary Energy|Non-Coal").plot()

### Computing coal as a share of primary energy

Next, we can also compute the share of coal relative to total primary energy, and again draw the plot.

In [None]:
df.divide("Primary Energy|Coal", "Primary Energy", "Share of coal").plot(legend=dict(loc="outside right"))

Note that **pyam** has automatically changed the unit on the y-axis.
Dividing `EJ/yr` by `EJ/yr` yiels in a dimensionless value.

### Compute ratio of energy sources between different scenarios

So far, we used the algebraic operations on the (default) *variable* axis.
But **pyam** also supports these operations on any other axis of the timeseries data!

Now, we compute the relative indicator between the *Net Zero 2050* and the *Current Policies* scenarios,
and again plot the resulting timeseries data.  
For simplicity, we only perform this computation on primary-energy values (including the sub-categories)
of the *REMIND* model.

In [None]:
df_pe = df.filter(model="REMIND-MAgPIE 2.1-4.2", variable="Primary Energy*")

In [None]:
(
    df_pe.divide("Net Zero 2050", "Current Policies", "Ratio", axis="scenario")
    .plot(legend=dict(loc="outside right"))
)

In [None]:
import matplotlib.pyplot as plt

As a final illustration, this tutorials shows how to use **matplotlib** and **pyam** to create several plots next to each other.

In [None]:
baseline = "Current Policies"
scenario = set(df.scenario) - set([baseline])

# We first create a matplotlib figure with several "axes" objects (i.e., individual plots)
fig, ax = plt.subplots(1, len(scenario), figsize=(15, 5), sharey=True)

# Then, we iterate over the axes, plotting the results for each scenario as we go along
for i, s in enumerate(scenario):
    (
        df_pe.divide(s, baseline, "Ratio", axis="scenario")
        .plot(ax=ax[i], legend=dict(loc="outside right") if i==len(scenario) - 1 else False)
    )

    # We can also modify the axes objects directly to produce a better figure
    ax[i].set_title(s)

### Retrieve data directly from a Scenario Explorer

So far we have worked with downloaded data in form of Excel.
However, it is possible to download data directly from a Scenario Explorer.

To view the different databases available we use:

In [None]:
conn = pyam.iiasa.Connection()

Explore what's there for AR6:

In [None]:
conn = pyam.iiasa.Connection('openmod4africa_internal')

In [None]:
conn.models()

In [None]:
conn.scenarios()

Let's get some AR6 data then:

In [None]:
df = pyam.read_iiasa('openmod4africa')

In [None]:
df

More details about downloading data directly from a Scenario Explorer can be found in the [pyam docs](https://pyam-iamc.readthedocs.io/en/stable/tutorials/iiasa_dbs.html).

<div class="alert alert-info">
    
**Curious about more pyam features?** Check out the all the pyam tutorials on our [ReadTheDocs page](https://pyam-iamc.readthedocs.io/en/stable/tutorials.html)!

</div>