# Validation of the PyPSA-Africa Electric Demand

This notebooks investigates the data quality of the African electric consumption
data by comparing PyPSA, official Nigerian and Our World in Data (incl. BP & Ember).

To reproduce the findings obtained in this notebook,
please run the full snakemake workflow for the Africa.
To do so, please set ``countries = ["Africa"]`` in the ``config.yaml`` file.

Note. An unoptimized prepared network is sufficient for this notebook. 

## Preparation

### Import packages

In [1]:
# import packages

import logging
import os

import pypsa
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

logger = logging.getLogger(__name__)

pd.set_option("display.max_columns", None)
pd.set_option("display.max_colwidth", 70)

### Set main directory to root folder

In [2]:
# set current folders
import sys

sys.path.append("../../")  # adds path to $ .../pypsa-africa
from scripts._helpers import sets_path_to_root

sets_path_to_root("pypsa-africa")  # moves path to root

## 1. Load data 

#### Our World in Data
Retrieved from their GitHub: https://github.com/owid/energy-data/tree/master \
Web interface: https://ourworldindata.org/energy 

Our data sources:
- Energy consumption (primary energy, energy mix and energy intensity): this data is sourced from a combination of three sources—the BP Statistical Review of World Energy, the EIA and the SHIFT Data Portal.
- Electricity generation (electricity generation, and electricity mix): this data is sourced from a combination of three sources—the BP Statistical Review of World Energy, the Ember – Data Explorer and the Ember European Electricity Review.
- Other variables: this data is collected from a variety of sources (United Nations, World Bank, Gapminder, Maddison Project Database, etc.). More information is available in our codebook.


In [3]:
from scripts._helpers import three_2_two_digits_country  # _helpers are from pypsa

url = "https://nyc3.digitaloceanspaces.com/owid-public/data/energy/owid-energy-data.csv"
df = pd.read_csv(url)
df = df.loc[:, ["iso_code", "country", "year", "electricity_demand"]]
df = df[df["iso_code"].notna()]  # removes antartica
df["iso_code_2"] = df.loc[:, "iso_code"].apply(lambda x: three_2_two_digits_country(x))
electricity_demand_owid = df
electricity_demand_owid.tail(2)

#### PyPSA network

Requires a solved or unsolved network.

In [4]:
solved_network_path = os.getcwd() + "/networks/elec_s_420_ec.nc"
pypsa_network = pypsa.Network(solved_network_path)
electricity_demand_pypsa = pypsa_network.loads_t.p_set
electricity_demand_pypsa.head(2)

# 2. Validate

#### Steps:
- Align country coverage of dataframe
- Pick year of interest for 'Our World in Data'
- Align dataframe naming and temporal resolution
- Merge dataframe


Reduce "Our World in Data" to contain the same countries as PyPSA

In [5]:
country_in_network = (
    electricity_demand_pypsa.columns.to_frame()["Load"].apply(lambda x: x[0:2]).values
)
electricity_demand_owid_mini = electricity_demand_owid[
    electricity_demand_owid["iso_code_2"].isin(country_in_network)
]

Create total electric demand in TWh from OWID for year of interest

In [6]:
year_owid = 2020  # option
df = electricity_demand_owid_mini
df = df[df["year"] == year_owid]
electricity_demand_owid_mini = df
electricity_demand_owid_mini.head(2)

Create total electric demand in TWh from PyPSA for chosen demand year in `config.yaml`

In [7]:
# short the columns to only two digit (for groupby in next step)
electricity_demand_pypsa.columns = (
    electricity_demand_pypsa.columns.to_frame()["Load"].apply(lambda x: x[0:2]).values
)

Align PyPSA dataframe to 'Our World in Data'

In [8]:
from scripts._helpers import two_2_three_digits_country  # _helpers are from pypsa
import yaml

with open("config.yaml", "r") as file:
    config = yaml.safe_load(file)

df = electricity_demand_pypsa
df = pd.DataFrame(
    (df.sum().T.groupby([df.columns]).sum() / 10**6).round(2)
)  # MWh to TWh
df = df.reset_index()
df = df.rename(columns={0: "electricity_demand", "index": "iso_code_2"})
df["year"] = config["load_options"]["prediction_year"]
df["iso_code"] = df.loc[:, "iso_code_2"].apply(lambda x: two_2_three_digits_country(x))

map_dic = pd.Series(
    electricity_demand_owid_mini.country.values,
    index=electricity_demand_owid_mini.iso_code_2,
).to_dict()
df["country"] = df["iso_code_2"].map(map_dic)
electricity_demand_pypsa = df
electricity_demand_pypsa.head(2)

Merge dataframes

In [9]:
h = config["load_options"]["prediction_year"]
electricity_demand_pypsa["source"] = f"PyPSA {h}"
electricity_demand_owid_mini["source"] = f"Our World in Data {year_owid}"
df_merge = pd.concat([electricity_demand_pypsa, electricity_demand_owid_mini])

In [10]:
import seaborn as sns
import matplotlib.pyplot as plt

sns.set_theme(style="whitegrid", font_scale=1.5)

# shorten names
df_merge.loc[
    df_merge["country"] == "Democratic Republic of Congo", "country"
] = "Demo. Rep. of Congo"
df_merge.loc[
    df_merge["country"] == "Central African Republic", "country"
] = "Central African Rep."

# split dataframe. No equal split possible since 'Our World in Data' misses "EH" == Western Sahara
df_merge1 = df_merge.sort_values("country").iloc[0:46]
df_merge2 = df_merge.sort_values("country").iloc[46:]

# Initialize the matplotlib figure
ax = sns.catplot(
    x="country",
    y="electricity_demand",
    hue="source",
    data=df_merge1,
    palette="Blues_d",
    kind="bar",
    height=5,
    aspect=5,
)
ax.set_xticklabels(rotation=90)
sns.move_legend(ax, "upper right", bbox_to_anchor=(0.85, 0.9), frameon=True, title=None)
sns.despine(left=True, bottom=True)
ax.set(xlabel=None, ylabel="Annual Electricity demand [TWh]", ylim=(0, 500))
# ax.savefig("demand-validation-part1.pdf", bbox_inches='tight')

ax = sns.catplot(
    x="country",
    y="electricity_demand",
    hue="source",
    data=df_merge2,
    palette="Blues_d",
    kind="bar",
    height=5,
    aspect=5,
)
ax.set_xticklabels(rotation=90)
sns.move_legend(ax, "upper right", bbox_to_anchor=(0.85, 0.9), frameon=True, title=None)
sns.despine(left=True, bottom=True)
ax.set(xlabel=None, ylabel="Annual Electricity Demand [TWh]", ylim=(0, 500))
# ax.savefig("demand-validation-part2.pdf", bbox_inches='tight')

In [11]:
african_total_consumption = (
    df_merge.groupby(by="source").sum().drop(columns="year").reset_index()
)
extra = pd.DataFrame(
    data={
        "source": ["IRENA 2030", "Alova et al. 2030"],
        "electricity_demand": [1004 + 920, 1877],
    }
)
# https://www.irena.org/-/media/Files/IRENA/Agency/Publication/2015/IRENA_Africa_2030_REmap_2015_low-res.pdf
# https://www.nature.com/articles/s41560-020-00755-9

african_total_consumption = pd.concat([african_total_consumption, extra])
african_total_consumption.plot.scatter(
    x="source", y="electricity_demand", s="electricity_demand"
)
african_total_consumption