# What is the most sustainable way to generate electricity? A comparison of ecoinvent datasets.

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Introduction" data-toc-modified-id="Introduction-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Introduction</a></span></li><li><span><a href="#Calculation" data-toc-modified-id="Calculation-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Calculation</a></span><ul class="toc-item"><li><span><a href="#ILCD-scores" data-toc-modified-id="ILCD-scores-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>ILCD scores</a></span></li><li><span><a href="#Sustainability-Index" data-toc-modified-id="Sustainability-Index-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>Sustainability Index</a></span><ul class="toc-item"><li><span><a href="#Normalization" data-toc-modified-id="Normalization-2.2.1"><span class="toc-item-num">2.2.1&nbsp;&nbsp;</span>Normalization</a></span></li><li><span><a href="#Weighing-and-aggregation" data-toc-modified-id="Weighing-and-aggregation-2.2.2"><span class="toc-item-num">2.2.2&nbsp;&nbsp;</span>Weighing and aggregation</a></span></li></ul></li><li><span><a href="#Export" data-toc-modified-id="Export-2.3"><span class="toc-item-num">2.3&nbsp;&nbsp;</span>Export</a></span></li><li><span><a href="#Visualization" data-toc-modified-id="Visualization-2.4"><span class="toc-item-num">2.4&nbsp;&nbsp;</span>Visualization</a></span></li></ul></li><li><span><a href="#Discussion" data-toc-modified-id="Discussion-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Discussion</a></span></li></ul></div>

## Introduction

Which electricity generation technology is the most sustainable one? Is it photovoltaic panels on house roofs? Is it off-shore wind parks? Is it nuclear pressure water reactors?! The answer will depend on several aspects. 

First of all, it depends on the models used to describe the different generation technologies. Here, I will use models and parameters as supplied by ecoinvent 3.5, allocation at the point of substitution system model (https://www.ecoinvent.org/).

Secondly, it will depend on how we define sustainability. For example, wind energy seems very sustainable from a climate change point of view (low green house gas emissions). However, wind turbines need large amounts of minerals and metals in their construction, making them seem less sustainable from a resource point of view. This is one example of how different indicators will yield different answers. For this study, I will use the 19 midpoint indicators recommended by the International Reference Life Cycle Data System, version 2.0, as implemented in ecoinvent 3.5. I will present results for each indicator. Additionally, I will present normalized, equal-weighted aggregates. The later is just *one* example of how to aggregate multiple indicators to yield one sustainability index. There are infinitely many ways to aggregate different indicators and none of them is preferable over the other. In the end, sustainability measures will always be a subjective construct because different stakeholders give different emphasis to different impact categories.

## Calculation

### ILCD scores
I use the brightway2 package for python for impact calculations (https://brightwaylca.org/). Brightway allows to 
- read the database (ecoinvent 3.5 APOS)
- query the database for activities (all electricity production activities in the database)
- calculate the impact score of activities according to different methods (ILCD 2.0)

First, imports.

In [1]:
import brightway2 as bw
import pandas as pd
import xlsxwriter

I'll skip the setup here. Please refer to the official brightway guide for details on how to import the ecoinvent 3.5 database etc.: https://nbviewer.jupyter.org/urls/bitbucket.org/cmutel/brightway2/raw/default/notebooks/Getting%20Started%20with%20Brightway2.ipynb

Let's start by getting all electricity production activities in the database.

In [2]:
# setting the directory containing ecoinvent 3.5 APOS database
bw.projects.set_current("ecoinvent-import")

# querying 
lActivities = [a for a in bw.Database("ecoinvent 3.5 APOS") if "electricity production" in a["name"]]

len(lActivities)

1488

ecoinvent knows 1,488 different activities that produce electricity! Let's go ahead and calculate their impacts. 

**Note: Computation of all values may take up to an hour! I reduced the number of activities to the first 5 in the list to make the notebook runable. Feel free to delete the corresponding line to calculate all impacts on your system.**

In [3]:
###### delete line below to calculate ALL impacts ######
lActivities = lActivities[:5]
########################################################

# get ILCD 2.0 midpoint methods
ilcd = [m for m in bw.methods if "ILCD" in str(m) and "2018" in str(m) and "LT" not in str(m)]

# compute all ILCD scores for all activities
ldScores = []
for a in lActivities:
    oLCA = bw.LCA({a:1}, ilcd[0])
    oLCA.lci()
    oLCA.lcia()
    dScores = {ilcd[0]:oLCA.score}
    for oMethod in ilcd[1:]:
        oLCA.switch_method(oMethod)
        oLCA.lcia()
        dScores[oMethod] = oLCA.score
    ldScores.append(dScores)
    
# convert to dataframe
df = pd.DataFrame(ldScores)
df.head()

Unnamed: 0,"(ILCD 2.0 2018 midpoint, climate change, climate change biogenic)","(ILCD 2.0 2018 midpoint, climate change, climate change fossil)","(ILCD 2.0 2018 midpoint, climate change, climate change land use and land use change)","(ILCD 2.0 2018 midpoint, climate change, climate change total)","(ILCD 2.0 2018 midpoint, ecosystem quality, freshwater and terrestrial acidification)","(ILCD 2.0 2018 midpoint, ecosystem quality, freshwater ecotoxicity)","(ILCD 2.0 2018 midpoint, ecosystem quality, freshwater eutrophication)","(ILCD 2.0 2018 midpoint, ecosystem quality, marine eutrophication)","(ILCD 2.0 2018 midpoint, ecosystem quality, terrestrial eutrophication)","(ILCD 2.0 2018 midpoint, human health, carcinogenic effects)","(ILCD 2.0 2018 midpoint, human health, ionising radiation)","(ILCD 2.0 2018 midpoint, human health, non-carcinogenic effects)","(ILCD 2.0 2018 midpoint, human health, ozone layer depletion)","(ILCD 2.0 2018 midpoint, human health, photochemical ozone creation)","(ILCD 2.0 2018 midpoint, human health, respiratory effects, inorganics)","(ILCD 2.0 2018 midpoint, resources, dissipated water)","(ILCD 2.0 2018 midpoint, resources, fossils)","(ILCD 2.0 2018 midpoint, resources, land use)","(ILCD 2.0 2018 midpoint, resources, minerals and metals)"
0,5.3e-05,0.588143,1.5e-05,0.588211,0.000645,0.217072,8e-06,0.00018,0.001934,1.142199e-09,0.001369,1.202924e-08,5.0993e-08,0.000718,1.659245e-09,0.044559,9.281378,0.170848,1.040912e-07
1,0.000162,1.053454,7.8e-05,1.053694,0.010851,0.074229,0.000753,0.001528,0.015533,1.561707e-09,0.004844,4.135861e-08,6.808792e-09,0.00412,1.342487e-08,0.07718,15.350463,2.357585,1.142805e-07
2,2.6e-05,0.012671,3e-05,0.012726,7.4e-05,0.025228,9e-06,6e-05,0.000187,5.676048e-10,1.184924,3.716515e-09,5.734082e-08,5.1e-05,2.871032e-09,0.132819,14.255169,0.084286,5.122173e-08
3,0.000363,0.077957,0.000177,0.078496,0.00057,0.089036,7.2e-05,0.000101,0.000926,2.191182e-09,0.009163,2.70784e-08,8.781046e-09,0.000315,4.861033e-09,0.1121,1.200793,0.592213,3.434466e-06
4,0.000287,0.061599,0.00014,0.062025,0.00045,0.070353,5.7e-05,8e-05,0.000732,1.731384e-09,0.00724,2.139614e-08,6.938799e-09,0.000249,3.841037e-09,0.088576,0.948821,0.467936,2.713751e-06


Numbered indices are not very readable. Let's use metadata about the activities as the index instead:

In [4]:
# get names
names = [a["name"].split(",") for a in lActivities]
df_names = pd.DataFrame(names).fillna(" ")
# split names at the commas to make reading and manipulation easier
col_names = [("name_"+str(c), " ", " ") for c in df_names.columns]
df[col_names] = df_names

# add units and locations
df[("unit"," "," ")] = [a["unit"] for a in lActivities]
df[("location"," "," ")] = [a["location"] for a in lActivities]

# set index
meta_data_cols = col_names + [("unit", " ", " "), ("location", " ", " ")]
df.set_index(meta_data_cols, inplace=True)

df.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Unnamed: 5_level_0,Unnamed: 6_level_0,Unnamed: 7_level_0,"(ILCD 2.0 2018 midpoint, climate change, climate change biogenic)","(ILCD 2.0 2018 midpoint, climate change, climate change fossil)","(ILCD 2.0 2018 midpoint, climate change, climate change land use and land use change)","(ILCD 2.0 2018 midpoint, climate change, climate change total)","(ILCD 2.0 2018 midpoint, ecosystem quality, freshwater and terrestrial acidification)","(ILCD 2.0 2018 midpoint, ecosystem quality, freshwater ecotoxicity)","(ILCD 2.0 2018 midpoint, ecosystem quality, freshwater eutrophication)","(ILCD 2.0 2018 midpoint, ecosystem quality, marine eutrophication)","(ILCD 2.0 2018 midpoint, ecosystem quality, terrestrial eutrophication)","(ILCD 2.0 2018 midpoint, human health, carcinogenic effects)","(ILCD 2.0 2018 midpoint, human health, ionising radiation)","(ILCD 2.0 2018 midpoint, human health, non-carcinogenic effects)","(ILCD 2.0 2018 midpoint, human health, ozone layer depletion)","(ILCD 2.0 2018 midpoint, human health, photochemical ozone creation)","(ILCD 2.0 2018 midpoint, human health, respiratory effects, inorganics)","(ILCD 2.0 2018 midpoint, resources, dissipated water)","(ILCD 2.0 2018 midpoint, resources, fossils)","(ILCD 2.0 2018 midpoint, resources, land use)","(ILCD 2.0 2018 midpoint, resources, minerals and metals)"
"(name_0, , )","(name_1, , )","(name_2, , )","(name_3, , )","(name_4, , )","(name_5, , )","(unit, , )","(location, , )",Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1
electricity production,natural gas,conventional power plant,,,,kilowatt hour,CN-JS,5.3e-05,0.588143,1.5e-05,0.588211,0.000645,0.217072,8e-06,0.00018,0.001934,1.142199e-09,0.001369,1.202924e-08,5.0993e-08,0.000718,1.659245e-09,0.044559,9.281378,0.170848,1.040912e-07
electricity production,hard coal,,,,,kilowatt hour,RoW,0.000162,1.053454,7.8e-05,1.053694,0.010851,0.074229,0.000753,0.001528,0.015533,1.561707e-09,0.004844,4.135861e-08,6.808792e-09,0.00412,1.342487e-08,0.07718,15.350463,2.357585,1.142805e-07
electricity production,nuclear,boiling water reactor,,,,kilowatt hour,US-NPCC,2.6e-05,0.012671,3e-05,0.012726,7.4e-05,0.025228,9e-06,6e-05,0.000187,5.676048e-10,1.184924,3.716515e-09,5.734082e-08,5.1e-05,2.871032e-09,0.132819,14.255169,0.084286,5.122173e-08
electricity production,photovoltaic,3kWp slanted-roof installation,multi-Si,panel,mounted,kilowatt hour,LV,0.000363,0.077957,0.000177,0.078496,0.00057,0.089036,7.2e-05,0.000101,0.000926,2.191182e-09,0.009163,2.70784e-08,8.781046e-09,0.000315,4.861033e-09,0.1121,1.200793,0.592213,3.434466e-06
electricity production,photovoltaic,3kWp slanted-roof installation,multi-Si,panel,mounted,kilowatt hour,IN-JH,0.000287,0.061599,0.00014,0.062025,0.00045,0.070353,5.7e-05,8e-05,0.000732,1.731384e-09,0.00724,2.139614e-08,6.938799e-09,0.000249,3.841037e-09,0.088576,0.948821,0.467936,2.713751e-06


That's it! These are impact scores for all electricity production activities in ecoinvent 3.5. I can use these to answer indicator-specific questions like: Which electricity generation technology has the lowest total global warming potential (GWP 100)?

In [5]:
df[("ILCD 2.0 2018 midpoint", "climate change", "climate change total")].idxmin()

('electricity production',
 ' nuclear',
 ' boiling water reactor',
 ' ',
 ' ',
 ' ',
 'kilowatt hour',
 'US-NPCC')

Or statistical evaluations, like what is the average and standard deviation for the GWP 100 indicator for all electricity generation activities?

In [6]:
df[("ILCD 2.0 2018 midpoint", "climate change", "climate change total")].describe()

count    5.000000
mean     0.359031
std      0.453299
min      0.012726
25%      0.062025
50%      0.078496
75%      0.588211
max      1.053694
Name: (ILCD 2.0 2018 midpoint, climate change, climate change total), dtype: float64

### Sustainability Index

Using the produced data we can rank the electricity generation datasets according to individual impact indicators. However, a single indicator does not give enough information to decide if a technology is sustainable or not. To get a bigger picture, I want to aggregate all indicators into one number. As mentioned in the introduction, there are infinitely many ways to do this. The one chosen here is not more right or wrong than any other way. Feel free to change this part according to your needs!

#### Normalization

For each indicator, I choose the minimum and the maximum value over all activities. I define the minimum as 0 and the maximum as 1. Then I use linear interpolation to project all other values into this [0, 1] range.

In [7]:
df_normalized = df.copy()
for indicator in df.columns:
    max_value = df[indicator].max()
    min_value = df[indicator].min()
    df_normalized[indicator] = (df[indicator] - min_value) / (max_value - min_value)
    
df_normalized.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Unnamed: 5_level_0,Unnamed: 6_level_0,Unnamed: 7_level_0,"(ILCD 2.0 2018 midpoint, climate change, climate change biogenic)","(ILCD 2.0 2018 midpoint, climate change, climate change fossil)","(ILCD 2.0 2018 midpoint, climate change, climate change land use and land use change)","(ILCD 2.0 2018 midpoint, climate change, climate change total)","(ILCD 2.0 2018 midpoint, ecosystem quality, freshwater and terrestrial acidification)","(ILCD 2.0 2018 midpoint, ecosystem quality, freshwater ecotoxicity)","(ILCD 2.0 2018 midpoint, ecosystem quality, freshwater eutrophication)","(ILCD 2.0 2018 midpoint, ecosystem quality, marine eutrophication)","(ILCD 2.0 2018 midpoint, ecosystem quality, terrestrial eutrophication)","(ILCD 2.0 2018 midpoint, human health, carcinogenic effects)","(ILCD 2.0 2018 midpoint, human health, ionising radiation)","(ILCD 2.0 2018 midpoint, human health, non-carcinogenic effects)","(ILCD 2.0 2018 midpoint, human health, ozone layer depletion)","(ILCD 2.0 2018 midpoint, human health, photochemical ozone creation)","(ILCD 2.0 2018 midpoint, human health, respiratory effects, inorganics)","(ILCD 2.0 2018 midpoint, resources, dissipated water)","(ILCD 2.0 2018 midpoint, resources, fossils)","(ILCD 2.0 2018 midpoint, resources, land use)","(ILCD 2.0 2018 midpoint, resources, minerals and metals)"
"(name_0, , )","(name_1, , )","(name_2, , )","(name_3, , )","(name_4, , )","(name_5, , )","(unit, , )","(location, , )",Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1
electricity production,natural gas,conventional power plant,,,,kilowatt hour,CN-JS,0.07912,0.552922,0.0,0.552836,0.052945,1.0,0.0,0.081488,0.113848,0.353906,0.0,0.220836,0.87438,0.163834,0.0,0.0,0.578584,0.038077,0.015627
electricity production,hard coal,,,,,kilowatt hour,RoW,0.40255,1.0,0.388987,1.0,1.0,0.255419,1.0,1.0,1.0,0.612291,0.002936,1.0,0.0,1.0,1.0,0.369608,1.0,1.0,0.018639
electricity production,nuclear,boiling water reactor,,,,kilowatt hour,US-NPCC,0.0,0.0,0.087117,0.0,0.0,0.0,0.00129,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.102994,1.0,0.923947,0.0,0.0
electricity production,photovoltaic,3kWp slanted-roof installation,multi-Si,panel,mounted,kilowatt hour,LV,1.0,0.062728,1.0,0.063182,0.045976,0.332602,0.086409,0.027954,0.048144,1.0,0.006586,0.620632,0.03903,0.064746,0.272131,0.765254,0.017496,0.223432,1.0
electricity production,photovoltaic,3kWp slanted-roof installation,multi-Si,panel,mounted,kilowatt hour,IN-JH,0.773791,0.047011,0.770071,0.047359,0.034886,0.235214,0.066049,0.013513,0.035482,0.716799,0.004961,0.469677,0.002573,0.048516,0.185438,0.498725,0.0,0.168763,0.786975


The result is a table where all impact scores range between zero and one. Zero means lowest impact with reference to the benchmark (i.e. all ecoinvent 3.5 electricity generation activities). One means highest impact with reference to the benchmark.

#### Weighing and aggregation

We still have 19 numbers, each of which describes a small part of the big picture "sustainability". I will now boil them down to one number by simply adding them up. I call the resulting number "sustainability index". Let me stress this again: This index is not more right or wrong than any other one. It is *one* rather arbitrary way to aggregate the individual impact scores.

The lowest possible number for our index is zero. Zero indicates a technology which achieves the *lowest* possible (with reference to the benchmark) impact score in all nineteen impact categories. The highest possible index value is nineteen. It indicates a technology which has the *highest* possible (with reference to the benchmark) impact score in all nineteen impact categories.

Let's see how the ecoinvent activities score in this index:

In [8]:
# sum
df_normalized[("SUM"," "," ")] = df_normalized.sum(axis=1)

# sort ascending
df_normalized.sort_values(by=("SUM"," "," "), ascending=True, inplace=True)

df_normalized.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Unnamed: 5_level_0,Unnamed: 6_level_0,Unnamed: 7_level_0,"(ILCD 2.0 2018 midpoint, climate change, climate change biogenic)","(ILCD 2.0 2018 midpoint, climate change, climate change fossil)","(ILCD 2.0 2018 midpoint, climate change, climate change land use and land use change)","(ILCD 2.0 2018 midpoint, climate change, climate change total)","(ILCD 2.0 2018 midpoint, ecosystem quality, freshwater and terrestrial acidification)","(ILCD 2.0 2018 midpoint, ecosystem quality, freshwater ecotoxicity)","(ILCD 2.0 2018 midpoint, ecosystem quality, freshwater eutrophication)","(ILCD 2.0 2018 midpoint, ecosystem quality, marine eutrophication)","(ILCD 2.0 2018 midpoint, ecosystem quality, terrestrial eutrophication)","(ILCD 2.0 2018 midpoint, human health, carcinogenic effects)","(ILCD 2.0 2018 midpoint, human health, ionising radiation)","(ILCD 2.0 2018 midpoint, human health, non-carcinogenic effects)","(ILCD 2.0 2018 midpoint, human health, ozone layer depletion)","(ILCD 2.0 2018 midpoint, human health, photochemical ozone creation)","(ILCD 2.0 2018 midpoint, human health, respiratory effects, inorganics)","(ILCD 2.0 2018 midpoint, resources, dissipated water)","(ILCD 2.0 2018 midpoint, resources, fossils)","(ILCD 2.0 2018 midpoint, resources, land use)","(ILCD 2.0 2018 midpoint, resources, minerals and metals)","(SUM, , )"
"(name_0, , )","(name_1, , )","(name_2, , )","(name_3, , )","(name_4, , )","(name_5, , )","(unit, , )","(location, , )",Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1
electricity production,nuclear,boiling water reactor,,,,kilowatt hour,US-NPCC,0.0,0.0,0.087117,0.0,0.0,0.0,0.00129,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.102994,1.0,0.923947,0.0,0.0,4.115347
electricity production,natural gas,conventional power plant,,,,kilowatt hour,CN-JS,0.07912,0.552922,0.0,0.552836,0.052945,1.0,0.0,0.081488,0.113848,0.353906,0.0,0.220836,0.87438,0.163834,0.0,0.0,0.578584,0.038077,0.015627,4.678403
electricity production,photovoltaic,3kWp slanted-roof installation,multi-Si,panel,mounted,kilowatt hour,IN-JH,0.773791,0.047011,0.770071,0.047359,0.034886,0.235214,0.066049,0.013513,0.035482,0.716799,0.004961,0.469677,0.002573,0.048516,0.185438,0.498725,0.0,0.168763,0.786975,4.905804
electricity production,photovoltaic,3kWp slanted-roof installation,multi-Si,panel,mounted,kilowatt hour,LV,1.0,0.062728,1.0,0.063182,0.045976,0.332602,0.086409,0.027954,0.048144,1.0,0.006586,0.620632,0.03903,0.064746,0.272131,0.765254,0.017496,0.223432,1.0,6.676299
electricity production,hard coal,,,,,kilowatt hour,RoW,0.40255,1.0,0.388987,1.0,1.0,0.255419,1.0,1.0,1.0,0.612291,0.002936,1.0,0.0,1.0,1.0,0.369608,1.0,1.0,0.018639,13.050429


### Export

Let's export the absolute and the normalized results to an excel file. 

In [9]:
# transform index into individual columns for easier manipulation
df.reset_index(inplace=True)
df_normalized.reset_index(inplace=True)

# make multi-level headers for better readability
df.columns = pd.MultiIndex.from_tuples(df.columns)
df_normalized.columns = pd.MultiIndex.from_tuples(df_normalized.columns)

# export to xlsx
writer = pd.ExcelWriter("output/ecoinvent_electricity_comparison.xlsx", engine='xlsxwriter')
df.to_excel(writer, sheet_name='abs')
df_normalized.to_excel(writer, sheet_name='norm')
writer.save()

### Visualization

Let's draw a heat map showing all normalized impacts for all activities and coloring them according to their magnitude.

In [14]:
import bokeh.io
import bokeh.models
import bokeh.plotting
from bokeh.palettes import Reds9
import re
import numpy as np

# construct list of activity names for display
names = df.loc[:,meta_data_cols].apply(lambda x: re.sub(' +', ' '," ".join(x[1:])).strip(), axis=1).to_list()

# construct list of method names for display
methods = [", ".join(m[1:]) for m in ilcd]

# define tooltips to be displayed
TOOLTIPS = [
    ("activity", "@act"),
    ("impact category", "@cat"),
    ("normalized score", "@score"),
]

# make figure
f = bokeh.plotting.figure(
    x_axis_label="ILCD 2.0 midpoint indicator", y_axis_label='ecoinvent activity',
    plot_width=900, plot_height=800,   
    tooltips = TOOLTIPS,
    y_range=bokeh.models.FactorRange(*names),
    x_range=bokeh.models.FactorRange(*methods)
)

# define plot data
data = {
    "score": [df_normalized.loc[i,m] for i in df_normalized.index for m in ilcd],
    "act": [names[i] for i in df_normalized.index for m in ilcd],
    "cat": [m for i in df_normalized.index for m in methods],
}

# define colormap
mapper = bokeh.models.LinearColorMapper(palette=Reds9, low=1, high=0)

# plot
f.rect(
    source=data, x="cat", y="act", width=1, height=1,
    fill_color={'field': 'score', 'transform': mapper},
    line_color=None,
)

# rotate x-axis ticks
f.xaxis.major_label_orientation = np.pi / 4
    
# show plot
bokeh.io.output_notebook()
bokeh.io.show(f)

Save the figure to disk.

In [11]:
bokeh.io.output_file("output/heatmap.html")
path = bokeh.io.save(f)

## Discussion