# Tutorial on how to use mescal
*mescal* is a [Brightway](https://docs.brightway.dev/en/latest/)-powered Python package that helps you to integrate Life-Cycle Assessment (LCA) in your energy system model

## Set up

In [1]:
# Import the required libraries
from mescal import *
import pandas as pd
import bw2data as bd

In [2]:
# Set up your Brightway project
bd.projects.set_current('ei3.8-mescal') # put the name of your brightway project here

## Your data
For mescal to understand the structure of your energy system model, you need to provide it with a set of dataframes.

### Mandatory dataframes

In [3]:
mapping = pd.read_csv('../dev/energyscope_data/mapping.csv')
unit_conversion = pd.read_csv('../dev/energyscope_data/unit_conversion.csv')
mapping_esm_flows_to_CPC = pd.read_csv('../dev/energyscope_data/mapping_esm_flows_to_CPC.csv')
model = pd.read_csv('../dev/energyscope_data/model.csv')

A mapping between the energy technologies and resources of the energy system model, and Life-Cycle Inventories datasets (LCI) from an LCI database (e.g., ecoinvent). The mapping should be provided in a dataframe with the following columns:
- `Name`: the name of the energy technology or resource in the energy model 
- `Type`: the type of the energy technology or resource (i.e., 'Construction', 'Operation', or 'Resource')  
- `Product`: the name of the product of the energy technology or resource in the LCI database
- `Activity`: the name of the activity of the energy technology or resource in the LCI database
- `Location`: the name of the region of the energy technology or resource in the LCI database
- `Unit`: (optional) the physical unit of the energy technology or resource in the LCI database
- `Database`: the name of the database in your brightway project

In [4]:
mapping.head()

A set of unit conversion factors between the energy system model and the LCI database. The conversion factors should be provided in a dataframe with the following columns:
- `Name`: the name of the energy technology or resource in the energy model 
- `Type`: the type of the energy technology or resource (i.e., 'Construction', 'Operation', or 'Resource')
- `Value`: the numerical value of the conversion factor
- `From`: the unit of the energy technology or resource to be converted
- `To`: the target unit of the energy technology or resource

In [5]:
unit_conversion.head()

A mapping between the energy model flows and CPC categories. The list of products should at least encompass the list of technologies without a CPC category within your mapping and their foreground inventory. The mapping should be provided in a dataframe with the following columns:
- `Product`: the name of the product in the LCI database
- `Description`: (optional) the description of the product
- `CPC`: the list of names of corresponding CPC categories

In [6]:
mapping_esm_flows_to_CPC.head()

The input and output flows if energy technologies. It should be provided in a dataframe with the following columns:
- `Name`: the name of the energy technology in the energy model
- `Flow`: the name of the input or output flow
- `Amount`: the numerical value of the flow (negative if input, positive if output)  

In [7]:
model.head()

### Optional dataframes

In [4]:
technology_compositions = pd.read_csv('../dev/energyscope_data/technology_compositions.csv')
technology_specifics = pd.read_csv('../dev/energyscope_data/technology_specifics.csv') 
lifetime = pd.read_csv('../dev/energyscope_data/lifetime.csv')
mapping_product_to_CPC = pd.read_csv('../dev/data/mapping_product_to_CPC.csv')
impact_abbrev = pd.read_csv('../dev/IW+/impact_abbrev.csv')
technologies_to_remove_from_layers = pd.read_csv('../dev/energyscope_data/technologies_to_remove_from_layers.csv')
new_end_use_types = pd.read_csv('../dev/energyscope_data/new_end_use_types.csv')

A set of composition of technologies, i.e., if one technology or resource in the energy model should be represented by a combination of LCI datasets. The composition should be provided in a dataframe with the following columns:
- `Name`: the name of the main energy technology or resource in the energy model 
- `Components`: the list of names of subcomponents

In [9]:
technology_compositions.head()

A set of technologies with specific requirements. For instance, this stands for energy technologies without a construction phase, mobility technologies (if mismatch fuel in the LCI dataset), bioprocesses (if mismatch fuel in the LCI dataset). The requirements should be provided in a dataframe with the following columns:
- `Name`: the name of the energy technology in the energy model
- `Specifics`: the list of requirements  
- `Amount`: the numerical value of the requirement (if relevant)

In [10]:
technology_specifics.head()

Energy technologies lifetimes in the ESM and the LCI database. The lifetimes should be provided in a dataframe with the following columns:
- `Name`: the name of the energy technology in the energy model
- `ESM`: the numerical value of the lifetime in the energy system model (if relevant)
- `LCA`: the numerical value of the lifetime in the LCI database (if relevant)

In [11]:
lifetime.head()

In case you have a LCI database without CPC categories (which are necessary for the double-counting check), you can provide a mapping between the products and activities in the LCI database and the CPC categories. The mapping should be provided in a dataframe with the following columns:
- `Name`: the (partial) name of the product or activity in the LCI database
- `CPC`: the number and name of the corresponding CPC category
- `Search type`: whether the `Name` entry is an exactly the name to look for, or is contained in the full name
- `Where`: whether the `Name` entry is meant for products or activities

In [12]:
mapping_product_to_CPC.head()

An abbreviation scheme for the impact categories you aim to work with, to ease the readability in the ESM. The abbreviations should be provided in a dataframe with the following columns:
- `Impact_category`: the name of the impact category, expressed as a tuple following brightway convention
- `Unit`: (optional) the unit of the impact category
- `Abbrev`: the abbreviation of the impact category
- `AoP`: the area of protection of the impact category

In [13]:
impact_abbrev.head()

In case you want to remove some energy technologies from energy layers in the results LCI datasets to be added to the LCI database, you can provide a list of technologies to remove. The list should be provided in a dataframe with the following columns:
- `Layers`: name of the layer(s). A layer is basically an energy vector, which is an output for some energy technologies and an input for some others. 
- `Technologies`: the name of the energy technology to remove from the layer(s)
- `Comment`: (optional) a comment on the removal

In [5]:
technologies_to_remove_from_layers.head()

Unnamed: 0,Layers,Technologies,Comment
0,"['ELECTRICITY_EHV', 'ELECTRICITY_HV']","['TRAFO_HE','TRAFO_EH']",The high and extra high voltage electricity ar...
1,['ELECTRICITY_LV'],['STO_ELEC'],The storage technologies should be removed (pr...
2,"['NG_HP', 'NG_EHP']","['NG_EXP_EH', 'NG_EXP_EH_COGEN', 'NG_COMP_HE',...",The high and extra high pressure natural gas a...
3,"['H2_LP', 'H2_MP', 'H2_HP', 'H2_EHP']","['H2_COMP_HE', 'H2_COMP_MH', 'H2_COMP_LM', 'H2...",All pressure levels for hydrogen are merged in...
4,"['HEAT_HIGH_T', 'HEAT_LOW_T_DHN']",['HT_LT'],The high and low heat production at the DHN le...


If you want to reformat your end-use types (e.g., output layer) in order to better fit the LCI database, you can provide a list of new end-use types. The list should be provided in a dataframe with the following columns:
- `Name`: name of technologies for which the end-use type should be changed
- `Search type`: whether the `Name` entry is an exactly the name to look for (`equals`), is contained in the full name (`contains`), or is the beginning of the full name (`startswith`)
- `Old`: the current end-use type
- `New`: the new end-use type

In [6]:
new_end_use_types.head()

Unnamed: 0,Name,Search type,Old,New
0,BUS,startswith,MOB_PUBLIC,MOB_PUBLIC_BUS
1,SCHOOLBUS,startswith,MOB_PUBLIC,MOB_PUBLIC_SCHOOLBUS
2,COACH,startswith,MOB_PUBLIC,MOB_PUBLIC_COACH
3,TRAIN,startswith,MOB_PUBLIC,MOB_PUBLIC_TRAIN
4,CAR,startswith,MOB_PRIVATE,MOB_PRIVATE_CAR


## Create a new database with additional CPC categories (optional)
In case you are working with a LCI database without CPC categories, you can create a new database with the CPC categories. The function `create_new_database_with_CPC_categories` takes as input the database with missing CPC categories, the name of the new database, and a mapping between the products and activities in the LCI database and the CPC categories. It creates a new database with the CPC categories. This step can take a few minutes depending on the size of the database.

In [6]:
name_premise_db = "ecoinvent_cutoff_3.8_remind_SSP2-Base_2020"
name_premise_with_CPC_db = name_premise_db + '_with_CPC'

In [6]:
premise_db = load_extract_db(name_premise_db)

In [8]:
create_new_database_with_CPC_categories(db=premise_db, new_db_name=name_premise_with_CPC_db, mapping_product_to_CPC=mapping_product_to_CPC)

## Operations on the mapping dataframe (optional)

In [7]:
premise_db_with_CPC = load_extract_db(name_premise_with_CPC_db)

### Relink the mapping dataframe with another database
In this example, we relink the current mapping, which is based on ecoinvent and some additional inventories, with a [premise](https://premise.readthedocs.io/en/latest/introduction.html) database. The function `create_complementary_database` takes as input the mapping dataframe, the database to link to, and the name of a newly created complementary database, in case some LCI datasets are missing in the LCI database we relink to. It returns a new mapping dataframe with the location column updated, and create the complementary database in the brightway project.

In [8]:
name_premise_comp_db = name_premise_db + "_comp_QC"
name_premise_comp_with_CPC_db = name_premise_comp_db + '_with_CPC'

In [9]:
mapping_linked_to_premise = create_complementary_database(mapping, premise_db_with_CPC, name_premise_comp_with_CPC_db)

No inventory in the premise database for ('EHP_H2_GRID', 'Construction')
No inventory in the premise database for ('HP_H2_GRID', 'Construction')
No inventory in the premise database for ('LCV_BIODIESEL_B100_MD', 'Construction')
No inventory in the premise database for ('LCV_BIODIESEL_B100_MD', 'Operation')
No inventory in the premise database for ('LCV_BIODIESEL_B100_SD', 'Operation')
No inventory in the premise database for ('LCV_BIODIESEL_B100_SD', 'Construction')
No inventory in the premise database for ('LCV_BIODIESEL_B20_MD', 'Construction')
No inventory in the premise database for ('LCV_BIODIESEL_B20_MD', 'Operation')
No inventory in the premise database for ('LCV_BIODIESEL_B20_SD', 'Construction')
No inventory in the premise database for ('LCV_BIODIESEL_B20_SD', 'Operation')
No inventory in the premise database for ('LCV_CNG_MD', 'Construction')
No inventory in the premise database for ('LCV_CNG_MD', 'Operation')
No inventory in the premise database for ('LCV_CNG_SD', 'Construct

Writing activities to SQLite3 database:
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Title: Writing activities to SQLite3 database:
  Started: 06/27/2024 15:13:22
  Finished: 06/27/2024 15:13:22
  Total time elapsed: 00:00:00
  CPU %: 23.10
  Memory %: 15.93


In [10]:
premise_comp_db = load_extract_db(name_premise_comp_with_CPC_db)

Getting activity data


100%|██████████| 37/37 [00:00<?, ?it/s]


Adding exchange data to activities


100%|██████████| 1170/1170 [00:00<00:00, 37413.45it/s]


Filling out exchange data


100%|██████████| 37/37 [00:00<00:00, 170.31it/s]


In [11]:
create_new_database_with_CPC_categories(db=premise_comp_db, new_db_name=name_premise_comp_with_CPC_db, mapping_product_to_CPC=mapping_product_to_CPC)

Writing activities to SQLite3 database:
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Title: Writing activities to SQLite3 database:
  Started: 06/27/2024 15:13:24
  Finished: 06/27/2024 15:13:24
  Total time elapsed: 00:00:00
  CPU %: 22.20
  Memory %: 9.89


In [12]:
premise_comp_db = [] # free memory

In [13]:
mapping_linked_to_premise.head()

Unnamed: 0,Name,Type,Product,Activity,Location,Database
0,AEC_OG,Operation,"hydrogen, gaseous, 20 bar","hydrogen production, gaseous, 20 bar, from AEC...",CH,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020_wit...
1,AEC_OG_PLANT,Construction,"electrolyzer, 1MWe, AEC, Balance of Plant","electrolyzer production, 1MWe, AEC, Balance of...",RER,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020_wit...
2,AEC_OG_PLANT_DECOM,Construction,"used fuel cell balance of plant, 1MWe, AEC","treatment of fuel cell balance of plant, 1MWe,...",RER,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020_wit...
3,AEC_OG_STACK,Construction,"electrolyzer, 1MWe, AEC, Stack","electrolyzer production, 1MWe, AEC, Stack",RER,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020_wit...
4,AEC_OG_STACK_DECOM,Construction,"used fuel cell stack, 1MWe, AEC","treatment of fuel cell stack, 1MWe, AEC",RER,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020_wit...


### Add or replace the location column based on a user-defined ranking
Based on a user-defined ranking, the location column of the mapping dataframe can be updated. The function `change_location_mapping_file` takes as input the mapping dataframe, the user-defined ranking, and the base database. It returns the mapping dataframe with the location column updated.

In [16]:
# create a concatenated database of all databases in the mapping dataframe (including background requirements)
base_db = concatenate_databases(list(mapping_linked_to_premise.Database.unique()))

Getting activity data


100%|██████████| 37/37 [00:00<?, ?it/s]


Adding exchange data to activities


100%|██████████| 1170/1170 [00:00<00:00, 37232.07it/s]


Filling out exchange data


100%|██████████| 37/37 [00:00<00:00, 241.87it/s]


In [17]:
# Define the user-defined ranking
my_ranking = [
    'CA-QC', # Quebec
    'CA', # Canada
    'CA-ON', # Other canadian provinces 
    'CA-AB',
    'CA-BC',
    'CA-MB',
    'CA-NB',
    'CA-NF',
    'CA-NS',
    'CA-NT',
    'CA-NU',
    'CA-PE',
    'CAZ', # Canada - Australia - New Zealand
    'RNA', # North America
    'US', # United States
    'USA', # United States
    'GLO', # Global average 
    'RoW', # Rest of the world
]

In [18]:
# Update mapping dataframe
mapping_linked_to_premise = change_location_mapping_file(mapping_linked_to_premise, my_ranking, base_db, 'QC')

In [19]:
mapping_linked_to_premise.head()

Unnamed: 0,Name,Type,Product,Activity,Location,Database
0,AEC_OG,Operation,"hydrogen, gaseous, 20 bar","hydrogen production, gaseous, 20 bar, from AEC...",CH,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020_wit...
1,AEC_OG_PLANT,Construction,"electrolyzer, 1MWe, AEC, Balance of Plant","electrolyzer production, 1MWe, AEC, Balance of...",RER,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020_wit...
2,AEC_OG_PLANT_DECOM,Construction,"used fuel cell balance of plant, 1MWe, AEC","treatment of fuel cell balance of plant, 1MWe,...",RER,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020_wit...
3,AEC_OG_STACK,Construction,"electrolyzer, 1MWe, AEC, Stack","electrolyzer production, 1MWe, AEC, Stack",RER,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020_wit...
4,AEC_OG_STACK_DECOM,Construction,"used fuel cell stack, 1MWe, AEC","treatment of fuel cell stack, 1MWe, AEC",RER,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020_wit...


In [20]:
# Save the relinked mapping file
mapping_linked_to_premise.to_csv('../dev/energyscope_data/mapping_linked.csv', index=False)

## Double-counting removal

In [27]:
# To skip the previous steps
mapping_linked_to_premise = pd.read_csv('../dev/energyscope_data/mapping_linked.csv')

In [21]:
mapping_linked_to_premise = mapping_linked_to_premise[mapping_linked_to_premise['Type'] != 'Flow']

In [22]:
new_db_name = 'energyscope_QC_2020'

In [23]:
regionalize_foregrounds = False

In [24]:
accepted_locations = ['CA-QC', 'CA']  # sufficient match within ecoinvent

In [25]:
base_db = concatenate_databases(list(mapping_linked_to_premise.Database.unique()))

Getting activity data


100%|██████████| 37/37 [00:00<?, ?it/s]


Adding exchange data to activities


100%|██████████| 1170/1170 [00:00<00:00, 27131.09it/s]


Filling out exchange data


100%|██████████| 37/37 [00:00<00:00, 166.67it/s]


In [26]:
mapping_linked_to_premise_new_code = create_esm_database(
    mapping=mapping_linked_to_premise,
    model=model,
    tech_specifics=technology_specifics,
    technology_compositions=technology_compositions,
    mapping_esm_flows_to_CPC_cat=mapping_esm_flows_to_CPC,
    main_database=base_db,
    esm_db_name=new_db_name,
    regionalize_foregrounds=regionalize_foregrounds,
    accepted_locations=accepted_locations,
    target_region='CA-QC',
    locations_ranking=my_ranking
)

AEC_OG
ALKALINE_ELECTROLYSIS
AL_MAKING
AL_MAKING_HR
AN_DIG
AN_DIG_SI
ATR
ATR_CCS
BIOGAS_ATR
BIOGAS_ATR_CCS
BIOGAS_BIOMETHANE
BIOGAS_SMR
BIOGAS_SMR_CCS
BIOMASS_ETHANOL
BIOMASS_GAS_EF_H2
BIOMASS_GAS_EF_H2_CCS
BIOMASS_GAS_FB_H2
BIOMASS_GAS_FB_H2_CCS
BUS_BIODIESEL_B100_SD
BUS_BIODIESEL_B20_SD
BUS_CNG_SD
BUS_DIESEL_SD
BUS_EV_SD
BUS_FC_CH4_SD
BUS_FC_H2_SD
BUS_HY_DIESEL_SD
BUS_PROPANE_SD
CAR_BIODIESEL_B100_ELD
CAR_BIODIESEL_B100_LD
CAR_BIODIESEL_B100_MD
CAR_BIODIESEL_B100_SD
CAR_BIODIESEL_B20_ELD
CAR_BIODIESEL_B20_LD
CAR_BIODIESEL_B20_MD
CAR_BIODIESEL_B20_SD
CAR_CNG_ELD
CAR_CNG_LD
CAR_CNG_MD
CAR_CNG_SD
CAR_DIESEL_ELD
CAR_DIESEL_LD
CAR_DIESEL_MD
CAR_DIESEL_SD
CAR_ETOH_E10_ELD
CAR_ETOH_E10_LD
CAR_ETOH_E10_MD
CAR_ETOH_E10_SD
CAR_ETOH_E85_ELD
CAR_ETOH_E85_LD
CAR_ETOH_E85_MD
CAR_ETOH_E85_SD
CAR_EV_ELD
CAR_EV_LD
CAR_EV_MD
CAR_EV_SD
CAR_FC_CH4_ELD
CAR_FC_CH4_LD
CAR_FC_CH4_MD
CAR_FC_CH4_SD
CAR_FC_H2_ELD
CAR_FC_H2_LD
CAR_FC_H2_MD
CAR_FC_H2_SD
CAR_GASOLINE_ELD
CAR_GASOLINE_LD
CAR_GASOLINE_MD
CAR_GASOLI

Writing activities to SQLite3 database:
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Title: Writing activities to SQLite3 database:
  Started: 06/27/2024 14:47:31
  Finished: 06/27/2024 14:47:34
  Total time elapsed: 00:00:03
  CPU %: 26.60
  Memory %: 18.01


## Computing the LCA metrics 

In [28]:
esm_db = load_extract_db(new_db_name)

In [29]:
R_long = compute_impact_scores(
    esm_db=esm_db,
    mapping=mapping_linked_to_premise_new_code,
    technology_compositions=technology_compositions,
    methods=['IMPACT World+ Midpoint 2.0.1', 'IMPACT World+ Damage 2.0.1', 'IMPACT World+ Footprint 2.0.1'],
    unit_conversion=unit_conversion,
    lifetime=lifetime
)

In [30]:
R_long.head()

In [31]:
R_long.to_csv('results/impact_scores.csv', index=False)

## Convert the results in an AMPL format

In [5]:
# To skip the previous steps
# R_long = pd.read_csv('results/impact_scores.csv')

In [6]:
refactor = 1e-2
lcia_method = 'IMPACT World+ Damage 2.0.1 - Total only'

### Create the .dat file

In [7]:
normalize_lca_metrics(
    R=R_long,
    f_norm=1e6,
    mip_gap=1e-6,
    refactor=refactor,
    lcia_method=lcia_method,
    impact_abbrev=impact_abbrev
)

### Create the .mod file

In [8]:
gen_lcia_obj(
    lcia_method=lcia_method,
    refactor=refactor,
    impact_abbrev=impact_abbrev
)

## Integrate the ESM results back in the LCI database

In [None]:
annual_prod = pd.read_csv('../dev/energyscope_data/results_ES.csv') # put the path to your ESM results here

In [None]:
create_new_database_with_esm_results(
    mapping=mapping_linked_to_premise,
    model=model,
    esm_location='CA-QC',
    esm_results=annual_prod,
    unit_conversion=unit_conversion,
    db=base_db,
    locations_ranking=my_ranking,
    accepted_locations=['CA-QC', 'CA'],
    new_db_name='ecoinvent_cutoff_3.8_remind_SSP2-Base_2020_ESM',
    technology_compositions=technology_compositions,
    tech_specifics=technology_specifics,
    mapping_esm_flows_to_CPC_cat=mapping_esm_flows_to_CPC,
    tech_to_remove_layers=technologies_to_remove_from_layers,
    new_end_use_types=new_end_use_types,
)