# Tutorial on how to use mescal
*mescal* is a [Brightway](https://docs.brightway.dev/en/latest/)-powered Python package that helps you to integrate Life-Cycle Assessment (LCA) in your energy system model

## Set up

In [1]:
# Import the required libraries
from mescal import *
import pandas as pd
import bw2data as bd

In [2]:
# Set up your Brightway project
bd.projects.set_current('ei3.8-mescal') # put the name of your brightway project here

## Your data
For mescal to understand the structure of your energy system model, you need to provide it with a set of dataframes.

### Mandatory dataframes

In [3]:
mapping = pd.read_csv('../dev/energyscope_data/mapping.csv')
unit_conversion = pd.read_csv('../dev/energyscope_data/unit_conversion.csv')
mapping_esm_flows_to_CPC = pd.read_csv('../dev/energyscope_data/mapping_esm_flows_to_CPC.csv')
model = pd.read_csv('../dev/energyscope_data/model.csv')

A mapping between the energy technologies and resources of the energy system model, and Life-Cycle Inventories datasets (LCI) from an LCI database (e.g., ecoinvent). The mapping should be provided in a dataframe with the following columns:
- `Name`: the name of the energy technology or resource in the energy model 
- `Type`: the type of the energy technology or resource (i.e., 'Construction', 'Operation', or 'Resource')  
- `Product`: the name of the product of the energy technology or resource in the LCI database
- `Activity`: the name of the activity of the energy technology or resource in the LCI database
- `Location`: (optional) the name of the region of the energy technology or resource in the LCI database
- `Database`: the name of the database in your brightway project

In [4]:
mapping.head()

Unnamed: 0,Name,Type,Product,Activity,Location,Unit,Database
0,AEC_OG,Operation,"hydrogen, gaseous, 20 bar","hydrogen production, gaseous, 20 bar, from AEC...",CH,/kg,h2_electrolysis
1,AEC_OG_PLANT,Construction,"electrolyzer, 1MWe, AEC, Balance of Plant","electrolyzer production, 1MWe, AEC, Balance of...",RER,/unit,h2_electrolysis
2,AEC_OG_PLANT_DECOM,Construction,"used fuel cell balance of plant, 1MWe, AEC","treatment of fuel cell balance of plant, 1MWe,...",RER,/unit,h2_electrolysis
3,AEC_OG_STACK,Construction,"electrolyzer, 1MWe, AEC, Stack","electrolyzer production, 1MWe, AEC, Stack",RER,/unit,h2_electrolysis
4,AEC_OG_STACK_DECOM,Construction,"used fuel cell stack, 1MWe, AEC","treatment of fuel cell stack, 1MWe, AEC",RER,/unit,h2_electrolysis


A set of unit conversion factors between the energy system model and the LCI database. The conversion factors should be provided in a dataframe with the following columns:
- `Name`: the name of the energy technology or resource in the energy model 
- `Type`: the type of the energy technology or resource (i.e., 'Construction', 'Operation', or 'Resource')
- `Value`: the numerical value of the conversion factor

In [5]:
unit_conversion.head()

Unnamed: 0,Name,Type,Value
0,ACETIC_ACID,Resource,247422.6804
1,ACETONE,Resource,121654.5012
2,AEC_OG,Construction,1555.555556
3,AEC_OG,Operation,30030.03003
4,AEC_OG_PLANT,Construction,1.0


A mapping between the energy model flows and CPC categories. The list of products should at least encompass the list of technologies without a CPC category within your mapping and their foreground inventory. The mapping should be provided in a dataframe with the following columns:
- `Product`: the name of the product in the LCI database
- `Description`: (optional) the description of the product
- `CPC`: the list of names of corresponding CPC categories

In [6]:
mapping_esm_flows_to_CPC.head()

Unnamed: 0,Flow,Description,CPC
0,BENZENE,Benzene,"['33100: Coke and semi-coke of coal, of lignit..."
1,BIO_DIESEL,Bio-diesel,['35491: Biodiesel']
2,CO2_A,Carbon dioxide (concentrated emissions),['34210b: Carbon dioxide and monoxide']
3,CO2_C,Carbon dioxide (captured),['34210b: Carbon dioxide and monoxide']
4,CO2_CS,Carbon dioxide (captured & stored),['34210b: Carbon dioxide and monoxide']


The input and output flows if energy technologies. It should be provided in a dataframe with the following columns:
- `Name`: the name of the energy technology in the energy model
- `Flow`: the name of the input or output flow
- `Amount`: the numerical value of the flow (negative if input, positive if output)  

In [7]:
model.head()

Unnamed: 0,Name,Flow,Amount
0,DIESEL_S,DIESEL_S,1.0
1,GASOLINE_S,GASOLINE_S,1.0
2,ELEC_S,ELEC_S,1.0
3,NG_S,NG_S,1.0
4,H2_S,H2_S,1.0


### Optional dataframes

In [33]:
technology_compositions = pd.read_csv('../dev/energyscope_data/technology_compositions.csv')
technology_specifics = pd.read_csv('../dev/energyscope_data/technology_specifics.csv') 
lifetime = pd.read_csv('../dev/energyscope_data/lifetime.csv')
mapping_product_to_CPC = pd.read_csv('../dev/data/mapping_product_to_CPC.csv')
impact_abbrev = pd.read_csv('../dev/IW+/impact_abbrev.csv')

A set of composition of technologies, i.e., if one technology or resource in the energy model should be represented by a combination of LCI datasets. The composition should be provided in a dataframe with the following columns:
- `Name`: the name of the main energy technology or resource in the energy model 
- `Components`: the list of names of subcomponents

In [9]:
technology_compositions.head()

Unnamed: 0,Name,Components
0,AEC_OG,"['AEC_OG_STACK', 'AEC_OG_PLANT', 'AEC_OG_STACK..."
1,ALKALINE_ELECTROLYSIS,"['ALKALINE_ELECTROLYSIS_STACK', 'ALKALINE_ELEC..."
2,AN_DIG_SI,"['AN_DIG_SI_PLANT', 'AN_DIG_SI_COGEN']"
3,ATR,"['ATR_PLANT', 'ATR_TANK']"
4,ATR_CCS,"['ATR_CCS_PLANT', 'ATR_CCS_TANK']"


A set of technologies with specific requirements. For instance, this stands for energy technologies without a construction phase, mobility technologies (if mismatch fuel in the LCI dataset), bioprocesses (if mismatch fuel in the LCI dataset). The requirements should be provided in a dataframe with the following columns:
- `Name`: the name of the energy technology in the energy model
- `Specifics`: the list of requirements  
- `Amount`: the numerical value of the requirement (if relevant)

In [10]:
technology_specifics.head()

Unnamed: 0,Name,Specifics,Amount
0,DIRECT_USAGE,No construction,
1,DOGR,No construction,
2,EOR,No construction,
3,DEEP_SALINE,No construction,
4,MINES_STORAGE,No construction,


Energy technologies lifetimes in the ESM and the LCI database. The lifetimes should be provided in a dataframe with the following columns:
- `Name`: the name of the energy technology in the energy model
- `ESM`: the numerical value of the lifetime in the energy system model (if relevant)
- `LCA`: the numerical value of the lifetime in the LCI database (if relevant)

In [11]:
lifetime.head()

Unnamed: 0,Name,ESM,LCA
0,AEC_OG,20.0,
1,AEC_OG_PLANT,,20.0
2,AEC_OG_STACK,,7.5
3,AEC_OG_PLANT_DECOM,,20.0
4,AEC_OG_STACK_DECOM,,7.5


In case you have a LCI database without CPC categories (which are necessary for the double-counting check), you can provide a mapping between the products and activities in the LCI database and the CPC categories. The mapping should be provided in a dataframe with the following columns:
- `Name`: the (partial) name of the product or activity in the LCI database
- `CPC`: the number and name of the corresponding CPC category
- `Search type`: whether the `Name` entry is an exactly the name to look for, or is contained in the full name
- `Where`: whether the `Name` entry is meant for products or activities

In [23]:
mapping_product_to_CPC.head()

Unnamed: 0,Name,CPC,Search type,Where
0,amine-based silica,"35310: Organic surface active agents, except soap",equals,Product
1,biodiesel,35491: Biodiesel,contains,Product
2,"biomass, used as fuel",31230: Wood in chips or particles,equals,Product
3,"biomethane, from biogas upgrading, using amine...","12020: Natural gas, liquefied or in the gaseou...",equals,Product
4,"biomethane, high pressure","12020: Natural gas, liquefied or in the gaseou...",equals,Product


An abbreviation scheme for the impact categories you aim to work with, to ease the readability in the ESM. The abbreviations should be provided in a dataframe with the following columns:
- `Impact category`: the name of the impact category, expressed as a tuple following brightway convention
- `Unit`: (optional) the unit of the impact category
- `Abbrev`: the abbreviation of the impact category
- `AoP`: the area of protection of the impact category

In [34]:
impact_abbrev.head()

Unnamed: 0,Impact_category,Unit,Abbrev,AoP
0,"('IMPACT World+ Damage 2.0.1', 'Ecosystem qual...",PDF.m2.yr,CCEQL,EQ
1,"('IMPACT World+ Damage 2.0.1', 'Ecosystem qual...",PDF.m2.yr,CCEQS,EQ
2,"('IMPACT World+ Damage 2.0.1', 'Ecosystem qual...",PDF.m2.yr,CCEQLB,EQ
3,"('IMPACT World+ Damage 2.0.1', 'Ecosystem qual...",PDF.m2.yr,CCEQSB,EQ
4,"('IMPACT World+ Damage 2.0.1', 'Human health',...",DALY,CCHHL,HH


## Operations on the mapping dataframe (optional)

In [None]:
# create a concatenated database of all databases in the mapping dataframe (including background requirements)
base_db = concatenate_databases(list(mapping.Database.unique()), create_pickle=True)

### Add or replace the location column based on a user-defined ranking
Based on a user-defined ranking, the location column of the mapping dataframe can be updated. The function `change_location_mapping_file` takes as input the mapping dataframe, the user-defined ranking, and the base database. It returns the mapping dataframe with the location column updated.

In [14]:
# Define the user-defined ranking
my_ranking = [
    'CA-QC',
    'CA',
    'CA-ON',
    'CA-AB',
    'CA-BC',
    'CA-MB',
    'CA-NB',
    'CA-NF',
    'CA-NS',
    'CA-NT',
    'CA-NU',
    'CA-PE',
    'CAZ',
    'RNA',
    'US',
    'USA',
    'GLO',
    'RoW'
]

In [15]:
# Update mapping dataframe
mapping_with_locations = change_location_mapping_file(mapping, my_ranking, base_db, 'QC')

In [16]:
mapping_with_locations.head()

Unnamed: 0,Name,Type,Product,Activity,Location,Unit,Database
0,AEC_OG,Operation,"hydrogen, gaseous, 20 bar","hydrogen production, gaseous, 20 bar, from AEC...",CH,/kg,h2_electrolysis
1,AEC_OG_PLANT,Construction,"electrolyzer, 1MWe, AEC, Balance of Plant","electrolyzer production, 1MWe, AEC, Balance of...",RER,/unit,h2_electrolysis
2,AEC_OG_PLANT_DECOM,Construction,"used fuel cell balance of plant, 1MWe, AEC","treatment of fuel cell balance of plant, 1MWe,...",RER,/unit,h2_electrolysis
3,AEC_OG_STACK,Construction,"electrolyzer, 1MWe, AEC, Stack","electrolyzer production, 1MWe, AEC, Stack",RER,/unit,h2_electrolysis
4,AEC_OG_STACK_DECOM,Construction,"used fuel cell stack, 1MWe, AEC","treatment of fuel cell stack, 1MWe, AEC",RER,/unit,h2_electrolysis


### Relink the mapping dataframe with another database
In this example, we relink the current mapping, which is based on ecoinvent and some additional inventories, with a [premise](https://premise.readthedocs.io/en/latest/introduction.html) database. The function `create_complementary_database` takes as input the mapping dataframe, the database to link to, and the name of a newly created complementary database, in case some LCI datasets are missing in the LCI database we relink to. It returns a new mapping dataframe with the location column updated, and create the complementary database in the brightway project.

In [17]:
name_premise_db = "ecoinvent_cutoff_3.8_remind_SSP2-Base_2020"
name_premise_comp_db = name_premise_db + "_comp_QC"

In [18]:
premise_db = load_extract_db(name_premise_db)

In [19]:
mapping_linked_to_premise = create_complementary_database(mapping_with_locations, premise_db, name_premise_comp_db)

No inventory in the premise database for ('EHP_H2_GRID', 'Construction')
No inventory in the premise database for ('HP_H2_GRID', 'Construction')
No inventory in the premise database for ('LCV_BIODIESEL_B100_MD', 'Construction')
No inventory in the premise database for ('LCV_BIODIESEL_B100_MD', 'Operation')
No inventory in the premise database for ('LCV_BIODIESEL_B100_SD', 'Construction')
No inventory in the premise database for ('LCV_BIODIESEL_B100_SD', 'Operation')
No inventory in the premise database for ('LCV_BIODIESEL_B20_MD', 'Operation')
No inventory in the premise database for ('LCV_BIODIESEL_B20_MD', 'Construction')
No inventory in the premise database for ('LCV_BIODIESEL_B20_SD', 'Operation')
No inventory in the premise database for ('LCV_BIODIESEL_B20_SD', 'Construction')
No inventory in the premise database for ('LCV_CNG_MD', 'Construction')
No inventory in the premise database for ('LCV_CNG_MD', 'Operation')
No inventory in the premise database for ('LCV_CNG_SD', 'Construct

Writing activities to SQLite3 database:
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Title: Writing activities to SQLite3 database:
  Started: 06/18/2024 16:36:43
  Finished: 06/18/2024 16:36:43
  Total time elapsed: 00:00:00
  CPU %: 45.10
  Memory %: 46.49


In [24]:
mapping_linked_to_premise.head()

Unnamed: 0,Name,Type,Product,Activity,Location,Database
0,AEC_OG,Operation,"hydrogen, gaseous, 20 bar","hydrogen production, gaseous, 20 bar, from AEC...",CH,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020
1,AEC_OG_PLANT,Construction,"electrolyzer, 1MWe, AEC, Balance of Plant","electrolyzer production, 1MWe, AEC, Balance of...",RER,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020
2,AEC_OG_PLANT_DECOM,Construction,"used fuel cell balance of plant, 1MWe, AEC","treatment of fuel cell balance of plant, 1MWe,...",RER,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020
3,AEC_OG_STACK,Construction,"electrolyzer, 1MWe, AEC, Stack","electrolyzer production, 1MWe, AEC, Stack",RER,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020
4,AEC_OG_STACK_DECOM,Construction,"used fuel cell stack, 1MWe, AEC","treatment of fuel cell stack, 1MWe, AEC",RER,ecoinvent_cutoff_3.8_remind_SSP2-Base_2020


## Create a new database with additional CPC categories (optional)
In case you are working with a LCI database without CPC categories, you can create a new database with the CPC categories. The function `create_new_database_with_CPC_categories` takes as input the database with missing CPC categories, the name of the new database, and a mapping between the products and activities in the LCI database and the CPC categories. It creates a new database with the CPC categories. This step can take a few minutes depending on the size of the database.

In [25]:
name_premise_with_CPC_db = name_premise_db + '_with_CPC'
name_premise_comp_with_CPC_db = name_premise_comp_db + '_with_CPC'

In [26]:
premise_comp_db = load_extract_db(name_premise_comp_db)

Getting activity data


100%|██████████| 37/37 [00:00<00:00, 12145.98it/s]


Adding exchange data to activities


100%|██████████| 1170/1170 [00:00<00:00, 27295.43it/s]


Filling out exchange data


100%|██████████| 37/37 [00:00<00:00, 121.14it/s]

ecoinvent_cutoff_3.8_remind_SSP2-Base_2020_comp_QC.pickle created!





In [27]:
create_new_database_with_CPC_categories(db=premise_db, new_db_name=name_premise_with_CPC_db, mapping_product_to_CPC=mapping_product_to_CPC)

Vacuuming database 


Writing activities to SQLite3 database:
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:01:05


Title: Writing activities to SQLite3 database:
  Started: 06/18/2024 16:54:21
  Finished: 06/18/2024 16:55:26
  Total time elapsed: 00:01:05
  CPU %: 41.50
  Memory %: 27.78


In [28]:
create_new_database_with_CPC_categories(db=premise_comp_db, new_db_name=name_premise_comp_with_CPC_db, mapping_product_to_CPC=mapping_product_to_CPC)

Writing activities to SQLite3 database:
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Title: Writing activities to SQLite3 database:
  Started: 06/18/2024 16:59:01
  Finished: 06/18/2024 16:59:02
  Total time elapsed: 00:00:00
  CPU %: 30.00
  Memory %: 29.04


In [29]:
premise_comp_db_with_CPC = load_extract_db(name_premise_comp_with_CPC_db, create_pickle=False)

Getting activity data


100%|██████████| 37/37 [00:00<?, ?it/s]


Adding exchange data to activities


100%|██████████| 1170/1170 [00:00<00:00, 37312.47it/s]


Filling out exchange data


100%|██████████| 37/37 [00:00<00:00, 147.60it/s]


In [30]:
relink_database(premise_comp_db_with_CPC, name_premise_db, name_premise_with_CPC_db)

Writing activities to SQLite3 database:
0% [##############################] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Title: Writing activities to SQLite3 database:
  Started: 06/18/2024 16:59:02
  Finished: 06/18/2024 16:59:03
  Total time elapsed: 00:00:00
  CPU %: 29.40
  Memory %: 29.04


In [31]:
## Change the mapping file accordingly
mapping_linked_to_premise['Database'] += '_with_CPC'

## Double-counting removal

In [None]:
new_db_name = 'energyscope_QC_2020_reg'

In [None]:
mismatch_regions = ['GLO', 'RoW', 'RER', 'CH']  # insufficient match within ecoinvent

In [None]:
create_esm_database(
    mapping=mapping_linked_to_premise,
    model=model,
    tech_specifics=technology_specifics,
    technology_compositions=technology_compositions,
    mapping_esm_flows_to_CPC_cat=mapping_esm_flows_to_CPC,
    main_database=concatenate_databases(list(mapping_linked_to_premise.Database.unique()), create_pickle=False),
    esm_db_name=new_db_name,
    regionalize_foregrounds=True,
    results_path_file='results/',
    mismatch_regions=mismatch_regions,
    target_region='CA-QC',
    locations_ranking=my_ranking
)

## Computing the LCA metrics 

In [None]:
esm_db = load_extract_db('energyscope_QC_2020')

In [None]:
R_long = compute_impact_scores(
    esm_db=esm_db,
    mapping=mapping,
    technology_compositions=technology_compositions,
    methods=['IMPACT World+ Midpoint 2.0.1', 'IMPACT World+ Damage 2.0.1', 'IMPACT World+ Footprint 2.0.1'],
    unit_conversion=unit_conversion,
    lifetime=lifetime
)

In [None]:
R_long.head()

## Convert the results in an AMPL format

### Create the .dat file

In [None]:
normalize_lca_metrics(
    R=lca_metrics,
    f_norm=1e6,
    mip_gap=1e-6,
    refactor=refactor,
    lcia_method=lcia_method,
    impact_abbrev=impact_abbrev
)

### Create the .mod file

In [None]:
gen_lcia_obj(
    lcia_method=lcia_method,
    refactor=refactor,
    impact_abbrev=impact_abbrev
)