# Use examples of [premise](https://github.com/romainsacchi/premise)

Author: [romainsacchi](https://github.com/romainsacchi)

This notebook shows examples on how to use `premise` to adapt the life cycle inventory database [ecoinvent](https://www.ecoinvent.org/) for prospective environmental impact assessment.

This library extract useful information from IAM model output files (such as those of REMIND or IMAGE) and aligns inventories in the ecoinvent database accordingly.

With version `2.2.0`, the following transformations are available:

* `update("biomass")`: creates regional biomass markets, adjusting the share of residual vs. purpose-grown biomass for use in heat and power generation
* `update("battery")`: creates a global mix of stationary and mobile battery technologies.
* `update("electricity")`: creates regional electricity markets and adjusts the efficiency of power plants, including that of photovoltaic panels
* `update("cement")`: creates regional markets for clinker production and adjusts clinker production efficiency
* `update("steel")`: creates regional markets for steel and adjusts steel production efficiency and the supply of secondary steel
* `update("dac")` creates region- and scenario-specific inventories for Direct Air Capture (DAC) and Carbon Storage (DACCS) systems.
* `update("fuels")`: creates regional markets for liquid and gaseous fuels
* `update("heat")`: regionalizes some heat and steam generation datasets (working on diesel, biomass, and natural gas)
* `update("emissions")`: adjusts emission of pollutants (PM, NOx, VOCs) for various activities based on GAINS model projections.
* `update("two_wheelers")`: imports two-wheelers (bicycles, motorbikes, etc.)
* `update("cars")`: produces fleet average cars and relinks to activities consuming passenger car transport
* `update("trucks")`: produces fleet average trucks and relinks to activities consuming lorry transport
* `update("buses")`: imports buses (urban and coach buses, single-deckers and double-deckers)
* `update("trains")`: imports buses (urban and coach buses, single-deckers and double-deckers)
* `update("external")`: runs any external scenarios provided.

Alternatively, `update()` performs all the transformations mentioned above.

Integration of user-defined scenarios (e.g., `update("external")`) is also possible, and we have a separate notebook for this.

Additional documentation on the methodology is available [here](https://premise.readthedocs.io/en/latest/introduction.html).

There's also a **publication** about `premise` [here](https://www.sciencedirect.com/science/article/pii/S136403212200226X?via%3Dihub).

## Requirements

* **Pyhton 3.10 or higher (up to 3.11) is highly recommended**
* a user license for ecoinvent v.3
* a **decryption key**, to be asked from [Romain Sacchi](mailto:romain.sacchi@psi.ch)

# Use case with [brightway2](https://brightway.dev/)

`brightway2` is an open source LCA framework for Python.
To use `premise` from `brightway2`, it requires that you have an activated `brightway2` project with a `biosphere3` database as well as an [ecoinvent](https://ecoinvent.prg) v.3 cut-off or consequential database registered in that project. Please refer to the brightway [documentation](https://brightway.dev) if you do not know how to create a project and install ecoinvent.

In [3]:
from premise import *
import bw2data

### List of available scenarios

Some scenarios come installed with the library.
They are stored in `premise/data/iam_ouput_files` from the root directory.
They are defined across different SSPs. For example, SSP2 (nicknamed "middle of the road"), which describes a future world (in terms of GDP and demographics development, education, intergovernmental collaboration) very much in line with what has been observed historically.

But they are proposed in combination with different climate mitigation targets, called Representative Concentration Pathways (RCP).
Read more about SSPs and RCPs, [here](https://www.carbonbrief.org/explainer-how-shared-socioeconomic-pathways-explore-future-climate-change).

With REMIND, we have the following SSP/RCP scenarios:
* "SSP1-Base"
* "SSP1-NPi"
* "SSP1-NDC"
* "SSP1-PkBudg1150"
* "SSP1-PkBudg500"
* "SSP2-Base"
* "SSP2-NPi"
* "SSP2-NDC"
* "SSP2-PkBudg1150"
* "SSP2-PkBudg500"
* "SSP5-Base"
* "SSP5-NPi"
* "SSP5-NDC"
* "SSP5-PkBudg1150"
* "SSP5-PkBudg500"

With IMAGE, we have the following SSP/RCP scenarios:
* "SSP1-Base"
* "SSP2-Base"
* "SSP2-RCP26"
* "SSP2-RCP19"

With TIAM-UCL, the following SSP/RCP scenarios are available:
* "SSP2-Base"
* "SSP2-RCP45"
* "SSP2-RCP26"
* "SSP2-RCP19"


Refer to [the documentation](https://premise.readthedocs.io/en/latest/extract.html#current-iam-scenarios) for a description of theses scenarios, or have a look at our **[scenario explorer](https://premisedash-6f5a0259c487.herokuapp.com/)**.
Additionally, [this blog](https://www.carbonbrief.org/explainer-how-shared-socioeconomic-pathways-explore-future-climate-change/) is a good reading material to understand SSPs and RCPs.


### Database creation from default scenarios

To create a scenario using REMIND's SSP2 Base pathway, from ecoinvent 3.10 for the year 2028, one would execute the following. This leads to the extraction of the database, some cleanup as well as importing a few additional inventories.

First, activate the brightway2 project where you want to store the database.

In [None]:
# activate the bw project
bw2data.projects.set_current("premise")
# list currently installed databases
bw2data.databases

In [None]:
ndb = NewDatabase(
    scenarios=[
        {"model":"image", "pathway":"SSP2-RCP19", "year":2050},
        {"model":"remind", "pathway":"SSP2-PkBudg500", "year":2050},
    ],
    source_db="ecoinvent-3.10-cutoff", # <-- name of the database in the BW2 project. Must be a string.
    source_version="3.10", # <-- version of ecoinvent. Can be "3.8", "3.9" or "3.10". Must be a string.
    key='tUePmX_S5B8ieZkkM7WUU2CnO8SmShwmAeWK9x2rTFo=', # <-- decryption key
    # to be requested from the library maintainers if you want ot use default scenarios included in `premise`
    keep_source_db_uncertainty=False, # False by default, set to True if you want to keep ecoinvent's uncertainty data
    keep_imports_uncertainty=False, # False by default, set to True if you want to keep the uncertainty data of the additional inventories
    use_absolute_efficiency=True, # False by default, set to True if you want to use the IAM's absolute efficiencies
)

Here is a list of all arguments that can be passed to `NewDatabase()`:
    
    scenarios: List[dict], # list of scenarios to process
    source_version: str = "3.10", # version of ecoinvent database. Can be "3.6", "3.7", "3.8", "3.9" or "3.10".
    source_type: str = "brightway", # type of source database. Can be "brightway" or "ecospold".
    key: bytes = None, # decryption key
    source_db: str = None, # name of database if source_type is "brightway"
    source_file_path: str = None, # path to ecospold files if source_type is "ecospold"
    additional_inventories: List[dict] = None, # list of additional inventories to import
    system_model: str = "cutoff", # system model. Can be "cutoff" or "consequential".
    system_args: dict = None, # arguments for the "consequential "system model
    use_cached_inventories: bool = True, # use cached inventories
    use_cached_database: bool = True, # use cached database
    quiet=False, # suppress output
    keep_imports_uncertainty=False, # keep uncertainty data of additional inventories
    keep_source_db_uncertainty=False, # keep uncertainty data of source database
    gains_scenario="CLE", # Air pollutionn GAINS scenario
    use_absolute_efficiency=False, # use IAM's absolute efficiencies instead of efficiencies relative to 2020
    biosphere_name: str = "biosphere3", # name of biosphere database in brightway project if different from "biosphere3"

The first time you create a premise database, *premise* will store a copy of the ecoinvent database and external inventories, to be able to skip that time-consuming step next time. If you wish to clear this cache (which is only encourage if updating premise or if encountering issues with inventories), do:

In [None]:
clear_cache() # clears both ecoinvent and additional inventories cache
clear_inventory_cache() # clears only additional inventories cache

If you do not want to integrate the IAM projections in the database, but only wish to have the additional inventories, you can stop here and export the database back to Brightway or other destinations, by using the `write_db_to` methods, like so:

In [None]:
ndb.write_db_to_brightway()

However, if you wish first to proceed with the IAM integration, you need to use the `update()` method, like so for the electricity sector:

In [None]:
ndb.update("electricity")

In [None]:
ndb.write_db_to_brightway()

If you want to create multiple databases at once, just populate the `scenarios` list.

In [None]:
ndb = NewDatabase(
            scenarios=[
                {"model":"remind", "pathway":"SSP2-Base", "year":2020},
                {"model":"remind", "pathway":"SSP2-Base", "year":2030},
                {"model":"remind", "pathway":"SSP2-Base", "year":2040},
                {"model":"remind", "pathway":"SSP2-Base", "year":2050},
            ],
            source_db="ecoinvent 3.7 cutoff", # <-- name of the database. Must be a string.
            source_version="3.7.1", # <-- version of ecoinvent. Can be "3.5", "3.6", "3.7" or "3.7.1"
            key='xxxxxxxxxxxxxxxxxxxxxxxxx'
)



When the database is loaded and the additional inventories imported, you can apply a transformation function.
For example here, we adjust the efficiency of the power plants to the two scenarios we have loaded.
We go more in details later.

In [None]:
ndb.update("electricity")

Or you can proceed instead to doing all the sectoral transformations available, like so:

In [None]:
ndb.update() # <- updates all sectors

And then, we register these two databases back into brightway2.

In [None]:
ndb.write_db_to_brightway()

### Consequential

`premise` can read in the consequential version of ecoinvent.
Based on the publication of Maes et al. 2023 (https://doi.org/10.1016/j.rser.2023.113830), `premise` builds marginal market mixes for electricity and fuels.
Passing a series of arguments to `NewDatabase()` can influence the identification of marginal suppliers.
Additionally, `premise` removes secondary steel technologies from steel markets.

In [None]:
from premise import *
from datapackage import Package
import brightway2 as bw
bw.projects.set_current("new4")

args = {"range time":2, "duration":False, "foresight":False, "lead time":True, "capital replacement rate":False, "measurement": 0, "weighted slope start": 0.75, "weighted slope end": 1.00}

ndb = NewDatabase(
            scenarios=[
                {"model":"remind", "pathway":"SSP2-Base", "year":2020},
                {"model":"remind", "pathway":"SSP2-Base", "year":2030},
                {"model":"remind", "pathway":"SSP2-Base", "year":2040},
                {"model":"remind", "pathway":"SSP2-Base", "year":2050},
            ],
            source_db="ecoinvent 3.8 consequential", # <-- Must point to the consequential database.
            source_version="3.8", # <-- Can only be 3.8.
            key='xxxxxxxxxxxxxxxxxxxxxxxxx',
            system_model="consequential", # <-- Must specify "consequential"
            system_model_args=args # Optional. Arguments.
)

### Database creation from non-default scenarios

If you have some specific IAM scenarios (one that is not included in `premise`) from which you would like to build a database, you can specify the directory for those.

**Important remark**: your scenario file must begin with "remind_" or "image_". When using a non-default scenario that you provide yourself, you do not have to provide a decryption key.

In [None]:
from premise import *
import bw2data

bw2data.projects.set_current("new")

In [None]:
ndb = NewDatabase(
    scenarios = [{"model":"newiam", "pathway":"path1-Base", "year":2028,
                  "filepath":"/Users/romain/Documents"}],        
    source_db="ecoinvent 3.8 cutoff", # <-- name of the database
    source_version="3.8", # <-- version of ecoinvent
 )

### Adding inventories
Upon the database extraction, you can import some of your Brightway2-compatible inventories like so:

In [None]:
ndb = NewDatabase(
            scenarios=[
                {"model":"remind", "pathway":"SSP2-Base", "year":2030},
            ],
            source_db="ecoinvent 3.7 cutoff", 
            source_version="3.7.1",
            key='xxxxxxxxxxxxxxxxxxxxxxxxx'
            additional_inventories= [ # <-- this is NEW
                {"filepath": r"filepath\to\excel_file.xlsx", "ecoinvent version": "3.7"}, # <-- this is NEW
                {"filepath": r"filepath\to\another_excel_file.xlsx", "ecoinvent version": "3.7"}, # <-- this is NEW
            ] # <-- this is NEW
                 )

# Use case with ecospold2

The source database does not have to be from a brightway2 project.
It can be directly extracted from the bunch of ecospold2 files one gets when downloaded from the [ecoinvent website](https://ecoinvent.org).

For this, one needs to specify the argument `source_db = "ecospold"` and `source_file_path`, which is the directory leading to the ecospold files.

For example, here we combine the use of a specific (non-default) IAM scenario file with the use of ecospold2 files as data source (ecoinvent 3.6 in this case).

In [None]:
ndb = NewDatabase(
        scenarios = [
            {"model":"remind", "pathway":"my_special_scenario", "year":2028,
             "filepath":r"C:\filepath\to\your\scenario\folder"}
        ],        
        source_type="ecospold", # <--- this is NEW
        source_file_path=r"C:\filepath\to\your\ecosposld\folder\datasets", # <-- this is NEW
        source_version="3.6",
    )

# Transformation functions

These functions modify the extracted database:

* **update("biomass")**: create scenario- and region-specific markets for biomass used for power generation. Distinguish between purpose-grown and residual biomass.

* **update("electricity")**: alignment of regional electricity production mixes as well as efficiencies for several electricity production technologies, including Carbon Capture and Storage technologies and photovoltaic panels.

* **update("cement")**: adjustment of technologies for cement production (dry, semi-dry, wet, with pre-heater or not), fuel efficiency of kilns, fuel mix of kilns (including biomass and waste fuels).

* **update("steel")**: adjustment of process efficiency, fuel mix and share of secondary steel in steel markets.

* **update("dac")**: This function creates region—and scenario-specific inventories for DAC and DACCS systems and adjusts efficiency.

* **update("fuels")**: This method creates regional markets for liquid and gaseous fuels and relinks consuming activities to them.

* **update("heat")**: This function creates regionalized versions of heat and steam production datasets and relinks them to heat-consuming activities.

* **update("emissions")**: adjust emission of local air pollutants according to GAINS projections.

* **update("cars")**: creates updated inventories for fleet average passenger cars and links back to activities that consume transport.

* **update("trucks")**: creates updated inventories for fleet average lorry trucks and links back to activities that consume transport.

* **update("two_wheelers")**: creates updated inventories for fleet average two-wheelers and links back to activities that consume transport.

* **update("buses")**: creates updated inventories for fleet average buses and links back to activities that consume transport.

* **update("buses")**: creates updated inventories for fleet average trains and links back to activities that consume transport.

A look at the documentation is advised.


These functions can be applied *separately*, *consecutively* or *altogether* (using instead **.update()** without arguments).

They will apply to all the scenario-specific databases listed in `scenarios`.

In [None]:
from premise import *
import bw2data
bw2data.projects.set_current("some project")

In [None]:
ndb = NewDatabase(
            scenarios=[
                {'model':'remind','pathway':'SSP2-Base','year':'2020'},
                {"model":"image", "pathway":"SSP2-Base", "year":2034},
            ],
            key='xxxxxxxxxxxxxxxxxxxxxxxxx',
            source_db="ecoinvent 3.7 cutoff",
            source_version="3.7", 
)

In [None]:
ndb.update()

In [None]:
# write back to brightway project
ndb.write_db_to_brightway()

You can also give your datababases a custom name.

In [None]:
ndb.write_db_to_brightway(name=["my_custom_name_1", "my_custom_name_2"])

# Export options

### As a Brightway2 database

Export the modified database to brightway2

In [None]:
ndb.write_db_to_brightway()

### As a SimaPro CSV file

In [None]:
ndb.write_db_to_simapro(filepath=r"C:/Users/sacchi_r/Downloads/exported_simapro_file")

### As a CSV file for OpenLCA

In [None]:
ndb.write_db_to_olca(filepath=r"C:/Users/sacchi_r/Downloads/exported_olca_file")

### As a Superstructure database
A superstructure database is a database that can accomodate several scenarios, as described [here](https://github.com/dgdekoning/brightway-superstructure), to be then used in [Activity-Browser](https://github.com/LCA-ActivityBrowser/activity-browser).
This function will export the superstructure database as well as produce a "scenario difference file". Hence, even though you create multiple scenarios, **you only need to write to disk one database**.

In [None]:
ndb.write_superstructure_db_to_brightway(name="my_db")

Doing so will automatically produce the LCA of your system for each scenario contained in the "scenario difference" file.

![Example superstructure](example_superDB.png)

### As a data package
Export a data package, which can be shared. Data packages can be read by [unfold](https://github.com/polca/unfold), and databases can be reproduced on other computers provided a local copy of ecoinvent is present. This way of sharing premise databases across users respects ecoinvent's EULA.

In [None]:
ndb.write_datapackage()

### As a sparse matrix representation

Or export it as a sparse matrix representation.

This will export four files:

* "A_matrix.csv": matrix coordinates and values of shape (index of activity; index of product; value) for the technosphere
* "A_matrix_index.csv": labels for indices for A matrix of shape (name of activity, reference product, unit, location, index)
* "B_matrix.csv": matrix coordinates and values of shape (index of activity; index of biosphere flow; value) for the biosphere
* "B_matrix_index.csv": labels for indices for B matrix of shape (name of biosphere flow, main compartment, sub-compartmnet, unit, index)

As a convenience, you can specifiy a directory where to store the exported matrices.
If the directory does not exist, it will be created.
If you leave it unspecified, they will be stored in **data/matrices** in the root folder of the library.

In [None]:
ndb.write_db_to_matrices(filepath=r"C:/Users/sacchi_r/Downloads/exported_matrices")

The exported matrices have the following columns:

* "index of activity"
* "index of product"
* "value"
* "uncertainty type"
* "loc"
* "scale"
* "shape"
* "minimum"
* "maximum"
* "negative"
* "flip"

Where each row is in an exchange (a technosphere exchange if looking at "A_matrix," else, a biosphere exchange). These matrices comply with the data structure expected by [bw_processing](https://pypi.org/project/bw-processing/) and can therefore be read directly by [bw2calc]("https://pypi.org/project/bw2calc/") `2.0.x`.

Otherwise, you can also do things manually. Here is an example of how to calculate GWP scores using the set of sparse matrices
export by `premise`.

In [None]:
from scipy import sparse
#from pypardiso import spsolve <-- use pypardiso if you use an Intel chip, it's much faster!
from scipy.sparse.linalg import spsolve
from pathlib import Path
from csv import reader
import numpy as np

In [None]:
# the directory to the set of files produced by premise
DIR = Path(r"/Users/romain/GitHub/premise/premise/data/export/remind/SSP2-PkBudg1150/2040") 

# creates dict of activities <--> indices in A matrix
A_inds = dict()
with open(DIR / "A_matrix_index.csv", 'r') as read_obj:
    csv_reader = reader(read_obj, delimiter=";")
    for row in csv_reader:
        A_inds[(row[0], row[1], row[2], row[3])] = row[4]

A_inds_rev = {int(v):k for k, v in A_inds.items()}

# creates dict of bio flow <--> indices in B matrix
B_inds = dict()
with open(DIR / "B_matrix_index.csv", 'r') as read_obj:
    csv_reader = reader(read_obj, delimiter=";")
    for row in csv_reader:
        B_inds[(row[0], row[1], row[2], row[3])] = row[4]
        
B_inds_rev = {int(v):k for k, v in B_inds.items()}

# create a sparse A matrix
A_coords = np.genfromtxt(DIR / "A_matrix.csv", delimiter=";", skip_header=1)
I = A_coords[:, 0].astype(int)
J = A_coords[:, 1].astype(int)
A = sparse.csr_matrix((A_coords[:,2], (J, I)))

# create a sparse B matrix
B_coords = np.genfromtxt(DIR / "B_matrix.csv", delimiter=";", skip_header=1)
I = B_coords[:, 0].astype(int)
J = B_coords[:, 1].astype(int)
B = sparse.csr_matrix((B_coords[:,2] * -1, (I, J)), shape=(A.shape[0], len(B_inds)))

# a vector with a few GWP CFs
gwp = np.zeros(B.shape[1])
gwp[[int(B_inds[x]) for x in B_inds if x[0]=="Carbon dioxide, non-fossil, resource correction"]] = -1
#gwp[[int(B_inds[x]) for x in B_inds if x[0]=="Hydrogen"]] = 5
gwp[[int(B_inds[x]) for x in B_inds if x[0]=="Carbon dioxide, in air"]] = -1
gwp[[int(B_inds[x]) for x in B_inds if x[0]=="Carbon dioxide, non-fossil"]] = 1
gwp[[int(B_inds[x]) for x in B_inds if x[0]=="Carbon dioxide, fossil"]] = 1
gwp[[int(B_inds[x]) for x in B_inds if x[0]=="Carbon dioxide, from soil or biomass stock"]] = 1
gwp[[int(B_inds[x]) for x in B_inds if x[0]=="Carbon dioxide, to soil or biomass stock"]] = -1
gwp[[int(B_inds[x]) for x in B_inds if x[0]=="Carbon monoxide, fossil"]] = 4.06
gwp[[int(B_inds[x]) for x in B_inds if x[0]=="Methane, fossil"]] = 29.6

l_res = []
#for v in range(0, A.shape[0]):
# let's limit this to the first 3 activities of the matrix
for v in range(0, 3):
    f = np.float64(np.zeros(A.shape[0]))
    f[v] = 1
    A_inv = spsolve(A, f) # <-- this is too slow
    C = A_inv * B
    l_res.append((C * gwp).sum())

Print the results together with the name of the activity.

In [None]:
[(k, v) for k, v in zip(l_res, list(A_inds_rev.values())[:10])]

# Incremental Database

You can use the class `IncrementalDatabase` to create a `brightway` database that lets you analyze the respective contribution of sectors in the environmental score of your LCA.

In [None]:
ndb = IncrementalDatabase(
    scenarios=[
        {"model":"image", "pathway":"SSP2-RCP26", "year":2040},
        {"model":"image", "pathway":"SSP2-RCP26", "year":2050},
    ],
    source_db="ecoinvent-3.10-cutoff", # <-- name of the database in the BW2 project. Must be a string.
    source_version="3.10", # <-- version of ecoinvent. Can be "3.5", "3.6", "3.7" or "3.8". Must be a string.
    key="xxxxxxxxxxxxxxxxxxxxxxxxx",
    biosphere_name="ecoinvent-3.10-biosphere",
)

Then, you must choose the sectors to apply and the sequence of their implementation (and groupings if so desired).

In [None]:
# in this case, we wish to apply transformations relating to 
# the electricity sector, the steel sector, as well as the 
# cement, cars and fuel sectors. However, in this example, 
# these last three will be considered altogether.

sectors = {
    "electricity": "electricity",
    "steel": "steel",
    "others": [
        "cement",
        "cars",
        "fuels"
    ]
}

ndb.update(sectors=sectors)

In [None]:
# we export the database to brightway and open it in Activity Browser
# Just like a superstructure database, we load the scenario difference
# file premise has co-produced in the calculation setup window
# and we run an anlaysis. For more info, see how to conduct an analysis
# with a superstructure database.

ndb.write_increment_db_to_brightway(name="test increment", file_format="csv")

This lets use see the influence of each sector on the score of our LCA.

!["incremental_example"](incremental_example.png)

# Reports

## Scenario report

You can generate a spreadsheet report showing the main variables of the scenario you have selected to create your databases.
The report is saved in your working directory. Note that this report is generated automatically when exporting a database.

In [None]:
ndb.generate_scenario_report()

## Changes report

You can generate a spreadsheet report of the changes made to the original database.
It gives an overview on:

* the datasets created
* the datasets modified
* some performance indicators
* scaling factors used to scale certain exchanges

There is also a "Validation" tab that shows any datasets which contains values or efficiencies that may seem incorrect.

The report is saved in your working directory. Note that this report is generated automatically when exporting a database.

In [None]:
ndb.generate_change_report()