<div>
    <table>
        <tr>
            <td>
                <center>
                    <h1>Brightway (2.5) Introduction</h1>
                     <a href="https://www.psi.ch/en/ta/people/romain-sacchi">Romain Sacchi</a> (PSI)
                    <br><br>
                    Duration: 1 hour 30 minutes.
                </center>
            </td>
        </tr>
    </div>

# Brightway I/O: bw2io

<div class="alert alert-info">
Note: we will be using <a href="https://docs.brightway.dev/en/latest/content/installation/index.html">Brightway 2.5</a>, not <a href="https://docs.brightway.dev/en/legacy/">Brightway 2</a>. From the user end side, very little differ between the two. The code executed throughout this notebook works with both versions.
</div>


## Learning objectives  

Learn how to:

    - input LCI data to Brightway using Excel/CVS importers
    - fix linking issue using migration files
    - do a contribution analysis
    - export your foreground inventories

## Standard inputs and setup

In [1]:
import os
from pathlib import Path
import pandas as pd
import bw2io, bw2data, bw2calc

Let's list our projects:

In [2]:
list(bw2data.projects)

[Project: default,
 Project: new4,
 Project: ei39,
 Project: ei38,
 Project: simapro,
 Project: toronto,
 Project: ei310,
 Project: bw25,
 Project: coursePSI,
 Project: bw25_intro]

Setting the project

In [2]:
bw2data.projects.set_current("bw2")

## Context

Performing an LCA generally requires:
  - Background LCI data (e.g., an LCI database such as [ecoinvent](https://ecoinvent.org/))  
  - Foreground LCI data (e.g., a bunch of datasets the LCA practitioner has spent time modeling)
  - Sets of characterization factors.
 
This section will deal with the way Foreground LCI data is input to Brightway.

Useful documentation about what a database in Brightway is can be found [here](https://github.com/brightway-lca/brightway2/blob/master/notebooks/Databases.ipynb)
 and [here](https://docs.brightway.dev/en/latest/content/gettingstarted/databases.html).

# Importing from CSV or Excel

Using `bw2io.ExcelImporter`, we import datasets from an Excel file.

In [107]:
imp = bw2io.ExcelImporter(Path(".") / "files" / "lci-carbon-fiber.xlsx")

Extracted 1 worksheets in 0.03 seconds


We want to apply a number of data cleaning functions (format numbers, set correct location, etc.),

In [108]:
imp.apply_strategies()

Applying strategy: csv_restore_tuples
Applying strategy: csv_restore_booleans
Applying strategy: csv_numerize
Applying strategy: csv_drop_unknown
Applying strategy: csv_add_missing_exchanges_section
Applying strategy: normalize_units
Applying strategy: normalize_biosphere_categories
Applying strategy: normalize_biosphere_names
Applying strategy: strip_biosphere_exc_locations
Applying strategy: set_code_by_activity_hash
Applying strategy: link_iterable_by_fields
Applying strategy: assign_only_product_as_production
Applying strategy: link_technosphere_by_activity_hash
Applying strategy: drop_falsey_uncertainty_fields_but_keep_zeros
Applying strategy: convert_uncertainty_types_to_integers
Applying strategy: convert_activity_parameters_to_list
Applied 16 strategies in 5.00 seconds


Then, we want to use the `match_database()` method to link exchanges to suppliers.

First, we want to link exchange to suppliers that may also be contained in the data being imported.

In [109]:
# we match based on the name, reference product and location
imp.match_database(fields=('name', 'reference product', 'unit', 'location')) 

Applying strategy: link_iterable_by_fields


<div class="alert alert-info">
Note: Why is it important to link both based on <b>name</b> and <b>reference product</b>?
</div>

Is that enough? Do we still have unlinked exchanges? Let's check.

In [110]:
imp.statistics()

10 datasets
109 exchanges
82 unlinked exchanges
  Type biosphere: 1 unique unlinked exchanges
  Type technosphere: 19 unique unlinked exchanges


(10, 109, 82)

Let's check what those unlinked exchanges are:

In [111]:
for u in list(imp.unlinked):
    print(u["name"], u.get("location"), u.get("categories"))

market for heat, from steam, in chemical industry RER None
market for acrylonitrile GLO None
market group for electricity, low voltage RER None
market for methyl acrylate GLO None
market for acrylic acid RER None
market for water, deionised Europe without Switzerland None
market for compressed air, 1000 kPa gauge RER None
market for dimethyl sulfoxide GLO None
market for ethylene glycol GLO None
air separation, cryogenic RER None
market for steam, in chemical industry RER None
treatment of wastewater, average, wastewater treatment CH None
market for potassium permanganate GLO None
market for silicone product RER None
Argon-40 None ('air',)
market for natural gas, medium pressure, vehicle grade GLO None
market for NOx retained, by selective catalytic reduction GLO None
market for tap water Europe without Switzerland None
market for ammonium bicarbonate RER None
market for epoxy resin, liquid RER None


In [112]:
bw2data.databases

Databases dictionary with 2 object(s):
	ecoinvent-3.10-biosphere
	ecoinvent-3.10-cutoff

OK, some unlinked exchanges are clearly from ecoinvent. Let's try to link those.

In [113]:
imp.match_database("ecoinvent-3.10-cutoff", fields=('name', 'reference product', 'unit', 'location'))
imp.statistics()

Applying strategy: link_iterable_by_fields
10 datasets
109 exchanges
14 unlinked exchanges
  Type biosphere: 1 unique unlinked exchanges
  Type technosphere: 2 unique unlinked exchanges


(10, 109, 14)

Depiste trying to link with ecoinvent, we still have two unmatched technosphere flows:

In [114]:
[u for u in imp.unlinked if u["type"] == "technosphere"]

[{'name': 'market for ethylene glycol',
  'amount': 2.4225526641883498e-06,
  'database': 'ecoinvent',
  'location': 'GLO',
  'unit': 'kilogram',
  'type': 'technosphere',
  'reference product': 'ethylene glycol'},
 {'name': 'air separation, cryogenic',
  'amount': 0.005396530359355638,
  'database': 'ecoinvent',
  'location': 'RER',
  'unit': 'kilogram',
  'type': 'technosphere',
  'reference product': 'nitrogen, liquid'}]

Also, we have an unlinked biosphere exchange left, let's try to match that one.

In [115]:
imp.match_database("ecoinvent-3.10-biosphere", fields=('name', 'unit', 'categories'))
imp.statistics()

Applying strategy: link_iterable_by_fields
10 datasets
109 exchanges
14 unlinked exchanges
  Type biosphere: 1 unique unlinked exchanges
  Type technosphere: 2 unique unlinked exchanges


(10, 109, 14)

In [123]:
[u for u in imp.unlinked if u["type"] == "biosphere"]

[]

Nope. Why not? Maybe because `Argon-40` does not not exist as such in `biosphere?`

In [116]:
[f for f in bw2data.Database("ecoinvent-3.10-biosphere") if "argon" in f["name"].lower()]

['Argon' (kilogram, None, ('air',)),
 'Argon-41' (kilo Becquerel, None, ('air', 'non-urban air or from high stacks')),
 'Argon-41' (kilo Becquerel, None, ('air',)),
 'Argon-41' (kilo Becquerel, None, ('air', 'urban air close to ground')),
 'Argon-41' (kilo Becquerel, None, ('air', 'low population density, long-term')),
 'Argon' (kilogram, None, ('natural resource', 'in air'))]

It is indeed now simply called `Argon` in ecoinvent 3.10.
We can:
1. manually fix this (i.e., modify the exchange name in the Excel file),
2. go over `imp.data`(list), iterate through the exchanges and find `Argon-40` and replace it with `Argon`
3. create a `migration` file for translating ecoinvent 3.9 flows to 3.10

### Migration file

We create a mapping dictionary, and use it to create a `Migration` object.

In [117]:
migration = {
    "fields": ["name", "reference product", "location", "categories"],
    "data": [
        (
            ("market for ethylene glycol", "ethylene glycol", "GLO", None),
            {"location": "RER",},
        ),
        (
            ("air separation, cryogenic", "nitrogen, liquid", "GLO", None),
            {
                "name": "industrial gases production, cryogenic air separation",
                "location": "RER",
            },
        ),
        (
            ("air separation, cryogenic", "nitrogen, liquid", "RER", None),
            {
                "name": "industrial gases production, cryogenic air separation",
                "location": "RER",
            },
        ),
        (
            ("Argon-40", None, None, ("air",)),
            {
                "name": "Argon",
            },
        )
    ],
}

In [118]:
bw2io.Migration(name="ei3.9-3.10").write(data=migration, description="ei 3.9 to 3.10")

In [119]:
"ei3.9-3.10" in bw2io.migrations

True

In [120]:
bw2io.Migration("ei3.9-3.10")

Brightway2 Migration: ei3.9-3.10

We apply the migration on our imported data.

In [121]:
imp.data = bw2io.strategies.migrate_exchanges(
    db=imp.data,
    migration="ei3.9-3.10"
)

In [122]:
imp.match_database("ecoinvent-3.10-cutoff", fields=('name', 'reference product', 'unit', 'location'))
imp.match_database("ecoinvent-3.10-biosphere", fields=('name', 'unit', 'categories'))
imp.statistics()

Applying strategy: link_iterable_by_fields
Applying strategy: link_iterable_by_fields
10 datasets
109 exchanges
0 unlinked exchanges
  


(10, 109, 0)

We zero unlinked exchanges, we're ready to write the database.

In [125]:
if len(list(imp.unlinked)) == 0:
    imp.write_database()

Not able to determine geocollections for all datasets. This database is not ready for regionalization.


100%|█████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 974.76it/s]

Vacuuming database 





Created database: carbon fiber


# Contribution analyses

## Process contribution

We have already seen how to obtain a contribution analysis in terms of contributing processes:

In [4]:
db = bw2data.Database("carbon fiber")

In [9]:
# let's list the datasets in our new database "carbon fiber"
[a["name"] for a in db]

['carbon fiber production, fiber drying and sizing',
 'carbon fiber production, fiber winding and unwinding',
 'carbon fiber production, exhaust gas treatment 1',
 'polyacrylonitrile production (PAN) by polymerisation',
 'carbon fiber production, weaved, at factory',
 'carbon fiber production, fiber coagulation, stretching, washing, sizing and drying',
 'carbon fiber production, fiber stabilization, carbonization, electrolysis and washing',
 'carbon fiber production, exhaust gas treatment 2',
 'carbon fiber production, fiber relaxation',
 'Dimethyl sulfoxide production (DMSO)']

In [5]:
activity = db.search('carbon fiber production, weaved, at factory')[0]
activity

'carbon fiber production, weaved, at factory' (kilogram, RER, None)

In [3]:
method = ('IPCC 2021', 'climate change', 'global warming potential (GWP100)')

In [7]:
lca = bw2calc.LCA({activity:1}, method)
lca.lci()
lca.lcia()
lca.score



75.11270245850397

In [8]:
rev_prod, rev_act, rev_bio = lca.reverse_dict()

In [9]:
results_by_activity = (lca.characterized_inventory.sum(axis=0)).A1

In [10]:
# Create a list of names in columns
list_of_names_in_columns = [
    bw2data.get_activity(rev_prod[col])['name'] 
    for col in range((lca.characterized_inventory.sum(axis=0)).shape[1])
]

In [11]:
pd.Series(index=list_of_names_in_columns, data=results_by_activity).sort_values(ascending=False).head(10)

heat production, natural gas, at industrial furnace >100kW    8.068948
carbon fiber production, exhaust gas treatment 2              7.448623
natural gas venting from petroleum/natural gas production     4.314160
heat production, at hard coal industrial furnace 1-10MW       2.413090
ammonia production, partial oxidation, liquid                 2.203817
ammonia production, steam reforming, liquid                   2.094031
ammonia production, partial oxidation, liquid                 1.906205
heat production, at hard coal industrial furnace 1-10MW       1.796146
electricity production, lignite                               1.766702
sweet gas, burned in gas turbine                              1.513273
dtype: float64

But there is a simpler and more "official" way to obtain this.

In [12]:
import bw2analyzer as ba

In [13]:
pd.DataFrame(
    [(x, y, z["name"]) for x, y, z in ba.ContributionAnalysis().annotated_top_processes(lca=lca)],
    columns=["score", "quantity", "name"]
)

Unnamed: 0,score,quantity,name
0,8.068948,136.671829,"heat production, natural gas, at industrial fu..."
1,7.448623,10.609957,"carbon fiber production, exhaust gas treatment 2"
2,4.31416,0.247272,natural gas venting from petroleum/natural gas...
3,2.41309,20.967055,"heat production, at hard coal industrial furna..."
4,2.203817,0.787078,"ammonia production, partial oxidation, liquid"
5,2.094031,1.461715,"ammonia production, steam reforming, liquid"
6,1.906205,0.680788,"ammonia production, partial oxidation, liquid"
7,1.796146,15.606498,"heat production, at hard coal industrial furna..."
8,1.766702,1.490596,"electricity production, lignite"
9,1.513273,22.596866,"sweet gas, burned in gas turbine"


Same approach for elementary flows:

In [14]:
pd.DataFrame(
    [(x, y, z["name"]) for x, y, z in ba.ContributionAnalysis().annotated_top_emissions(lca=lca)],
    columns=["score", "quantity", "name"]
)

Unnamed: 0,score,quantity,name
0,31.524717,31.52472,"Carbon dioxide, fossil"
1,22.449426,22.44943,"Carbon dioxide, fossil"
2,13.097015,13.09701,"Carbon dioxide, fossil"
3,7.31087,0.2453312,"Methane, fossil"
4,0.171,6.785712e-06,Sulfur hexafluoride
5,0.123087,0.0004508681,Dinitrogen monoxide
6,0.10604,0.0003884237,Dinitrogen monoxide
7,0.093928,0.0003440585,Dinitrogen monoxide
8,0.07081,0.002376177,"Methane, fossil"
9,0.062798,0.06279813,"Carbon dioxide, from soil or biomass stock"


## Tree map

In [1]:
from polyviz import treemap

In [7]:
treemap(
    activity=activity,
    method=method
)

Calculating LCIA score...




'/Users/romain/GitHub/autumn-school-dds-psi-2024/tutorials/brightway/carbon fiber production weaved at factory kilogram RER IPCC 2021climate changeglobal warming potential GWP100 treemap.html'

## Supply chain traversal

In [9]:
from polyviz import sankey

Github repo: [link](https://github.com/romainsacchi/polyviz). No proper documentation yet, but a notebook with [examples](https://github.com/romainsacchi/polyviz/blob/main/examples/examples.ipynb).

In [13]:
_, df = sankey(
    activity=activity,
    level=4,
    cutoff=0.01,
    method=method,
    labels_swap={
        "carbon fiber": "cf.",
        "production": "prod."
    }
)

Calculating supply chain score...




## Violin plot

In [6]:
from polyviz import violin
import warnings
warnings.filterwarnings("ignore")

In [7]:
method = ('IPCC 2021', 'climate change', 'global warming potential (GWP100)')
violin(
    activities=[
        a for a in bw2data.Database("ecoinvent-3.10-cutoff") 
        if a["unit"] == "ton kilometer"
    ][:3],
    method=method,
    iterations=100
)



'/Users/romain/GitHub/autumn-school-dds-psi-2024/tutorials/brightway/market for GLO vs transport GLO vs market for RoW IPCC 2021climate changeglobal warming potential GWP100 violin.html'

# Exporting databases

## To Excel

We can export the entire database inventory to an Excel file.

In [221]:
bw2io.export.write_lci_excel(
    database_name="carbon fiber",
    dirpath=".",
)

'./lci-carbon-fiber.xlsx'

## As a bw2package file

We can also export the database as a Brightway package file.

In [235]:
bw2io.package.BW2Package().export_obj(
    obj=db,
)

'/Users/romain/Library/Application Support/Brightway3/bw25.a3a6b830/export/carbon-fiber54c41419.efe1809b.bw2package'

<div class="alert alert-info">
Note: It may not be ideal for sharing, because for the import to be successful, the other user will need the databases the exported database depends on (ecoinvent, biosphere) to be named exactly the same. It is still possible, though, but the user you share the package with, will have to correct this upon import.
</div>

## As a project

We can export the entire project. This is the safest option, as all the database `carbon fiber` depends on are also exported. The drawback is that the file is bigger, and there may be licensing issues. But it is at least a good backup solution.

In [236]:
bw2data.projects.current

'bw25'

In [238]:
bw2io.backup.backup_project_directory(
    project='bw2',
    dir_backup='.' # here
)

Creating project backup archive - this could take a few minutes...
Saved to: brightway2-project-bw25-backup08-July-2024-07-26PM.tar.gz


PosixPath('brightway2-project-bw25-backup08-July-2024-07-26PM.tar.gz')

And we load it back up... Note that I give it another name to not overwrite it.
Also, overwrite is `False` by default, so it would need ot be set to `True` first.

In [242]:
bw2io.backup.restore_project_directory(
    fp="brightway2-project-bw2-backup.tar.gz",
    project_name="carbon fiber 2",
    overwrite_existing=False
)

Restoring project backup archive - this could take a few minutes...
Restored project: carbon fiber 2


'carbon fiber 2'

Let's check

In [243]:
"carbon fiber 2" in bw2data.projects

True