# ILCD Importer

* Author: Miguel Astudillo
* Kernel: `ilcdimporter`
* License: [CC-BY-SA-4.0](https://creativecommons.org/licenses/by-sa/4.0/)

In [None]:
import bw2calc as bc
import bw2data as bd
import bw2io
from pathlib import Path
from bw2io.importers.ilcd import ILCDImporter
import pandas as pd
from IPython.display import display,Image
import textwrap
import numpy as np

# Config

In [None]:
bw2io.remote.install_project('ecoinvent-3.9.1-biosphere', 'brightcon2023_ilcd', overwrite_existing=True)
bd.projects.set_current('brightcon2023_ilcd')

In [None]:
# delete database if present
if 'sdk' in bd.databases:
    sdk_db = bd.Database('sdk')
    sdk_db.deregister()
    del sdk_db
    

In [None]:
bd.__version__ # tested with version dev18, 

In [None]:
bc.__version__ # the notebook assumes versions > 2.0

In [None]:
assert bd.__version__[0] >= 4
assert bc.__version__[0] >= 2

In [None]:
sorted

In [None]:
gwp100 = ('IPCC 2021', 'climate change', 'global warming potential (GWP100)')
assert gwp100 in bd.methods

# A Brightway importer for the ILCD data exchange format

This notebook gives a couple of examples of importing files in the ILCD format into Brightway.

Here are some sources of free ILCD files:
- [LCA Commons](https://www.lcacommons.gov/lca-collaboration/)
- [GLAD](https://www.globallcadataaccess.org/)

## 1) JRC test file 

for testing purposes the JRC provides a test file of the ilcd exchange data format that we'll use in first place [here](https://eplca.jrc.ec.europa.eu/LCDN/developerILCDDataFormat.xhtml). This test file includes some features that are not often available, like a definition of the product system model

In [None]:
path_to_example = Path('ILCD_sdk_211_simp.zip')
sdk = ILCDImporter(dirpath= path_to_example,dbname='sdk')
sdk.apply_strategies()
sdk.statistics()

In [None]:
# try to match to biopshere3 database
sdk.match_database('biosphere3',fields=['database','code','unit'],
                   kind='biosphere')
sdk.statistics()

In [None]:
unlinked_df = pd.DataFrame(sdk.unlinked)

In [None]:
unlinked_df[['exchanges_name','exchanges_uuid','amount','type']]

from the file we can get an idea of what it was meant to be imported

In [None]:
Image(filename='sdk_product_system_model.png',width=800)

In [None]:
sdk.drop_unlinked(True)

In [None]:
sdk.statistics()

In [None]:
sdk.write_database()

In [None]:
sdk_db = bd.Database('sdk')

In [None]:
plastic_injection = bd.get_node(name='Plastic injection moulding part (unspecific) - sample')

In [None]:
plastic_injection

In [None]:
lca = bc.LCA({plastic_injection:1},gwp100)
lca.lci()
lca.lcia()

In [None]:
A = pd.DataFrame(lca.technosphere_matrix.todense())

# format A
cols = pd.MultiIndex.from_tuples([(bd.get_activity(key)['name'],
bd.get_activity(key)['location'] )for key in lca.activity_dict])

rows = pd.MultiIndex.from_tuples([(bd.get_activity(key)['name'],
bd.get_activity(key)['location'] )for key in lca.product_dict])
A.columns = cols
A.index = rows

In [None]:
B = pd.DataFrame(lca.biosphere_matrix.todense())
B_rows = pd.MultiIndex.from_tuples([(bd.get_activity(key)['name'],
bd.get_activity(key)['categories'] )for key in lca.biosphere_dict])
B.index =  B_rows
B.columns = cols

In [None]:
def style_negative(v, props=''):
    return props if v < 0 else None

In [None]:
A.style.highlight_max(axis=0).map(style_negative, props='color:red;')

In [None]:
AB = pd.concat([A,B])

In [None]:
# AB.loc[:,[('Plastic injection moulding part (unspecific) - sample',
#            'DE')]].replace(0,np.nan).dropna()

Parameters (parsed but exchanges are have not been parameterized)

In [None]:
pd.DataFrame(plastic_injection['parameters']).sample(axis=0).T

In [None]:
pd.DataFrame(plastic_injection.exchanges()).dropna(how='all',axis=1).T

## 2) a random process from GLAD

For ilustrative purposes we've downloaded a file from GLAD. Feel free to pick a different one (https://www.globallcadataaccess.org/)

In [None]:
path_to_example = Path('plastics_europe_example.zip')
plastic_eu = ILCDImporter(dirpath= path_to_example,dbname='plastics_europe')
plastic_eu.apply_strategies()
plastic_eu.statistics()

One of the complications of importing ILCD files is that the elementary flows almost always follow a different nomenclature that ecoinvent, the default list used in Brightway (and the default impact assessment methods). So often we need to map the nomenclature to the one used by Brightway. Luckly there are stablished mappings to help us:

- EF 3.0 -> ecoinvent 3.7 [GLAD](https://github.com/UNEP-Economy-Division/GLAD-ElementaryFlowResources/tree/master/Mapping/Output/Mapped_files)
- FEDEFLv1.0.3 -> ecoinvent 3.7 [GLAD](https://github.com/UNEP-Economy-Division/GLAD-ElementaryFlowResources/tree/master/Mapping/Output/Mapped_files)
- EF 2.0 -> EF 3.0 [JRC](https://eplca.jrc.ec.europa.eu/permalink/EF_3.0_Complete.zip)
- EF 1.1 -> EF 2.0 [JRC](https://eplca.jrc.ec.europa.eu/permalink/EF_2.0_Complete.zip)
- ecoinvent 3.7 -> ecoinvent 3.8 [Happy family](https://github.com/Depart-de-Sentier/happy_family/tree/main/Elementary%20flow%20mapping/outputs) 
- ecoinvent 3.8 -> ecoinvent 3.9 [Happy family](https://github.com/Depart-de-Sentier/happy_family/tree/main/Elementary%20flow%20mapping/outputs)


if your dataset is defined using EF1.1 (a very common case in the datasets reported in GLAD) then these would need to be converted to whatever version of ecoinvent is being used to define the biosphere3 database

in the [documentation](https://plasticseurope.lca-data.com/datasetdetail/process.xhtml?uuid=b60b30bf-991d-41b8-a295-5d0b2e5f2bb5&version=09.00.000) we can see that it is in ilcd 1.1. 

To help with this exercise we have created a mapping between ilcd 1.1 and ecoinvent 3.7 using the resources cited above

In [None]:
mapping_df = pd.read_csv('ilcd11_ecoinvent37.csv').dropna()
mapping_dict = mapping_df.dropna().set_index('UUID OLD (ILCD 2016)').uuid_ecoinvent37.to_dict()

There is an ilcd strategy that can help, provided that we have a dictionary between the uuids of the biosphere flows in our database and the equivalent uuid in the biosphere3.

In [None]:
plastic_eu.data = bw2io.strategies.ilcd.alternative_map_to_biosphere3(plastic_eu.data,
mapping_dict)
# do the matching as always
plastic_eu.match_database('biosphere3',fields=['database','code','unit'])

In [None]:
# substantially lower number of unlinked exchanges
plastic_eu.statistics()

some examples of unlinked exchanges:

In [None]:
pd.DataFrame(plastic_eu.unlinked)[['name','category_2','category_1','category_0',]].sample(20)

The "categories" / "context" definitions are different in ilcd and ecoinvent. To fix the unlinked exchanges the mapping of categories requires assumptions. Some examples: 

- `Emissions to fresh water` could correspond to a exchange of type `emission` and category `('water', 'surface water')`.
- `Emissions to air, unspecified` could corresopnd to exchange of type `emission` and category `('air',)`.
- `Emissions to non-agricultural soil` could correspond to exchange of type `emission` and category `('soil', 'industrial')` or `('soil', 'forestry')`.

Some are not trivial

Furthermore, some substances do not have a straightforward equivalent in the biosphere3 database (e.g. `cresol`, `acid (as H+)`, `hexamethylene diamine`, `hard coal; 26.3 MJ/kg`). For others the mapping is easier 

In [None]:
category_ilcd11_ecoinvent_dict = {
'Emissions to air, unspecified':('air',),
'Emissions to sea water':('water', 'ocean'),
'Non-renewable material resources from ground':('natural resource', 'in ground'),
'Non-renewable energy resources from ground':('natural resource', 'in ground'),
'Renewable material resources from water':('natural resource', 'in water'),
'Renewable energy resources from water':('natural resource', 'in water'),
'Renewable energy resources from biosphere':('natural resource', 'biotic'),
}

In [None]:
for ds in plastic_eu.data:
    for exchange in ds['exchanges']:

        if 'input' in exchange:
            # don't mess the already linked
            continue

        # in biosphere3 names are capitalised
        try:
            exchange['name'] = exchange['name'].capitalize()
        except AttributeError:
            exchange['name'] = exchange['name'][0].capitalize()

        for key in category_ilcd11_ecoinvent_dict:

            if exchange['category_2'] == key:
                exchange['categories'] = category_ilcd11_ecoinvent_dict[key]

In [None]:
plastic_eu.match_database('biosphere3',fields=['categories','name','unit'])

In [None]:
plastic_eu.statistics()

slight improvement..  

In [None]:
# to test the activity we 
plastic_eu.drop_unlinked(True)

In [None]:
plastic_eu.write_database()

In [None]:
plastics_eu_act = bd.Database('plastics_europe').random()

a quick tour of the information available

In [None]:
print(textwrap.fill(plastics_eu_act['general_comment'],120))

In [None]:
print(textwrap.fill(plastics_eu_act['intended_application'],120))

In [None]:
pd.DataFrame(plastics_eu_act['contacts'])

In [None]:
pd.DataFrame(plastics_eu_act.exchanges()).sample(3).T

In [None]:
plastics_eu_act.as_dict().keys()

In [None]:
plastics_lca = plastics_eu_act.lca(gwp100)

In [None]:
plastics_lca.to_dataframe()