# Running pyLCAIO

Welcome to the tutorial of pyLCAIO which will show you how to simply hybridize (with default values) an LCA database and an IO database.

For this tutorial, ecoinvent3.5 and exiobase3 were chosen as our LCA and IO database respectively. Therefore, make sure to have access to the ecospold format of ecoinvent3.5 (with unit processes) (https://www.ecoinvent.org/) and a pxp (product by product) file of whichever reference year you want from exiobase3 monatery versions (https://www.exiobase.eu/index.php/data-download/exiobase3mon).

These databases in the format you download them are not readable by Python (and therefore by pyLCAIO) and must be adapted to a python-readable format. PyLCAIO relies on the pandas library of python, ecoinvent and exiobase will therefore be transformed into pandas dataframes.
To do so, we use ecospold2matrix (https://github.com/majeau-bettez/ecospold2matrix) and pymrio (https://github.com/konstantinstadler/pymrio), two other python libraries. You need to download these two modules as well.

### Create the dataframe of ecoinvent

To create the dataframe of ecoinvent, follow this other tutorial (https://github.com/majeau-bettez/ecospold2matrix/blob/master/doc/ecospold2matrix_demo.ipynb).

We recommend you take the options "positive_waste=True", "nan2null=True" and obviously "fileformats='Pandas'".

After using ecospold2matrix, you should have in the "out_dir" you indicated a pickle file with a similar name to this: ecoinvent3.5.cutoffPandas_symmNorm.gz.pickle.

Note: pickle is a storage format, just like .docx or .pdf, except it's only readable by Python.

We now have ecoinvent in dataframes, but it is in a pickle format. We need to unpickle it:

In [None]:
import gzip
import pickle
import pandas as pd
with gzip.open('my_path_to_the_ecoinvent_pickle','rb') as f:
    ecoinvent = pd.read_pickle(f)

Now the ecoinvent variable contains a dictionary (a collection) of dataframes for the different matrices ecoinvent provides (technosphere, biosphere, metadata, characterization).

### Create the dataframe of exiobase

Much simpler, pymrio already includes a simple parser. You just need to run the method parse_exiobase3 as follows:

In [None]:
import sys
sys.path.append('my_path_to_pymrio')
import pymrio
io = pymrio.parse_exiobase3('my_path_to_exiobase')

### Run pyLCAIO

We now imported everything we needed, except pylcaio itself.

In [None]:
import sys
sys.path.append('my_path_to_pylcaio')
import pylcaio

PyLCAIO is divided into three classes: DatabaseLoader, LCAIO and Analysis.

* DatabaseLoader is the class that is assigned to successfully extract all information needed from the ecoinvent and exiobase dataframes we previously imported, and modify the information. It also loads appropriate product concordance, geography concordance, filter, STAM filter matrices and STAM categories.
* LCAIO is the class assigned to the hybridization itself. Once everything was extracted by DatabaseLoader, LCAIO incorporates this data and use it to create the hybrid database.
* Analysis is the class assigned to analyze and perform analyses on the resulting hybrid databases, such as calculation of life cycle emissions or contribution analyses. 

In object-oriented programming, every class must be initiallized to create a object of that class (here we called it database_loader). To initialize DatabaseLoader, four arguments are needed. The unpickled dictionary of dataframes created through ecospold2matrix (we previously called it ecoinvent), the parsed exiobase file created with pymrio (we previously called it io), the name and version of both databases to hybridize (here it's 'ecoinvent3.5' and 'exiobase3').

In [None]:
database_loader = pylcaio.DatabaseLoader(ecoinvent, io, 'ecoinvent3.5', 'exiobase3')

The object database_loader was successlly created. We now can run its ".combine_ecoinvent_exiobase()" method and at the same time, we initialize a second object (from the class LCAIO called lcaio_object) with the output of the method ".combine_ecoinvent_exiobase()".

The operation takes 1 or 2 minutes.

In [None]:
lcaio_object = database_loader.combine_ecoinvent_exiobase()

The lcaio_object was just initialized and ecoinvent and exiobase dataframes along with other parameters were transferred to it. We can now hybridize both databases with the method ".hybridize()". This method requires an argument: the name of the method to correct double counting. There are currently two choices available: 'STAM' or 'binary'. For details on these methods and know which one to use refer to: 
* Agez, Maxime, Guillaume Majeau-Bettez, Manuele Margni, Anders Hammer Strømman, and Réjean Samson. 2019. “Lifting the Veil on the Correction of Double Counting Incidents in Hybrid Life Cycle Assessment.” Journal of Industrial Ecology.
* Agez, Maxime, Richard Wood, Manuele Margni, Anders Hammer Strømman, Réjean Samson, and Guillaume Majeau-Bettez. 2019. “Hybridization of Complete LCA and MRIO Databases for Comprehensive a Product System Coverage.” Journal of Industrial Ecology.

In [None]:
lcaio_object.hybridize('STAM')

The operation should take around 15 minutes (we are inverting a 26000x26000 matrix here so it takes a while)

We can now save the resulting hybrid database in a pickle to not have to rerun the hybridization process everytime. By default the created pickle will appear in the /src/Databases/hybrid_databases/ folder of your pylcaio package.

In [None]:
lcaio_object.save_system('pickle')