# Analyze the hybrid database

This tutorial will show you how to use the methods of the Analysis class of pyLCAIO, to perform life cycle calculations and contribution analyses.

To begin this tutorial, you need to have created your hybrid database and saved it (go check the notebook "Running_pyLCAIO"). You should therefore have a hybrid database stored in your src/Databases pyLCAIO folder. The name of the file will depend on the parameters used to create the hybrid database. For example, in the "Running_pyLCAIO" notebook, we used ecoinvent3.5 and exiobase3 and relied on the STAM method to corrected double counting.

Begin with import statements:

In [None]:
import pandas as pd
import sys
sys.path.append('path_to_pylcaio/src/')
import pylcaio

To create an object of the Analysis class 3 parameters are required: the name and version of both databases and the method to correct double counting. This object will then go to the src/Databases/ folder and select the hybrid system corresponding to the parameters entered. In this hybrid system are included all the important matrices created during the hybridization process.

In [None]:
analysis_object = pylcaio.Analysis('ecoinvent3.5','exiobase3',method_double_counting='STAM')

Two main methods in the Analysis class:
* calc_lifecycle(), performing life cycle calculations
* contribution_analysis(), performing contribution analyses

# Life cycle calculations

In [None]:
analysis_object.calc_lifecycle()

Calculations should take between 5 and 10 minutes depending on the calculation power of your machine. Results of the calculations are contained within the d matrix.

In [None]:
analysis_object.d

To manipulate the results you can either rely on pandas (and by extension on pyLCAIO) or export the matrix to excel using the following command:

In [None]:
analysis_object.d.to_excel('put_the_path_where_you_want_the_excel_sheet_to_be/name_of_the_excel_sheet.xlsx')

The following section describes the manipulation of the results using pyLCAIO. 

To navigate within the matrix use the .loc command, where the first argument looks at the rows (in this case impact methods) and the second argument looks at the columns (in this case the UUIDs of processes of ecoinvent). 

In [None]:
analysis_object.d.loc['CML 2001; climate change; GWP 100a; kg CO2-Eq','a96cb241-a4a9-4980-a16a-ba4b6a80175e_aeaf5266-3f9c-4074-bd34-eba76a61760c']

__How do I scroll through the available impact methods to select the one I want?__

First, know that only CML2001 is implemented in Exiobase. While adding CO2 characterized with ReCiPe (for the ecoinvent part) to CO2 characterized with CML2001 (for the exiobase part) will obviously work, we recommend limiting the use of other impact methods.

To get the exact name of the method, use the get_available_impact_methods() method. Say we want to see the names of the CML2001 methods for Global Warming Potential. The method returns two elements. The first one (i.e., the element indexed as 0 using Python logic) is the name of the method as used by __ecoinvent__ while the second (the element indexed as 1) is the name as used by __exiobase__.

In [None]:
#for ecoinvent
analysis_object.get_available_impact_methods('GWP')[0]

In [None]:
#for exiobase
analysis_object.get_available_impact_methods('GWP')[1]

__How do I match a UUID to the name and other characteristics of the process?__

The link between UUIDs and the metadata of processes is made in the PRO_f matrix. To facilitate the research of the correct UUID, use the method navigate_through_PRO_f(). The method has three arguments product, geography and activity, depending if you enter a product, a geography, an activity or a combination of those.

If we are looking for every process producing 'barley' and happening in Quebec:

In [None]:
analysis_object.navigate_through_PRO_f(product='barley',geography='CA-QC')

Just need to copy/paste the index after that.

To determine the percentage of increase in GWP due to hybridization of the barley process of Quebec for example, you just need to execute basic operation, now that you can access the data using the .loc command. Here I use variables to make the calculation clearer:

In [None]:
GWP_traditional_ecoinvent = analysis_object.d.loc[analysis_object.get_available_impact_methods('GWP')[0],'a96cb241-a4a9-4980-a16a-ba4b6a80175e_aeaf5266-3f9c-4074-bd34-eba76a61760c']
GWP_added_emissions_through_hybridization = analysis_object.d.loc[analysis_object.get_available_impact_methods('GWP')[1],'a96cb241-a4a9-4980-a16a-ba4b6a80175e_aeaf5266-3f9c-4074-bd34-eba76a61760c'][0]

GWP_hybrid_database = GWP_traditional_ecoinvent + GWP_added_emissions_through_hybridization

increase = GWP_added_emissions_through_hybridization / GWP_traditional_ecoinvent *100
increase

# Contribution analyses

The contribution_analysis method has three arguments. 

First, specify the type of contribution analysis wanted. It can be only on the original ecoinvent inputs (origin), only on the inputs added by the hybridization (added) or on both (both).

UUID corresponds to the UUID of the process to analyze. (use navigate_through_PRO_F() again)

impact_category corresponds to the impact category to analyze (GWP100, Acidification, Eutrohpication or Human toxicity)

In [None]:
df = analysis_object.contribution_analysis(type_of_analysis='both',
                                           UUID='a96cb241-a4a9-4980-a16a-ba4b6a80175e_aeaf5266-3f9c-4074-bd34-eba76a61760c',
                                           impact_category='GWP100')

The calculation will take around 10-15 minutes. Then you can either manipulate the dataframe (df) through pandas if you are accustomed to using it or export the results to an excel sheet.

In [None]:
df.to_excel('put_the_path_where_you_want_the_excel_sheet_to_be/name_of_the_excel_sheet.xlsx')