# Adding process to the hybrid database

This notebook will show you how to add primary data to the hybrid database. The introduction of primary data is done through an excel template (which you can find in the pylcaio/src/ folder) and this tutorial will use the Template_example.xlsx in your pylcaio/doc/ folder.

The template only supports integration into ecoinvent3.5 for now.

The template __only__ enables the creation of unit processes, meaning that it does not support allocation. It contains 5 tabs:
* General info tab which gives general information about the template and its requirements
* Metadata_foreground tab in which you will enter all the metadata of the processes you wish to introduce into the database (i.e., name, geography, price, etc.)
* Unit_processes_exchanges tab where you will construct the input/output system of your created processses
* Technosphere_background tab in which the metadata of all ecoinvent3.5 processes is contained (notably their UUIDs which you will need)
* Biosphere_background tab in which the metadata of all ecoinvent3.5 emissions is contained (with their UUIDs again)

### First step: the metadata of your foreground processes

In the metadata_foreground tab there are 24 parameters to define an ecoinvent process, of which 9 are mandatory (in orange). Two others are mandatory only if the process is to be hybridized (in blue). The other parameters are not required by pyLCAIO and thus can be left empty (in yellow). 

PyLCAIO directly reads data of ecoinvent through the ecospold files and thus relies on UUIDs (Universal Unique IDentifiers) of ecoinvent to select processes. To introduce new data into this structure will thus require the use of UUIDs for the primary data you wish to introduce. To be more precise, up to 3 UUIDs per process-to-introduce can be necessary:
* productIds are unique identifiers to select products (i.e., the product "wheat" is identified through a unique ID regardless of the activity that produces it
* activityId are unique identifiers to select activities (i.e., the "petroleum refinery operation" activity is identified through a unique ID regardless of the different coproducts it generates but differentiates between the refinery in the US or the refinery in Canada)
* activityNameId are unique identifiers to select the activity name precisely (i.e., the "petroleum refinery operation" activity is identified through a unique ID regardless of the different coproducts it generates and regardless of the geography)

To add a process you therefore need to provide a productId, an activityId and an activityNameId. Keep in mind that each can come from already existing product/activity of ecoinvent. In this case, creating a new ID is useless. Use ctrl+f to peak into ecoinvent metadata in the technosphere_background tab.

If the product/activity to introduce is not already in ecoinvent, generate a UUID of your own. There are multiple ways to do this, we recommend just using the website https://www.uuidgenerator.net/

Once you enter the IDs, notice that a __main UUID__ is automatically built (combination of productId and activityId) which quickly identifies the process as a whole. Two processes *__cannot__* have the same main UUID.

<img src="images/prtsc3.png">

Next step is to fill the different names of product/activities. If the productId/activityId of your foreground process was already existing in ecoinvet, you need to take the __exact__ same productName/activityName (careful to typos). The same goes for its price, geography and unit.

In hybrid LCA, prices are basically everything. Since the complement from IO is linearly proportional to the price, the more precise this data is the better. All prices must therefore be in the same unit: €2005. To calculate inflation you can use this website: http://www.in2013dollars.com/eu/inflation/2005?amount=100 and use appropriate yearly conversion rates. You do not have to worry about the year of the exiobase3 version you took as it is already dealt with by pyLCAIO.

The 3 last columns are parameters introduced by pyLCAIO to perform the hybridization and are thus not in ecoinvent. They must therefore be completed by the user even for existing IDs of ecoinvent.

__io_geography__ corresponds to the geographic resolution with which the process will be hybridized. Most of the time, geography and io_geography will be equal. Yet, for special cases (e.g., provinces) they will differ. If the foreground process geography is CA-QC (for the province of Quebec in Canada), its io_geography cannot be CA-QC, because the Quebec province is not in exiobase. You would thus have to enter the closest geography from exiobase as a proxy, which would be CA (Canada as a whole). Available countries and regions are available in the pylcaio/src/Data/eco3.5_exio3/regions.txt and pylcaio/src/Data/eco3.5_exio3/countries.txt. 

Rest of the World regions are accepted but will require another non-RoW region for the same activityId to be defined or the RoW will be defaulted to GLO.

__ProductTypeName__ corresponds to the sector of the economy with which the foreground process will be hybridized. For instance, if you create a process describing the production of a Tesla, then your process should be hybridized with the "Motor vehicles, trailers and semi-trailers" sector. To check the 200 available sectors of exiobase, refer to the "Available_sectors.txt" file in your pylcaio/doc/ folder.

__to_hybridize__ assesses if you wish the process to be hybridized or not. Enter "yes" or "no". An empty cell equals to a no answer. If this parameter is set equal to "no" io_geography and ProductTypeName are not required.

<img src="images/prtsc4.png">

### Build the system structure

After entering the metadata of the foreground processes to add, now we will specify the inputs and outputs and the processes throughout the tab Unit_processes_exchanges. As a reminder, and as the name of the tab states, the template works with unit processes only.

In this tab, the green columns correspond to the destination activity, i.e., the process to be added. These columns are filled automatically once you enter the ID of the process in the ProcessId column. Since it is a unit process, an output of 1 is automatically produced for the processes' product as well (in the blue columns).

<img src="images/prtsc5.png">

To add inputs/outputs enter their main ID and the value of the input/output. Previously created processes can be added as inputs/outputs as well as classic ecoinvent process (technosphere_background). Only ecoinvent emissions can be added, through their EmissionId (biosphere_background).

This template uses the LCA convention, meaning that output have a positive sign while inputs have a negative sign.

<img src="images/prtsc6.png">

To track the background processes, use the method: navigate_through_PRO_f() (see Analyze the hybrid databse notebook to see how to use this method) or use CTRL+F!

Now that the processes are added, they will automatically be loaded when using the combine_ecoinvent_exiobase() method in a typical run of pyLCAIO.