## 0.0 setup

### 0.0.0 initial setup

1. [set up your Conda environment](https://conda.io/projects/conda/en/latest/user-guide/getting-started.html) using the provided environment file `conda_env_hybridization.yml`:

```
conda env create -f <path to environment yaml file>
```

2. clone the [`pylcaio` repository](https://github.com/OASES-project/pylcaio)

### 0.0.1 known error messages

üåê compare also issues with the __bug__ label in the [`bw_hybrid` repository](https://github.com/michaelweinold/bw_hybrid/issues?q=is%3Aopen+is%3Aissue+label%3Abug)

1. [`AttributeError: 'IOSystem' object has no attribute 'Z' and then KeyError: 'PRO'`](https://github.com/michaelweinold/pylcaio_integration_with_brightway/issues/4) \
Caused by repeated execution of the `database_loader.combine_ecoinvent_exiobase()` and/or `lcaio_object.hybridize()` functions.

2. `brightway` import [currently breaks `%autoreload` magic](https://github.com/brightway-lca/brightway2/issues/49)

## 0.1. imports
### 0.1.1. regular imports

In [1]:
# i/o
import sys
import os
from pathlib import Path
import gzip
import pickle
# configuration
import yaml
# lca
import ecospold2matrix as e2m
import pymrio
import brightway2 as bw
# type hints
from ecospold2matrix import ecospold2matrix
from pymrio import IOSystem
# data science
import pandas as pd
# deep copy
import copy

### 0.1.2. local imports

set location of the cloned `pylcaio` repository

In [None]:
path_pylcaio: str = '~/github/pylcaio/'

append `pylcaio` location to system path to ensure it can be used by Python

In [4]:
sys.path.append(os.path.join(Path.home(), path_pylcaio)) # required for local import of pylcaio
import pylcaio

## 0.2. file paths

set location of databases (Ecoinvent and Exiobase) for use by the appropriate Python packages

### 0.2.1. directories

In [5]:
%%capture
# home directory
print(path_dir_home := Path.home())
# input directory
print(path_dir_databases := os.path.join(path_dir_home, config['path_dir_databases']))
# output directories
print(path_dir_data := os.path.join(path_dir_home, config['path_dir_data']))
print(path_dir_pylcaio := os.path.join(path_dir_home, path_dir_data, config['path_dir_pylcaio']))
print(path_dir_pymrio := os.path.join(path_dir_home, path_dir_data, config['path_dir_pymrio']))
print(path_dir_e2m := os.path.join(path_dir_home, path_dir_data, config['path_dir_e2m']))

#### 0.2.2. files

In [6]:
%%capture
# databases
print(path_exiobase := os.path.join(path_dir_home, path_dir_databases, config['exiobase']))
print(path_dir_ecoinvent := os.path.join(path_dir_home, path_dir_databases, config['ecoinvent']))
# pylcaio output
print(path_pylcaio_database_loader_class_instance := os.path.join(path_dir_pylcaio, config['pylcaio_database_loader_class_instance']))
print(path_pylcaio_class_instance_before_hybrid := os.path.join(path_dir_pylcaio, config['pylcaio_class_instance_before_hybrid']))
print(path_pylcaio_class_instance_after_hybrid := os.path.join(path_dir_pylcaio, config['pylcaio_class_instance_after_hybrid']))
# pymrio output
print(path_pymrio_class_instance := os.path.join(path_dir_pymrio, config['pymrio_class_instance']))

## 1.1. read databases and create Pandas dataframes

#### 1.1.1 read Exiobase database and save pickle to disk

‚ùî creates pymrio.IOSystem class instance (collection of pd.DataFrames etc.) \
‚è≥ ~1min if `load_from_disk == False`

In [10]:
path_pymrio_class_instance

'/Users/michaelweinold/data/data_pymrio/pymrio_class_instance.pkl'

In [9]:
%%time
if load_from_disk == True:
     with open(path_pymrio_class_instance, 'rb') as filestream:
        exiobase: pymrio.core.mriosystem.IOSystem = pd.read_pickle(filestream)
else:
    exiobase: pymrio.IOSystem = pymrio.parse_exiobase3(path_exiobase)
    with open(path_pymrio_class_instance, 'wb') as file_handle:    
        pickle.dump(obj = exiobase, file = file_handle, protocol=pickle.HIGHEST_PROTOCOL)

ReadError: Given path does not exist

#### 1.1.2 read ecoinvent

‚ùî creates e2m.Ecospold2Matrix class instance and writes dataframe to defined output directory in `pickle` format. \
‚è≥ ~12min if `load_from_disk == False`

In [17]:
%%capture
print(e2m_project_name := config['e2m_project_name'])
print(path_dir_e2m_logs := os.path.join(path_dir_e2m, e2m_project_name, config['path_dir_e2m_logs']))
print(path_file_e2m_pickle := os.path.join(path_dir_e2m, e2m_project_name + config['e2m_pickle_filename']))
print(pattern_e2m_characterization_db := '*.db')

ecoinvent_3_5_cutoff
/home/weinold/data/data_e2m/ecoinvent_3_5_cutoff/_log
/home/weinold/data/data_e2m/ecoinvent_3_5_cutoffPandas_symmNorm.gz.pickle
*.db


In [None]:
def delete_e2m_files(list_string: list) -> None:
    for i in list_string:
        !rm -rf $i
    pass

In [11]:
def copy_ecoinvent_from_shared_to_local(path: path_dir_ecoinvent) -> None:
    !mkdir -p $path_dir_ecoinvent && cp -ru /srv/data/ecoinvent-3.5-cutoff/* $path_dir_ecoinvent
    pass

In [13]:
if load_from_disk == True:
    ecoinvent: dict = read_ecoinvent_pickle(path_file_e2m_pickle)
else:
    delete_e2m_files(
        [
            path_dir_e2m,
            path_dir_e2m_logs,
            path_dir_ecoinvent,
            pattern_e2m_characterization_db
        ]
    )
    copy_ecoinvent_from_shared_to_local(path_dir_ecoinvent)
    parser = e2m.Ecospold2Matrix(
        sys_dir = path_dir_ecoinvent,
        project_name = e2m_project_name,
        out_dir = path_dir_e2m,
        #characterisation_file = path_e2m_char_file,
        positive_waste = False,
        nan2null = True)
    parser.ecospold_to_Leontief(
        fileformats = 'Pandas',
        with_absolute_flows=True)
    with gzip.open(path_file_e2m_pickle, 'rb') as filestream:
        ecoinvent: dict = pd.read_pickle(filestream)

2022-09-12 11:28:40,279 - ecoinvent_3_5_cutoff - INFO - Ecospold2Matrix Processing
2022-09-12 11:28:40,282 - ecoinvent_3_5_cutoff - INFO - Current git commit: e9d511bcbee84ffbbaee8d6b2bb8b8565815bff0
2022-09-12 11:28:40,282 - ecoinvent_3_5_cutoff - INFO - Project name: ecoinvent_3_5_cutoff
2022-09-12 11:28:40,283 - ecoinvent_3_5_cutoff - INFO - Unit process and Master data directory: /home/weinold/data_pylcaio_input/ecoinvent-3.5-cutoff
2022-09-12 11:28:40,283 - ecoinvent_3_5_cutoff - INFO - Data saved in: /home/weinold/data/data_e2m
2022-09-12 11:28:40,284 - ecoinvent_3_5_cutoff - INFO - Replace Not-a-Number instances with 0.0 in all matrices
2022-09-12 11:28:40,284 - ecoinvent_3_5_cutoff - INFO - Pickle intermediate results to files
2022-09-12 11:28:40,285 - ecoinvent_3_5_cutoff - INFO - Order processes based on: ISIC, activityName
2022-09-12 11:28:40,285 - ecoinvent_3_5_cutoff - INFO - Order elementary exchanges based on: comp, name, subcomp
rm: cannot remove 'ecoinvent_3_5_cutoff_c

FileNotFoundError: [Errno 2] No such file or directory: '/home/weinold/data/data_e2m/ecoinvent_3_5_cutoff/Pandas_symmNorm.gz.pickle'

### 2.1. main `pylcaio` functionality

In [19]:
database_loader: pylcaio.DatabaseLoader  = pylcaio.DatabaseLoader(
    lca_database_processed = ecoinvent,
    io_database_processed = exiobase,
    lca_database_name_and_version = 'ecoinvent3.5',
    io_database_name_and_version = 'exiobase3')

In [20]:
with open(path_pylcaio_database_loader_class_instance, 'wb') as file_handle:
    pickle.dump(obj = database_loader, file = file_handle, protocol=pickle.HIGHEST_PROTOCOL)

In [21]:
lcaio_object: pylcaio.LCAIO = database_loader.combine_ecoinvent_exiobase(
    complete_extensions = False,
    impact_world = False,
    regionalized = False)

No path for the capital folder was provided. Capitals will not be endogenized


In [22]:
with open(path_pylcaio_class_instance_before_hybrid, 'wb') as file_handle:
    pickle.dump(obj = lcaio_object, file = file_handle, protocol=pickle.HIGHEST_PROTOCOL)

In [23]:
lcaio_object.hybridize(
    price_neutral_cut_off_matrix = 'STAM',
    capitals = False,
    priceless_scaling = True)

Indentifying Rest of World regions...
Updating electricity prices...
Calculating productions volumes...
Adjusting low production volume processes...
Extending inventory...
Building H matrix...


  self.H = self.H.append([self.H] * (self.number_of_countries_IO + self.number_of_RoW_IO - 1))


Building geography concordance...
Filter H matrix...
Build Cut-off matrix...
Add processes with 'priceless scaling' to Cut-off matrix...


In [24]:
with open(path_pylcaio_class_instance_after_hybrid, 'wb') as file_handle:
    pickle.dump(obj = lcaio_object, file = file_handle, protocol=pickle.HIGHEST_PROTOCOL)