# Interacting with openLCA using Python
### The NetlOlca class
<subtitle>Created: Wednesday, January 24, 2024</subtitle>

This notebook examines the NetlOlca Python class, developed for interacting with openLCA projects (either directly with the app or indirectly through exported JSON-LD zip files).
This work is funded by NETL under the SA contract for Advanced Systems and Markets Analysis (SA 01.02.24.04-1-3).

The code is written by Priyadarshini and Tyler W. Davis (2023&ndash;2024).
The requirements for executing this code are:

- Python 3.11 (or higher)
- Jupyter Lab 4.0 (or higher)
- olca-ipc 2.0 (or higher)
- PyYaml 6.0 (or higher)
- Pandas 2.0 (or higher)

The `NetlOlca` Python class is defined in the NetlOlca.py module within the netlolca package. To import, type the following:

In [None]:
try:
    from netlolca.NetlOlca import NetlOlca
except ModuleNotFoundError:
    # Handle if this notebook is run from the "demo" directory.
    import os
    os.chdir("..")
    from netlolca.NetlOlca import NetlOlca

With the class definition imported, instantiate an instance of the class (no parameters are required).

In [None]:
netl = NetlOlca()

Now you are ready to go!

There are two basic modes of interacting with openLCA databases:

1. Directly (via the inter-process communication service protocol)
2. Indirectly (via ZIP file input-output processes)

In this example, we define the relative path to an existing ZIP file, which represents the JSON-LD export of a product system from openLCA.

In [None]:
data_files = [
    "data/alkaline_electrolysis.zip",
    "data/aluminum_single_system.zip"
]
my_file = data_files[1]

You may use Python standard library, `os`, to check to see if the file exists (or simply look it up in the File Browser in Jupyter Lab).

In [None]:
import os
os.path.isfile(my_file)

It is a two-step process connecting to an openLCA database (either directly or indirectly).
The first step is to either `open` a JSON-LD ZIP file or `connect` to the openLCA app.
The second step is to `read` the database.

In this example, since we are connecting to a JSON-LD file, we use the `open` method and send it the parameter definition that stores the JSON-LD file path.

In [None]:
netl.open(my_file)
netl.read()

In Python, the openLCA is represented in two ways:

1. The data schema (how the data are modeled, including formats, data types, and metadata)
2. The application (used to run processes and analyses and generate graphs)

For (1), GreenDelta provides olca-schema, a Python package with all the class defintions for "root entities."

**Root entities** are the main data components and are the folder names you see if you unzip a JSON-LD file (see the table below for the list of root entities).

| Root Entity Name | Description |
| ---------------- | ----------- |
| Actor | A person or organization |
| Currency | Costing currency |
| DQ System | Data quality system, a matrix of quality indicators |
| EPD | Environmental Product System |
| Flow | Everything that can be an input/output of a process |
| Flow property | Quantity used to express amounts of flow |
| Impact category | Life cycle impact assessment category |
| Impact method | An impact assessment method |
| Location | A location (e.g., country, state, or city) |
| Parameter | Input or dependent global/process/impact parameter |
| Process | Systematic organization or series of actions |
| Product system | A product's supply chain (functional unit) |
| Project | An openLCA project |
| Result | A calculation result of a product system |
| Social indicator | An indicator for Social LCA |
| Source | A literature reference |
| Unit group | Group of units that can be inter-converted |

It is through these root entities that all the openLCA data may be accessed.

The basic interaction with an openLCA database is to query data.
The `NetlOlca` class provides both specialized and generic querying methods.
The following table provides a short summary of these methods.

| Method Name | Description |
| :---------- | :---------- |
| `get_actors` | Return a metadata dictionary of actors |
| `get_descriptors` | Return a list of Ref objects for a given entity type |
| `get_exchange_flows` | Return a list of all flow universally unique identifiers |
| `get_flows` | Return a dictionary of input and/or output flow data for a give process |
| `get_input_flows` | Return a dictionary of input exchange flow data for a given process |
| `get_output_flows` | Return a dictionary of out exchange flow data for a given process |
| `flow_is_tracked` | Return true for a product flow |
| `get_num_inputs` | Return a count of input flows for a given process |
| `get_number_product_systems` | Return the count of product systems in a database | 
| `get_process_doc` | Return a process's documentation text |
| `get_process_id` | Return the universally unique identifier for a given product system's reference process |
| `get_reviewer` | Return reviewer's name and ID for a given process |
| `get_reference_category` | Return the category name for a product system's reference process |
| `get_reference_description` | Return the description text for a product system's reference process |
| `get_reference_doc` | Return the documentation text for a product system's reference process |
| `find_reference_exchange` | Return an indexed flow exchange for a given process |
| `get_electricity_gen_processes` | Return a list of electricity generation processes and their IDs |
| `match_process_names` | Return a list of process names and IDs for a given pattern |
| `get_reference_flow` | Return the quantitative reference flow for a given product system |
| `get_reference_name` | Return the name of a product system's reference process |
| `get_reference_process` | Return a list of reference processes for a given product system |
| `get_reference_process_id` | Return a list of UUIDs of reference processes for a given product system |
| `get_spec_class` | Return a root entity class of a given name |
| `get_spec_ids` | Return a list of UUIDs associated with a given root entity |
| `print_descriptors` | Print a data property for a given root entity |
| `print_unit_groups` | Print a list of unit group names. |
| `query` | Return the object for a given root entity and UUID |



A good first step is to see how many product systems are in your database.

In [None]:
netl.get_number_product_systems()

Okay, now we know there's a product system. What's it's name?

In [None]:
netl.get_reference_name()

And other metadata associated with it include:

In [None]:
netl.get_reference_category()

In [None]:
netl.get_reference_doc()

You may want to UUID of this product system.

In [None]:
netl.get_descriptors(                       # returns Ref objects
    netl.get_spec_class("Product system")   # for product system entities
)[0].id                                     # the UUID of the first (and only) one

And product systems have a functional unit. What is it for this one?

In [None]:
netl.get_reference_flow() # add UUID

Now, you may want to look into the properties of the reference process. First, what is the reference process's identifier?

In [None]:
netl.get_reference_process_id()

You may want to know how many input flows there are for this process.

In [None]:
process_id = netl.get_reference_process_id()[0]
netl.get_num_inputs(process_id)

Let's look closer at the input flows.

In [None]:
netl.get_input_flows(process_id)

Not as nice. Try visualizing as a data frame!

In [None]:
import pandas as pd
pd.DataFrame(netl.get_input_flows(process_id))

And try the same for output flows.

In [None]:
pd.DataFrame(netl.get_output_flows(process_id))

You can use the `match_process_names` method to search for process and the `query` method to get all the process's data.

In [None]:
import re
q = re.compile("Quicklime.*")
netl.match_process_names(q)

In [None]:
netl.get_process_doc() # add UUID

When you're done, you can close the file.

In [None]:
netl.close()

That's it!