# Brightway 2.5 - Linking with IAMs

The next generation of Brightway aims to make IAM integration easier.

Brightway is a matrix-based LCA software. This means that, in the end, we need to describe whatever system we are going to model in a set of linear equations and coefficients. IAMs provide new data - new coefficients. The current pain point is how to get these coefficients into the system.

## The current apporach

The new coefficients are used to create a *complete* new copy of ecoinvent+. 

* Slow! New data stored in a relational database, and then extracted again
* Takes a lot of disk space and memory
* Need a complete installation of the Brightway and import ecoinvent
    * Dispatching calculations to cloud almost impossible

# The new approach

* Only store deltas, not complete databases
    * Delta datasets can introduce new activities
* Storage format easy and fast 
    * Supports storage anywhere
* LCA calculations without Brightway fluff

## Installation

* Install [conda](https://docs.conda.io/en/latest/) or miniconda. We use conda as it has a [very fast sparse library](https://www.pardiso-project.org/).
* Create a new conda environment using conda or (better) [mamba](https://mamba-framework.readthedocs.io/en/latest/):

```
    mamba create -y -n bw25test -c conda-forge -c cmutel -c bsteubing -c haasad -c pascallesage pypardiso python=3.8 fs scipy numpy pandas stats_arrays appdirs pip
```

* Activate this environment following the instructions for your OS

* In your `bw25test` environment, install the new development libraries directly from github:

```
    pip install https://github.com/brightway-lca/bw_processing/archive/master.zip
    pip install https://github.com/brightway-lca/matrix_utils/archive/main.zip
    pip install https://github.com/brightway-lca/brightway2-calc/archive/master.zip
```

## Notebook setup

Import new libraries ([bw_processing](https://github.com/brightway-lca/bw_processing) and [matrix_utils](https://github.com/brightway-lca/matrix_utils))

In [41]:
import numpy as np
import bw_processing as bwp
from bw2calc.lca import LCA

# Simple example with in-memory data packages

Create a "data package":

* Set of data resources used to construct matrices
    * With metadata following [Data Package standard](https://specs.frictionlessdata.io/data-package/).
* Use [PyFilesystem2](https://docs.pyfilesystem.org/en/latest/) for many logical or virtual filesystems

In [46]:
dp = bwp.create_datapackage()

In memory is the default, we could also make it explicit:

```
from fs.memoryfs import MemoryFS
dp = bwp.create_datapackage(MemoryFS())
```

We can also store data on network drives, via FTP, on cloud platforms, etc. etc. There is a lot of potential here!

Add data. Each matrix (in this case) is constructed by one simple resource. We have the ability, however, to support more complex resource types :)

BTW, in our number scheme, 1xx are products, 4xx are activities, and 2xx are elementary (biosphere) flows.

In [47]:
dp.add_persistent_vector(
    matrix="technosphere_matrix",
    indices_array=np.array(
        [(100, 400), (101, 401), (102, 402), (103, 403),  # Production
         (100, 401), (101, 402), (101, 403), (102, 403)], # Inputs
        dtype=bwp.INDICES_DTYPE  # Means first element is row, second is column
    ),
    flip_array=np.array([
        False, False, False, False, # Production
        True, True, True, True      # Inputs
    ]),
    data_array=np.array([
        1, 1, 1, 1,  # Production
        2, 4, 8, 16  # Inputs
    ]),
)
dp.add_persistent_vector(
    matrix="biosphere_matrix",
    indices_array=np.array(
        [(200, 400), (200, 401), (200, 402), (200, 403), (201, 400), (201, 402)], dtype=bwp.INDICES_DTYPE
    ),
    data_array=np.arange(6),
)
dp.add_persistent_vector(
    matrix="characterization_matrix",
    indices_array=np.array(
        [(200, 200), (201, 201)], dtype=bwp.INDICES_DTYPE
    ),
    data_array=np.array([1, 10]),
)

Building an LCA is then as simple as the functional unit and the data packages

In [48]:
lca = LCA(demand={103: 1}, data_objs=[dp])
lca.lci()
lca.lcia()
lca.score

6667.0

# Using interfaces to generate data on demand

Let's increase the complexity. Sometimes we need to generate data on the fly. This we can do through "interfaces", Python code that generates data or wraps other data sources. These interfaces follow a very simple [bw_processing standard API](https://github.com/brightway-lca/bw_processing#persistent-versus-dynamic).

Intefaces can be classes:

In [49]:
class ExampleVectorInterface:
    def __next__(self):
        return np.array([1, 1, 1, 1, 2, 4, 8, 16], dtype=np.float64)

`__next__` just means we can call `next()` on the object:

In [50]:
next(ExampleVectorInterface())

array([ 1.,  1.,  1.,  1.,  2.,  4.,  8., 16.])

Of course, generator functions also support `next()`, and can be used:

In [51]:
from itertools import repeat
vector_interface = repeat(np.array([1, 1, 1, 1, 2, 4, 8, 16], dtype=np.float64))

Interfaces in action:

In [52]:
dp_vector = bwp.create_datapackage()
dp_vector.add_dynamic_vector(
    matrix="technosphere_matrix",
    interface=ExampleVectorInterface(),  # <- This is the part that changed
    indices_array=np.array(
        [(100, 100), (101, 101), (102, 102), (103, 103),  # Production
         (100, 101), (101, 102), (101, 103), (102, 103)], # Inputs
        dtype=bwp.INDICES_DTYPE  # Means first element is row, second is column
    ),
    flip_array=np.array([
        False, False, False, False, # Production
        True, True, True, True      # Inputs
    ]),
)
dp_vector.add_persistent_vector(
    matrix="biosphere_matrix",
    indices_array=np.array(
        [(200, 100), (200, 101), (200, 102), (200, 103), (201, 100), (201, 102)], dtype=bwp.INDICES_DTYPE
    ),
    data_array=np.arange(6),
)
dp_vector.add_persistent_vector(
    matrix="characterization_matrix",
    indices_array=np.array(
        [(200, 200), (201, 201)], dtype=bwp.INDICES_DTYPE
    ),
    data_array=np.array([1, 10]),
)

In [53]:
lca = LCA(demand={103: 1}, data_objs=[dp_vector])
lca.lci()
lca.lcia()
lca.score

6667.0

For future reference, here is the produced technosphere matrix:

In [None]:
lca.technosphere_matrix.todense()

# Interfaces with stochastic data

We can even treat the interface as a stochastic data source that overwrite existing values. We create an interface that will return some random data:

In [54]:
class RandomInterface:
    def __next__(self):
        return np.hstack([
            np.random.random() * np.array([8, 16])
        ])

And now we create an interface that will overwrite some of the static values.

Data package behaviour on conflicting data points is controlled by the `sum_intra_duplicates` and `sum_inter_duplicates` [policies](https://github.com/brightway-lca/bw_processing#policies).

In [57]:
overwriter = bwp.create_datapackage()

In [58]:
overwriter.add_dynamic_vector(
    matrix="technosphere_matrix",
    interface=RandomInterface(),
    indices_array=np.array(
        [(101, 403), (102, 403)],  # Indices of the values that will be overwritten
        dtype=bwp.INDICES_DTYPE   
    ),
    flip_array=np.array([           
        True, True      # Inputs
    ]),
)

We can now iterate over the LCA class. It will draw new data from the stochastic resources each time.

In [59]:
lca = LCA(demand={103: 1}, data_objs=[dp, overwriter])
lca.lci()
lca.lcia()

for _ in range(10):
    next(lca)
    print(lca.score)

8877.184761730492
12886.406100229011
12491.248110841014
9647.550954112308
7744.917704310206
8423.230391439525
12951.146663399293
8627.053079543732
10214.5514927928
11899.761827211625


# Different system configurations

As a reminder, our base matrix looks like:

```
matrix([[  1.,  -2.,   0.,   0.],
        [  0.,   1.,  -4.,  -8.],
        [  0.,   0.,   1., -16.],
        [  0.,   0.,   0.,   1.]])
```

The rows (products) are `100, 101, 102, 103`, and the columns (activities) are `400, 401, 402, 403`.

Sometimes we want to imagine different system configurations or allocation strategies. Let's imagine that activity 103 could be split two different ways, creating virtual activities `404` and `405`. To keep the matrix square, they also need virtual products `104` and `105`. We can then define an *array* of the different allocation results:

In [61]:
systems_array = np.array([
    [-10, -6],  # Amount of 101 needed by 404
    [-6, -10],  # Amount of 101 needed by 405
    [0, -20],   # Amount of 102 needed by 404
    [-20, 0],   # Amount of 102 needed by 405
    [1, 1],     # Production of 404
    [1, 1],     # Production of 405
])

systems_indices = np.array(
    [(101, 404), (101, 405), (102, 404), (102, 405), (104, 404), (105, 405)],
    dtype=bwp.INDICES_DTYPE   
)

One thing that is a change versus previous behaviour in e.g. [presamples](https://github.com/PascalLesage/presamples/) is that we can directly specify new elements in the matrix.

In [62]:
allocations = bwp.create_datapackage(sequential=True)  # Force different possibilities to evaluated in order

In [63]:
allocations.add_persistent_array(
    matrix="technosphere_matrix",
    data_array=systems_array,
    indices_array=systems_indices,
)

In [64]:
lca = LCA(demand={104: 1}, data_objs=[dp, allocations], use_arrays=True)
lca.lci()
lca.lcia()
print(lca.technosphere_matrix.todense())
print(lca.score)

[[  1.  -2.   0.   0.   0.   0.]
 [  0.   1.  -4.  -8. -10.  -6.]
 [  0.   0.   1. -16.   0. -20.]
 [  0.   0.   0.   1.   0.   0.]
 [  0.   0.   0.   0.   1.   0.]
 [  0.   0.   0.   0.   0.   1.]]
810.0


In [65]:
next(lca)
print(lca.technosphere_matrix.todense())
print(lca.score)

[[  1.  -2.   0.   0.   0.   0.]
 [  0.   1.  -4.  -8.  -6. -10.]
 [  0.   0.   1. -16. -20.   0.]
 [  0.   0.   0.   1.   0.   0.]
 [  0.   0.   0.   0.   1.   0.]
 [  0.   0.   0.   0.   0.   1.]]
8006.0


Sequential indices will wrap around to zero, so they be iterated over again:

In [66]:
for _ in range(4):
    next(lca)
    print(lca.score)

810.0
8006.0
810.0
8006.0


# Conclusions

New approach will:

* Make Alois's and Romain's life easier
* Allow for fast LCA calculations to be integrated into REMIND
* Allow for calculations to be dispatched to other computers
* Allow for "streaming" REMIND results into LCA

Current status:

* Everything but `bw2data` working
* Needs tests and more documentation