# 2 - Building and using matrices in `bw2calc`

Before we dive into it, let's think about what we need to actually build a matrix. What specific data would you need? What don't you need?

## Exercise

Please think about the minimal set of information you would need to build a *sparse matrix* using [scipy.sparse.coo_matrix](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_matrix.html) (sparse matrices store only non-zero values). Then, create this information as Numpy arrays and actually build a sparse matrix.

Here is the matrix you should build:

$$\begin{bmatrix} 0 & 1 \\ 2 & 3 \end{bmatrix}$$

## Hint

You will need three Numpy arrays: one for the data, one for the row indices, and one for the column indices.

## Solution

In [2]:
import numpy as np
from scipy import sparse

data = np.array([1, 2, 3])
rows = np.array([0, 1, 1])
cols = np.array([1, 0, 1])

matrix = sparse.coo_matrix((data, (rows, cols)), (2, 2))
matrix.toarray()

array([[0, 1],
       [2, 3]])

## `bw_processing`

We can run into difficulties when we want to store this data. The library `bw_processing` helps us create data packages, which can store this matrix-building data on variety of file systems. You can read the [`bw_processing` README](github.com/brightway-lca/bw_processing) for more information, and can see the [PyFilesystem2 Docs](https://docs.pyfilesystem.org/en/latest/) for more on the filesystems that can be used.

Let's define this same matrix in `bw_processing`.

Matrices by definition are two-dimensional, so we know that to build matrices we will always need to specify the row and column indices of the data. We combine these two arrays into a single Numpy [structured array](https://numpy.org/doc/stable/user/basics.rec.html), which uses the labels `row` and `col`.

In [4]:
import bw_processing as bwp
import numpy as np

indices_array = np.array([(0, 1), (1, 0), (1, 1)], dtype=bwp.INDICES_DTYPE)
indices_array

array([(0, 1), (1, 0), (1, 1)], dtype=[('row', '<i4'), ('col', '<i4')])

The data array is the same as before:

In [6]:
data_array = np.array([1, 2, 3])
data_array

array([1, 2, 3])

This is all we need to create a data package:

In [7]:
dp = bwp.create_datapackage()

dp.add_persistent_vector(
    matrix="some label",
    data_array=data_array,
    name="some name",
    indices_array=indices_array,
)

Why a vector?

Why is it persistent?

Matrix label and name

Didn't you say something about filesystems?

## `matrix_utils`

A datapackage is just a package... of data. Not a matrix. Let's build one using `matrix_utils`!

In [8]:
import matrix_utils as mu

In [11]:
mapped_matrix = mu.MappedMatrix(packages=[dp], matrix="some label")
mapped_matrix.matrix

<2x2 sparse matrix of type '<class 'numpy.float64'>'
	with 3 stored elements in Compressed Sparse Row format>

Why is this matrix mapped?

## Exercise

In a **new data package** (but using the same matrix label), recreate the same matrix, but with row indices starting at 100 and column indices starting at 200.

## Solution

In [12]:
indices_array = np.array([(100, 201), (101, 200), (101, 201)], dtype=bwp.INDICES_DTYPE)
indices_array

dp_offset = bwp.create_datapackage()

dp_offset.add_persistent_vector(
    matrix="some label",
    data_array=data_array,
    name="some name",
    indices_array=indices_array,
)
mapped_matrix = mu.MappedMatrix(packages=[dp_offset], matrix="some label")
mapped_matrix.matrix

<2x2 sparse matrix of type '<class 'numpy.float64'>'
	with 3 stored elements in Compressed Sparse Row format>

Let's see what happens when we combine these two data packages.

In [14]:
mapped_matrix = mu.MappedMatrix(packages=[dp, dp_offset], matrix="some label")
mapped_matrix.matrix.toarray()

array([[0., 1., 0., 0.],
       [2., 3., 0., 0.],
       [0., 0., 0., 1.],
       [0., 0., 2., 3.]])

Resource groups

get_resource

filter_by_attribute

multiple matrices

## `bw2calc`

We can finally use the main Brightway library `bw2calc`. Based on what you know, build a datapackage that can populate the following matrices in `bw2calc` (use these exact labels for the matrices):

* technosphere_matrix
* biosphere_matrix

In [17]:
# Do some work here to create the datapackage `my_complete_datapackage`

If you did it right, you should be able to run the following code:

In [16]:
index_of_my_product = 1  # Change if necessary

In [None]:
import bw2calc as bc

lca = bc.LCA({index_of_my_product: 1}, data_objs=[my_complete_datapackage])
lca.lci()
lca.inventory.toarray()

## Brightway gotchas

It's time to start our list of things that might mess up your day using Brightway.

The element is the way that we build the characterization matrix $C$. In Brightway, $C$ is a diagonal matrix, as this preserves the dimensions of the inventory matrix, and allows for contribution analysis in the future.

How do we define the row and column indices of a diagonal matrix? By definition, they are the same, so this seems like a needless duplication of effort. Instead, Brightway allows you to use the columns indices for regionalized characterization factors.

(some more text here)

There is also a Python gotcha here: Python is zero-indexed (i.e. the first value in an iterable has index 0), but 0 is also "false-y", so you can run into problems with code like this:

In [None]:
index_for_good_data = 0

if index_for_good_data:
    print("Doing some filtering")
else:
    print("No filtering, even though the first index value is 0. Oh well.")

A better way to do this is to check if you value is `None`:

In [19]:
def filter_func(index=None):
    if index is not None:
        print("Doing some filtering")
    else:
        print("No filtering, even though the first index value is 0. Oh well.")

In [None]:
filter_func(0)

Let's make things a bit more complicated. What if we wanted to include feedbacks from the biosphere to the technosphere (or indeed from the impact models as well)? Then, if we want to continue linearizing everything, we could combine the technosphere, biosphere, and characterization into one big matrix.