# Brightway2 seminar
Chris Mutel ([PSI](https://www.psi.ch/)), Pascal Lesage ([CIRAIG](http://www.ciraig.org/en/))

## Day 1, morning

### Learning objectives  
  - Learn about the general structure of Brightway and its most importand abstractions: projects, databases, activities and exchanges  
  - Learn how to find objects (notably activities and exchanges), assign them to variables and work with them using their associated methods  
  - Learn about how simple LCA calculations are done (one product, one impact category), and specifically how the different matrices are built and used  
  - Learn how to extract information from the matrices (inputs or results) and translate them into nice, human-readable objects  
  - Learn different ways to carry out comparative LCAs  
  - Learn different ways to carry out LCAs with multiple impact categories

### Content  

#### 1) Getting started  
##### 1.1) Downloading, installing and accessing Brightway2  
  - Downloading and installing Brightway  
  - Accessing Brightway2 libraries  
  
##### 1.2) Projects  
  - Concept  
  - Creating a new project, or switching to an existing project  
  - Contents of a project  

  
##### 1.3) bw2_setup(), biosphere3 database and LCIA methods  
  - bw2_setup()  
  - biosphere3 database and a first look at database objects  
  - Getting activities from codes or keys  
  - Methods  
  - Looking up elementary flows (list comprehensions, search)  
  - Searching for methods  
  - Nice display of data in methods 

##### 1.4) LCI databases  
  - Importing (succinct)  
  - LCI activities  
  - Looking up activities  
  - LCI exchanges
  - Loaded LCI databases
  
#### 2) My first LCA - simplest case:  
##### 2.1) General syntax of LCA calculations  

##### 2.2) The `demand` attribute  

##### 2.3) Reminder of the system that needs to be solved in calculating an LCI  

##### 2.4) Building the matrices  

  - $A$ matrix  
  - $B$ matrix  
  - $f$ (demand array)  
  
##### 2.5) Solution to the inventory calculation  

  - Supply array  
  - Inventory matrix  
  
##### 2.6) Life Cycle Impact Assessment  

  - `.lcia()` method  
  - Simple contribution analysis  
  
#### 3) My second LCA - comparative LCA:
    
#### 4) My third LCA - Multiple impact categories
  
#### 5) My first and third LCAs revisited with MultiLCA

### 1) Getting started

#### 1.1) Downloading, installing and accessing Brightway2

##### Downloading and installing Brightway

Please [install miniconda](https://conda.io/docs/install/quick.html).

Then, in a command prompt or a terminal window (see [Brightway2 docs](https://docs.brightwaylca.org/installation.html#launching-and-using-a-command-shell) if this is something new), enter the following:

conda create -n bw2 python=3.6
Then activate the bw2 environment using something like one of the following:

source activate bw2
activate bw2
Finally, instead the needed libraries:

conda install -q -y -c haasad -c conda-forge brightway2 jupyter jupyter_contrib_nbextensions jupyter_nbextensions_configurator

##### Accessing Brightway2 libraries

The different modules in Brightway2 are Python libraries. This means that, to use them, you can use any environment from which you normally use Python (Idle, command prompt, Spyder or, as is the case today, Jupyter Notebooks).  

We will favour Jupyter Notebooks in this seminar because it allows us to integrate code and text. It will also allow us to provide code snippets for you to complete.  

Note that the [Brightway2 installation package](https://docs.brightwaylca.org/installation.html) installs Brightway2 in a separate [Conda environment](https://conda.io/docs/using/using.html). This isolates Brightway2 from your other Python installations. It however requires you to activate the bw2 environment. You can do this the same way you [normally activate Conda environments](https://conda.io/docs/using/envs.html#change-environments-activate-deactivate), or by executing the bw2-env.bat batch file installed in your `bw2-python` directory (located at `C:\bw2-python`in Windows).  

The `bw2-python` directory also offers two other ways to run Brightway2: via IPython (run the `bw2-ipython.bat` file) or via Jupyter Notebooks (`bw2-notebook.bat`).  

For this course, you should run `bw2-notebook.bat` and open the Notebooks (such as this one), allowing you to directly run the code and get some hands-on experience. 

Like all other Python packages, you need to `import` Brightway2 modules. We will here import it as `bw`. This means that, to access a method or function from the brightway2 modules, the prefix `bw.` needs to be added. 

In [None]:
import brightway2 as bw

We're also going to be using the following libraries:

In [None]:
import os               # to use "operating system dependent functionality"
import numpy as np      # "the fundamental package for scientific computing with Python"
import pandas as pd     # "high-performance, easy-to-use data structures and data analysis tools" for Python

#### 1.2) Projects

##### Concept

The top-level containent in Brightway2 is the project (see [here](https://docs.brightwaylca.org/intro.html#projects) for a description and [here](https://docs.brightwaylca.org/technical/bw2data.html#projects) for the docs). A project contains LCI databases, LCIA methods and other less often used objects. Objects from one project do not interract with objects within other projects. By analogy, projects are like databases in openLCA and SimePro.  
When you first launch Brightway2, you will be in the `default` project. You can check this using the `current` property of the `projects` object: 

In [None]:
bw.projects.current

##### Creating a new project, or switching to an existing project

Let's create a new project for this seminar, unsurprisingly called "bw2_seminar_2017". There are two ways of doing this:  
* `projects.create_project('bw2_seminar_2017')` will create the project, but you will remain in your current project.
* `projects.set_current('bw2_seminar_2017')` will switch you to the project passed as argument, and create it first if it doesn't exist.  Let's do the latter:

In [None]:
# The name of the project is entered as string; 
# it doesn't really have any restrictions, so can include spaces, 
# special characters, other languages, or even emoji.

bw.projects.set_current('bw2_seminar_2017') 

You can always see what projects you have on your computer by running `list(bw.projects)`. Unless you have worked with Brightway2 before on your computer, your list should contain two projects: 'default' and 'bw2_seminar_2017'.

_**Exercise**_: list the projects on your computer.

In [None]:
list(bw.projects)

Like in all Python modules, you can get additionnal information on the `projects` object and associated properties and methods by typing `help(projects)`. The [docs](https://docs.brightwaylca.org/technical/bw2data.html#projects) are of course more verbose.

##### Contents of a project

One property of `projects` is its location, given by `projects.dir`:

In [None]:
bw.projects.dir

Looking at what is inside:  
<img src="images/project_folder_before_setup.JPG">

First things first: **do not panic**! You can use Brightway2 for years without ever opening this directory, but we will discuss some of these files later.

It is simply interesting to note that, for now, all the directories are empty except the `lci` directory, which contains an empty database.

All in all, the project takes up 4KB.  
Its now time to start populating the project.

#### 1.3) bw2setup(), biosphere3 database and LCIA methods

##### bw2setup()

The first thing you should do is add flow and LCIA methods. This is done by running the `bw2setup` function:

In [None]:
bw.bw2setup()

The output tells us that bw2_setup created some very useful things:  
  - Created a database called "biosphere3": this database contains elementary flows (called biosphere exchanges in Brightway2)  
  - 718 impact assessment methods  
  
It also created some a `mapping` between the imported exchanges and some integer: more on this later.  
The whole directory now takes up 125MB.

##### biosphere3 database and a first look at database objects

Looking at the contents of the `databases.db` database, we see we have imported 4029 exchanges:  
<img src="images\database_after_setup_data.JPG">

It also states that all the records are associated with the same database: `biosphere3`  

Trivia question: why "3"?

Note: You can always list the databases inside a project by simply typing 'bw.databases'. This accesses the 'database.json' file in your 'project.dir' (I learned the latter by typing `databases?`, you should try it too.)

In [None]:
bw.databases

While not impossible to interact with the data at this level, you probably never will unless you are developping some funky program. Instead, it is strongly recommended to learn to work with `abstractions`.  

To access a database in Brightway, you use the `Database` initialization method (again, you can type `Database?` for more information - this is the last I'll mention this.

In [None]:
bw.Database('biosphere3')

It doesn't actually return anything other than information about the Backend.  
However, there are many properties and functions associated with this database object.  These are found [here](https://docs.brightwaylca.org/technical/bw2data.html). We can also have a look through the autocomplete. Let's assign the database to a variable:

In [None]:
my_bio = bw.Database('biosphere3')

Let's check the my_bio `type`:

In [None]:
type(my_bio)

Let's check its length:

In [None]:
len(my_bio)

This is exactly the number of items we saw had been added to databases.db  
Given this, what do you think is going on?

If you type `my_bio.` and click on tab, you should get a list of properties and methods associated with database objects. Try this now:

In [None]:
my_bio.        #Type my_bio. and click tab. Have a look at the different properties and objects

Some of the more basic ones we will be using now are :  
  - `random()` - returns a random activity in the database
  - `get(*valid_exchange_tuple*)` - returns an activity, but you must know the activity key
  - `load()` - loads the whole database as a dictionary.
  - `make_searchable` - allows searching of the database (by default, it is already searchable)
  - `search` - search the database  
  
Lets start with `random`:

In [None]:
my_bio.random()

This returns a biosphere activity, but without assigning it to a variable, there is not much we can do with it directly.  

Note: It may seem counter-intuitive for elementary *flows* to be considered *activities* in Brightway, but it is no mistake. 
LCA models are made up of **nodes** (activities) that are linked by **edges** (exchanges). The biosphere activities are the nodes *outside* the technosphere. Emissions and resource extractions are modelled as exchanges between activities in the technosphere (part of the product system) and these biosphere activities.  

More on this later. 

For now, let's assign another random activity to a variable:

In [None]:
random_biosphere = my_bio.random()
random_biosphere

We can get the type of the object that was returned from the database:

In [None]:
type(random_biosphere)

The type is an **activity proxy**. Activity proxies allow us to interact with the content of the database. In the journey to and from the database, several translation layers are used:  
<img src="images/data_transition_layers.png">

In Brightway, we *almost* always work with `Activity` or `Exchange` objects. 

To see what the activity contains, we can convert it to a dictionary:

In [None]:
random_biosphere.as_dict()

##### Getting activities from codes or keys

We can see that the activities in the biosphere3 database have unique **codes**, which we can use with the `get` function:

In [None]:
my_bio.get(random_biosphere['code'])

Activities can also be "gotten" via `get_activity`, but the argument is the activity **key**, consisting of a tuple with two elements: the database name, and the activity code.

**Exercise:** Use `bw.get_activity()` to retrieve the random biosphere activity. 

In [None]:
database_name = 'biosphere3'
code = random_biosphere['code']
random_biosphere_key = (database_name, code)
random_biosphere_key

In [None]:
bw.get_activity(random_biosphere_key)

You can always find (or return) the key to an activity using the `.key` property.

In [None]:
random_biosphere.key

##### Searching for activities

Let's say we are looking for a specific elementary flow, we can use the `search` method of the database (see [here](https://docs.brightwaylca.org/technical/bw2data.html#default-backend-databases-stored-in-a-sqlite-database) for more details on using search):

In [None]:
bw.Database('biosphere3').search('carbon dioxide')

It is also possible to use "filters" to narrow searches, e.g.

In [None]:
bw.Database('biosphere3').search('carbon dioxide', filter={'categories':'urban', 'name':'fossil'})

The database object is also iterable, allowing "home-made" searches through list comprehensions. This approach is better because one can add as many criteria as wanted:

In [None]:
[act for act in my_bio if 'Carbon dioxide' in act['name'] 
                                            and 'fossil' in act['name']
                                            and 'non' not in act['name']
                                            and 'urban air close to ground' in str(act['categories'])
         ]

Activities returned by searches or list comprehensions can be assigned to variables, but to do so, one needs to identify the activity by index. Based on the above, I can refine my filters to ensure the list comprehension only returns one activity, and then choose it without fear of choosing the wrong one..

In [None]:
[act for act in my_bio if 'Carbon dioxide' in act['name'] 
                                            and 'fossil' in act['name']
                                            and 'non' not in act['name']
                                            and 'urban air close to ground' in str(act['categories'])]

In [None]:
activity_I_want = [act for act in my_bio if 'Carbon dioxide' in act['name'] 
                                            and 'fossil' in act['name']
                                            and 'non' not in act['name']
                                            and 'urban air close to ground' in str(act['categories'])
         ][0]
activity_I_want

**Exercise**: look for and assign to a variable an emission of nitrous oxide emitted to air in the "urban air" subcompartment.

In [None]:
# First inquiry:
[act for act in my_bio if 'nitrogen' in act['name']
                       and 'urban air' in str(act['categories'])
         ]

In [None]:
# Found what I need:
[act for act in my_bio if 'Dinitrogen monoxide' in act['name']
                       and 'urban air close to ground' in str(act['categories'])
         ][0]

Let's leave the biosphere database here for now.

##### Methods

bw2_setup() also installed LCIA methods.

In [None]:
bw.methods

One can load a random method:

In [None]:
bw.methods.random()

In [None]:
type(bw.methods.random())

Here, the random method returns the tuple by which the method is identified. To get to an actual method, the following syntax is used:

In [None]:
bw.Method(bw.methods.random())

Of course, a random method is probably not useful except to play around. To find an actual method, one can again use list comprehensions. Let's say I am interested in using the IPCC2013 100 years method:

In [None]:
[m for m in bw.methods if 'IPCC' in str(m) and ('2013') in str(m) and '100' in str(m)]

I am interested in the second of these, and will assign it to a variable. I can will refine my search until there is one element in my list and then choose it by subscripting.

In [None]:
[m for m in bw.methods if 'IPCC' in m[0]
                        and ('2013') in str(m)
                        and 'GWP 100' in str(m)
                        and 'no LT' not in str(m)]

In [None]:
# Good, now let's choose it:
ipcc2013 = [m for m in bw.methods if 'IPCC' in m[0]
                    and ('2013') in str(m)
                    and 'GWP 100' in str(m)
                    and 'no LT' not in str(m)][0]

Of course, if I know exactly the method I want, and I know the syntax, I can simply type it out: `('IPCC 2013', 'climate change', 'GWP 100a')`

In [None]:
type(ipcc2013)

In [None]:
ipcc_2013_method = bw.Method(ipcc2013)

In [None]:
type(ipcc_2013_method)

Again, there are a bunch of methods associated with a method object. You can access these by typing ipcc_2013_method. and clicking tab.  
For example, metadata:

In [None]:
ipcc_2013_method.name

In [None]:
ipcc_2013_method.metadata

In [None]:
ipcc_2013_method.metadata['unit']

Question: where is this metadata?

Let's use the `load` method to see what is in the object:

In [None]:
ipcc_2013_method.load()

This contains tuples with (elementary flow, characterization factors).

##### Nice display of data in methods 

**Exercise:** Create a dictionary with keys = elementary flow names and values = characterization factors for the ILCD "respiratory effects, inorganics" method (including long-term emissions).  
Bonus (optional): Generate a Pandas Series with the resulting dictionary. 

In [None]:
# First exploration
[m for m in bw.methods if 'ILCD' in str(m) 
                        and 'inorganics' in str(m)]

In [None]:
# Refine search and assign to a variable
ILCD_resp_effects_tuple = [m for m in bw.methods if 'ILCD' in str(m) 
                        and 'inorganics' in str(m)
                        and 'no LT' not in str(m)][0]
ILCD_resp_effects = bw.Method(ILCD_resp_effects_tuple)
ILCD_resp_effects

In [None]:
# Generate the dictionary using a comprehension:
ILCD_resp_effects_dict = {bw.get_activity(ef[0])['name']:ef[1] for ef in ILCD_resp_effects.load()}
ILCD_resp_effects_dict

In [None]:
# Bonus: put the whole thing in a neat Pandas series
pd.Series(ILCD_resp_effects_dict,
          name="{}, {}".format(ILCD_resp_effects.name, ILCD_resp_effects.metadata['unit']))

Enough said for now about methods.

#### 1.4) LCI datases

There is much information on the structure of LCI databases in Brightway2 [here](https://docs.brightwaylca.org/intro.html#inventory-databases), [here](http://nbviewer.jupyter.org/urls/bitbucket.org/cmutel/brightway2/raw/default/notebooks/Databases.ipynb) and [here](https://docs.brightwaylca.org/technical/bw2data.html#databases).  Probably the easiest way to learn about them, however, is to import one and have a look.  

Here is the code to import the ecoinvent v3.3 database. Don't do it though, not now: it takes too long, we're better off working with a smaller database and getting more done.

In [None]:
fpei33 = "enter your path to the datasets e.g. r'C:\FolderWith\LotsOf\ecoSpold2\Files'"

Next, execute the following code. It will import the ecoinvent database. We will look in detail in another Notebook at what this code actually does.

In [None]:
if 'ecoinvent 3.3 cutoff' in bw.databases:
    print("Database has already been imported")
else:
    ei33 = bw.SingleOutputEcospold2Importer(fpei33, 'ecoinvent 3.3 cutoff')
    ei33.apply_strategies()
    ei33.statistics()

The database needs to be written to disk:

In [None]:
ei33.write_database()

Let's instead import **ecoinvent v2.2**:

In [None]:
fpei22 =  "enter your path to the datasets e.g. r'C:\FolderWith\LotsOf\ecoSpold1\Files'"

In [None]:
if 'ecoinvent 2.2' in bw.databases:
    print("Database has already been imported")
else:
    ei22 = bw.SingleOutputEcospold1Importer(fpei22, 'ecoinvent 2.2')
    ei22.apply_strategies()
    ei22.statistics()

In [None]:
if 'ecoinvent 2.2' in bw.databases:
    print("Database has already been imported")
else:
    ei22.write_database()

Other code to import LCI databases in other formats are found [here](https://bitbucket.org/cmutel/brightway2-io/src/211f748e7b9987aef452a1ead1f483cc0b4bc25c/bw2io/importers/?at=default). We'll explore this later.

If you check that the database has actually been added to your project: 

In [None]:
eidb.

In [None]:
bw.databases

Note: the ei22 (or ei33) object created above is not the actual database, but actually an object used strictly for importing. 

In [None]:
# Only execute this code if you've just imported ecoinvent v2.2
type(ei22)

To access the actual database, you need to use the Database method: 

In [None]:
bw.Database('ecoinvent 2.2')

This is a more advanced topic, but note that there are alternative backends. See [here](https://docs.brightwaylca.org/technical/bw2data.html#inventory-data-backends).

Let's assign the database to a variable and see what we can do:

In [None]:
eidb = bw.Database('ecoinvent 2.2')

In [None]:
# Check the length of the database:
len(eidb)

Again, we can get an idea of useful methods and attributes by typing eidb. and Tab. Do this now.

In [None]:
eidb. #Press tab!

##### LCI activities

In the context of LCI databases, activities are the nodes "within the technosphere". They are therefore the columns in the technosphere matrix $A$.  
There are different ways to get access to an activity. Let's use the `random()` method for now to explore a random activity in the ecoinvent database.

In [None]:
random_act = eidb.random()

In [None]:
random_act

In [None]:
type(random_act)

To see what is stored in an activity object, let's convert our random act in a dictionary: 

In [None]:
random_act.as_dict()

Notice one important thing: **no exchanges**!  
Indeed, the exchanges and the activities are stored in two different tables of the `databases.db` database.  
It is possible, however, to iterate through the exchanges of the activities.

##### Searching and getting LCI activities

Searching and getting LCI activities is done exactly the same way as for activities in the biosphere3 database:

In [None]:
random_act.as_dict()

In [None]:
# Using search
eidb.search('transport', filter={'name':'lorry'})

In [None]:
random_act['location']

In [None]:
# Using list comprehensions:
[act for act in eidb if 'lorry' in act['name']
                    and 'RER' in act['location']
                    and '>32t' in act['name']
                    and 'operation' not in act['name']
]

**Exercise:** Return an activity for electricity production, coal-fired power plants in Germany

In [None]:
[act for act in eidb if 'electricity' in act['name']
                        and 'coal' in act['name']
                        and act['location']=='DE'
              ][0]

#### LCI exchanges

**`Exchanges`** are the edges between nodes.

These can be:  
  - an edge between two activities within the technosphere (an element $a_{ij}$ of matrix $A$)  
  - edges between an activity in the technosphere and an activity in the "biosphere" (an element $b_{kj}$ of the biosphere matrix $B$).

One can iterate through **all** exchanges that have a given activity as `output`

In [None]:
for exc in random_act.exchanges():
    print(exc)

One can also iterate through subsets of the exchanges:  
  - Technosphere exchanges: exchanges linking to other activities in the technosphere, `activity.technosphere()`  
  - Biosphere exchanges: AKA elementary flows, linking to activities in the biosphere database `activity.biosphere()`  
  - Production exchange: the reference flow of the activity `activity.production`  

Let's assign a **technosphere exchange** to a variable to learn more about it:

In [None]:
random_techno_exchange = [exc for exc in random_act.technosphere()][0]
random_techno_exchange

In [None]:
type(random_techno_exchange)

Again, the type is a proxy (refer to the diagram above about the different translation layers).

In [None]:
# Amount, or weight of the edge
random_techno_exchange.amount

In [None]:
# Activity the exchange stems from
random_techno_exchange.input

In [None]:
# Activity the exchange terminates in
random_techno_exchange.output

In [None]:
# Exchange as a dictionary
random_techno_exchange.as_dict()

Let's now look at a production exchange

**Exercise:** Assign a biosphere flow to a variable, and check the following:  
  - Is the output the same as for the technosphere exchange?  
  - From what database does the biosphere exchange come from?  
  - What is the amount of the exchange (i.e. the weight of the edge connecting the two activities)?
  
NOTE: If you get a ` list index out of range` error when trying to subscript your list comprehension, it means your list comprehension is empty, i.e. that there are no biosphere flows associated with the activity.

In [None]:
# Assign the exchange to a variable:
random_bio_exchange = [exc for exc in random_act.biosphere()][0]
random_bio_exchange

In [None]:
# Output of biosphere exchange
random_bio_exchange.output

In [None]:
# Is it the same as the output of the technosphere exchange? It should be!
random_bio_exchange.output == random_techno_exchange.output

In [None]:
# Database of the random biosphere exchange input - `.input`directly returns the activity proxy!
random_bio_exchange.input.key[0]

In [None]:
# Amount of exchange
random_bio_exchange['amount']

#### Loaded LCI databases

It is possible to load the entire database into a dictionary.  
This greatly speeds up work if you need to iterate over all activities or exchanges. The resulting object is quite big, so you should do this only if the gain in efficiency is worth it.

In [None]:
# Let's not do this in the seminar, ok?
eidb_loaded = eidb.load()

In [None]:
eidb_loaded

## 2) My first LCA

Brightway has a so-called LCA object.  
It is instantiated using `LCA(args)`.  
The only required argument is a functional unit, described by a dictionary with keys = activities and values = amounts (more [here](https://docs.brightwaylca.org/lca.html#specifying-a-functional-unit)).  
A second argument that is often passed is an LCIA method, passed using the method tuple.  

### 2.1) General syntax of LCA calculations

Let's create our first LCA object using our random activity and our IPCC method.  

In [None]:
myFirstLCA_quick = bw.LCA({random_act:1}, ('IPCC 2013', 'climate change', 'GWP 100a'))

The steps to get to the impact score are as follows:

In [None]:
myFirstLCA_quick.lci()    # Builds matrices, solves the system, generates an LCI matrix.
myFirstLCA_quick.lcia()   # Characterization, i.e. the multiplication of the elements 
                          # of the LCI matrix with characterization factors from the chosen method
myFirstLCA_quick.score    # Returns the score, i.e. the sum of the characterized inventory

Let's not take a closer look at the LCA object and its methods/attributes. We'll do this by creating a new LCA object: 

In [None]:
myFirstLCA = bw.LCA({random_act:1}, ('IPCC 2013', 'climate change', 'GWP 100a'))

### 2.2) the `demand` attribute

In [None]:
myFirstLCA.demand

To access the actual activity from the demand, you would do this:

In [None]:
demanded_act = list(myFirstLCA.demand.keys())[0]
demanded_act

In [None]:
demanded_act == random_act

There are also other attributes that have simply not been built yet, such as the `demand_array` and the `score`. To generate them, we first need to actually build the matrices. This will be done when calling the `.lci()` method.

### 2.3) Reminder of the system that needs to be solved in calculating an LCI

Before actually running the `.lci()` method, here's a quick refresher of the actual calculation that Brightway will need to do to calculate the inventory:  

$g=BA^{-1}f$  

where:  

  - $A$ = the technosphere matrix  
  - $B$ = the biosphere matrix (matrix with elementary flows)  
  - $f$ = the final demand vector  
  - $g$ = the inventory  

**Discussion:** Knowing what you do about the structure of Brightway (notably, activities and exchanges), what needs to happen to generate these matrices?

To consider:  
  - how should the order of the rows and columns be determined?  
  - how should we keep track of what is in each row and column?  
  - The parameters in the matrices are sometimes actually probability distribution functions - how should we consider this uncertainty information?  
  - The matrices are *sparse*, i.e. they are mostly made up of zeros. Should we consider this? Why? How?

### 2.4) Building the matrices

#### Structured arrays

LCI data imported in Brightway is stored in the `databases.db` database, discussed above.  
It is also stored in [numpy *structured arrays*](https://docs.scipy.org/doc/numpy/user/basics.rec.html). Here is a look at the  structured array for my ecoinvent 2.2 import (put in a pandas DataFrame because it looks nicer):  
<img src="images\structured_array_ecoinvent22.JPG">

**Exercise**(not core): Load the structured array of the ecoinvent database you are working with now.

In [None]:
eidb.filepath_processed()

In [None]:
your_structured_array = np.load(eidb.filepath_processed())
pd.DataFrame(your_structured_array).head()

In this array:  
  - `input` and `output` columns are integers that map to an activity. This mapping is found in the `mapping.pickle` file in the `project` directory and it looks something like this:

In [None]:
pd.Series(bw.mapping).head()

  - `row` and `col`store *dummy* placeholder information about the location of the parameter in the matrices. 
  - the `type` indicates whether the exchange is a reference flow (`type`=0), technosphere exchange (`type`=1) or elementary flow (`type`=2).  
  - the other columns deal with uncertainty data. We'll cover that later, but one can always read about these columns in the [`stats_arrays` documentation](http://stats-arrays.readthedocs.io/en/latest/)

When the `.lci()` method is called, the structured arrayas are used to build matrices. The code responsible to do this is in the [`MatrixBuilder` class methods](https://bitbucket.org/cmutel/brightway2-calc/src/105e24e2d803c96773651ed73c43d850f9c23548/bw2calc/matrices.py?at=default&fileviewer=file-view-default). 

The method `MatrixBuilder.build_dictionary` is used to take input and output values, respectively, and figure out which rows and columns they correspond to. The actual code is succinct - only one line - but what it does is:
  - Get all unique values, as each value will appear multiple times
  - Sort these values
  - Give them integer indices, starting with zero
This information on row and column indices is sufficient to build matrices. These matrices are build in a [COOrdinate sparse matrix](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_matrix.html) format, where, for each exchange, three values are required: row position, column position, and amount (the actual value). The sparse matrices are actually stored in [CSR format](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html#scipy.sparse.csr_matrix), but this is a detail.  

Some more details are are found [here](https://docs.brightwaylca.org/lca.html#building-matrices).  

Let's now finally run the `.lci()`.

In [None]:
myFirstLCA.lci()

Here's what the structured arrays *now* look like:  

In [None]:
pd.DataFrame(myFirstLCA.bio_params).head(5) # Technosphere parameters are at myFirstLCA.tech_params

We see that the row and col numbers are no longer dummy variables, but that they actually have real matrix indices.

#### Dictionaries that map between incides and activities

One of the useful things that the `MatrixBuilder` produces are `dictionaries` that map row and column numbers to the keys of activities.  There are three such dictionaries, all directly accessible as attributes of the LCA object:
 - `activity_dict`: Columns in the technosphere matrix $A$ or biosphere matrix $B$
 - `product_dict` : Rows in the technosphere matrix $A$  
 - `biosphere_dict`: Rows in the biosphere matrix $B$

Here what this dictionary looks like:

In [None]:
myFirstLCA.activity_dict

So, if I know the key to my activity (which, again,  is a `tuple` consisting of the database name and the activity code), I can read the column index (from `activity_dict`) or row index (from `product_dict` or `biosphere_dict` for the $A$ or $B$ matrices, respectively). 

Let's find out what column is associated with the activity that is producing our final demand as reference flow.

In [None]:
# Getting the key from the `demand`attribute:
act_key = list(myFirstLCA.demand)[0].key
# Getting the column number from the activity_dict:
col_index = myFirstLCA.activity_dict[act_key]
print("The column index for activity {} is {}".format(act_key, col_index))

While this is useful, it is often more useful to determine what a row or column in the matrices actually refers to. In these cases, we need a dictionary that maps row or column indices to activity keys, and not the opposite.  
We can do this by reversing our dictionaries:

In [None]:
myFirstLCA_rev_activity_dict = {value:key for key, value in myFirstLCA.activity_dict.items()}
myFirstLCA_rev_activity_dict

As a convenience, Brightway offers a method that will generate the three reverse dictionaries simultaneously.  
`.reverse_dict()` returns three reverse dicts (reverse activity dict,  reverse product dict,  reverse biosphere dict) *in that order*. The syntax for creating and assigning these reverse dicts is therefore: 

In [None]:
myFirstLCA_rev_act_dict, myFirstLCA_rev_product_dict, myFirstLCA_rev_bio_dict = myFirstLCA.reverse_dict()

#### $A$ and $B$ Matrices

We can also access the matrices that were constructed. Let's look at the technosphere matrix ($A$).  
The ** $A$ matrix**, with each element $a_{ij}$ provides information on the amount of input or output of product $i$ from activity $j$. When $i=j$, the element $a_{ij}$ is the *reference flow* for the activity described in the column.

In [None]:
myFirstLCA.technosphere_matrix

I am told that the dimensions of the matrix is $n*n$ where $n$ is the number of activities in my product system, and that the amount of actually stored elements is much less than $n^2$ (because the matrix is *sparse* and zero values are not stored).  

We can have an idea of what it stores by printing it out:

In [None]:
print(myFirstLCA.technosphere_matrix)

It therefore stores both the coordinates and the values (as expected).
We can slice this matrix using coordinates. For example, let's say we wanted a view of the exchanges associated with the unit process providing our functional unit.  
We already know found the column number for that activity: 

In [None]:
print("As a reminder, the column index for  {} is  {}".format(act_key, col_index))

To return the whole column from the $A$ matrix, we therefore slice the $A$ matrix.  
Python notes:  
  - In Python, slicing is done using []
  - We specify rows first, then columns  
  - `:` refers to "the whole row" or "the whole column" (depending if it is passed first or second in the []) 

In [None]:
myColumn = myFirstLCA.technosphere_matrix[:, col_index]
myColumn

Printing this out gives:

In [None]:
print(myColumn)

Not too useful: it would be better to get the *names* to these exchanges.  
We need to do two things:  
  - Get the indices from the CSR matrix (we can do this by converting it to a sparse matrix in `COOrdinate` format first)  
  - Get the activity code for the each index (we can do this using the reverse of the `activity_dict`)  
  - Use `get_activity` to access the actual names of the activities.  

1) Converting the CSR matrix to a COO matrix:  

In [None]:
myColumnCOO = myColumn.tocoo()
myColumnCOO

It is still a sparse matrix with the same number of elements, and it looks quite like the CSR version when we print it out:

In [None]:
print(myColumnCOO)

However, we can directly access the rows and column indices using `row` and `col`:  

In [None]:
myColumnCOO.row

2) Get the activity code for each element using the **reverse** product dictionary we produced above:

In [None]:
# Using a list comprehension:
[myFirstLCA_rev_product_dict[i] for i in myColumnCOO.row]

It would be even nicer to get the names for these:

In [None]:
names_of_my_inputs = [bw.get_activity(myFirstLCA_rev_product_dict[i])['name'] for i in myColumnCOO.row]
names_of_my_inputs

We can put these in a neat Pandas Series, with actual names and amounts:

In [None]:
# First create a dict with the information I want:
myColumnAsDict = dict(zip(names_of_my_inputs,myColumnCOO.data))
# Create Pandas Series from dict
pd.Series(myColumnAsDict, name="Nice series with information on exchanges in my foreground process")

Alternative way to generate similar information without even looking at the matrices:

In [None]:
pd.Series({bw.get_activity(exc.input)['name']:exc.amount for exc in random_act.technosphere()}, 
          name="alternative way to generate exchanges")

Note the differences:  
  - The reference flow is not there (activity.technosphere() only returns technoshere exchanges where the input is not equal to the output)  
  - The values are positive, not negative (because the $A$ matrix is $I-Z$ where $Z$ contains the information on these inputs.

**Exercise**: Create a Pandas Series with the elementary flows of the activity supplying the reference flow for myFirstLCA.

In [None]:
# Solution
myBioColumn = myFirstLCA.biosphere_matrix[:, col_index]
myBioValues = myBioColumn.tocoo().data
myBioNames = [bw.get_activity(myFirstLCA_rev_bio_dict[row])['name'] for row in myBioColumn.tocoo().row]
pd.Series(dict(zip(myBioNames, myBioValues)))

#### Demand array $f$

The demand array is the $f$ in $As=f$. 
It is an attribute of the LCA object:

In [None]:
myFirstLCA.demand_array

Looks like it is all zeros, but not so:

In [None]:
myFirstLCA.demand_array.sum()

So where is the one? We can know this by using our `activity_dict`.

In [None]:
demand_database = list(myFirstLCA.demand.keys())[0]['database']
demand_code = list(myFirstLCA.demand.keys())[0]['code']
(demand_database, demand_code)

In [None]:
row_of_demand = myFirstLCA.activity_dict[(demand_database, demand_code)]
row_of_demand # Row number of our demand vector containing the functional unit.

In [None]:
myFirstLCA.demand_array[row_of_demand]

### 2.5) Solution to the inventory calculation

We saw above how `.lci()` produced the $A$ and $B$ matrices.  
`.lci()` also *solves* the equation $As=f$ and calculated the inventory by multiplying the solution to this equation by the biosphere matrix.  

#### Supply array

Vector containing the amount each activity will need to provide to meet the functional demand, i.e. $s=A^{-1}f$.

In [None]:
myFirstLCA.supply_array

In [None]:
myFirstLCA.supply_array.shape

**Inventory matrix**  
Contains the inventory *by activity* (i.e. not summed). In other words, we do not have $g=BA^{-1}f$, but rather $G=B \cdot diag(A^{-1}f)$

In [None]:
myFirstLCA.inventory

We can aggregate the LCI results along the columns (i.e. calculate the cradle-to-gate inventory):

In [None]:
LCI_cradle_to_gate = myFirstLCA.inventory.sum(axis=1)
LCI_cradle_to_gate.shape

**Exercise:** Get the total (cradle-to-gate) emissions of nitrous oxide emitted to air in the "urban air" subcompartment.

In [None]:
NOx_act = [act for act in my_bio if 'Dinitrogen monoxide' in act['name']
                       and 'urban air close to ground' in str(act['categories'])
         ][0]
NOx_act.key

In [None]:
NOx_row = myFirstLCA.biosphere_dict[NOx_act]
NOx_row

In [None]:
myFirstLCA.inventory[NOx_row, :].sum()

### 2.7) LCIA

The LCIA calculation is done via the `.lcia()` method.

In [None]:
myFirstLCA.lcia()

A number of other matrices are now available:

In [None]:
# Matrix of characterization factors:
myFirstLCA.characterization_matrix

In [None]:
myFirstLCA.characterization_matrix.shape

In [None]:
# Matrix of characterized inventory flows
myFirstLCA.characterized_inventory

Question: would there be more, less or just as many elements in the inventory matrix as there are in the characterized inventory matrix?

The overall score is now an attribute of the LCA object: 

In [None]:
myFirstLCA.score

We also could have determined what this score was by summing the elements of our `characterized_inventory` matrix:

In [None]:
myFirstLCA.characterized_inventory.sum()

We could also have calculated it by multiplying the inventory and characterization factors ourselves:

In [None]:
(myFirstLCA.characterization_matrix * myFirstLCA.inventory).sum()

We could also calculate the score by elementary flow (summing columns for each rows), irrespective of the unit process that produced it:

In [None]:
elementary_flow_contribution = myFirstLCA.characterized_inventory.sum(axis=1) #Axis is the dimension I want to sum over:

In [None]:
elementary_flow_contribution.shape

Notice that is has **two** dimensions. The result is in fact a one-dimensional matrix:

In [None]:
type(elementary_flow_contribution)

To convert it to an array (probably more useful for many purposes), you can use any of the following approaches:

In [None]:
elementary_flow_contribution.A1 
#np.squeeze(np.asarray(elementary_flow_contribution))
#np.asarray(elementary_flow_contribution).reshape(-1)
#np.array(elementary_flow_contribution).flatten()
#np.array(elementary_flow_contribution).ravel()

**Exercise:** Create a Pandas series that has the scores per unit process, sorted by value (contribution analysis)

In [None]:
# Create array with the results per column (i.e. per activity)
results_by_activity = (myFirstLCA.characterized_inventory.sum(axis=0)).A1

In [None]:
# Create a list of names in columns
list_of_names_in_columns = [bw.get_activity(myFirstLCA_rev_act_dict[col])['name'] 
                            for col in range(myFirstLCA.characterized_inventory.shape[1])]

In [None]:
pd.Series(index=list_of_names_in_columns, data=results_by_activity).sort_values(ascending=False).head(10)

## 3) My second LCA: Comparative LCA

Let's choose two activities to compare, say Swiss electricity produced from respectively a run-of-river hydropower plant and a wind turbine.

**Exercise**: assign the two activities to variables `hydro` and `wind` respectively.

In [None]:
[act for act in eidb if "wind" in act['name'] and "electricity" in act['name'] and "CH" in act['location']]

In [None]:
wind = [act for act in eidb if "wind" in act['name']
        and "600kW" in act['name'] 
        and "kilowatt hour" in act['unit']][0]
wind

In [None]:
[act for act in eidb if "hydro" in act['name'] and "river" in act['name'] and "CH" in act['location']]

In [None]:
hydro = [act for act in eidb if "hydro" in act['name'] 
                     and "river" in act['name'] 
                     and "CH" in act['location']
                     and "kilowatt hour" in act['unit']
                     ][0]
hydro

Let's also compare these according to their carbon footprint as measured with the IPCC method we already selected above:

In [None]:
ipcc_2013_method

#### One at a time approach:

In [None]:
hydroLCA = bw.LCA({hydro:1}, ipcc_2013_method.name)
hydroLCA.lci()
hydroLCA.lcia()
hydroLCA.score

**Exercise:** Do the LCA for `wind`:

In [None]:
windLCA = bw.LCA({wind:1}, ipcc_2013_method.name)
windLCA.lci()
windLCA.lcia()
windLCA.score

In [None]:
#Compare results:
if windLCA.score>hydroLCA.score:
    print("Hydro is preferable")
elif windLCA.score<hydroLCA.score:
    print("Wind is preferable")
else:
    print("Both options have the same climate change indicator result")

Do one "delta" LCA:

In [None]:
deltaLCA = bw.LCA({wind:1, hydro:-1}, ipcc_2013_method.name)
deltaLCA.lci()
deltaLCA.lcia()
deltaLCA.score

In [None]:
if deltaLCA.score>0:
    print("Hydro is preferable")
elif deltaLCA.score<0:
    print("Wind is preferable")
else:
    print("Both options have the same climate change indicator result")

## 4) My third LCA - Multiple impact categories

Say we want to evaluate the indicator results for our randomAct for all ILCD midpoint categories (with long-term emissions).

In [None]:
# Make a list of all impact method names (tuples):
ILCD = [method for method in bw.methods if "ILCD" in str(method) and "no LT" not in str(method)]
ILCD

Simplest way: for loop, using `switch method`

In [None]:
myThirdLCA = bw.LCA({random_act:1}, ILCD[0]) # Do LCA with one impact category
myThirdLCA.lci()
myThirdLCA.lcia()
for category in ILCD:
    myThirdLCA.switch_method(category)
    myThirdLCA.lcia()
    print("Score is {:f} {} for category {}".format(myThirdLCA.score, 
                                                 bw.Method(category).metadata['unit'],
                                                 bw.Method(category).name)
          )

In [None]:
myFirstLCA_unitProcessContribution = myFirstLCA.characterized_inventory.sum(axis=0).A1
myFirstLCA_unitProcessRelativeContribution = myFirstLCA_unitProcessContribution/myFirstLCA.score

## Revising my second and third LCA with `MultiLCA`

The `MultiLCA` allows thecalculation of LCA results for multiple functional units and impact categories.  
One simply needs to create a `calculation setup`, i.e. a named set of functional units and LCIA methods.

Calculation setups: dictionary with lists of functional units and methods.

In [None]:
list_functional_units = [{wind.key:1}, {hydro.key:1}]
list_methods = ILCD

In [None]:
bw.calculation_setups['wind_vs_hydro'] = {'inv':list_functional_units, 'ia':list_methods}

In [None]:
bw.calculation_setups['wind_vs_hydro']

In [None]:
myMultiLCA = bw.MultiLCA('wind_vs_hydro')

In [None]:
myMultiLCA.results.shape

In [None]:
myMultiLCA.results

In [None]:
pd.DataFrame(index=ILCD, columns=[wind['name'], hydro['name']], data=myMultiLCA.results.T)

You can also create "fuller" DataFrames. Here is with code from [here](http://stackoverflow.com/questions/42984831/create-a-dataframe-from-multilca-results-in-brightway2): 

In [None]:
scores = pd.DataFrame(myMultiLCA.results, columns=myMultiLCA.methods)
as_activities = [
    (bw.get_activity(key), amount) 
    for dct in myMultiLCA.func_units 
    for key, amount in dct.items()
]
nicer_fu = pd.DataFrame(
    [
        (x['database'], x['code'], x['name'], x['location'], x['unit'], y) 
        for x, y in as_activities
    ], 
    columns=('Database', 'Code', 'Name', 'Location', 'Unit', 'Amount')
)
pd.concat([nicer_fu, scores], axis=1).T

You can even generate beautiful heatmaps like this in a relatively easy way, see example notebook [here](http://nbviewer.jupyter.org/urls/bitbucket.org/cmutel/brightway2/raw/default/notebooks/Using%20calculation%20setups.ipynb).

<img src="images/multiLCA_heatmap.JPG">

Done with deterministic LCA using only existing database items!