# Tutorial 1 - Getting Started

This is a tutorial for Brightway2, an open source framework for Life Cycle Assessment. This tutorial will cover the basics of databases and activities and looking at LCI databases and LCIA methods.

You will get the most from this tutorial as part of reading the [Brightway2 manual](https://brightway2.readthedocs.org/en/latest/).

At the end of this tutorial, you will be able to:

* Import basic data like the biosphere database

If you finish the tutorial, you get a kitten.

This tutorial is written in an Jupyter notebook, an online scientific notebook which combines, text, data, images, and programming. It is amazing, and could be a fantastic way to do and communicate advanced LCA work. See the [documentation](http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html) and a list of [awesome examples](https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks).

You should **download this notebook** and run it cell by cell - don't just read it on the web!

## Brightway 2 tutorials

Please read the tutorials in order, as they build upon each other.

* [1 - Getting Started](http://nbviewer.ipython.org/urls/bitbucket.org/cmutel/brightway2/raw/default/docs/notebooks/Tutorial 1 - Getting Started.ipynb)
* [2 - Working with data](http://nbviewer.ipython.org/urls/bitbucket.org/cmutel/brightway2/raw/default/docs/notebooks/Tutorial 2 - Working with data.ipynb)
* [3 - Basic LCA Calculations](http://nbviewer.ipython.org/urls/bitbucket.org/cmutel/brightway2/raw/default/docs/notebooks/Tutorial%201%20-%20Getting%20Started.ipynb)
* [4 - Meta-analysis](http://nbviewer.ipython.org/urls/bitbucket.org/cmutel/brightway2/raw/default/docs/notebooks/Tutorial 4 - Meta-analysis.ipynb)
* [5 - Defining A New Matrix](http://nbviewer.ipython.org/urls/bitbucket.org/cmutel/brightway2/raw/default/docs/notebooks/Tutorial%205%20-%20Defining%20A%20New%20Matrix.ipynb)

# Starting at the beginning

Import brightway2.

In [1]:
from brightway2 import *

## Python 2 and 3

Brightway2 is compatible with both Python 2 and 3. If you are using Python 2, I strongly recommend you execute the following cell, which will make all your text string unicode without have to type the letter ``"u"`` in front of each text string. On Python 3, the following **doesn't do anything** - your text strings are unicode by default.

In [2]:
from __future__ import unicode_literals, print_function

## Projects

In Brightway2, a project is a separate directory with its own copies of LCI databases, LCIA methods, and any other data you use. Each research project or article should probably be its own project, so that any changes you want to make will not interfere with your other work.

The default project is called ``default``:

In [3]:
projects.current

'default'

Each project is stored in a separate directory in a place in your filesystem reserved for application data. It varies depending on the operating system; on OS X, this is:

In [4]:
projects.dir

'/Users/cmutel/Library/Application Support/Brightway3/default.c21f969b5f03d33d43e04f8f136e7682'

However, you shouldn't really need to care about this. We can create a new project:

In [5]:
projects.current = "my new project"

And list the available projects:

In [6]:
list(projects)

[Project: default, Project: my new project]

## Getting basic data

Let's import some basic data - a database of elementary flows, some LCIA methods, and some metadata used for importing other databases:

In [7]:
bw2setup()

Importing some data...
Creating default biosphere
Applying strategy: drop_unspecified_subcategories

Writing activities to SQLite3 database:
0%                          100%
[##############################] | ETA[sec]: 0.000 
Total time elapsed: 0.654 sec



Title: Writing activities to SQLite3 database:
  Started: 05/14/2015 09:56:34
  Finished: 05/14/2015 09:56:35
  Total time elapsed: 0.654 sec
  CPU %: 91.500000
  Memory %: 1.036215
Created database: biosphere3
Creating default LCIA methods
Applying strategy: set_biosphere_type
Applying strategy: drop_unspecified_subcategories
Applying strategy: link_iterable_by_fields
Wrote 692 LCIA methods with 170915 characterization factors
Creating core data migrations


The iPython notebook by default prints all logged messages. On your machine, there might be messages track how long it took to download and process the biosphere and methods packages.

## A biosphere dataset

The ``biosphere3`` database is installed. It is called the ``biosphere3`` database because elementary flow names are normalized to the ecoinvent 3 standard.

The metadata for this database shows us:
    * What other databases it depends on (links into): An empty list, but other databases will link to these elementary flows
    * This data is stored in an SQLite3 database
    * When it was last modified and processed
    * How many elementary flows are present
    * The format the data was imported from
    
You don't really need to worry about any of this, but it is nice that someone is keeping track.

In [11]:
print("Installed databases:", list(databases))
print("Biosphere metadata:") 
Database("biosphere3").metadata

Installed databases: ['biosphere3']
Biosphere metadata:


{'backend': 'sqlite',
 'depends': [],
 'format': 'Ecoinvent XML',
 'modified': '2015-05-13T09:12:47.109096',
 'number': 3955,
 'processed': '2015-05-13T09:12:51.124811'}

OK, so what is going on here? First, we have one database installed: `biosphere`. In Brightway2, the biosphere flows, also called environmental interventions, or emissions and resources, are by default in a database called `biosphere`. This can be changed, of course - Brighway2 is all about flexibility - but you would need to make sure that the other inventory databases link correctly to the new name.

Next, there is biosphere metadata, which will look something like this:


    {
        'depends': [], 
        'number': 3913, 
        'version': 2, 
        'format': [u'Handmade', -1],
        'filename': 'biosphere.57b2c55dba95bfa2b6ba74418898df60.50'
    }

This metadata is stored in `databases`. `depends` list the other databases that `biosphere` would link to, and is empty, as biosphere flows don't have any inputs themselves. `number` is just the number of flows present, `version` is a primitive versioning system - this number is incremented each time new data is written. `filename` is the name of the file where the data is written. Finally `format` is a dsecription of what format the data was in when it was imported - in this case, it was built by hand.

The only required key here is `depends`, as this is needed to load all the relevant data when making calculations.

The method `Database.random()` loads a random key. In Python, dictionaries are a data structures with keys and values, e.g.:

In [8]:
my_dict = {"a": 1, "b": 2}
my_dict["a"]

1

In [9]:
"b" in my_dict

True

Brightway2 uses keys to identify datasets. Each dataset is identified by a combination of its database and some unique code. The code can be anything - a number, a UUID, or just a name. All of the following would be valid keys:

    ("biosphere", "f66d00944691d54d6b072310b6f9de37")
    ("my new database", "building my dream house")
    ("skynet", 14832)

Now, let's look at the data stored for our random key. We have to manually load the data, as this takes a bit of time, so is not done automatically.

In [10]:
biosphere.load()[random_flow]

{u'categories': [u'soil', u'agricultural'],
 u'code': 4002,
 u'exchanges': [],
 u'name': u'Dichlobenil',
 u'type': u'emission',
 u'unit': u'kilogram'}

The specifics of the [Database data format](https://brightway2.readthedocs.org/en/latest/lci.html#lci-datasets-are-documents) are covered in the manual.

Data is another Python dictionary. There are no `exchanges`, as elementary flows by definition don't have any inputs.

## An LCIA method dataset

We just downloaded a large number of LCIA methods installed:

In [11]:
len(methods)

677

Because LCIA methods have many different impact categories, they are identified by a special kind of list called a `tuple`. Let's look at an example:

In [12]:
method_key = methods.random()
method_key

(u'EDIP2003 w/o LT', u'ecotoxicity w/o LT', u'chronic, in water w/o LT')

In this case, the LCIA method has three levels of specificity, from the general name (first level) to the specific impact category (last level). There is nothing magic about three levels - you could have one, or one thousand - but Brightway2 expects these nested LCIA method identifiers.

A `tuple` is a special kind of list that uses `()` instead of `[]`, and can be used as keys in dictionaries (see [more on the difference](http://lmgtfy.com/?q=python+lists+versus+tuples)). To create a tuple with only one element, you need to add a comma, to distinguish it from a set of parentheses:

In [13]:
print (1 + 2)
print (1,), type((1,))

3
(1,) <type 'tuple'>


We can load the method data, which is just a list of characterization factors, and show some:

In [14]:
method_data = Method(method_key).load()
print "Number of CFs:", len(method_data)
method_data[:20]

Number of CFs: 478


[[(u'biosphere', u'cb9d45068d73136f124ccec077400dd8'), 13.793, u'GLO'],
 [(u'biosphere', u'6ef8a10f97b7deb14f99d7441da871b2'), 13.793, u'GLO'],
 [(u'biosphere', u'9f5eaef27fbd7e56f1326c15c47cf0a4'), 13.793, u'GLO'],
 [(u'biosphere', u'172388ab74ccfbe913b796bc8b5c4bd9'), 13.793, u'GLO'],
 [(u'biosphere', u'd99f68dc24466d743cb365b7cea20bca'), 50.633, u'GLO'],
 [(u'biosphere', u'0db3cc36a9a1b331bce25b58982be4a4'), 50.633, u'GLO'],
 [(u'biosphere', u'9d6ff6b9e46b3234393d3f22c2e22e63'), 50.633, u'GLO'],
 [(u'biosphere', u'8ea85c7197007a3233ff15955cbea050'), 50.633, u'GLO'],
 [(u'biosphere', u'9a17342d354c08fccb1797c07c63756e'), 80, u'GLO'],
 [(u'biosphere', u'072276520b1de1054aff938dbf587291'), 80, u'GLO'],
 [(u'biosphere', u'184cb2883ed124a02d5a3603810dbea5'), 80, u'GLO'],
 [(u'biosphere', u'c31309836cb53c6c087372ddb14777ce'), 80, u'GLO'],
 [(u'biosphere', u'8a6bbdb2aaf2406dda11a511c6a38e1f'), 1212.1, u'GLO'],
 [(u'biosphere', u'2fe885840cebfcc7d56b607b0acd9359'), 1212.1, u'GLO'],
 [(u'bio

The specifics of the [Method data format](https://brightway2.readthedocs.org/en/latest/ia.html#lcia-method-documents) are covered in the `bw2data` documentation, but the basic idea is a key for a biosphere flow, a numeric characterization factor, and a location. Actually, the method data format is pretty flexible, and the following are all acceptable:

    [('biosphere', 'CO2'), 1.0],
    [('biosphere', 'CO2'), 1.0, 'Australia, mate!'],
    [('biosphere', 'CO2'), {'amount': 1.0, 'uncertainty type': 0}]
   
In other words, the location code is optional, and characterization factors can be specified as either static values or uncertainty dictionaries.

If you are wondering why we need to identify biosphere flows like `('biosphere', '2fe885840cebfcc7d56b607b0acd9359')`, this is a good question! The short answer is that there is no single field that uniquely identifies biosphere flows or activities. The longer answer [is in the manual](http://brightway2.readthedocs.org/en/latest/lci.html#uniquely-identifying-datasets).

That's it! You will get your kitten when you execute this cell:

In [19]:
from IPython.display import Image
import random
dimensions = sorted((int(random.random() * 600 + 200), int(random.random() * 600 + 200)))
Image(url="http://placekitten.com/{}/{}/".format(*dimensions))