# Making Brightway2 faster

In general, steps have been taken to make Brightway2 faster. For example, key functions to construct matrices were rewritten in the [Cython](http://cython.org/) library [bw2speedups](https://pypi.python.org/pypi/bw2speedups/2.1) (see [the blog post](https://chris.mutel.org/fast-dont-lie.html)). However, Python is a comfortable language, not a fast one, and there will often be opportunities to optimize key steps or algorithms.

## Don't over-engineer things!

Optimization can be a fun engineering exercise, but please make sure it it worth it! If you have to do a single operation that takes an hour, maybe it is worth spending that hour reading a paper. Now, if you had to do that operation a thousand times...

## Timing

Before we start looking into specifics about what makes things fast or slow, you should know about the magic command `%timeit`. There is also a magic command`%time`; you can read more about [timeit](https://docs.python.org/3/library/timeit.html#timeit.Timer.timeit) and [magic functions](http://ipython.readthedocs.io/en/stable/interactive/magics.html).

In [1]:
import numpy as np

In [3]:
%timeit sum(np.random.random(size=100000))

100 loops, best of 3: 14.9 ms per loop


In [4]:
%timeit np.random.random(size=100000).sum()

1000 loops, best of 3: 1.67 ms per loop


## Profiling

The first step towards actually improving performance is to understand why things are slow. There are a number of Python profilers available:

* [pyflame (linux only)](https://github.com/uber/pyflame)
* [line_profiler](https://github.com/rkern/line_profiler)
* [SnakeViz](https://jiffyclub.github.io/snakeviz/)
* [memory_profiler](https://github.com/fabianp/memory_profiler)

And a lot of tutorials if you search for Python profiling or Python performance.

In [2]:
import brightway2 as bw2



In [3]:
config = {'demand': {bw2.Database('ecoinvent 2.2').random(): 1}, 'method': bw2.methods.random()}



Here is our profiling statement. With `%prun`, everything has to be on one line. This will popup a results screen.

You can also run whole cells in the profiler with `%%prun`, e.g.

    %%prun 
    import brightway2 as bw2
    config = {'demand': {bw2.Database('ecoinvent 2.2').random(): 1}, 'method': bw2.methods.random}
    lca = bw2.LCA(**config)
    lca.lci()

In [4]:
%prun lca = bw2.LCA(**config); lca.lci()

 



We can also get a graphical profiling result using a neat utility called snakeviz. Let's install it:

In [12]:
!pip install snakeviz

Collecting snakeviz
  Using cached snakeviz-0.4.1-py2.py3-none-any.whl
Installing collected packages: snakeviz
Successfully installed snakeviz-0.4.1




In [13]:
%load_ext snakeviz



In [4]:
%snakeviz lca = bw2.LCA(**config); lca.lci()

 
*** Profile stats marshalled to file '/var/folders/1r/qbs5ybm90j5b6443gqcczddm0000gn/T/tmppwlbmlyx'. 


The indexer takes the most time - basically nothing else matters. What is this indexer?

[Here is the source code](https://bitbucket.org/cmutel/brightway2-speedups/src/86e800c3fa5ba922e539df3e722faaa7656d305d/bw2speedups/_indexer.pyx?at=default&fileviewer=file-view-default). If you need some help, [here is where it is used](https://bitbucket.org/cmutel/brightway2-calc/src/105e24e2d803c96773651ed73c43d850f9c23548/bw2calc/matrices.py?at=default&fileviewer=file-view-default#matrices.py-41). Let's discuss what this is used for.

## Speeding up individual LCA calculation runs

What would be some strategies to speed this up? First, we need to decide if we do need to speed it up. Most of the time is spent in the initial startup, and any subsequent calculations will be quick:

In [7]:
%timeit [lca.redo_lci({bw2.Database('ecoinvent 2.2').random(): 1}) for _ in range(10)]

1 loop, best of 3: 2.3 s per loop


Hmm... that wasn't as fast as I thought it would be. Let's figure out what takes the time.

In [8]:
%prun [lca.redo_lci({bw2.Database('ecoinvent 2.2').random(): 1}) for _ in range(10)]

 

Half the time is spent on the database cursor. What if we move the database object creation out of the loop?

In [9]:
db = bw2.Database('ecoinvent 2.2')

In [10]:
%timeit [lca.redo_lci({db.random(): 1}) for _ in range(10)]

1 loop, best of 3: 2.28 s per loop


Maybe `random` in general is slow on my machine? What is we iterate through the database?

In [12]:
db = iter(bw2.Database('ecoinvent 2.2'))

In [14]:
%timeit [lca.redo_lci({next(db): 1}) for _ in range(10)]

1 loop, best of 3: 1.27 s per loop


## Speeding up matrix indexing

Back to our original question - is there a way to speed up indexing? We are already using Cython; we know that using Cython correctly can make things much better, but it is hard to see what could be changed in the code - we are basically doing a dictionary lookup, and Python dictionaries are pretty quick.

As we are using sparse matrices, what about just using the integer ids from `bw2data` directly, instead of trying to order everything to start from row or column zero? The sparse matrix bits would not care at all, but we do have dense components in the demand and supply arrays, and if we had a large number of elements in our project - say, 10 copies of ecoinvent - then we would lose time allocating and manipulating larger arrays, though this shouldn't be too much of a problem. We would also lose any real possiblity of entering dense matrix land.

However, actually implementing this is rather complicated, and so we leave it as an idea for the future.

## Speeding up multiple LCA calculations

When doing multiple LCA calculations, we can consider the setup step as a fixed cost, and instead focus on the time needed for each calculation. The library that `b2calc` uses for matrix calculations already has a number of optimizations, including storing information on the factorization of the technosphere matrix. We won't be developing a new linear algebra library, but there is still room to make faster or slower choices, as we will see in a simple example.

### Example of multiple calculations for multiple LCIA methods

In [5]:
db = iter(bw2.Database('ecoinvent 2.2'))
activities = [next(db) for _ in range(10)]
methods = [bw2.methods.random() for _ in range(10)]



A simple approach - a new LCA for each object

In [8]:
def multiples_one():
    results = np.zeros((10, 10))

    for row, method in enumerate(methods):
        lca = bw2.LCA({activities[0]: 1}, method)
        lca.lci()
        lca.lcia()

        for col, act in enumerate(activities):
            lca.redo_lcia({act: 1})
            results[row, col] = lca.score

    return results



In [9]:
%timeit multiples_one()

1 loop, best of 3: 11.3 s per loop




In [14]:
%snakeviz multiples_one()

 
*** Profile stats marshalled to file '/var/folders/1r/qbs5ybm90j5b6443gqcczddm0000gn/T/tmpihv9hpop'. 




Our old friend the indexer is again eating up most of the time.

Let's try to keep the LCA object and use the `switch_method` call.

In [10]:
def multiples_two():
    results = np.zeros((10, 10))

    lca = bw2.LCA({activities[0]: 1}, methods[0])
    lca.lci()
    lca.lcia()

    for row, method in enumerate(methods):
        lca.switch_method(method)
        for col, act in enumerate(activities):
            lca.redo_lcia({act: 1})
            results[row, col] = lca.score

    return results



In [11]:
%timeit multiples_two()

1 loop, best of 3: 1.52 s per loop




## Choice of Monte Carlo solver

In [15]:
from bw2calc.monte_carlo import DirectSolvingMonteCarloLCA, MonteCarloLCA



In [16]:
def iterative_mc():
    lca = MonteCarloLCA({activities[0]: 1}, methods[0])
    lca.load_data()

    results = np.zeros((10, 10))

    for row, act in enumerate(activities):
        lca.build_demand_array({act: 1})
        for col in range(10):
            results[row, col] = next(lca)

    return results



In [17]:
%timeit iterative_mc()

1 loop, best of 3: 6.56 s per loop




In [18]:
def direct_mc():
    lca = DirectSolvingMonteCarloLCA({activities[0]: 1}, methods[0])
    lca.load_data()

    results = np.zeros((10, 10))

    for row, act in enumerate(activities):
        lca.build_demand_array({act: 1})
        for col in range(10):
            results[row, col] = next(lca)

    return results



In [19]:
%timeit direct_mc()

1 loop, best of 3: 11.5 s per loop




## Interacting with the database