# Backcalculate process mixes

This notebook is an attempt to figure out how easy it would be to backcalculate the inputs of an aggregated process, assuming that you knew the background database. We will use `ecoinvent 3.1 cutoff` as the background database and inventories, without applying LCIA methods.

We first set up our project.

In [11]:
from brightway2 import *
from scipy.optimize import *
import numpy as np
import pyprind
from scipy.optimize import minimize
from time import time

Create new project

In [3]:
projects.current = "backcalculate"

Basic setup: biosphere, LCIA methods, etc.

In [4]:
bw2setup()

Creating default biosphere

Applying strategy: normalize_units

Writing activities to SQLite3 database:
0%                          100%
[##############################] | ETA[sec]: 0.000 
Total time elapsed: 0.620 sec



Applying strategy: drop_unspecified_subcategories
Applied 2 strategies in 0.01 seconds
Title: Writing activities to SQLite3 database:
  Started: 10/09/2015 11:22:18
  Finished: 10/09/2015 11:22:18
  Total time elapsed: 0.620 sec
  CPU %: 91.100000
  Memory %: 1.296997
Created database: biosphere3
Creating default LCIA methods

Applying strategy: normalize_units
Applying strategy: set_biosphere_type
Applying strategy: drop_unspecified_subcategories
Applying strategy: link_iterable_by_fields
Applied 4 strategies in 1.84 seconds
Wrote 692 LCIA methods with 170915 characterization factors
Creating core data migrations



Import ecoinvent 3.1

In [5]:
ei31cutoff = SingleOutputEcospold2Importer(
    "/Users/cmutel/Documents/LCA Documents/Ecoinvent/3.1/cutoff/datasets",
    "ecoinvent 3.1 cutoff"
)
ei31cutoff.apply_strategies()
ei31cutoff.statistics()

Extracting ecospold2 files:
0%                          100%
[##############################] | ETA[sec]: 0.000 | Item ID: fff527b1-0fe4-4
Total time elapsed: 118.483 sec


Title: Extracting ecospold2 files:
  Started: 10/09/2015 11:23:33
  Finished: 10/09/2015 11:25:31
  Total time elapsed: 118.483 sec
  CPU %: 90.100000
  Memory %: 11.830235
Extracted 11301 datasets in 120.45 seconds
Applying strategy: normalize_units
Applying strategy: remove_zero_amount_coproducts
Applying strategy: remove_zero_amount_inputs_with_no_activity
Applying strategy: remove_unnamed_parameters
Applying strategy: es2_assign_only_product_with_amount_as_reference_product
Applying strategy: assign_single_product_as_activity
Applying strategy: create_composite_code
Applying strategy: drop_unspecified_subcategories
Applying strategy: link_biosphere_by_flow_uuid
Applying strategy: link_internal_technosphere_by_composite_code
Applying strategy: delete_exchanges_missing_activity
Applying strategy: delete_ghost_exchanges
Applied 12 strategies in 4.21 seconds
11301 datasets
521712 exchanges
0 unlinked exchanges
  


(11301, 521712, 0)

In [6]:
ei31cutoff.write_database()

Writing activities to SQLite3 database:
0%                          100%
[##############################] | ETA[sec]: 0.000 
Total time elapsed: 65.508 sec


Title: Writing activities to SQLite3 database:
  Started: 10/09/2015 11:25:52
  Finished: 10/09/2015 11:26:57
  Total time elapsed: 65.508 sec
  CPU %: 83.400000
  Memory %: 13.546133
Created database: ecoinvent 3.1 cutoff


Brightway2 SQLiteBackend: ecoinvent 3.1 cutoff

Clear import object to free memory

In [9]:
ei31cutoff = None

# Calculate all product inventory vectors

We next need to know all inventory vectors for all processes.

In [7]:
db = Database("ecoinvent 3.1 cutoff")

In [12]:
lca = LCA({db.random(): 1})
lca.lci(factorize=True)



In [31]:
mapping, vectors = {}, []

pbar = pyprind.ProgBar(len(db), title="Datasets:")

vectors = np.zeros((len(db), lca.inventory.shape[0]))

for index, ds in enumerate(db):
    lca.redo_lci({ds: 1})
    vectors[index, :] = np.array(lca.inventory.sum(axis=1)).ravel()
    mapping[ds.key] = index
    pbar.update()

Datasets:
0%                          100%
[##############################] | ETA[sec]: 0.000 
Total time elapsed: 206.648 sec


# Get optimization function

In [83]:
class ScoringFunction(object):
    """Scoring function for inventory vector.
    
    Calculate the unweighted Manhattan distance in n-dimensional space between 
    answer * vectors and guess * vectors, where n is the number of processes 
    in the database. Zeros are included and weighted the same as other values.
    
    Returns an number when called with a vector guess."""
    def __init__(self, lca, vectors):
        assert isinstance(lca, LCA) 
        assert isinstance(vectors, np.ndarray)
        assert hasattr(lca, "inventory")

        self.lca = lca
        self.vectors = vectors

    def sum_inventory_matrix(self, matrix):
        return np.array(matrix.sum(axis=1)).ravel()
        
    def set_answer(self, answer):
        """Given functional unit ``answer``, create an inventory vector of shape ``(flows,)``"""
        self.lca.redo_lci(answer)
        self.answer = self.sum_inventory_matrix(lca.inventory)
        
    def __call__(self, guess):
        """Evaluate a guess vector of shape ``(processes,)``"""
        assert guess.shape[0] == self.vectors.shape[0]
        assert len(guess.shape) == 1
        # Translate from (processes,) to (flows,)
        guess = guess.reshape((1, -1)).dot(self.vectors)
        # Get manhattan distance between two vectors
        distance = np.abs(guess - self.answer).sum()
        return distance

# Create an artificial system

In [46]:
answer = {db.random().key: float(np.random.random() * 10) for _ in range(5)}

Set up and check our scoring function

In [84]:
scorer = ScoringFunction(lca, vectors)
scorer.set_answer(answer)



Get test guess which should be perfect

In [51]:
perfect = np.zeros((len(db),))
for key, value in answer.items():
    perfect[mapping[key]] = value

In [58]:
scorer(perfect)

(11301,) 11301


5.4815357723086221e-12

In [85]:
start = time()
result = minimize(scorer, np.ones(perfect.shape), options={'maxiter': 50, 'disp': True})
print("Took {} minutes".format((time() - start)/60))

         Current function value: 156104120668565.406250
         Iterations: 0
         Function evaluations: 124345
         Gradient evaluations: 11
Took 20.973269168535868 minutes


# Display how close we can get

In [None]:
def how_close(guess):
    for key, value in answer.items():
        print(get_, value, mapping[key])

In [None]:
ALGORITHMS = ('Nelder-Mead', 'Powell', 'CG', 'BFGS', 'L-BFGS-B', 'TNC', 'COBYLA', 'SLSQP')

for algo in ALGORITHMS:
    r = minimize(MCDA, 5500., method=algo)
    print(algo, r['x'])