# Reallocating combined heat and power datasets in Brightway

This is a somewhat messy notebook on how we can use the data in allocated datasets to unallocate them, and then do group allocation to get the inventories we want.

In [1]:
import bw2data as bd
import bw2calc as bc
import pandas as pd
import copy

In [2]:
bd.projects.set_current("ei 3.8 cutoff")

Get the CHP datasets we are working on:

In [3]:
CHP = [
    x 
    for x in bd.Database('ecoinvent 3.8 cutoff') 
    if x['name'] == 'heat and power co-generation, natural gas, combined cycle power plant, 400MW electrical' 
    and x['location'] == 'DE'
]
CHP

['heat and power co-generation, natural gas, combined cycle power plant, 400MW electrical' (kilowatt hour, DE, None),
 'heat and power co-generation, natural gas, combined cycle power plant, 400MW electrical' (megajoule, DE, None)]

# Get unallocated combined dataset

We can combine the allocated datasets together to get the unallocated dataset, but we need to know their relative production ratios. The absolute numbers are not necessary as our output value will scale up and down with the inputs.

First, let's avoid some unit confusion by expressing everything in the same units. We need to be a bit careful - a unit conversion doesn't change the amount of biosphere flows, only the way that we express the production exchange value (in MJ, not kWh) and the production volume:

In [4]:
def extract_and_rescale(act, factor=1):
    def rescale_exchange(exc, factor):
        if exc['type'] == 'production':
            exc['amount'] *= factor
            exc['production volume'] *= factor
        return exc
        
    data_dict = act.as_dict()
    data_dict['exchanges'] = [rescale_exchange(exc.as_dict(), factor) for exc in act.exchanges()]
    return data_dict

Now we need to pick one allocated dataset to be the main one, and a unit to normalize to. As we know that the original data is normalized to one kilowatt hour of electricity, let's pick one megajoule of heat to prove we are not cheating.

In [5]:
heat = extract_and_rescale(bd.get_node(name='heat and power co-generation, natural gas, combined cycle power plant, 400MW electrical', location='DE', unit='megajoule', database='ecoinvent 3.8 cutoff'), 1)
elec = extract_and_rescale(bd.get_node(name='heat and power co-generation, natural gas, combined cycle power plant, 400MW electrical', location='DE', unit='kilowatt hour', database='ecoinvent 3.8 cutoff'), 3.6)

We don't yet know how much electricity is produced relative to one megajoule of heat. Moreover, we have our exergetic allocation to remember as well - heat has lower exergy than electricity. We have these numbers:

In [6]:
for activity in CHP:
    for exc in activity.production():
        print(exc['properties']['exergy']['comment'])
    break

allocation factor for electricity (=1) vs. heat, where heat is calculated as the termodynamic mean temperature Tm = (Tfeed-Treturn)/ln(Tfeed/Treturn), relative to the application temperature (Tu, typically = 293 K), i.e. (Tm-Tu)/Tm


The factor 3.6 below is just the conversion from kWh to MJ.

In [7]:
exergy = {
    activity['reference product']: exc['properties']['exergy']['amount']
    for activity in CHP
    for exc in activity.production()
}

exergy

{'electricity, high voltage': 3.6,
 'heat, district or industrial, natural gas': 0.184213553594}

We have converted everything to megajoules, so let's fix this dictionary:

In [8]:
exergy['electricity, high voltage'] = 1

Our allocation is done by exergy, so our relative allocation factors should be:

$$
allocation_{i} = \frac{exergy_{i} \cdot production\_volume_{i}}{\sum_{j} ( exergy_{j} \cdot production\_volume_{j})}
$$

We can then calculate our allocation factors. Heat should be much lower than electricity, as we know in the unallocated dataset there is more electricity being produced than heat, and heat has a much lower exergy value.

In [9]:
def get_allocation_factor(product, all_products):
    return (
        exergy[product['reference product']] 
        * next(exc['production volume'] for exc in product['exchanges'] if exc['type'] == 'production')
        / sum([exergy[obj['reference product']] * next(exc['production volume'] for exc in obj['exchanges'] if exc['type'] == 'production') for obj in all_products])
   )

In [10]:
allocation_factors = {act['reference product']: get_allocation_factor(act, (heat, elec)) for act in (heat, elec)}
allocation_factors

{'heat, district or industrial, natural gas': 0.12921890832494007,
 'electricity, high voltage': 0.8707810916750599}

Let's be clear about what these numbers mean (you probably already know this, but I sometimes get confused :). These are the relative shares of the *environmental burdens* and *input goods and services* of the unallocated data assigned to each one of the multiple output products. We can go backwards, but need to be careful because our `elec` and `heat` data are relative, not absolute, so just scaling everything up or down isn't correct. Moreover, reversing the application of any single allocation factor should result in the **total** amount of environmental burdens and input goods and services, so we can't reverse multiple allocated datasets and add them together.

## Validation of allocation factors

So what can we do? First, we can make sure that these allocation factors are correct. Let's reverse the allocation for electricity and make sure we get some values we know are correct from the [unlinked dataset](https://v38.ecoquery.ecoinvent.org/Details/UPR/87f4e644-7d1a-4510-bf6c-98a197d3176f/8b738ea0-f89e-4627-8679-433616064e82):

* Carbon dioxide, fossil: 0.38448
* Mercury: 4.9536E-10
* natural gas, high pressure: 0.18462

There is a single allocation factor being applied, so a few checks should be enough. We also need to bear in mind that we have electricity in megajoules, but the unallocated values are for 1 kilowatt hour (i.e. 3.6 megajoules) of electricity production.

In [11]:
UNALLOCATED = {
    "Carbon dioxide, fossil": 0.38448,
    "Mercury": 4.9536E-10,
    "natural gas, high pressure": 0.18462
}

In [12]:
for key, value in UNALLOCATED.items():
    exc = next(exc for exc in elec['exchanges'] if exc['name'] == key)
    print(value, exc['amount'] / allocation_factors[elec['reference product']])

0.38448 0.38447999999999993
4.9536e-10 4.953600000000003e-10
0.18462 0.18461538461538501


To do the same for heat, we need to know the ratio of heat (in energy, not exergy) to electricity produced in the unallocated dataset. We can get this from the production volumes:

In [13]:
heat_fraction = next(exc['production volume'] for exc in heat['exchanges'] if exc['type'] == 'production') / sum(exc['production volume'] for obj in (heat, elec) for exc in obj['exchanges'] if exc['type'] == 'production')
heat_ratio = heat_fraction / (1 - heat_fraction)
heat_ratio

0.8055555555555561

Or, the unallocated dataset was relative to 3.6 MJ of electricity production and $heat\_ratio \cdot 3.6$ MJ of heat production.

In [14]:
for key, value in UNALLOCATED.items():
    exc = next(exc for exc in heat['exchanges'] if exc['name'] == key)
    print(value, exc['amount'] / allocation_factors[heat['reference product']] * (3.6 * heat_ratio))

0.38448 0.384479999999999
4.9536e-10 4.95359999999999e-10
0.18462 0.18461538461538488


We can do one last check - let's get the total score from an LCIA calculation that we can compare with later on:

In [15]:
lca = bc.LCA({
    (elec['database'], elec['code']): 1, # one kWh, 3.6 MJ,
    (heat['database'], heat['code']): 3.6 * heat_ratio,
}, method=('IPCC 2013', 'climate change', 'GWP 100a'))
lca.lci()
lca.lcia()
reference_score = lca.score

# Recombining datasets

Once we have validation allocation, going to the unallocated datasets is relatively simple :)

In [16]:
production_scaling = {
    'heat, district or industrial, natural gas': (3.6 * heat_ratio),
    'electricity, high voltage': 1, # Already at 3.6 due to unit conversion
}

In [17]:
def unallocate(ds, factor, other_datasets, production_scaling):
    ds = copy.deepcopy(ds)
    for exc in ds['exchanges']:
        if exc['type'] != 'production':
            exc['amount'] *= factor
        elif exc['type'] == 'production' and exc['name'] == ds['reference product']:
            exc['amount'] *= production_scaling[exc['name']]
    for other_ds in other_datasets:
        for exc in other_ds['exchanges']:
            if exc['type'] == 'production' and exc['name'] == other_ds['reference product']:
                ds['exchanges'].append(exc)
    return ds

In [18]:
unallocated = unallocate(heat, (3.6 * heat_ratio) / allocation_factors[heat['reference product']], [elec], production_scaling)

# Allocating using grouped factors

Some people seem quite excited about doing allocation using different factors for different groups of exchanges. As an example, we can use different values for biosphere and technosphere exchanges. Of course, we can make this as complex or simple as we want.

We can also use allocation factors which don't sum to one, but we can skip the normalization step if they do.

In [19]:
ALLOCATION_FACTORS = {
    'biosphere': {
        'heat, district or industrial, natural gas': 0.2,
        'electricity, high voltage': 0.8,
    },
    'technosphere': {
        'heat, district or industrial, natural gas': 0.5,
        'electricity, high voltage': 0.5,
    },    
}

In [20]:
def allocation_by_groups(unallocated, factors):
    allocated_data = []
    
    products = {exc['name'] for exc in unallocated['exchanges'] if exc['type'] == 'production'}
    for product in products:
        ds = copy.deepcopy(unallocated)
        ds['reference product'] = product
        for exc in ds['exchanges']:
            if exc['type'] == 'production' and exc['name'] != product:
                exc['amount'] = 0
            elif exc['type'] == 'production':
                code = exc['input'][1]
            elif exc['type'] in factors:
                exc['amount'] *= factors[exc['type']][product]
        ds['code'] = code
        allocated_data.append(ds)
    return allocated_data

In [21]:
new_datasets = allocation_by_groups(unallocated, ALLOCATION_FACTORS)

# Storing and using our new datasets

We can store this data in different ways; here I will save it as a new Brightway database.

In [22]:
def change_database_codes(datasets, db_name):
    for ds in datasets:
        ds['database'] = db_name
        for exc in ds['exchanges']:
            if exc['type'] == 'production':
                exc['input'] = (db_name, exc['input'][1])
                del exc['output'] # Filled in by BW
    return datasets

In [23]:
db_name = "reallocated NG in Germany"

In [24]:
new_datasets = change_database_codes(new_datasets, db_name)

In [25]:
new_db = bd.Database(db_name)
new_db.write({(o['database'], o['code']): o for o in new_datasets})

Writing activities to SQLite3 database:
0% [##] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Title: Writing activities to SQLite3 database:
  Started: 08/24/2022 14:19:39
  Finished: 08/24/2022 14:19:39
  Total time elapsed: 00:00:00
  CPU %: 268.20
  Memory %: 1.37


Check that our allocated datasets have the correct individual and total LCA scores:

In [26]:
lca = bc.LCA({
    (db_name, elec['code']): 3.6, # Unit is now megajoule
    (db_name, heat['code']): 3.6 * heat_ratio,
}, method=('IPCC 2013', 'climate change', 'GWP 100a'))
lca.lci()
lca.lcia()
new_allocated_score = lca.score

In [27]:
reference_score, new_allocated_score

(0.4772085315130064, 0.47720852124955737)

The split between different activities has changed though:

In [28]:
FUNC_UNITS = [
    (('ecoinvent 3.8 cutoff', elec['code']), 1, "Ecoinvent electricity"),
    (('ecoinvent 3.8 cutoff', heat['code']), 2.9, "Ecoinvent heat"),
    ((db_name, elec['code']), 3.6, "New electricity"),
    ((db_name, heat['code']), 2.9, "New heat"),
]

In [29]:
for key, amount, label in FUNC_UNITS:
    lca.lcia({bd.get_activity(key).id: amount})
    print(label, lca.score)

Ecoinvent electricity 0.41554416740392963
Ecoinvent heat 0.06166436410907783
New electricity 0.3545886715411093
New heat 0.12261984970844801
