# Replace market groups
info about market groups: https://ecoinvent.org/the-ecoinvent-database/market-activities/
- market groups only exist to facilitate readability
- no information is lost by removing them from the system

The purpose of this notebook is to remove the market groups from the database
- this is achieved by subsituting the technosphere input representing a market group by all markets in the group 
- this has to be done iteratively in several steps because some supply chains inlcude market groups which import from other market groups

In [1]:
import pickle
import random
import collections
import itertools
import copy
import pandas as pd
import numpy as np
import brightway2 as bw

### 1. Load LCI data 

In [2]:
# use "eidb" for checking dataset details in preliminary analysis and "datasets" for efficient calcualtion
bw.projects.set_current('regeco')
eidb = bw.Database('ecoinvent 3.7.1_cutoff_ecoSpold02')
with open('../../Data/lci_iot_imported/cutoff371_data.pickle', 'rb') as i:
    datasets = pickle.load(i)

regeco


19128

### 2. Preliminary analysis

In [5]:
market_groups = [d for d in eidb if d['activity type']=='market group']
market_group_codes = {d['code'] for d in market_groups}
len(market_groups)

115

#### 2.1 Make sure that no market groups imports from itself (so replacement is not infinite)

In [8]:
for mg in market_groups:
    exc = [exc for exc in mg.exchanges()]
    inputs = [e['input'][1] for e in exc if e['type'] != 'production'] # "production" -> exchange showing output. e['input'][1] returns code of activity.
    assert all([i != mg['code'] for i in inputs])

#### 2.2 Assert that  there are no bioshpere exchanges (so removing market group will not affect LCA results)

In [9]:
for mg in market_groups:
    exc = [exc for exc in mg.exchanges()]
    assert all([e['type']!='biosphere' for e in exc])

#### 2.3 Assert that there are not circular references (so replacement is not infinite)

In [11]:
code_d = {d['code']:d for d in datasets}

market_group_exchanges = lambda m: [code_d[e['input'][1]] for e in [exc for exc in m.exchanges()] if 
                                    code_d[e['input'][1]]['activity type']=='market group' and 
                                    e['type']!='production']  #anonymous function with m as input
i_max = 0
for mg in market_groups:
    list_of_market_groups = [mg]
    print(type(list_of_market_groups.pop()))
    i = 0
    while list_of_market_groups: 
        list_of_market_groups.extend(market_group_exchanges(list_of_market_groups.pop()))  
        i += 1
        if i > i_max:
            i_max = i
        if i == 20:
            print('failed')
            break
print(i_max)

<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee.proxies.Activity'>
<class 'bw2data.backends.peewee

#### 2.4 Check the algebraic sign of the amounts for market groups themselved and their technosphere inputs

In [12]:
collections.Counter([e['amount']/abs(e['amount']) for e in 
                     itertools.chain.from_iterable([[exc for exc in d.exchanges()] for d in market_groups]) if e['type']!='production'])

Counter({1.0: 1041, -1.0: 444})

In [13]:
collections.Counter([e['amount']/abs(e['amount']) for e in 
                     itertools.chain.from_iterable([[exc for exc in d.exchanges()] for d in market_groups])])

Counter({1.0: 1132, -1.0: 468})

note: negative amounts are for 'market of waste' exchanges of 'market groups of waste' (there are also negative amounts for the whole supply chain, i.e., market group of waste - consumer, treatment - market of waste). to prevent swapping of signs if both are negative, we need to replace negative amounts under 'market groups' by positive values during replacement, to ensure that after replacing market group by multiplication amounts are still negative

### 3. Replace market groups by markets in the groups.

In [27]:
%%time
datasets_no_mg = copy.deepcopy([d for d in datasets if d['activity type']!='market group'])
print(len(datasets_no_mg))

19013
CPU times: user 44.8 s, sys: 2.69 s, total: 47.5 s
Wall time: 47.5 s


In [26]:
def get_input_dataset_by_code(exchange):
    return code_d[exchange['input'][1]]

In [28]:
def combine_double_exchanges(exchanges):#no double exchanges now, but there may be after replacement
    """
    This function merges the technosphere double exchanges.
    
    """
    non_technosphere_ex = [e for e in exchanges if e['type'] != 'technosphere']
    technosphere_ex = [e for e in exchanges if e['type'] == 'technosphere']
    
    input_codes = {e['input'][1] for e in technosphere_ex}
    input_dict = {c:sorted([e for e in technosphere_ex if e['input'][1]==c], 
                           key=lambda x:x['amount'], reverse=True) for c in input_codes} # in the dic, code: names of exchanges, may >1 due to double exchanges
    combined_exchanges = []
    for code, exchange_list in input_dict.items():
        if len(exchange_list) == 1:
            combined_exchanges.append(exchange_list[0])
        else:
            new_ex = copy.deepcopy(exchange_list[0])
            new_ex['amount'] = sum([e['amount'] for e in exchange_list]) # use one of double exchanges as template, change its amount to sum of double exchanges
            combined_exchanges.append(new_ex)
    
    np.testing.assert_almost_equal(sum([e['amount'] for e in technosphere_ex]),
                                   sum([e['amount'] for e in combined_exchanges]),
                                   decimal=3, err_msg='merging error')
    
    combined_exchanges = non_technosphere_ex + combined_exchanges
    
    return combined_exchanges

In [29]:
def replace_market_group(exchanges): 
    """
    This function replace market group by markets in the group.
    
    """
    normal_exchanges = [e for e in exchanges if not e['input'][1] in market_group_codes]
    market_group_exchanges = [e for e in exchanges if e['input'][1] in market_group_codes]
    assert len([e for e in market_group_exchanges if e['type']!='technosphere']) == 0, 'working on no_mg_datasets' 
    
    if not market_group_exchanges: 
        return exchanges
    
    exchanges_via_market_group = [] 
    for exchange in market_group_exchanges:
        amount = abs(exchange['amount']) 
        indirect_exchanges = copy.deepcopy(get_input_dataset_by_code(exchange)['exchanges'])
        indirect_exchanges = [e for e in indirect_exchanges if e['type']!='production'] 
        for indirect_exchange in indirect_exchanges:
            indirect_exchange['amount'] *= amount 
        exchanges_via_market_group.extend(indirect_exchanges)
    
    assert not [e for e in indirect_exchanges if e['type']=='production'], 'included production exchange'
    
    
    adapted_exchanges = normal_exchanges + exchanges_via_market_group
    
    if len(adapted_exchanges) != len({e['input'][1] for e in adapted_exchanges}):
        adapted_exchanges = combine_double_exchanges(adapted_exchanges)
        
    adapted_exchanges_technosphere = [e for e in adapted_exchanges if e['type']=='technosphere']
    assert len(adapted_exchanges_technosphere) == len({e['input'][1] for e in adapted_exchanges_technosphere}),\
    'double exchanges'
    
    
    return adapted_exchanges

In [32]:
%%time
replacement_loop = 0
while True:
    replacement_loop += 1
    # calculate total number of market group exchanges in non-market-group datasets
    number_of_market_group_references = len([True for e in 
                                             itertools.chain.from_iterable([d['exchanges'] for 
                                                                            d in datasets_no_mg]) if 
                                             e['input'][1] in market_group_codes])
        
    print('replacement loop: {}'.format(replacement_loop))
    print('number of market group references: {}'.format(number_of_market_group_references))
    total_n_exchanges = len([True for e in itertools.chain.from_iterable([d['exchanges'] for 
                                                                          d in datasets_no_mg])])
    print('total number of exchanges: {}'.format(total_n_exchanges))
    if not number_of_market_group_references:
        print('done')
        break
    for d in datasets_no_mg:
        d['exchanges'] = replace_market_group(d['exchanges'])
        
    print('-'*50)

replacement loop: 1
number of market group references: 17938
total number of exchanges: 620119
--------------------------------------------------
replacement loop: 2
number of market group references: 16637
total number of exchanges: 857220
--------------------------------------------------
replacement loop: 3
number of market group references: 4676
total number of exchanges: 1011019
--------------------------------------------------
replacement loop: 4
number of market group references: 440
total number of exchanges: 1054975
--------------------------------------------------
replacement loop: 5
number of market group references: 0
total number of exchanges: 1059815
done
CPU times: user 21.7 s, sys: 441 ms, total: 22.1 s
Wall time: 22.1 s


### 4. Save the new datasets

In [33]:
with open('../../Data/lci_iot_imported/cutoff371_no_mg.pickle', 'wb') as o:
    pickle.dump(datasets_no_mg, o)

### 5. Write no-market-group datasets to bw2_databases

In [None]:
bw.databases

In [None]:
# import ecoinvent database as framework for new no-market-groups database
my_path = r'../../Data/ecoinvent_database/ecoinvent 3.7.1_cutoff_ecoSpold02/datasets'
datasets = bw.SingleOutputEcospold2Importer(my_path, 'ecoinvent 3.7.1_cutoff_ecoSpold02')
datasets.apply_strategies()

In [None]:
with open('../../Data/lci_iot_imported/cutoff371_no_mg.pickle', 'rb') as i:
    data_no_mg = pickle.load(i)

# rename the information of database name in no-market-groups database
for d in data_no_mg:
    assert d['database']=='ecoinvent 3.7.1_cutoff_ecoSpold02'
    d['database'] = 'ecoinvent 3.7.1_cutoff_no_marketgroups'
    for exc in d['exchanges']:
        if exc['input'][0] == 'ecoinvent 3.7.1_cutoff_ecoSpold02':
            exc['input'] = ('ecoinvent 3.7.1_cutoff_no_marketgroups', exc['input'][1])

In [None]:
datasets.db_name = 'ecoinvent 3.7.1_cutoff_no_marketgroups'
datasets.data = data_no_mg
datasets.write_database() 
bw.databases