# Sector matching

The matching between ISIC codes in ecoinvent and sectors in exiobase has been done by Simon Roth and Jonas Mehr [1]. This notebook is to check whether the former matching work covers all ISIC codes in ecoinvent v3.7.1 and sectors in exiobase of 2011.



[1] Jonas Mehr and Simon Roth. Development of a tool refining the regional resolution in life cycle inventory. Project work report, ETH Zürich, 11 2015.

In [1]:
import numpy as np
import pandas as pd
import pickle
import brightway2 as bw
import re

In [6]:
# local file
with open('../../Data/sector_matching/sector_matching_dicts.pickle', 'rb') as i:
    matching_dicts = pickle.load(i)
matching_dicts

{'exio_to_isic': {'i28': ['2511',
   '2512',
   '2513',
   '2520',
   '2591',
   '2592',
   '2593',
   '2599'],
  'i80': ['8510', '8521', '8522', '8530', '8541', '8542', '8549', '8550'],
  'i90.6.e': ['3811', '3812', '3821', '3822'],
  'i40.11.b': ['3510'],
  'i01.o': ['0150'],
  'i17': ['1311', '1312', '1313', '1391', '1392', '1393', '1394', '1399'],
  'i27.42': ['2420'],
  'i90.6.a': ['3811', '3812', '3821', '3822'],
  'i90.1.d': ['3811', '3812', '3821', '3822'],
  'i26.b': ['2393'],
  'i21.2': ['1702', '1709'],
  'i90.6.f': ['3811', '3812', '3821', '3822'],
  'i13.20.13': ['0729'],
  'i13.20.12': ['0729'],
  'i90.4.a': ['3811', '3812', '3821', '3822'],
  'i90.1.b': ['3811', '3812', '3821', '3822'],
  'i50.b': ['4730'],
  'i30': ['2610', '2620', '2817'],
  'i14.3': ['0891', '0892', '0893', '0899', '0990'],
  'i99': ['9900'],
  'i13.20.14': ['0729'],
  'i15.c': ['1010'],
  'i40.11.l': ['3510'],
  'i21.1': ['1701'],
  'i01.j': ['0145', '0150', '0162'],
  'i22': ['1811', '1812', '1820',

#### 1. Check whether the dictionary covers all sectors of the exiobase in 2011.

Check sectors covered by current dictionary.

In [8]:
exio_name_dict = matching_dicts['exio_name_dict']
exio_name_dict = dict(sorted(exio_name_dict.items(), key=lambda item: item[1]))
len(exio_name_dict)

163

In [9]:
exio_name = list(exio_name_dict.values())
exio_name

['Activities auxiliary to financial intermediation',
 'Activities of membership organisation n.e.c.',
 'Air transport',
 'Aluminium production',
 'Animal products nec',
 'Biogasification of food waste, incl. land application',
 'Biogasification of paper, incl. land application',
 'Biogasification of sewage slugde, incl. land application',
 'Casting of metals',
 'Cattle farming',
 'Chemicals nec',
 'Collection, purification and distribution of water',
 'Composting of food waste, incl. land application',
 'Composting of paper and wood, incl. land application',
 'Computer and related activities',
 'Construction',
 'Copper production',
 'Cultivation of cereal grains nec',
 'Cultivation of crops nec',
 'Cultivation of oil seeds',
 'Cultivation of paddy rice',
 'Cultivation of plant-based fibers',
 'Cultivation of sugar cane, sugar beet',
 'Cultivation of vegetables, fruit, nuts',
 'Cultivation of wheat',
 'Distribution and trade of electricity',
 'Education',
 'Extra-territorial organizatio

Check sectors covered by current exiobase.

In [7]:
with open('../../Data/lci_iot_imported/iot_flow.p', 'rb') as i:
    iot = pickle.load(i)

iot_sec = {multi_index[1] for multi_index in iot.index}
iot_sec = [sec_name for sec_name in iot_sec] # from set to list where order matters and index can be applied
iot_sec.sort()
iot_sec

['Activities auxiliary to financial intermediation (67)',
 'Activities of membership organisation n.e.c. (91)',
 'Air transport (62)',
 'Aluminium production',
 'Animal products nec',
 'Biogasification of food waste, incl. land application',
 'Biogasification of paper, incl. land application',
 'Biogasification of sewage slugde, incl. land application',
 'Casting of metals',
 'Cattle farming',
 'Chemicals nec',
 'Collection, purification and distribution of water (41)',
 'Composting of food waste, incl. land application',
 'Composting of paper and wood, incl. land application',
 'Computer and related activities (72)',
 'Construction (45)',
 'Copper production',
 'Cultivation of cereal grains nec',
 'Cultivation of crops nec',
 'Cultivation of oil seeds',
 'Cultivation of paddy rice',
 'Cultivation of plant-based fibers',
 'Cultivation of sugar cane, sugar beet',
 'Cultivation of vegetables, fruit, nuts',
 'Cultivation of wheat',
 'Distribution and trade of electricity',
 'Education (80

Since sectors in current exiobase have (digits) in the end of the sector name, remove them before comparison.

In [10]:
iot_sec_rev = iot_sec.copy()
for i in range(len(iot_sec_rev)):
    iot_sec_rev[i] = iot_sec_rev[i].replace(' (','').replace(')','')
    iot_sec_rev[i] = re.sub('\d', '', iot_sec_rev[i])
iot_sec_rev.sort()
iot_sec_rev

['Activities auxiliary to financial intermediation',
 'Activities of membership organisation n.e.c.',
 'Air transport',
 'Aluminium production',
 'Animal products nec',
 'Biogasification of food waste, incl. land application',
 'Biogasification of paper, incl. land application',
 'Biogasification of sewage slugde, incl. land application',
 'Casting of metals',
 'Cattle farming',
 'Chemicals nec',
 'Collection, purification and distribution of water',
 'Composting of food waste, incl. land application',
 'Composting of paper and wood, incl. land application',
 'Computer and related activities',
 'Construction',
 'Copper production',
 'Cultivation of cereal grains nec',
 'Cultivation of crops nec',
 'Cultivation of oil seeds',
 'Cultivation of paddy rice',
 'Cultivation of plant-based fibers',
 'Cultivation of sugar cane, sugar beet',
 'Cultivation of vegetables, fruit, nuts',
 'Cultivation of wheat',
 'Distribution and trade of electricity',
 'Education',
 'Extra-territorial organizatio

Now, compare sectors in the dictionary and the exiobase!

In [11]:
intersection = set(exio_name) & set(iot_sec_rev)
print(set(exio_name)-intersection)
print(set(iot_sec_rev)-intersection)

{'Re-processing of secondary lead into new lead', 'Manure treatment (biogas), storage and land application', 'Manure treatment (conventional), storage and land application'}
{'Manure treatmentconventional, storage and land application', 'Manure treatmentbiogas, storage and land application', 'Re-processing of secondary lead into new lead, zinc and tin'}


In [12]:
for i in range(len(iot_sec_rev)):
    if exio_name[i]!= iot_sec_rev[i]:
        print(iot_sec_rev[i])
        print(exio_name[i])

Manure treatmentbiogas, storage and land application
Manure treatment (biogas), storage and land application
Manure treatmentconventional, storage and land application
Manure treatment (conventional), storage and land application
Re-processing of secondary lead into new lead, zinc and tin
Re-processing of secondary lead into new lead


These three sectors also have updated names but their meanings are not changed. Thus, 163 industries are covered by the dictonary with some revision on their names. Next, update the sector names in the dictonary.

In [16]:
for ind, (key,value) in enumerate(exio_name_dict.items()):
    exio_name_dict[key]=list(iot_sec)[ind]
matching_dicts['exio_name_dict']= exio_name_dict    
exio_name_dict

{'i67': 'Activities auxiliary to financial intermediation (67)',
 'i91': 'Activities of membership organisation n.e.c. (91)',
 'i62': 'Air transport (62)',
 'i27.42': 'Aluminium production',
 'i01.m': 'Animal products nec',
 'i90.3.a': 'Biogasification of food waste, incl. land application',
 'i90.3.b': 'Biogasification of paper, incl. land application',
 'i90.3.c': 'Biogasification of sewage slugde, incl. land application',
 'i27.5': 'Casting of metals',
 'i01.i': 'Cattle farming',
 'i24.4': 'Chemicals nec',
 'i41': 'Collection, purification and distribution of water (41)',
 'i90.4.a': 'Composting of food waste, incl. land application',
 'i90.4.b': 'Composting of paper and wood, incl. land application',
 'i72': 'Computer and related activities (72)',
 'i45': 'Construction (45)',
 'i27.44': 'Copper production',
 'i01.c': 'Cultivation of cereal grains nec',
 'i01.h': 'Cultivation of crops nec',
 'i01.e': 'Cultivation of oil seeds',
 'i01.a': 'Cultivation of paddy rice',
 'i01.g': 'Culti

#### 2. Check whether the ISIC codes in the dictionary cover ISIC codes in ecoinvent v3.7.1

Check ISIC codes covered by the dictionary.

In [17]:
ISIC_set = set(matching_dicts['isic_to_exio'].keys())
ISIC_set

{'0111',
 '0112',
 '0113',
 '0114',
 '0115',
 '0116',
 '0119',
 '0121',
 '0122',
 '0123',
 '0124',
 '0125',
 '0126',
 '0127',
 '0128',
 '0129',
 '0130',
 '0141',
 '0142',
 '0143',
 '0144',
 '0145',
 '0146',
 '0149',
 '0150',
 '0161',
 '0162',
 '0163',
 '0164',
 '0170',
 '0210',
 '0220',
 '0230',
 '0240',
 '0311',
 '0312',
 '0321',
 '0322',
 '0510',
 '0520',
 '0610',
 '0620',
 '0710',
 '0721',
 '0729',
 '0810',
 '0891',
 '0892',
 '0893',
 '0899',
 '0910',
 '0990',
 '1010',
 '1020',
 '1030',
 '1040',
 '1050',
 '1061',
 '1062',
 '1071',
 '1072',
 '1073',
 '1074',
 '1075',
 '1079',
 '1080',
 '1101',
 '1102',
 '1103',
 '1104',
 '1200',
 '1311',
 '1312',
 '1313',
 '1391',
 '1392',
 '1393',
 '1394',
 '1399',
 '1410',
 '1420',
 '1430',
 '1511',
 '1512',
 '1520',
 '1610',
 '1621',
 '1622',
 '1623',
 '1629',
 '1701',
 '1702',
 '1709',
 '1811',
 '1812',
 '1820',
 '1910',
 '1920',
 '2011',
 '2012',
 '2013',
 '2021',
 '2022',
 '2023',
 '2029',
 '2030',
 '2100',
 '2211',
 '2219',
 '2220',
 '2310',
 

Now, try to check the ISIC codes in ecoinvent v3.7.1

In [18]:
with open('../../Data/lci_iot_imported/cutoff371_data_no_mg.pickle', 'rb') as i:
    datasets = pickle.load(i)
datasets = [d for d in datasets if d['activity type']!='production mix'] 

Since burden-free Recycled Content cut-off datasets do not have ISIC code:
- we choose 3900 as their ISIC code 
- this affects origin shares, but will not affect environmental impact since they are burden free datasets

In [None]:
for d in datasets:
    if 'Recycled Content' in d['name']:
        d['classifications'].append(('ISIC rev.4 ecoinvent', '3900:Remediation activities and other waste management services'))

for d in datasets:
    if 'Recycled Content' in d['name']:
        print(d['classifications'])
        print(len(d['exchanges']))
        print('------------')

In [21]:
def extract_isic_code(ds):
#     print(ds['name'])
    cl = ds['classifications']# example of a "classification": [('ISIC rev.4 ecoinvent','1050:Manufacture of dairy products'),('CPC', '22230: Yoghurt and other fermented or acidified milk and cream')],
    isic_info = [c[1] for c in cl if c[0].startswith('ISIC')][0] # only production mixes have no isic-info #1050:
    isic_code = isic_info.split(':')[0] #1050
    isic_code = re.sub(r'\D','', isic_code) # remove added non-digit characters
    return isic_code

ISIC_set_new = set(extract_isic_code(ds) for ds in datasets)

In [22]:
ISIC_set_not4 = []
for i in ISIC_set_new:
    if i not in ISIC_set:
        ISIC_set_not4.append(i)

ISIC_set_not4.sort()
print(len(ISIC_set_not4))
ISIC_set_not4

26


['012',
 '014',
 '03',
 '032',
 '051',
 '072',
 '09',
 '108',
 '131',
 '170',
 '19',
 '20',
 '23',
 '239',
 '242',
 '243',
 '25',
 '259',
 '27',
 '28',
 '35',
 '38',
 '381',
 '382',
 '49',
 '68']

Ecoinvent also provides 3-digits or 2-digits ISIC code, which happens when the activity cannot be precisely classified. <br>
In this case, ISIC codes with first digits same as them will be used in further calcualtion.

#### 3. Save new matching dictionary

In [23]:
with open("../../Data/sector_matching/matching_dicts_new.p", 'wb') as o:
    pickle.dump(matching_dicts, o)

#### 4. Match ISIC code to sector names in exiobase

In current dictionary, isic code - exiobase sector code - exiobase sector name.

In [24]:
isic_to_exio = matching_dicts['isic_to_exio']
exio_name_dict = matching_dicts['exio_name_dict']
isic_to_exio_name = {isic:[exio_name_dict[c] for c in exiocode] for isic,exiocode in isic_to_exio.items()}
isic_to_exio_name

{'2100': ['Chemicals nec'],
 '2021': ['P- and other fertiliser'],
 '2823': ['Manufacture of machinery and equipment n.e.c. (29)'],
 '1061': ['Processing of Food products nec', 'Processed rice'],
 '1080': ['Processing of Food products nec'],
 '8110': ['Real estate activities (70)'],
 '6391': ['Recreational, cultural and sporting activities (92)'],
 '2220': ['Manufacture of rubber and plastic products (25)'],
 '9511': ['Computer and related activities (72)'],
 '3290': ['Manufacture of furniture; manufacturing n.e.c. (36)'],
 '0321': ['Fishing, operating of fish hatcheries and fish farms; service activities incidental to fishing (05)'],
 '9420': ['Activities of membership organisation n.e.c. (91)'],
 '3250': ['Manufacture of furniture; manufacturing n.e.c. (36)'],
 '0114': ['Cultivation of sugar cane, sugar beet'],
 '9492': ['Activities of membership organisation n.e.c. (91)'],
 '0210': ['Forestry, logging and related service activities (02)'],
 '8710': ['Health and social work (85)'],
 '

In [25]:
with open("../../Data/sector_matching/isic_TO_exio_name.p", 'wb') as o:
    pickle.dump(isic_to_exio_name, o)