# Characterization of stressors

Stressor characterization allows the calculation of environmental and social impacts of economic activities. It transforms raw stressor data into meaningful impact indicators.

Pymrio implements an innovative string-matching approach for characterization.
This method matches stressors in the characterization table (provided in long format) with available stressors in the MRIO system. This brings the following benefits:

- Ensures stressor correspondence across the MRIO system and characterization table
- Performs automatic unit verification
- Works regardless of entry order in the characterization table
- Handles characterization tables that include factors for stressors not present in the satellite account
- Efficiently manages region and sector-specific characterization factors
- Enables characterization across different extensions

This contrasts with traditional approaches that rely on matrix multiplication between stressor and characterization matrices, requiring strict 1:1 correspondence between matrix dimensions and precise ordering of entries.

The characterization functionality is available both as an extension object method and as top-level function accepting complete MRIO objects or extension collections.

In the following, we give some examples on how to use both methods, starting with some simple example and then advancing to more complex cases with regional specific factors.

## Basic Example

For this example we use the test MRIO included in Pymrio. We also need
the Pandas library for loading the characterization table and pathlib for some folder manipulation.

In [None]:
from pathlib import Path

In [None]:
import pandas as pd

In [1]:
import pymrio
from pymrio.core.constants import PYMRIO_PATH  # noqa

To load the test MRIO we use:

In [2]:
io = pymrio.load_test()

and the characterization table with some foo factors can be loaded by

In [3]:
charact_table = pd.read_csv(
    (PYMRIO_PATH["test_mrio"] / Path("concordance") / "emissions_charact.tsv"),
    sep="\t",
)
charact_table

Unnamed: 0,stressor,compartment,impact,factor,impact_unit,stressor_unit
0,emission_type1,air,air water impact,0.002,t,kg
1,emission_type2,water,air water impact,0.001,t,kg
2,emission_type1,air,total emissions,1.0,kg,kg
3,emission_type2,water,total emissions,1.0,kg,kg
4,emission_type3,land,total emissions,1.0,kg,kg
5,emission_type1,air,total air emissions,0.001,t,kg


This table contains the columns 'stressor' and 'compartment' which correspond
to the index names of the test_mrio emission satellite accounts:

In [4]:
io.emissions.F

Unnamed: 0_level_0,region,reg1,reg1,reg1,reg1,reg1,reg1,reg1,reg1,reg2,reg2,...,reg5,reg5,reg6,reg6,reg6,reg6,reg6,reg6,reg6,reg6
Unnamed: 0_level_1,sector,food,mining,manufactoring,electricity,construction,trade,transport,other,food,mining,...,transport,other,food,mining,manufactoring,electricity,construction,trade,transport,other
stressor,compartment,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2
emission_type1,air,1848064.8,986448.09,23613787.0,28139100.0,2584141.8,4132656.3,21766987.0,7842090.6,1697937.3,347378.15,...,42299319,10773826.0,15777996.0,6420955.5,113172450.0,56022534.0,4861838.5,18195621,47046542.0,21632868
emission_type2,water,139250.47,22343.295,763569.18,273981.55,317396.51,1254477.8,1012999.1,2449178.0,204835.44,29463.944,...,4199841,7191006.3,4826108.1,1865625.1,12700193.0,753213.7,2699288.3,13892313,8765784.3,16782553


Theses index-names / columns-names need to match in order to match
characterization factors to the stressors.

The other columns names can be passed to the characterization method. By default the method assumes the following column names:

- impact: name of the characterization/impact
- factor: the numerical (float) multiplication value for a specific stressor to derive the impact/characterized account
- impact_unit: the unit of the calculated characterization/impact
- stressor_unit: the unit of the stressor in the extension

Alternative names can be passed through the parameters
*characterized_name_column*, *characterization_factors_column*, *characterized_unit_column* and *orig_unit_column*


To calculate the characterization we use

In [5]:
char_emis = io.emissions.characterize(charact_table, name="impacts")

The parameter *name* is optional, if omitted the name will be set to
extension_name + _characterized. In case the passed name starts with an
underscore, the return name with be the name of the original extension concatenated with the passed name.

The return value is a named tuple with the *validation* and *extension* as attriubtes.

In [6]:
print(char_emis.extension)

Extension impacts with parameters: name, F, F_Y, unit


In [7]:
char_emis.validation

Unnamed: 0,stressor,compartment,impact,factor,impact_unit,stressor_unit,error_unit_impact,error_unit_stressor,error_missing_stressor
0,emission_type1,air,air water impact,0.002,t,kg,False,False,False
1,emission_type1,air,total emissions,1.0,kg,kg,False,False,False
2,emission_type1,air,total air emissions,0.001,t,kg,False,False,False
3,emission_type2,water,air water impact,0.001,t,kg,False,False,False
4,emission_type2,water,total emissions,1.0,kg,kg,False,False,False
5,emission_type3,land,total emissions,1.0,kg,kg,False,False,True


Checking the validation table is a recommended step that ensures accuracy and completeness before impact calculations. The validation process helps identify potential issues such as:

- Missing characterization factors for specific region/sector/stressor combinations
- Spelling mistakes or inconsistencies in stressor, sector, or region names
- Unit mismatches between the MRIO system and characterization factors
- Incomplete coverage that could affect impact assessment results

By systematically checking these elements, users can avoid calculation errors and ensure their impact assessment captures all relevant environmental and social dimensions with the proper characterization factors.


In the current case, the *charact_table* contains a characterization called 'total
emissions', for which the calculation requires a stressor not present in the
satellite account. This is indicated in the validation table in the *error_missing_stressor* column.
The calculation can proceed, but for all impacts containing the stressor it is assumed to be 0.

It is possible, to just the verification before doing any calculation with

In [8]:
only_val = io.emissions.characterize(charact_table, only_validation=True)
only_val.validation

Unnamed: 0,stressor,compartment,impact,factor,impact_unit,stressor_unit,error_unit_impact,error_unit_stressor,error_missing_stressor
0,emission_type1,air,air water impact,0.002,t,kg,False,False,False
1,emission_type1,air,total emissions,1.0,kg,kg,False,False,False
2,emission_type1,air,total air emissions,0.001,t,kg,False,False,False
3,emission_type2,water,air water impact,0.001,t,kg,False,False,False
4,emission_type2,water,total emissions,1.0,kg,kg,False,False,False
5,emission_type3,land,total emissions,1.0,kg,kg,False,False,True


In that case the extension attribute is set to None.
The same applies if a characterization needs to be aborted due to unit inconsistencies.

Anyways, in case everything works as expected, the extension can be attached to the MRIO object.

In [9]:
io.impacts = char_emis.extension

and used for subsequent calculations:

In [10]:
io.calc_all()
io.impacts.D_cba

region,reg1,reg1,reg1,reg1,reg1,reg1,reg1,reg1,reg2,reg2,...,reg5,reg5,reg6,reg6,reg6,reg6,reg6,reg6,reg6,reg6
sector,food,mining,manufactoring,electricity,construction,trade,transport,other,food,mining,...,transport,other,food,mining,manufactoring,electricity,construction,trade,transport,other
impact,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
air water impact,4354.677,384.125264,211698.4,23912.31,7032.641,8548.388,22000.5,40123.55,3800.328,42.024811,...,88433.84,30080.44,34765.28,3227.857,153582.9,74236.16,4580.343,139321.8,104944.7,118394.9
total air emissions,2056.183,179.423536,97493.0,11887.59,3342.906,3885.884,10750.27,15821.52,1793.338,19.145605,...,42095.05,11386.61,15172.35,1345.318,71450.75,36831.67,1836.696,42415.68,48054.09,36022.98
total emissions,2298494.0,204701.727979,114205400.0,12024720.0,3689735.0,4662504.0,11250230.0,24302030.0,2006991.0,22879.206385,...,46338790.0,18693820.0,19592930.0,1882540.0,82132190.0,37404490.0,2743647.0,96906130.0,56890570.0,82371960.0


Note that units are checked against the unit specification of the extension.
Thus, any mismatch of units will abort the calculation. The validation table
helps to identify the issue.

In [11]:
charact_table.loc[charact_table.stressor == "emission_type1", "stressor_unit"] = "t"

In [12]:
ret_error = io.emissions.characterize(charact_table)



In [13]:
ret_error.extension

In [14]:
ret_error.validation

Unnamed: 0,stressor,compartment,impact,factor,impact_unit,stressor_unit,error_unit_impact,error_unit_stressor,error_missing_stressor
0,emission_type1,air,air water impact,0.002,t,t,False,True,False
1,emission_type1,air,total emissions,1.0,kg,t,False,True,False
2,emission_type1,air,total air emissions,0.001,t,t,False,True,False
3,emission_type2,water,air water impact,0.001,t,kg,False,False,False
4,emission_type2,water,total emissions,1.0,kg,kg,False,False,False
5,emission_type3,land,total emissions,1.0,kg,kg,False,False,True


The error_unit_impact column indicate the stressor with the unit mismatch.

## Regional specific characterization factors

Here we use a table of regionally specific characterisation factors.
The actual factors contained here are the same as in the basic example and we
will modify them after loading.
We will also investigate cases with missing data or conflicting units.
The same principles can be used for sector specific characterization factors.

We use the same data test mrio system as before:

In [15]:
io = pymrio.load_test()

with the regional specific characterization factors from

In [16]:
charact_table_reg = pd.read_csv(
    (PYMRIO_PATH["test_mrio"] / Path("concordance") / "emissions_charact_reg_spec.tsv"),
    sep="\t",
)
charact_table_reg

Unnamed: 0,region,stressor,compartment,impact,factor,impact_unit,stressor_unit
0,reg1,emission_type1,air,air water impact,0.002,t,kg
1,reg1,emission_type2,water,air water impact,0.001,t,kg
2,reg1,emission_type1,air,total emissions,1.0,kg,kg
3,reg1,emission_type2,water,total emissions,1.0,kg,kg
4,reg1,emission_type3,land,total emissions,1.0,kg,kg
5,reg1,emission_type1,air,total air emissions,0.001,t,kg
6,reg2,emission_type1,air,air water impact,0.002,t,kg
7,reg2,emission_type2,water,air water impact,0.001,t,kg
8,reg2,emission_type1,air,total emissions,1.0,kg,kg
9,reg2,emission_type2,water,total emissions,1.0,kg,kg


Compared with the previous table (charact_table), this table contains an additional
column *region* which contains the regional specific data.
Currently, the factors are actually the same as before, thus

In [17]:
char_reg = io.emissions.characterize(charact_table_reg)

For regional specific characterization, the validation table contains information per region

In [18]:
char_reg.validation

Unnamed: 0,stressor,compartment,region,impact,factor,impact_unit,stressor_unit,error_unit_impact,error_unit_stressor,error_missing_stressor,error_missing_region
0,emission_type1,air,reg1,air water impact,0.002,t,kg,False,False,False,False
1,emission_type1,air,reg1,total emissions,1.0,kg,kg,False,False,False,False
2,emission_type1,air,reg1,total air emissions,0.001,t,kg,False,False,False,False
3,emission_type1,air,reg2,air water impact,0.002,t,kg,False,False,False,False
4,emission_type1,air,reg2,total emissions,1.0,kg,kg,False,False,False,False
5,emission_type1,air,reg2,total air emissions,0.001,t,kg,False,False,False,False
6,emission_type1,air,reg3,air water impact,0.002,t,kg,False,False,False,False
7,emission_type1,air,reg3,total emissions,1.0,kg,kg,False,False,False,False
8,emission_type1,air,reg3,total air emissions,0.001,t,kg,False,False,False,False
9,emission_type1,air,reg4,air water impact,0.002,t,kg,False,False,False,False


The extension is again available in the extension attribute

In [19]:
char_reg.extension.F

region,reg1,reg1,reg1,reg1,reg1,reg1,reg1,reg1,reg2,reg2,...,reg5,reg5,reg6,reg6,reg6,reg6,reg6,reg6,reg6,reg6
sector,food,mining,manufactoring,electricity,construction,trade,transport,other,food,mining,...,transport,other,food,mining,manufactoring,electricity,construction,trade,transport,other
impact,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
air water impact,3835.38,1995.239,47991.14,56552.18,5485.68,9519.79,44546.97,18133.36,3600.71,724.220244,...,88798.48,28738.66,36382.1,14707.54,239045.1,112798.3,12422.97,50283.56,102858.9,60048.29
total air emissions,1848.065,986.4481,23613.79,28139.1,2584.142,4132.656,21766.99,7842.091,1697.937,347.37815,...,42299.32,10773.83,15778.0,6420.956,113172.4,56022.53,4861.838,18195.62,47046.54,21632.87
total emissions,1987315.0,1008791.0,24377360.0,28413080.0,2901538.0,5387134.0,22779990.0,10291270.0,1902773.0,376842.094,...,46499160.0,17964830.0,20604100.0,8286581.0,125872600.0,56775750.0,7561127.0,32087930.0,55812330.0,38415420.0


gives the same result as before. To highlight regional
specificity, we double the total emission factors of region 3.

In [20]:
charact_table_reg.loc[
    (charact_table_reg.region == "reg3")
    & (charact_table_reg.impact == "total emissions"),
    "factor",
] = (
    charact_table_reg.loc[
        (charact_table_reg.region == "reg3")
        & (charact_table_reg.impact == "total emissions"),
        "factor",
    ]
    * 2
)

and calculate the new impacts

In [21]:
char_reg_dbl = io.emissions.characterize(charact_table_reg).extension
char_reg_dbl.F.loc["total emissions"]

region  sector       
reg1    food             1.987315e+06
        mining           1.008791e+06
        manufactoring    2.437736e+07
        electricity      2.841308e+07
        construction     2.901538e+06
        trade            5.387134e+06
        transport        2.277999e+07
        other            1.029127e+07
reg2    food             1.902773e+06
        mining           3.768421e+05
        manufactoring    1.598022e+07
        electricity      1.660779e+07
        construction     1.868660e+06
        trade            3.511220e+06
        transport        6.836824e+06
        other            6.185187e+06
reg3    food             1.100035e+07
        mining           9.531717e+06
        manufactoring    2.150874e+08
        electricity      1.503010e+08
        construction     3.996900e+07
        trade            1.213563e+08
        transport        1.301629e+08
        other            3.714520e+08
reg4    food             6.479508e+06
        mining           9.5

compared to

In [22]:
char_reg.extension.F.loc["total emissions"]

region  sector       
reg1    food             1.987315e+06
        mining           1.008791e+06
        manufactoring    2.437736e+07
        electricity      2.841308e+07
        construction     2.901538e+06
        trade            5.387134e+06
        transport        2.277999e+07
        other            1.029127e+07
reg2    food             1.902773e+06
        mining           3.768421e+05
        manufactoring    1.598022e+07
        electricity      1.660779e+07
        construction     1.868660e+06
        trade            3.511220e+06
        transport        6.836824e+06
        other            6.185187e+06
reg3    food             5.500174e+06
        mining           4.765858e+06
        manufactoring    1.075437e+08
        electricity      7.515049e+07
        construction     1.998450e+07
        trade            6.067817e+07
        transport        6.508145e+07
        other            1.857260e+08
reg4    food             6.479508e+06
        mining           9.5

## Some more notes on validation

We can put some more inconsistencies into the table to showcase the validation process.
Some unit error in the stressors:

In [23]:
charact_table_reg.loc[
    (charact_table_reg.region == "reg4")
    & (charact_table_reg.stressor == "emission_type1"),
    "stressor_unit",
] = "s"

Some inconsistent impact units:

In [24]:
charact_table_reg.loc[
    (charact_table_reg.region == "reg2")
    & (charact_table_reg.impact == "total emissions"),
    "impact_unit",
] = "kt"

Some spelling mistake in region 2 for some stressor:

In [25]:
charact_table_reg.loc[
    (charact_table_reg.region == "reg2")
    & (charact_table_reg.stressor == "emission_type2"),
    "region",
] = "reg22"

Another region data which is not available in the extension

In [26]:
new_data = charact_table_reg.iloc[[0]]
new_data.loc[:, "region"] = "reg_additional"
charact_table_reg = charact_table_reg.merge(new_data, how="outer")

In [27]:
report = io.emissions.characterize(charact_table_reg, only_validation=True).validation

The unit errors are reported for each row, and the one additional region not present in the extension is report under *error_missing_region*.
The column *error_unit_impact* indicates the impact with inconsistent units

In [28]:
report[report.stressor == "emission_type1"]

Unnamed: 0,stressor,compartment,region,impact,factor,impact_unit,stressor_unit,error_unit_impact,error_unit_stressor,error_missing_stressor,error_missing_region
0,emission_type1,air,reg1,air water impact,0.002,t,kg,False,False,False,False
1,emission_type1,air,reg1,total air emissions,0.001,t,kg,False,False,False,False
2,emission_type1,air,reg1,total emissions,1.0,kg,kg,True,False,False,False
3,emission_type1,air,reg2,air water impact,0.002,t,kg,False,False,False,False
4,emission_type1,air,reg2,total air emissions,0.001,t,kg,False,False,False,False
5,emission_type1,air,reg2,total emissions,1.0,kt,kg,True,False,False,False
6,emission_type1,air,reg3,air water impact,0.002,t,kg,False,False,False,False
7,emission_type1,air,reg3,total air emissions,0.001,t,kg,False,False,False,False
8,emission_type1,air,reg3,total emissions,2.0,kg,kg,True,False,False,False
9,emission_type1,air,reg4,air water impact,0.002,t,s,False,True,False,False


In case of emission_type2, the *error_missing_region* is True for the whole stressor, since reg2 is "no longer present" in the factor sheets due to the spelling mistake.
Thus, not all regions are covered in the specifications.
Again, the column *error_unit_impact* indicates the impact with inconsistent units

In [29]:
report[report.stressor == "emission_type2"]

Unnamed: 0,stressor,compartment,region,impact,factor,impact_unit,stressor_unit,error_unit_impact,error_unit_stressor,error_missing_stressor,error_missing_region
19,emission_type2,water,reg1,air water impact,0.001,t,kg,False,False,False,True
20,emission_type2,water,reg1,total emissions,1.0,kg,kg,True,False,False,True
21,emission_type2,water,reg22,air water impact,0.001,t,kg,False,False,False,True
22,emission_type2,water,reg22,total emissions,1.0,kt,kg,True,False,False,True
23,emission_type2,water,reg3,air water impact,0.001,t,kg,False,False,False,True
24,emission_type2,water,reg3,total emissions,2.0,kg,kg,True,False,False,True
25,emission_type2,water,reg4,air water impact,0.001,t,kg,False,False,False,True
26,emission_type2,water,reg4,total emissions,1.0,kg,kg,True,False,False,True
27,emission_type2,water,reg5,air water impact,0.001,t,kg,False,False,False,True
28,emission_type2,water,reg5,total emissions,1.0,kg,kg,True,False,False,True


## Characterization across multiple extensions

In addition to characterizing a single extension, pymrio also offers functionality
to apply characterization across multiple extensions simultaneously. This is useful
when your impacts depend on stressors that are distributed across different satellite accounts.

Let's demonstrate this using our test MRIO system:

In [30]:
io = pymrio.load_test()

First, let's create multiple extensions from our emissions data to better showcase this functionality:

In [31]:
# Create copies of the emissions extension with different names and data subsets
io.water = io.emissions.copy("water")
io.air = io.emissions.copy("air")

In [32]:
# Keep only water emissions in the water extension
io.water.F = io.water.F.loc[[("emission_type2", "water")], :]
io.water.F_Y = io.water.F_Y.loc[[("emission_type2", "water")], :]

In [33]:
# Keep only air emissions in the air extension
io.air.F = io.air.F.loc[[("emission_type1", "air")], :]
io.air.F_Y = io.air.F_Y.loc[[("emission_type1", "air")], :]

Examining the extensions:

In [34]:
io.air.F

Unnamed: 0_level_0,region,reg1,reg1,reg1,reg1,reg1,reg1,reg1,reg1,reg2,reg2,...,reg5,reg5,reg6,reg6,reg6,reg6,reg6,reg6,reg6,reg6
Unnamed: 0_level_1,sector,food,mining,manufactoring,electricity,construction,trade,transport,other,food,mining,...,transport,other,food,mining,manufactoring,electricity,construction,trade,transport,other
stressor,compartment,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2
emission_type1,air,1848064.8,986448.09,23613787.0,28139100.0,2584141.8,4132656.3,21766987.0,7842090.6,1697937.3,347378.15,...,42299319,10773826.0,15777996.0,6420955.5,113172450.0,56022534.0,4861838.5,18195621,47046542.0,21632868


In [35]:
io.water.F

Unnamed: 0_level_0,region,reg1,reg1,reg1,reg1,reg1,reg1,reg1,reg1,reg2,reg2,...,reg5,reg5,reg6,reg6,reg6,reg6,reg6,reg6,reg6,reg6
Unnamed: 0_level_1,sector,food,mining,manufactoring,electricity,construction,trade,transport,other,food,mining,...,transport,other,food,mining,manufactoring,electricity,construction,trade,transport,other
stressor,compartment,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2
emission_type2,water,139250.47,22343.295,763569.18,273981.55,317396.51,1254477.8,1012999.1,2449178.0,204835.44,29463.944,...,4199841,7191006.3,4826108.1,1865625.1,12700193.0,753213.7,2699288.3,13892313,8765784.3,16782553


To characterize across multiple extensions, we need a characterization table that includes
an 'extension' column specifying which extension each stressor belongs to:

In [36]:
# Start with our regional characterization table
factors_reg_spec = pd.read_csv(
    (PYMRIO_PATH["test_mrio"] / Path("concordance") / "emissions_charact_reg_spec.tsv"),
    sep="\t",
)

In [37]:
# Create a copy and add an extension column based on compartment
factors_reg_ext = factors_reg_spec.copy()
factors_reg_ext.loc[:, "extension"] = factors_reg_ext.loc[:, "compartment"]

In [38]:
# Filter out any entries that don't correspond to our extensions
factors_reg_ext = factors_reg_ext[factors_reg_ext.compartment.isin(["air", "water"])]

In [39]:
# Examine our multi-extension characterization table:
factors_reg_ext.head(10)

Unnamed: 0,region,stressor,compartment,impact,factor,impact_unit,stressor_unit,extension
0,reg1,emission_type1,air,air water impact,0.002,t,kg,air
1,reg1,emission_type2,water,air water impact,0.001,t,kg,water
2,reg1,emission_type1,air,total emissions,1.0,kg,kg,air
3,reg1,emission_type2,water,total emissions,1.0,kg,kg,water
5,reg1,emission_type1,air,total air emissions,0.001,t,kg,air
6,reg2,emission_type1,air,air water impact,0.002,t,kg,air
7,reg2,emission_type2,water,air water impact,0.001,t,kg,water
8,reg2,emission_type1,air,total emissions,1.0,kg,kg,air
9,reg2,emission_type2,water,total emissions,1.0,kg,kg,water
11,reg2,emission_type1,air,total air emissions,0.001,t,kg,air


There are two ways to characterize across multiple extensions:

In [40]:
# 1. Using the top-level function with specific extensions:
ex_reg_multi = pymrio.extension_characterize(
    io.air,
    io.water,  # List the extensions you want to include
    factors=factors_reg_ext,
    new_extension_name="multi_top_level",
).extension

In [41]:
# 2. Using the MRIO object's method which automatically includes all available extensions:
ex_reg_mrio = io.extension_characterize(
    factors=factors_reg_ext, new_extension_name="multi_mrio_method"
).extension

In [42]:
# Both approaches produce the same result when the same extensions are involved:
print("Are the characterized F matrices equal?", ex_reg_multi.F.equals(ex_reg_mrio.F))

Are the characterized F matrices equal? True


In [43]:
# Add the extension to our MRIO and calculate results:
io.multi = ex_reg_multi
io.calc_all()
io.multi.D_cba

region,reg1,reg1,reg1,reg1,reg1,reg1,reg1,reg1,reg2,reg2,...,reg5,reg5,reg6,reg6,reg6,reg6,reg6,reg6,reg6,reg6
sector,food,mining,manufactoring,electricity,construction,trade,transport,other,food,mining,...,transport,other,food,mining,manufactoring,electricity,construction,trade,transport,other
impact,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
air water impact,4354.677,384.125264,211698.4,23912.31,7032.641,8548.388,22000.5,40123.55,3800.328,42.024811,...,88433.84,30080.44,34765.28,3227.857,153582.9,74236.16,4580.343,139321.8,104944.7,118394.9
total air emissions,2056.183,179.423536,97493.0,11887.59,3342.906,3885.884,10750.27,15821.52,1793.338,19.145605,...,42095.05,11386.61,15172.35,1345.318,71450.75,36831.67,1836.696,42415.68,48054.09,36022.98
total emissions,2298494.0,204701.727979,114205400.0,12024720.0,3689735.0,4662504.0,11250230.0,24302030.0,2006991.0,22879.206385,...,46338790.0,18693820.0,19592930.0,1882540.0,82132190.0,37404490.0,2743647.0,96906130.0,56890570.0,82371960.0


In [44]:
# As with single extension characterization, validation is crucial:
validation_report = pymrio.extension_characterize(
    io.air, io.water, factors=factors_reg_ext, only_validation=True
).validation

In [45]:
print("Validation report:")
validation_report

Validation report:


Unnamed: 0,stressor,compartment,region,impact,factor,impact_unit,stressor_unit,error_unit_impact,error_unit_stressor,error_missing_stressor,error_missing_region
0,emission_type1,air,reg1,air water impact,0.002,t,kg,False,False,False,False
1,emission_type1,air,reg1,total emissions,1.0,kg,kg,False,False,False,False
2,emission_type1,air,reg1,total air emissions,0.001,t,kg,False,False,False,False
3,emission_type1,air,reg2,air water impact,0.002,t,kg,False,False,False,False
4,emission_type1,air,reg2,total emissions,1.0,kg,kg,False,False,False,False
5,emission_type1,air,reg2,total air emissions,0.001,t,kg,False,False,False,False
6,emission_type1,air,reg3,air water impact,0.002,t,kg,False,False,False,False
7,emission_type1,air,reg3,total emissions,1.0,kg,kg,False,False,False,False
8,emission_type1,air,reg3,total air emissions,0.001,t,kg,False,False,False,False
9,emission_type1,air,reg4,air water impact,0.002,t,kg,False,False,False,False


The validation process helps identify issues such as:
- Missing stressors or extensions
- Unit inconsistencies
- Missing regions or sectors
- Extension name mismatches

Important considerations for multi-extension characterization:

1. The 'extension' column in your characterization table must match the extension names in your MRIO
2. All extensions must have compatible region and sector classifications
3. Units must be consistent across extensions and characterization factors
4. If a characterization table references an extension that doesn't exist,
   it will be noted in the validation report