# Using the aggregation functionality of pymrio

Pymrio offers various possibilities to achieve an aggreation of a existing MRIO system. 
The following section will present all of them in turn, using the test MRIO system included in pymrio.
The same concept can be applied to real life MRIOs.

Some of the examples rely in the [country converter coco](https://github.com/konstantinstadler/country_converter). The minimum version required is coco >= 0.6 - install the latest version with 
```
pip install country_converter --upgrade
```
Coco can also be installed from the Anaconda Cloud - see the coco readme for further infos.

## Loading the test mrio

First, we load and explore the test MRIO included in pymrio:

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import numpy as np
import pymrio

In [3]:
io = pymrio.load_test()
io.calc_all()

<pymrio.core.mriosystem.IOSystem at 0x7fe122c91278>

In [4]:
print("Sectors: {sec},\nRegions: {reg}".format(sec=io.get_sectors().tolist(), reg=io.get_regions().tolist()))

Sectors: ['food', 'mining', 'manufactoring', 'electricity', 'construction', 'trade', 'transport', 'other'],
Regions: ['reg1', 'reg2', 'reg3', 'reg4', 'reg5', 'reg6']


## TODO Pre- or post-calculation aggregation

## Aggregation using a numerical concordance matrix

This is the standard way to aggregate MRIOs when you work in Matlab.
To do so, we need to set up a concordance matrix in which the columns correspond to the orignal classification and the rows to the aggregated one.

In [5]:
sec_agg_matrix = np.array([
    [1, 0, 0, 0, 0, 0, 0, 0],
    [0, 1, 1, 1, 1, 0, 0, 0],
    [0, 0, 0, 0, 0, 1, 1, 1]
    ])

reg_agg_matrix = np.array([
    [1, 1, 1, 0, 0, 0],
    [0, 0, 0, 1, 1, 1]
    ])

In [6]:
io.aggregate(region_agg=reg_agg_matrix, sector_agg=sec_agg_matrix)

<pymrio.core.mriosystem.IOSystem at 0x7fe122c91278>

In [7]:
print("Sectors: {sec},\nRegions: {reg}".format(sec=io.get_sectors().tolist(), reg=io.get_regions().tolist()))

Sectors: ['sec0', 'sec1', 'sec2'],
Regions: ['reg0', 'reg1']


In [8]:
io.calc_all()

<pymrio.core.mriosystem.IOSystem at 0x7fe122c91278>

In [9]:
io.emissions.D_fp

Unnamed: 0_level_0,region,reg0,reg0,reg0,reg1,reg1,reg1
Unnamed: 0_level_1,sector,sec0,sec1,sec2,sec0,sec1,sec2
stressor,compartment,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
emission_type1,air,9041149.0,301879100.0,152323600.0,24694650.0,346874200.0,245411700.0
emission_type2,water,2123543.0,48845090.0,98897570.0,6000239.0,45945300.0,189273100.0


To use custom names for the aggregated sectors or regions, pass a list of names in order of rows in the concordance matrix:

In [10]:
io = pymrio.load_test().calc_all().aggregate(region_agg=reg_agg_matrix, 
                                             region_names=['World Region A', 'World Region B'], 
                                             inplace=False)

In [11]:
io.get_regions()

Index(['World Region A', 'World Region B'], dtype='object', name='region')

## Aggregation using a numerical vector

Pymrio also accepts the aggregatio information as numerical or string vector. 
For these, each entry in the vector assignes the sector/region to a aggregation group.
Thus the two aggregation matrices from above (*sec_agg_matrix* and *reg_agg_matrix*) can also be represented as numerical or string vectors/lists:

In [12]:
sec_agg_vec = np.array([0,1,1,1,1,2,2,2])
reg_agg_vec = ['R1', 'R1', 'R1', 'R2', 'R2', 'R2']

can also be represented as aggregation vector:

In [13]:
io_vec_agg = pymrio.load_test().calc_all().aggregate(region_agg=reg_agg_vec, 
                                                     sector_agg=sec_agg_vec, 
                                                     inplace=False)

In [14]:
print("Sectors: {sec},\nRegions: {reg}".format(sec=io_vec_agg.get_sectors().tolist(), 
                                               reg=io_vec_agg.get_regions().tolist()))

Sectors: ['sec0', 'sec1', 'sec2'],
Regions: ['R1', 'R2']


In [15]:
io_vec_agg.emissions.D_fp_reg

Unnamed: 0_level_0,region,R1,R2
stressor,compartment,Unnamed: 2_level_1,Unnamed: 3_level_1
emission_type1,air,669019200.0,1686954000.0
emission_type2,water,533768200.0,590208100.0


## Regional aggregation using the country converter coco

The previous examples are best suited if you want to reuse existing aggregation information.
For new/ad hoc aggregation, the most user-friendly solution is to build the concordance with the [country converter coco](https://github.com/konstantinstadler/country_converter).

In [16]:
import country_converter as coco
io = pymrio.load_test().calc_all()

In [17]:
reg_agg_coco = coco.agg_conc(original_countries=io.get_regions(), 
                             aggregates={'reg1': 'World Region A',
                                         'reg2': 'World Region A',
                                         'reg3': 'World Region A',},
                             missing_countries='World Region B')                               

In [22]:
io.aggregate(region_agg=reg_agg_coco)

<pymrio.core.mriosystem.IOSystem at 0x7fe11a41e9b0>

In [23]:
print("Sectors: {sec},\nRegions: {reg}".format(sec=io.get_sectors().tolist(), 
                                               reg=io.get_regions().tolist()))

Sectors: ['food', 'mining', 'manufactoring', 'electricity', 'construction', 'trade', 'transport', 'other'],
Regions: ['Unspecified region']


This can be passed directly to pymrio:

In [24]:
io.emissions.D_fp_reg

Unnamed: 0_level_0,region,Unspecified region
stressor,compartment,Unnamed: 2_level_1
emission_type1,air,2355973000.0
emission_type2,water,1123976000.0


A pandas DataFrame corresponding to the output from *coco* can also be passed to *sector_agg* for aggregation.
A sector aggregation package similar to the country converter is planned.

## Aggregation to one total sector / region

Both, *region_agg* and *sector_agg*, also accept a string as argument. This leads to the aggregation to one total region or sector for the full IO system.

In [25]:
pymrio.load_test().calc_all().aggregate(region_agg='global', sector_agg='total').emissions.D_fp

Unnamed: 0_level_0,region,global
Unnamed: 0_level_1,sector,total
stressor,compartment,Unnamed: 2_level_2
emission_type1,air,1080224000.0
emission_type2,water,391084800.0
