# Regionalization

This notebook requires additional dependencies. Please install the following:

* fiona 
* folium
* rasterio 
* geopandas
* pandarus
* bw2regional
* bw2-lcimpact

With this command:

    conda install -y -q -c cmutel fiona rasterio geopandas bw2regional pandarus bw2-lcimpact folium
    
In this notebook, we will go into more technical detail on how regionalized data and matrices are stored and used.

## Before-class preparation

Please run the following cells before this class session.

Start by importing some libraries and setting up a new database for sugarcane in Brazil -> ethanol in cars in Europe.

In [1]:
import brightway2 as bw
import bw2regional as reg
import os

import geopandas as gpd
import pandas as pd
import folium
import numpy as np



In [2]:
bw.projects.set_current('bw2_seminar_2017')



In [6]:
imp = bw.ExcelImporter("data/ethanol-inventory.xlsx")
imp.apply_strategies()
imp.match_database("ecoinvent 2.2", fields=('name', 'unit', 'location'))
imp.statistics()

Extracted 1 worksheets in 0.00 seconds
Applying strategy: csv_restore_tuples
Applying strategy: csv_restore_booleans
Applying strategy: csv_numerize
Applying strategy: csv_drop_unknown
Applying strategy: normalize_units
Applying strategy: normalize_biosphere_categories
Applying strategy: normalize_biosphere_names
Applying strategy: strip_biosphere_exc_locations
Applying strategy: set_code_by_activity_hash
Applying strategy: link_iterable_by_fields
Applying strategy: assign_only_product_as_production
Applying strategy: link_technosphere_by_activity_hash
Applying strategy: drop_falsey_uncertainty_fields_but_keep_zeros
Applying strategy: convert_uncertainty_types_to_integers
Applied 14 strategies in 0.15 seconds
Applying strategy: link_iterable_by_fields
2 datasets
9 exchanges
0 unlinked exchanges
  


(2, 9, 0)



In [7]:
imp.write_database()

Writing activities to SQLite3 database:
0%  100%
[##] | ETA: 00:00:00
Total time elapsed: 00:00:00


Title: Writing activities to SQLite3 database:
  Started: 03/24/2017 14:04:43
  Finished: 03/24/2017 14:04:43
  Total time elapsed: 00:00:00
  CPU %: 113.70
  Memory %: 0.43
Created database: Sugarcane




We already know that regionalization involved matching different maps of the world. We start by telling the program which maps are used by ecoinvent and our case study database:

In [10]:
bw.databases['ecoinvent 2.2']['geocollections'] = ['world', 'ecoinvent 2.2']
bw.databases['Sugarcane']['geocollections'] = ['world', 'ecoinvent 2.2']
bw.databases['biosphere3']['geocollections'] = []
bw.databases.flush()



Next, we import some basic regionalization information (just like `bw2setup`). This function defines all the spatial units used in ecoinvent (all countries in the world, but also agglomerations like "Northern America" or "ENTSO-E").

In [10]:
reg.bw2regionalsetup()

Downloading and creating world geocollections
Adding world topology
Adding ecoinvent-specific topology




Next, we import a partial implementation of the LC IMPACT regionalized LCIA methods. The following function will also define new maps for each impact category, such as watersheds, ecoregions, or others.

In [3]:
from bw2_lcimpact import import_regionalized_lcimpact



In [None]:
import_regionalized_lcimpact()

We now define a third spatial scale - detailed maps of where sugarcane is grown, combined with its land- and water-use intensity. In this case, we need to *name* these maps, and give the computer information on *where* the spatial data sources are located.

In the case of `weighted-pop-density`, we don't actually have this map - but we do have the SHA 256 hash of the file contents, which will be enough to get the relevant GIS data from the Pandarus web service. 

In [11]:
reg.geocollections['sugarcane_landuse_intensity'] = {
    'filepath': os.path.abspath("data/sugarcane_landuse_intensity.tif"),
    'band': 1
}
reg.geocollections['sugarcane_water_intensity'] = {
    'filepath': os.path.abspath("data/sugarcane_water_intensity.tif"),
    'band': 1
}
reg.geocollections['weighted-pop-density'] = {
    'band': 1,
    'kind': 'raster',
    'sha256': '11ec180aaa8d1f68629c06a9c2e6eb185f8e1e4c0d6713bab7f9219f1d160644'
}



After defining our third spatial scales, we need to do some GIS calculations on how these scales interact with the other two scales. Luckily, all the hard work has already been done, we just need to download the results.

In [16]:
inters = [
    'world-topo-watersheds-hh',
    'world-topo-watersheds-eq-sw-core',
    'world-topo-watersheds-eq-sw-extended',
    'world-topo-particulate-matter',
    'world-topo-ecoregions',
]

crop_rasters = [
    'sugarcane_landuse_intensity',
    'sugarcane_water_intensity',
    'weighted-pop-density',
]

for x in inters:
    for y in crop_rasters:
        remote.rasterstats_as_xt(x, y, x + "-" + y)



## End of before-class preparation

Before starting to make more calculations, we will define the various matrices together.

In [7]:
crops = [x for x in bw.Database("ecoinvent 2.2") if 'sugarcane' in x['name']]



In [11]:
# Do agricultural activities with the sugarcane intensity map, 
# all others with the weighted pop density map
xt_ag = reg.ExtensionTablesLCA(
    {('Sugarcane', 'driving'): 1},
    ('LC-IMPACT', 'Land Use', 'Occupation', 'Marginal', 'Core'),
    xtable='world-topo-ecoregions-sugarcane_landuse_intensity',
    limitations={
        'activities': crops,
    }
)
xt_ag.lci()
xt_ag.lcia()

xt_others = reg.ExtensionTablesLCA(
    {('Sugarcane', 'driving'): 1},
    ('LC-IMPACT', 'Land Use', 'Occupation', 'Marginal', 'Core'),
    xtable='world-topo-ecoregions-weighted-pop-density',
    limitations={
        'activities': crops,
        'activities mode': 'exclude'
    }
)
xt_others.lci()
xt_others.lcia()

xt_ag.score + xt_others.score

5.972089447120165e-14



# Interpreting regionalized results

In [12]:
xt_ag.fix_spatial_dictionaries()



In [20]:
def iterate_results_spatial_labels(matrix, axis, spatial_dict, cutoff=1e-4):
    _ = lambda x: x[1] if isinstance(x, tuple) else x
    
    rsd = {y: _(x) for x, y in xt_ag.ia_spatial_dict.items()}

    total = matrix.sum()
    summed = np.array(matrix.sum(axis=axis)).ravel()
    sorting = np.argsort(np.abs(summed))[::-1]

    summed = summed[sorting]
    mask = summed > cutoff * summed.max()

    for x, y in zip(summed[mask], sorting):
        yield x, x * 100 / total, rsd[y]



In [23]:
def to_geopandas(result_iter, geocollection):
    source = gpd.read_file(reg.geocollections[geocollection]['filepath'])
    merged = source.merge(pd.DataFrame(
        list(result_iter), 
        columns=['lcia_weight', 'lcia_weight_normalized', reg.geocollections[geocollection]['field']]
    ))
    return merged



In [24]:
df = to_geopandas(
    iterate_results_spatial_labels(
        (xt_ag.results_ia_spatial_scale() + xt_others.results_ia_spatial_scale()),
        0,
        xt_ag.ia_spatial_dict,
    ),
    'ecoregions'
)



In [25]:
m = folium.Map(location=[0, 0], zoom_start=2, 
               tiles="cartodbpositron")

df['geoid'] = df.index.astype(str)
geo_str = df.to_json()

m.choropleth(geo_str=geo_str,
             data=df, columns=['geoid', 'lcia_weight_normalized'],
             key_on='feature.id',
             fill_color='YlGn', fill_opacity=0.4, line_opacity=0.2)
m

