# Disaggregation processes with `gemsgrid`

## Table of Contents
* [Overview](#Overview)
* [Prerequisites](#Prerequisites)
* [Examples](#Examples)  


## Overview

**Raster disaggregation** functionality allows to resample GEMS raster datasets from coarser to finer level (for example, from level 1 to level 2). A "parent" is defined as a pixel (or a grid point) of the coarser data, and all pixels of the finer level contained within a given "parent" are its "children". Number of children would be constant across space and would only depend on the refine ratio between the source and the target GEMS levels. 

**Vector disaggregation** functionality is meant to rasterize vector datasets to a specified GEMS level. In this case, a "parent" is a polygon, and "children" would be GEMS pixels inside the polygon. Number of children would NOT be constant across space; it will vary depending on the size of the polygon and the desired target GEMS level.

The following 5 disaggregation options are supported for both raster and vector data:

- option 1 - repeat parent value to all children
- option 2 - divide parent value by the number of all children
- option 3 - repeat parent value to valid children
- option 4 - divide parent value by the number of valid children
- option 5 - divide parent value by the number of valid children corresponding to multiple categories with different weights
- option 6 - distribute parent value based on proportions from a “suitability” mask

![Figure](disaggregation_options_6.png)

## Prerequisites
**Software:**  Prepare your Python environment by installing `gemsgrid` package. Refer to a setup guide here: https://github.umn.edu/IAA/gems_grid/blob/master/README.md. Other required packages include :
`geopandas`, `rasterio`, `rasterstats`, `shapely`, `numpy`, and `scipy`.  

**Data:**
- input vector data : needs to be in a .shp format;
- input raster data : needs to be in a GeoTiff format; needs to be already projected to EASE and aligned with GEMS grid. Use `gemsgrid.warp_raster_to_gems` first if you need to align your data with GEMS grid.
- input raster mask data : needs to be in a GeoTiff format; needs to be already projected to EASE and aligned with GEMS grid. GEMS level for mask data should be the same as the desired target level.

**Required arguments:**

| I want to use … | I start with … | Required arguments |
| --------------- | --------------- | -------------------|
| Option 1  | Vector data | <span style="color:blue">input_type="vector_unmasked"</span>; outrasterpath; <span style="color:red">operation="repeat"</span>; invectorpath; var; target_level; globalextent; (vectornodata) | 
| Option 1  | Raster data | <span style="color:blue">input_type="raster_unmasked"</span>; outrasterpath; <span style="color:red">operation="repeat"</span>; inrasterpath; source_level |
| Option 2 | Vector data | <span style="color:blue">input_type="vector_unmasked"</span>; outrasterpath; <span style="color:red">operation="divide"</span>; invectorpath; var; target_level; globalextent; (vectornodata) | 
| Option 2 | Raster data | <span style="color:blue">input_type="raster_unmasked"</span>; outrasterpath; <span style="color:red">operation="divide"</span>; inrasterpath; source_level | 
| Option 3 | Vector data | <span style="color:blue">input_type="vector_masked"</span>; outrasterpath; <span style="color:red">operation="repeat"</span>; invectorpath; var; inmaskpath;  clip; <span style="color:green">categories_dict (one category with 100% weight)</span>; (vectornodata) | 
| Option 3 | Raster data | <span style="color:blue">input_type="raster_masked"</span>; outrasterpath; <span style="color:red">operation="repeat"</span>; inrasterpath; source_level; inmaskpath; <span style="color:green">categories_dict (one category with 100% weight)</span> | 
| Option 4 | Vector data | <span style="color:blue">input_type="vector_masked"</span>; outrasterpath; <span style="color:red">operation="divide"</span>; invectorpath; var; inmaskpath;  clip; <span style="color:green">categories_dict (one category with 100% weight)</span>; (vectornodata) | 
| Option 4 | Raster data | <span style="color:blue">input_type="raster_masked"</span>; outrasterpath; <span style="color:red">operation="divide"</span>; inrasterpath; source_level; inmaskpath; <span style="color:green">categories_dict (one category with 100% weight)</span> |
| Option 5 | Vector data | <span style="color:blue">input_type="vector_masked"</span>; outrasterpath; <span style="color:red">operation="divide"</span>; invectorpath; var; inmaskpath; clip; <span style="color:green">categories_dict (multiple categories with corresponsing weights)</span>; (vectornodata)| 
| Option 5 | Raster data | <span style="color:blue">input_type="raster_masked"</span>; outrasterpath; <span style="color:red">operation="divide"</span>; inrasterpath; source_level; inmaskpath; <span style="color:green">categories_dict (multiple categories with corresponsing weights)</span> |
| Option 6 | Vector data | <span style="color:blue">input_type="vector_masked"</span>; outrasterpath; <span style="color:red">operation="distribute"</span>; invectorpath; var; inmaskpath; clip; (vectornodata) | 
| Option 6 | Raster data | <span style="color:blue">input_type="raster_masked"</span>; outrasterpath; <span style="color:red">operation="distribute"</span>; inrasterpath; source_level; inmaskpath |

**How to interpret arguments:**  

| Argument | Type | Meaning |
| --------- | ----- | ------ |
| input_type |  str | Describes the type of input data ("vector_unmasked", "vector_masked", "raster_unmasked", "raster_masked") | 
| outrasterpath | str | Describes the path of the output raster file |
| operation  | str | Defines allocation rule : operation == "repeat" - repeats parent value to children ; operation == "divide" - divides parent value by the number of children |
| invectorpath | str | Describes the path of the input shapefile | 
| var | str | Column name from the attribute table that needs to be disaggregated |
| target_level | int | Valid GEMS level, options are: 0, 1, 2, 3, 4, 5, 6 |
| globalextent  | boolean | Specifies if vector data are global (globalextnet=True, otherwise globalextnet=False) |
| inmaskpath  | str | Describes the path of the input mask file |
| categories_dict |  dictionary | Describes conditions for masking and weight assigned to each category, required for "divide" and "repeat" operations | 
| clip  | boolean | Specifies if mask raster has substantially larger extent compared to the input vector data and needs to be clipped (clip=True, otherwise clip=False) | 
| inrasterpath | str | Describes the path of the input raster file |
| source_level | int | Valid GEMS level of the original data, options are: 0, 1, 2, 3, 4, 5, 6 | 
| vectornodata | float or int | Value to store "absence" of data for rasters disaggregated from vector data; optional, defaults to -9999 | 

**How to create mask conditions:**
- For options 3 and 4 use a dictionary with only 1 category with weight 100%.  
Example: `categories_dict = {1:["mask==1",100]}`  
Multiple classes within the same category are also supported:  
`categories_dict = {1:["(mask==1)|(mask==5)",100]}`  
- For option 5 use a dictionary with 2 or more categories and their corresponding weights:  
Example: `categories_dict={1:["mask==1",60],2:["mask==5",40]}`  
NOTE: the rules defined by the categories should not be overlapping, meaning each category should describe a unique set of pixels in the mask layer. For example, `mask==3` would not overlap with `mask>=5` but will overlap with `mask>=2`
- use `|` operator to meet either condition. With `"(mask==1)|(mask==5)"` you will select pixels that equal to 1 or 5
- use `&` operator to meet all condition. With `"(mask>=24)&(mask<=67)"` you will select pixels with values higher than 24 but lower than 67.

## Examples

### Option 1 with input vector data

In [1]:
from gemsgrid.processing_tools.disaggregate import disaggregate

arguments = {
    "input_type" : "vector_unmasked",
    "outrasterpath": "D:/data/disaggregation_data/from_vector/mn_county_ag_value_unmasked_repeat.tif",
    "operation": "repeat",
    "invectorpath" : "D:/data/disaggregation_data/from_vector/mn_county_ag_value.shp",
    "var" : "valueperac",
    "target_level" : 4,
    "globalextent" : False
}

disaggregate(**arguments)

STEP 1 of 4: Open input file
STEP 2 of 4: Generate geoproperties
STEP 3 of 4: Produce disaggregated array
STEP 4 of 4: Save disaggregated array to raster



### Option 2 with input vector data

In [2]:
from gemsgrid.processing_tools.disaggregate import disaggregate

arguments = {
    "input_type" : "vector_unmasked",
    "outrasterpath": "D:/data/disaggregation_data/from_vector/mn_county_ag_value_unmasked_divide.tif",
    "operation": "divide",
    "invectorpath" : "D:/data/disaggregation_data/from_vector/mn_county_ag_value.shp",
    "var" : "totalvalue",
    "target_level" : 4,
    "globalextent" : False
}

disaggregate(**arguments)

STEP 1 of 4: Open input file
STEP 2 of 4: Generate geoproperties
STEP 3 of 4: Produce disaggregated array
STEP 4 of 4: Save disaggregated array to raster


### Option 3 with input vector data

In [3]:
from gemsgrid.processing_tools.disaggregate import disaggregate

arguments = {
    "input_type" : "vector_masked",
    "outrasterpath": "D:/data/disaggregation_data/from_vector/mn_county_ag_value_masked_repeat.tif",
    "operation": "repeat",
    "invectorpath" : "D:/data/disaggregation_data/from_vector/mn_county_ag_value.shp",
    "var" : "valueperac",
    "inmaskpath" : "D:/data/disaggregation_data/from_vector/LCMAP_CU_2017_V12_LCPRI_l4.tif",
    "categories_dict" : {1:["mask==2",100]},
    "clip" : True
}

disaggregate(**arguments)

STEP 1 of 4: Open input file
STEP 2 of 4: Generate geoproperties
STEP 3 of 4: Produce disaggregated array
STEP 4 of 4: Save disaggregated array to raster


### Option 4 with input vector data

In [4]:
from gemsgrid.processing_tools.disaggregate import disaggregate

arguments = {
    "input_type" : "vector_masked",
    "outrasterpath": "D:/data/disaggregation_data/from_vector/mn_county_ag_value_masked_divide_1categ.tif",
    "operation": "divide",
    "invectorpath" : "D:/data/disaggregation_data/from_vector/mn_county_ag_value.shp",
    "var" : "totalvalue",
    "inmaskpath" : "D:/data/disaggregation_data/from_vector/LCMAP_CU_2017_V12_LCPRI_l4.tif",
    "clip" : True,
    "categories_dict" : {1:["mask==2",100]}
}

disaggregate(**arguments)

STEP 1 of 4: Open input file
STEP 2 of 4: Generate geoproperties
STEP 3 of 4: Produce disaggregated array
STEP 4 of 4: Save disaggregated array to raster



### Option 5 with input vector data

In [5]:
from gemsgrid.processing_tools.disaggregate import disaggregate

arguments = {
    "input_type" : "vector_masked",
    "outrasterpath": "D:/data/disaggregation_data/from_vector/mn_county_ag_value_masked_divide_2categ.tif",
    "operation": "divide",
    "invectorpath" : "D:/data/disaggregation_data/from_vector/mn_county_ag_value.shp",
    "var" : "totalvalue",
    "inmaskpath" : "D:/data/disaggregation_data/from_vector/LCMAP_CU_2017_V12_LCPRI_l4.tif",
    "clip" : True,
    "categories_dict" : {1 : ["mask == 2", 97], 2 : ["mask == 3", 3]}
}

disaggregate(**arguments)

STEP 1 of 4: Open input file
STEP 2 of 4: Generate geoproperties
STEP 3 of 4: Produce disaggregated array
STEP 4 of 4: Save disaggregated array to raster


### Option 6 with input vector data

In [6]:
from gemsgrid.processing_tools.disaggregate import disaggregate

arguments = {
    "input_type" : "vector_masked",
    "outrasterpath": "D:/data/disaggregation_data/from_vector/mn_county_ag_value_distributed.tif",
    "operation": "distribute",
    "invectorpath" : "D:/data/disaggregation_data/from_vector/mn_county_ag_value.shp",
    "var" : "totalvalue",
    "inmaskpath" : "D:/data/disaggregation_data/from_vector/mn_crop_productivity.tif",
    "clip" : True
}

disaggregate(**arguments)

STEP 1 of 4: Open input file
STEP 2 of 4: Generate geoproperties
STEP 3 of 4: Produce disaggregated array
STEP 4 of 4: Save disaggregated array to raster


### Option 1 with input raster data

In [7]:
from gemsgrid.processing_tools.disaggregate import disaggregate

arguments = {
    "input_type" : "raster_unmasked",
    "outrasterpath": "D:/data/disaggregation_data/from_raster/asia_production_perhectare_l2_unmasked_repeat.tif",
    "operation": "repeat",
    "inrasterpath" : "D:/data/disaggregation_data/from_raster/asia_production_perhectare_l1.tif",
    "source_level" : 1
}

disaggregate(**arguments)

STEP 1 of 4: Open input file
STEP 2 of 4: Generate geoproperties
STEP 3 of 4: Produce disaggregated array
STEP 4 of 4: Save disaggregated array to raster


### Option 2 with input raster data

In [8]:
from gemsgrid.processing_tools.disaggregate import disaggregate

arguments = {
    "input_type" : "raster_unmasked",
    "outrasterpath": "D:/data/disaggregation_data/from_raster/asia_production_total_l2_unmasked_divide.tif",
    "operation": "divide",
    "inrasterpath" : "D:/data/disaggregation_data/from_raster/asia_production_total_l1.tif",
    "source_level" : 1
}

disaggregate(**arguments)

STEP 1 of 4: Open input file
STEP 2 of 4: Generate geoproperties
STEP 3 of 4: Produce disaggregated array
STEP 4 of 4: Save disaggregated array to raster


### Option 3 with input raster data

In [9]:
from gemsgrid.processing_tools.disaggregate import disaggregate

arguments = {
    "input_type" : "raster_masked",
    "outrasterpath": "D:/data/disaggregation_data/from_raster/asia_production_perhectare_l2_masked_repeat.tif",
    "operation": "repeat",
    "inrasterpath" : "D:/data/disaggregation_data/from_raster/asia_production_perhectare_l1.tif",
    "source_level" : 1,
    "inmaskpath": "D:/data/disaggregation_data/from_raster/GFSAD1KCM.2010.001.2016348142550_l2.tif",
    "categories_dict": {1 : ["(mask == 1) | (mask == 2)| (mask == 3) | (mask == 4) | (mask == 5)", 100]}
}

disaggregate(**arguments)

STEP 1 of 4: Open input file
STEP 2 of 4: Generate geoproperties
STEP 3 of 4: Produce disaggregated array
STEP 4 of 4: Save disaggregated array to raster


### Option 4 with input raster data

In [10]:
from gemsgrid.processing_tools.disaggregate import disaggregate

arguments = {
    "input_type" : "raster_masked",
    "outrasterpath": "D:/data/disaggregation_data/from_raster/asia_production_total_l2_masked_divide_1categ.tif",
    "operation": "divide",
    "inrasterpath" : "D:/data/disaggregation_data/from_raster/asia_production_total_l1.tif",
    "source_level" : 1,
    "inmaskpath": "D:/data/disaggregation_data/from_raster/GFSAD1KCM.2010.001.2016348142550_l2.tif",
    "categories_dict": {1 : ["(mask == 1) | (mask == 2)| (mask == 3) | (mask == 4) | (mask == 5)", 100]}
}

disaggregate(**arguments)

STEP 1 of 4: Open input file
STEP 2 of 4: Generate geoproperties
STEP 3 of 4: Produce disaggregated array
STEP 4 of 4: Save disaggregated array to raster


### Option 5 with input raster data

In [11]:
from gemsgrid.processing_tools.disaggregate import disaggregate

arguments = {
    "input_type" : "raster_masked",
    "outrasterpath": "D:/data/disaggregation_data/from_raster/asia_production_total_l2_masked_divide_2categ.tif",
    "operation": "divide",
    "inrasterpath" : "D:/data/disaggregation_data/from_raster/asia_production_total_l1.tif",
    "source_level" : 1,
    "inmaskpath": "D:/data/disaggregation_data/from_raster/GFSAD1KCM.2010.001.2016348142550_l2.tif",
    "categories_dict": {1 : ["(mask == 1) | (mask == 2)", 40], 2: ["(mask == 3) | (mask == 4) | (mask == 5)", 60]}
}

disaggregate(**arguments)

STEP 1 of 4: Open input file
STEP 2 of 4: Generate geoproperties
STEP 3 of 4: Produce disaggregated array
STEP 4 of 4: Save disaggregated array to raster


### Option 6 with input raster data

In [12]:
from gemsgrid.processing_tools.disaggregate import disaggregate

arguments = {
    "input_type" : "raster_masked",
    "outrasterpath": "D:/data/disaggregation_data/from_raster/asia_production_total_l2_distributed.tif",
    "operation": "distribute",
    "inrasterpath" : "D:/data/disaggregation_data/from_raster/asia_production_total_l1.tif",
    "source_level" : 1,
    "inmaskpath": "D:/data/disaggregation_data/from_raster/AWCh2_M_sl6_250m_ll_l2.tif",
}

disaggregate(**arguments)

STEP 1 of 4: Open input file
STEP 2 of 4: Generate geoproperties
STEP 3 of 4: Produce disaggregated array
STEP 4 of 4: Save disaggregated array to raster
