# Attributing vector features

**Original code:** [Alexandros Korkovelos](https://github.com/akorkovelos) <br />

This notebook employs a number of functions that extract values from raster layers and attribute them to the vector layer generated by the `"Creating spatial index for ascii clusters.ipynb"`.

You will need the following layers:
* Land Cover (raster -- .tif)
* Water Table Depth (raster -- .tif)
* Clusters (vector -- .shp or .gpkg)

**Note!** More layers can be added based on the mandates of the analysis..

## Importing necessary packages | defining functions

In [3]:
# Spatial
import geopandas as gpd
import rasterio
import rasterio.fill
from rasterstats import zonal_stats
from geojson import Feature, Point, FeatureCollection
import json


# System or other
import os
from IPython.display import display
import ipywidgets as widgets
import tkinter as tk
from tkinter import filedialog, messagebox
import datetime

import warnings
warnings.filterwarnings('ignore')

root = tk.Tk()
root.withdraw()
root.attributes("-topmost", True)

''

In [14]:
# Directories
ROOT_DIR = os.path.abspath(os.curdir)
in_path = os.path.join(ROOT_DIR, 'sample_input')
out_path = os.path.join(ROOT_DIR, 'sample_output')

## name of layers
vect_nm = "Ethiopia_vector_clusters.shp"               # name of the layer containing the cluster/vector data
out_nm = "Ethiopia_vector_clusters_with_attributes"    # name of the output layer

In [5]:
# Processing Continuous/Numerical Rasters
def processing_raster(name, method, clusters):
    """
    This function calculates stats for numerical rasters and attributes them to the given vector features. 
    
    INPUT: 
    name: string used as prefix when assigning features to the vectors
    method: statistical method to be used (check documentation)
    clusters: the vector layer containing the clusters
    
    OUTPUT:
    geojson file of the vector features including the new attributes
    """

    messagebox.showinfo('CLEWs', 'Select the ' + name + ' map')
    raster=rasterio.open(filedialog.askopenfilename(filetypes = (("rasters","*.tif"),("all files","*.*"))))
    
    clusters = zonal_stats(
        clusters,
        raster.name,
        stats=[method],
        prefix=name, geojson_out=True, all_touched=True)
    
    print(datetime.datetime.now())
    return clusters

In [6]:
## Processing Categorical Rasters
def processing_raster_cat(name, clusters):
    """
    This function calculates stats for categorical rasters and attributes them to the given vector features. 
    
    INPUT: 
    name: string used as prefix when assigning features to the vectors
    clusters: the vector layer containing the clusters
    
    OUTPUT:
    geojson file of the vector features including the new attributes
    """    
    messagebox.showinfo('CLEWs', 'Select the ' + name + ' map')
    raster=rasterio.open(filedialog.askopenfilename(filetypes = (("rasters","*.tif"),("all files","*.*"))))
    
    clusters = zonal_stats(
        clusters,
        raster.name,
        categorical=True,
        prefix=name, geojson_out=True, all_touched=True)
    
    print(datetime.datetime.now())
    return clusters

In [7]:
## Converting geojson to geodataframe
def geojson_to_gdf(workspace, geojson_file):
    """
    This function returns a geodataframe for a given geojson file
    
    INPUT: 
    workplace: working directory
    geojson_file: geojson layer to be convertes
    crs: projection system in epsg format (e.g. 'EPSG:32637')
    
    OUTPUT:
    geodataframe
    """
    output = workspace + r'\placeholder.geojson'
    with open(output, "w") as dst:
        collection = {
            "type": "FeatureCollection",
            "features": list(geojson_file)}
        dst.write(json.dumps(collection))
  
    clusters = gpd.read_file(output)
    os.remove(output)
    
    print(datetime.datetime.now())
    return clusters

## Import vector features as geodataframes       

In [8]:
# Create a new geo-dataframe for vector features
clusters = gpd.read_file(out_path + "\\" + vect_nm)

## Extract raster values 

#### Land Cover

Note that the layer used follows `International Geosphere-Biosphere Programme (IGBP) classification` (see [here](https://smap.jpl.nasa.gov/system/internal_resources/details/original/284_042_landcover.pdf) for more info).

In [10]:
clusters = processing_raster_cat("LC",clusters)

2020-11-20 12:03:57.915538


#### Water table depth

In [11]:
clusters = processing_raster("wtd","mean",clusters)

2020-11-20 12:09:30.205534


## Converting the geojson file to geodataframe

**NOTE** In case you get an Driver Error for reading the geojson file into a geodataframe, this might be cause due to attribution of "inf" or "-inf" value in one of the attributes. This is related to the way python handles json (see fix [here](https://stackoverflow.com/questions/17503981/is-there-a-way-to-override-pythons-json-handler)). An "easy" fix is that you import the geojson into Qgis and replace the erroneous value(s) manually. This is not ideal but it will do the job. In that case, save the updated geojson file and use the second (commented) line below to import into a geodataframe.

In [12]:
clusters = geojson_to_gdf(out_path, clusters)

2020-11-20 12:20:22.207764


## Exporting the geodataframe as vector layer

In [13]:
# Export as shapefile 
clusters.to_file(os.path.join(out_path,"{c}.shp".format(c=out_nm)))