# Post-process: Smoothing and Reclassify Classes

## Description
In this notebook we use external layers that contain reliable information on centain classes and/or have higher-spatial resolution to reclassify classes that may be misclassified by the random forest classifier. These external layers have been prepared and uploaded into 'Data/' folder. We'll also conduct a median filtering to reduce the 'salt and pepper' effect resulted from pixel-based classification. This notebook will demonstrate how to do these and visualise the comparison before and after the post-processing.

To run this analysis, run all the cells in the notebook, starting with the "Load packages" cell.

### Load Packages

In [None]:
%matplotlib inline
import os
import datacube
import warnings
import numpy as np
import geopandas as gpd
import pandas as pd
import xarray as xr
import rioxarray
from rasterio.enums import Resampling
from datacube.utils.cog import write_cog
from deafrica_tools.spatial import xr_rasterize
from deafrica_tools.plotting import rgb, display_map
from skimage.morphology import binary_dilation,disk
from skimage.filters.rank import modal
from odc.algo import xr_reproject
import matplotlib.pyplot as plt

## Analysis parameters
* `prediction_maps_path`: A list of file paths and name of the classification maps produced in the previous notebook.
* `dict_map`: A dictionary map of class names corresponding to pixel values.
* `output_crs`: Coordinate reference system for output raster files.

In [None]:
prediction_maps_path=['Results/Rwanda_land_cover_prediction_2021_location_0.tif',
'Results/Rwanda_land_cover_prediction_2021_location_1.tif',
'Results/Rwanda_land_cover_prediction_2021_location_2.tif'] # list of prediction map files
dict_map={1:'Forest',5:'Grassland',7:'Shrubland',9:'Perennial Cropland',10:'Annual Cropland',11:'Wetland',12:'Water Body',13:'Urban Settlement'}
output_crs='epsg:32735' # WGS84/UTM Zone 35S

## External Layers
A few external layers were sourced and prepared in the 'Data/' folder, which are helpful to provide information on specific classes, e.g. Urban Settlements and Water Body. which include:
* `hand_raster`: Hydrologically adjusted elevations, i.e. Height Above the Nearest Hrainage (hand) derived from the [MERIT Hydro dataset](https://developers.google.com/earth-engine/datasets/catalog/MERIT_Hydro_v1_0_1#description).
* `river_network_shp`: OSM river network shapefile. The OSM layers were sourced from the [Humanitarian OpenStreetMap Team (HOT)](https://data.humdata.org/organization/hot) website.
* `road_network_shp`: OSM road network shapefile.
* `google_building_raster`: A rasterised layer of [Google Open Building polygons](https://developers.google.com/earth-engine/datasets/catalog/GOOGLE_Research_open-buildings_v2_polygons), which consist of outlines of buildings derived from high-resolution 50 cm satellite imagery. As there are many polygons in the original vector layer, we rasterised the layer to reduce disk storage and memory required for processing.
* `wsf2019_raster`: 2019 [World Settlement Footprint (WSF) layer](https://gee-community-catalog.org/projects/wsf/), a 10m resolution binary mask outlining the extent of human settlements globally derived by means of 2019 multitemporal Sentinel-1 and Sentinel-2 imagery.  

> Note: In this notebook we have made the data prepared for you to run through the demonstration. If you would like to apply it to your own project, you may need to spend some time sourcing the datasets and do some pre-processing if needed, e.g. clipping to your study area, filtering, rasterisation or vectorisation. Alternatively you can revise this notebook depending on your data format.

In [None]:
river_network_shp='Data/hotosm_rwa_waterways_lines_filtered.shp' # OSM river network data
road_network_shp='Data/hotosm_rwa_roads_lines_filtered.shp' # OSM road network data
google_building_raster='Data/GoogleBuildingLayer_Rwanda_rasterised.tif' # rasterised google bulding layer
hand_raster='Data/hand_Rwanda.tif' # Hydrologically adjusted elevations, i.e. height above the nearest drainage (hand)
wsf2019_raster='Data/WSF2019_v1_Rwanda_clipped.tif' # 2019 World Settlement Footprint layer

## Load layers
First let's load the land cover maps generated from the previous notebook:

In [None]:
# import land cover map of 2021 and reproject
prediction_maps=[]
for i in range(0, len(prediction_maps_path)):
    prediction_maps[i]=rioxarray.open_rasterio(prediction_maps_path[i]).astype(np.uint8).squeeze()

We then load other layers. The SOM road network layer contains multi-lines with various surface attributes. We'll select some major road types and buffer them by 10 metres:

In [None]:
# import OSM road network data and reproject
road_network=gpd.read_file(road_network_shp).to_crs(output_crs) 
road_network=road_network.loc[road_network['surface'].isin(['asphalt', 'paved', 'compacted', 'cobblestone', 
                                                             'concrete', 'metal', 'paving_stones', 
                                                             'paving_stones:30'])] # select road network by attributes
road_network.geometry=road_network.geometry.buffer(10) # buffer the road network by 10m

Similaryly we'll select major waterways from the OSM river network layer:

In [None]:
river_network=gpd.read_file(river_network_shp).to_crs(output_crs) # import OSM river network data and reproject
river_network=river_network.loc[river_network['waterway'].isin(['canal','river'])] # select river network by attribute

We now load the Google buildings, WSF 2019 and 'hand' rasters:

In [None]:
google_buildings=xr.open_dataset(google_building_raster,engine="rasterio").squeeze() # import google bulding layer
hand=xr.open_dataset(hand_raster,engine="rasterio").squeeze() # import hand layer
wsf2019=xr.open_dataset(wsf2019_raster,engine="rasterio").astype(np.uint8).squeeze()

In [None]:
# load and reproject hand layer

hand=xr_reproject(hand, ds_geobox, resampling="average")
np_hand=hand.to_array().squeeze().to_numpy()
# import 2019 wsf layer


## Morphological filtering
To reduce salt-and-pepper noise in the classification map, we can impliment a major filtering. In this example we use a filtering size of 2.5 metres below. It is advised that you adjust the size depending on your data and study area.

In [None]:
# mode filtering for a smoother classification map
np_landcover2021_post=modal(np_landcover2021,selem=disk(2.5),mask=np_landcover2021!=0)

We can plot the map before and after filtering:

In [None]:
fig, axes = plt.subplots(1, 2, figsize=(24, 8))

# Plot classified image before filtering
np_landcover2021.plot(ax=axes[0], 
#                    cmap='Greens', 
               add_labels=False, 
               add_colorbar=True)

# Plot classified image after filtering
np_landcover2021_post.plot(ax=axes[1], 
#                    cmap='Greens', 
               add_labels=False, 
               add_colorbar=True)

# Add plot titles
axes[0].set_title('Classified Image Before Majority Filtering')
axes[1].set_title('Classified Image After Majority Filtering')

## Apply Rules using other layers
We'll use the loaded layers and some set rules to reclassify some classes. First, we reclassify pixels classified as water occuring at bottom of watersheds or fallen within OSM river networks as water class:

In [None]:
# make a copy of the map before applying the rules
np_landcover_copy=np_landcover2021_post.copy()
river_network_mask=xr_rasterize(gdf=river_network,
                                  da=landcover2021_tile.squeeze(),
                                  transform=ds_geobox.transform,
                                  crs=output_crs)
np_river_network_mask=river_network_mask.to_numpy()
np_landcover2021_post[((np_landcover2021==dict_map['Open Water'])&(np_hand<=45))|(np_river_network_mask==1)]=dict_map['Open Water']

We then assign pixels overlapping google building polygons or wsf mask as built-up:

In [None]:
google_buildings_mask=xr_rasterize(gdf=google_buildings,
                                  da=landcover2021_tile.squeeze(),
                                  transform=ds_geobox.transform,
                                  crs=croutput_crss)
np_google_buildings_mask=google_buildings_mask.to_numpy()
wsf2019=xr_reproject(wsf2019, ds_geobox, resampling="nearest")
np_wsf2019=wsf2019.to_array().squeeze().to_numpy()
np_landcover2021_post[(np_google_buildings_mask==1)|(np_wsf2019==1)]=dict_map['Settlements']

We assume that wetlands should not be too close (e.g. within 50m) to built-up areas and reclassify these misclassified pixels as croplands instead:

In [None]:
urban_buffered=binary_dilation(np_landcover2021==dict_map['Settlements'],selem=disk(5))
np_landcover2021_post[(urban_buffered==1)&(np_landcover2021==dict_map['Vegetated Wetland'])]=dict_map['Cropland']

In addition, we assign pixels overlapping OSM road network as built-up class:

In [None]:
road_network_mask=xr_rasterize(gdf=road_network,
                              da=landcover2021_tile.squeeze(),
                              transform=ds_geobox.transform,
                              crs=output_crs)
np_road_network_mask=road_network_mask.to_numpy()
np_landcover2021_post[np_road_network_mask==1]=dict_map['Settlements']

We can plot the maps to see a comparison before and after applying the rules:

In [None]:
fig, axes = plt.subplots(1, 2, figsize=(24, 8))

# Plot classified image before applying the rules
np_landcover_copy.plot(ax=axes[0], 
#                    cmap='Greens', 
               add_labels=False, 
               add_colorbar=True)

# Plot classified image after applying the rules
np_landcover2021_post.plot(ax=axes[1], 
#                    cmap='Greens', 
               add_labels=False, 
               add_colorbar=True)

# Add plot titles
axes[0].set_title('Classified Image Before Applying Rules')
axes[1].set_title('Classified Image After Applying Rules')

## Save as geotiff
We can now export our predictions to sandbox disk as Cloud-Optimised GeoTIFFs:

In [None]:
# convert back result back to DataArray
landcover2021_tile_post=xr.DataArray(data=np_landcover2021_post,dims=['y','x'],
                                     coords={'y':landcover2021_tile.y.to_numpy(), 'x':landcover2021_tile.x.to_numpy()})
landcover2021_tile_post.rio.write_crs(output_crs, inplace=True)
# export as geotiff
write_cog(landcover2021_tile_post, 'Results/Land_cover2021_postproc_add_all_others_tile_'+str(i)+'.tif', overwrite=True)

In [4]:
for i in range(len(tile_bboxes)):
# for i in range(0,3):
    x_min,y_min,x_max,y_max=tile_bboxes.iloc[i]
    print('Processing tile ',i,'with bbox of ',x_min,y_min,x_max,y_max)
    # clip land cover maps to tile boundary
    landcover2021_tile=landcover2021.rio.clip_box(minx=x_min,miny=y_min,maxx=x_max,maxy=y_max)
    ds_geobox=landcover2021_tile.geobox
    np_landcover2021=landcover2021_tile.squeeze().to_numpy()
    np_landcover2021_post=np_landcover2021.copy()


Processing tile  0 with bbox of  706710.6451945585 9734601.615224078 736778.9456614424 9764669.916827785
Processing tile  1 with bbox of  706571.951193417 9704669.970329456 736654.9786463858 9734752.998859746
Processing tile  2 with bbox of  706418.4618659335 9674738.4230786 736516.2033688526 9704836.165599158
Processing tile  3 with bbox of  736962.4177453369 9824310.06328279 767001.3429964046 9854348.989904791
Processing tile  4 with bbox of  736868.0657618204 9794363.364797598 766921.7626612171 9824417.063008774
Processing tile  5 with bbox of  736758.9000742742 9764416.736721337 766827.3563398367 9794485.194239335
Processing tile  6 with bbox of  736634.9232347194 9734470.189283589 766718.1262457089 9764553.393487478
Processing tile  7 with bbox of  736496.13814132 9704523.732704137 766594.0749397537 9734621.670635745
Processing tile  8 with bbox of  736342.5480382751 9674577.37719173 766455.2053302713 9704690.035556989
Processing tile  9 with bbox of  766901.72682136 9824223.02929