# Updating cadastral maps using aerial images and deep learning

This notebook shows a solution for segmentating houseboats from aerial images and save the acquired polygons to a geoJSON format. The training step of Mask R-CNN is not part of this notebook.

## Save the LUFO and TOPO image tiles locally
Download the aerial and topographic image tiles from the City of Amsterdam objectstore or generate the tiles using [WMTS](https://map.data.amsterdam.nl/service?REQUEST=GetCapabilities&SERVICE=WMTS) with the same tiling scheme. Place the tiles locally in the `datasets/` folder. Cached topographic images can be found at the [City of Amsterdam server](https://t1.data.amsterdam.nl/topo_rd/13/).

In [None]:
# Example data files are available in these folders to run the notebook.
in_folder_lufo = "datasets/2020/lufo/13/"
in_folder_topo = "datasets/2020/topo/13/"
out_folder_lufo = "datasets/2020/lufo_water_only/13/"

## Overlay LUFO images with a water only mask
To simplify the detection of houseboats, we overlay the satellite images with a non-water mask image. Using the mask, we avoid the detection of houses on land. We save the result to the `out_folder`.

In [None]:
from src.mask_utils import create_water_only_tiles

create_water_only_tiles(in_folder_lufo, in_folder_topo, out_folder_lufo)

## Instance segmentation of houseboats in aerial images
A mask region-based convolutional neural network [Mask R-CNN](https://arxiv.org/abs/1703.06870) is used for the instance segmentation from aerial images. The instance segmentation algorithm produces precise masks that can be converted to polygons outlining the houseboats.

**NOTE**: Running Mask R-CNN on a non-GPU machine will be VERY slow. Therefore, I suggest to perform the instance segmentation on for example [Google Colab](https://colab.research.google.com/). Perform the following steps to run instance segmentation on Google Colab:
- Copy the [Mask R-CNN test notebook](models/Mask_R_CNN_Houseboats.ipynb) to a Google Drive folder. 
- Compress the `out_folder` (overlayed LUFO images) to a zip file and copy it to the Google Drive folder. We use a compressed folder because Google Drive sometimes has issues with file indexing.
- Download the [pretrained model](TODO) and copy it to the Google Drive folder.
- Open the notebook in Google Colab and enable GPU (Runtime / Change runtime type).

In [None]:
# Compress the overlayed LUFO images to a zip file.
!(cd datasets/2020/lufo_water_only/13/; zip -r "../../lufo_water_only.zip" . -x ".*" -x "__MACOSX")

In [None]:
# Copy the output of Mask R-CNN to a local folder.
out_folder_mask_r_cnn = "output/mask_r_cnn_predicted_houseboats.csv"

## Instance segmentation to geoJSON
The trained Mask R-CNN model determined pixel level segmentation masks for the houseboats in the aerial images and these results are saved to a *.csv file. The next cell performs the following tasks:
- A minimum bounding rectangle algorithm is used to calculate the width and length of a polygon. The width and length of the rectangle in pixels are converted to Rijksdriehoek coordinates. 
- The center of a polygon is calculated in pixels and converted to Rijksdriehoek coordinates.
- The *.csv format is converted to a geoJSON format.

In [None]:
from src.geospatial_utils import segmentations_to_coordinates

out_file = "output/houseboat_polygon_data.csv"
segmentations_to_coordinates(out_folder_mask_r_cnn, out_file)

## [Optional] Draw binary polygon masks of the segmentations
Draw binary masks to visually validate the quality of the Mask R-CNN predictions. This part can also be used to generate a labelled dataset. See the repo's README for more information.

In [None]:
from src.visualize import draw_binary_mask

out_folder_masks = "output/masks"
draw_binary_mask(out_folder_mask_r_cnn, out_folder_masks)