*Copyright (c) 2021 Centre National d'Etudes Spatiales (CNES).  
 This file is part of Bulldozer.  
 All rights reserved.*

# Bulldozer pre-process

This notebook aims to present the tools available in the pre-processing module of Bulldozer.

In [None]:
import matplotlib.pyplot as plt
import rasterio
from bulldozer.core.dsm_preprocess import PreProcess

## Build inner nodata mask
This functions allows ou to create a mask of all the nodata points inside the input *Digital Surface Model* (DSM) **without** the border nodata.  
*For example if the input DSM is skewed in the TIF file and the corners are nodata, these pixels will not be present in the inner nodata mask.*  
Those inner nodata points mainly come from correlation or oclusion issues during the DSM computation.  
Setup:

In [None]:
dsm_path = '../tests/data/postprocess/dsm_test.tif'

⚠️ You have to provide a raster format DSM, so it's required to open your DSM tif file: 

In [None]:
with rasterio.open(dsm_path) as dsm_dataset:
    preprocess = PreProcess()
    dsm = dsm_dataset.read(1)
    mask = preprocess.build_inner_nodata_mask(dsm)

✅ **Done!**  
We can now observe the results:

In [None]:
fig, axarr = plt.subplots(1, 2, figsize=(10, 6))
fig.suptitle('Bulldozer inner nodata mask generation', fontsize=16)

axarr[0].imshow(dsm)
axarr[0].set_title('Input DSM')

axarr[1].imshow(mask)
axarr[1].set_title('Output inner nodata mask')
fig.tight_layout()

## Build disturbance mask
This method generates a mask that matches all heavily disturbed areas in the input DSM.  
These areas often correspond to correlation errors from the DSM calculation (ex: water areas).  
Setup:

In [None]:
# Required parameters
dsm_path = '../tests/data/postprocess/dsm_test.tif'
nb_max_worker = 16

# Optionnal parameters
slope_treshold = 2.0
is_four_connexity = True

The slope treshold is the maximum value of slope between two consecutive pixels before they are considered disturbed.  
The boolean is_four_connexity indicates the numbers of explored axis, by default horizontal and vertical (True), otherwise horizontal, vertical and diagonals (False).

In [None]:
preprocess = PreProcess()
mask = preprocess.build_disturbance_mask(dsm_path, nb_max_worker, slope_threshold, is_four_connexity)

✅ **Done!**  
We can now observe the results:

In [None]:
fig, axarr = plt.subplots(1, 2, figsize=(10, 6))
fig.suptitle('Bulldozer disturbance mask computation', fontsize=16)

axarr[0].imshow(dsm)
axarr[0].set_title('Input DSM')

axarr[1].imshow(mask)
axarr[1].set_title('Output disturbance mask')
fig.tight_layout()

## Bulldozer full pre-process pipeline
The full pre-process pipeline is designed to be used before the bulldozer DTM extraction.  
⚠️ It should not be called in standalone because it produces a pre-processed DSM that is only designed to be used with the Bulldozer DTM extraction.  
This part of the tutorial is adapted to the situation where you want to run the Bulldozer pipeline step by step (for example in the case you want to make separated jobs and then submit them to a cluster).   
Setup:


In [None]:
dsm_path = '../tests/data/postprocess/dsm_test.tif'
output_dir = '../tests/data/preprocess/'
nb_max_worker = 16

*In this tutorial we don not use optional parameters. If you want more information about them, check the documentation.*

In [None]:
preprocess = PreProcess()
preprocess.run(dsm_path, output_dir, nb_max_worker)

✅ **Done!**  