<img style="float: left; margin:0px 15px 15px 0px; width:120px" src="https://www.orfeo-toolbox.org/wp-content/uploads/2016/03/logo-orfeo-toolbox.png">

# OTB Guided Tour - Virtual Workshop
## Emmanuelle SARRAZIN, Yannick TANGUY and David YOUSSEFI (CNES, French Space Agency)

<br>

Orfeo ToolBox (OTB) is an open-source library for remote sensing images processing. It has been initiated and funded by CNES to promote the use and the exploitation of the satellites images. Orfeo ToolBox aims at enabling large images state-of-the-art processing even on limited resources laptops, and is shipped with a set of extensible ready-to-use tools for classical remote sensing tasks, as well as a fully integrated, end-users oriented software called Monteverdi ; OTB is also accessible via QGIS processing module.

This tutorial will present the ORFEO Toolbox and showcase available applications for processing and manipulating satellite imagery.

<b> Press <span style="color:black;background:yellow">SHIFT+ENTER</span> to execute the notebook interactively cell by cell </b></div>



## Mount Google Drive and OTB Installation (optional step)

The following cells are needed to run this notebook on Google Colab. First it mounts a google drive folder, then it downloads OTB binaries and install it in the virtual environment. Then, it compiles Python bindings so we can later run a "import otpApplication" command.

*if you run this notebook from your own computer, you can jump to cell 5 ("Sentinel 2 Dataset")*

In [None]:
from google.colab import drive
drive.mount('/content/gdrive/')
# *******************************************************************************************************
# 
# At this step, google will ask you to login and authorize access to your google drive from this notebook
#
# *******************************************************************************************************

import sys

# TODO : fix folder path
FOLDER = 'gdrive/My Drive/material/01-otb-guided-tour/'
sys.path.append(FOLDER)

# This will download Orfeo ToolBox
!wget https://www.orfeo-toolbox.org/packages/OTB-7.2.0-Linux64.run
!apt-get install file

# This configures OTB (source environment and compile Python bindings)
!chmod +x OTB-7.2.0-Linux64.run && ./OTB-7.2.0-Linux64.run && cd OTB-7.2.0-Linux64 && ctest -S share/otb/swig/build_wrapping.cmake -VV

In [None]:
# *******************************************************************************************************
# Configure OTB environment variables
# *******************************************************************************************************
import os, sys
os.environ["CMAKE_PREFIX_PATH"] = "/content/OTB-7.2.0-Linux64"
os.environ["OTB_APPLICATION_PATH"] = "/content/OTB-7.2.0-Linux64/lib/otb/applications"
os.environ["PATH"] = "/content/OTB-7.2.0-Linux64/bin" + os.pathsep + os.environ["PATH"]
sys.path.insert(0, "/content/OTB-7.2.0-Linux64/lib/python")
os.environ["LC_NUMERIC"] = "C"
os.environ["GDAL_DATA"] = "content/OTB-7.2.0-Linux64/share/gdal"
os.environ["PROJ_LIB"] = "/content/OTB-7.2.0-Linux64/share/proj"
os.environ["GDAL_DRIVER_PATH"] = "disable"
os.environ["OTB_MAX_RAM_HINT"] = "1000"

In [None]:
# Installation of third-parties libraries
!pip install rasterio

## Display the Sentinel 2 Dataset

### Data / Output directory

* fix links to the dataset
* different dates avalaible
* display images : ipyleaflets doen't work on google colab -> use rasterio

In [1]:
#import ipywidgets

# Data directory
DATA_DIR = "/gdrive/MyDrive/OTB_training/data"



# Output directory
OUTPUT_DIR = "/gdrive/MyDrive/OTB_training/output"


In [15]:
DATA_DIR = "/home/yannick/Dev/dataset_OTB_virtual_workshop/"
# Local execution
OUTPUT_DIR = "/home/yannick/tmp"


### Choose your dataset (by date)

In [3]:
import os
from glob import glob


In [13]:
#import display_api
image = DATA_DIR+"/xt_20180701_RVB_NIR.tif"
# displays a raster on a ipyleaflet map (
#    rasters_list: rasters to display (rasterio image),
#    out_dir: path to the output directory (preview writing)
#    overlay_names_list: name of the overlays for the map)


#m  = display_api.rasters_on_map([raster], OUTPUT_DIR, [DATE])


NameError: name 'm' is not defined

In [5]:

def rasters_on_map_with_folium(rasters_list, out_dir, overlay_names_list, geojson_data=None):
    """
    displays a raster on a ipyleaflet map
    :param rasters_list: rasters to display (rasterio image)
    :param out_dir: path to the output directory (preview writing)
    :param overlay_names_list: name of the overlays for the map
    """
    # - get bounding box
    raster = rasters_list[0]
    epsg4326 = {'init': 'EPSG:4326'}
    bounds = transform_bounds(raster.crs, epsg4326, *raster.bounds)
    center = [(bounds[0]+bounds[2])/2, (bounds[1]+bounds[3])/2]

    # - get centered map
    m = folium.Map(location=(center[-1], center[0]),start_zoom=10)

    # - plot quicklook
    for raster, overlay_name in zip(rasters_list, overlay_names_list):
        bounds = transform_bounds(raster.crs, epsg4326, *raster.bounds)
        quicklook_url = os.path.join(out_dir, "PREVIEW_{}.JPG".format(datetime.datetime.now()))
        write_quicklook(raster, quicklook_url)
        quicklook = folium.raster_layers.ImageOverlay(
            quicklook_url,
            ((bounds[1], bounds[0]),(bounds[3], bounds[2])),
            name=overlay_name
        )
        m.add_children(quicklook)

    # - add geojson data
    if geojson_data is not None:
        geo_json = GeoJSON(data=geojson_data,
                           style = {'color': 'green', 'opacity':1, 'weight':1.9, 'dashArray':'9', 'fillOpacity':0.1})
        m.add_children(geo_json)

    return m

In [16]:
import rasterio
from rasterio.warp import transform_bounds
from rasterio.warp import transform



im = rasterio.open(DATA_DIR+"/xt_SENTINEL2B_20180621_RGB_NIR.tif")
m  = rasters_on_map_with_folium([im], OUTPUT_DIR, ["bblabal"])

CRSError: Unable to open EPSG support file gcs.csv.  Try setting the GDAL_DATA environment variable to point to the directory containing EPSG csv files.

## 1) How to compute vegetation index with OTB

Here we create an application with otbApplication.Registry.CreateApplication("BandMath")

BandMath takes a list of images as input, so we have to give a Python list with "il" parameter : [image], or [image1, image2, .., imageN] and the main parameter is the mathematical expression "exp".

Here, we compute NDVI :
NDVI=(ùëãùëõùëñùëü‚àíùëãùëüùëíùëë)(ùëãùëõùëñùëü+ùëãùëüùëíùëë)

The corresponding bands for NIR and Red are respectively the 4th and the 1st bands (b4, b1) of the first image (im1)

In [None]:
import otbApplication as otb

In [None]:
def compute_ndvi(im, ndvi):
    app = otb.Registry.CreateApplication("BandMath")
    app.SetParameterStringList("il",[im])
    app.SetParameterString("out", ndvi)
    app.SetParameterString("exp", "(im1b4-im1b1)/(im1b4+im1b1)")
    exit_code = app.ExecuteAndWriteOutput()

In [None]:
image1 = DATA_DIR+"/xt_SENTINEL2B_20180621_RGB_NIR.tif"
image2 = DATA_DIR+"/xt_SENTINEL2B_20180701_RGB_NIR.tif"
image3 = DATA_DIR+"/xt_SENTINEL2B_20180711_RGB_NIR.tif"

In [None]:
ndvi1 = OUTPUT_DIR+"/ndvi1.tif"
ndvi2 = OUTPUT_DIR+"/ndvi2.tif"
ndvi3 = OUTPUT_DIR+"/ndvi3.tif"

In [None]:
# Choose an image an compute the NDVI ()
#compute_ndvi(image1, ndvi1)
compute_ndvi(...., ....)

In [None]:
# display the result
display_on_map(ndvi1)

## 2) Compute Normalized Difference Water Index from Green and Near InfraRed bands

NDWI2 is computed from green and nir bands (defined by McFeeters, 1996):

NDWI2=(ùëãùëîùëüùëíùëíùëõ‚àíùëãùëõùëñùëü)(ùëãùëîùëüùëíùëíùëõ+ùëãùëõùëñùëü)

For this second variant of the NDWI, a threshold can also be found in https://www.mdpi.com/2072-4292/5/7/3544/htm (McFeeters, 2013):

* < 0.3 - Non-water
* >= 0.3 - Water


In [None]:
def compute_ndwi(im, ndwi):
    app = otb.Registry.CreateApplication("BandMath")
    app.SetParameterStringList("il",[im])
    app.SetParameterString("out", ndwi)
    app.SetParameterString("exp", "your expression")
    exit_code = app.ExecuteAndWriteOutput()

In [None]:
def compute_ndwi(im, ndwi):
    app = otb.Registry.CreateApplication("BandMath")
    app.SetParameterStringList("il",[im])
    app.SetParameterString("out", ndwi)
    app.SetParameterString("exp", "(im1b2-im1b4)/(im1b2+im1b4)")
    exit_code = app.ExecuteAndWriteOutput()

In [None]:
# Compute and display NDWI on your images : 
ndwi1 = OUTPUT_DIR+"/ndwi1.tif"
compute_ndwi(image1,ndwi1)
display_on_map(ndwi1)

## 3) Compute Watermask

The aim of this second exercise is to combine NDWI2 values to create a water mask.

As we have seen, the NDWI2 images are different depending on the dates, mainly because tides level is different and there maybe some clouds that hide some regions of the image.

We have to find a function that can combine the information from the different NDWI images to create a watermask

### Create a simple watermask (NDWI threshold)

OTB BandMath can use these formula :

    binary operators:
        ‚Äò+‚Äô addition, ‚Äò-‚Äò subtraction, ‚Äò*‚Äô multiplication, ‚Äò/‚Äô division
        ‚Äò^‚Äô raise x to the power of y
        ‚Äò<‚Äô less than, ‚Äò>‚Äô greater than, ‚Äò<=‚Äô less or equal, ‚Äò>=‚Äô greater or equal
        ‚Äò==‚Äô equal, ‚Äò!=‚Äô not equal
        ‚Äò||‚Äô logical or, ‚Äò&&‚Äô logical and
    functions: exp(), log(), sin(), cos(), min(), max(), ...
    if-then-else : "(<expression> ? <value if true> : <value if false>)"

https://www.orfeo-toolbox.org/CookBook/Applications/app_BandMath.html

Try to write an expression to create a basic watermask :

*if ndwi < 0.3 then return 0 else return 1*

In [None]:
def threshold_ndwi(ndwi, mask):
    # Fill the threshold_ndwi function
    pass

mask1 = OUTPUT_DIR+"/mask1.tif"
threshold_ndwi(ndwi1, mask1)

In [None]:
def threshold_ndwi(ndwi, mask):
    app = otb.Registry.CreateApplication("BandMath")
    app.SetParameterStringList("il",[ndwi])
    app.SetParameterString("out", mask)
    app.SetParameterString("exp", "(im1b1 < 0.3 ? 0 : 1) ")
    exit_code = app.ExecuteAndWriteOutput()

### Create a watermask with the different NDWI images

We now want to use the three dates to obtain the better mask, that will better identify the water presence. To do so, we shall identify the largest areas (high tides) in the different watermasks.

Tips : OTB BandMath can take as input a list of images (im1, im2, ...) and produce a single result

In [None]:
def create_water_mask(ndwi1, ndwi2, ndwi3, mask):
    pass


In [None]:
def create_water_mask(ndwi1, ndwi2, ndwi3, mask):
    app = otb.Registry.CreateApplication("BandMath")
    app.SetParameterStringList("il",[ndwi1, ndwi2, ndwi3])
    app.SetParameterString("out", mask)
    app.SetParameterString("exp", "(max(im1b1, im1b2, im1b3) < 0.3 ? 0 : 1) ")
    exit_code = app.ExecuteAndWriteOutput()

In [None]:
final_mask = OUTPUT_DIR+"/mask.tif"
threshold_ndwi(ndwi1, ndwi2, ndwi3, final_mask)

In [None]:
display_on_map(final_mask)

## 4) Polygonize watermask and filter features to count islands

In this step, we are going to polygonize our binary masks : we will obtain a lot of polygons ! Some of these features have to be filtered (main land, ocean) in order to count the islands in Morbihan gulf.

In [None]:
import numpy as np

import rasterio
from rasterio import features
from rasterio import warp

from shapely.geometry import Polygon

# Output directory
OUTPUT_DIR = "output"

# Morbihan gulf
morbihan = {'type': 'FeatureCollection', 'features': [{'type': 'Feature', 'properties': {}, 'geometry': {'type': 'Polygon', 'coordinates': [[[-2.953968, 47.603544], [-2.958085, 47.589653], [-2.929956, 47.563713], [-2.883303, 47.561397], [-2.86478, 47.556763], [-2.840081, 47.547958], [-2.826638, 47.552744], [-2.808114, 47.55761], [-2.774497, 47.5495], [-2.749113, 47.546951], [-2.730932, 47.564329], [-2.728188, 47.5861], [-2.740537, 47.595825], [-2.747055, 47.605317], [-2.780672, 47.609252], [-2.786846, 47.613649], [-2.802348, 47.61882], [-2.831848, 47.616969], [-2.857233, 47.620209], [-2.862721, 47.609563], [-2.865466, 47.600303], [-2.880559, 47.59984], [-2.891536, 47.597988], [-2.896339, 47.582243], [-2.938875, 47.589653], [-2.953968, 47.603544]]]}}]}
morbihan_as_polygon = Polygon(morbihan['features'][0]['geometry']['coordinates'][0])

# Convert Watermask to geojson collection
with rasterio.open(final_mask) as src:
    image = src.read(1).astype(np.uint8)
try:
    transform = src.affine
# depend on rasterio version
except AttributeError:
    transform = src.transform

results = ({'type':'Feature', 'properties': {}, 'geometry': s} for i, (s, __) in enumerate(features.shapes(image, mask=image, transform=transform)))      

# Filter geojson
collection = {'type': 'FeatureCollection', 'features': list()}
for res in results:
    # area in m^2
    island_area = Polygon(res['geometry']['coordinates'][0]).area
    
    # convert geom to EPSG:4326 (WGS84)
    geom_for_geojson = warp.transform_geom(src.crs, 'EPSG:4326', res['geometry'])   
    island_as_polygon = Polygon(geom_for_geojson['coordinates'][0])
    
    # Filter the smallest areas and the biggest (main land) and
    # Crop the "watermask" with envelope shape morbihan (~ Morbihan gulf)
    if island_area < 5000000.0 and island_area > 10000.0 and island_as_polygon.intersects(morbihan_as_polygon):
        feature = dict(res)
        feature['geometry'] = geom_for_geojson
        collection['features'].append(feature)

### Visualize the islands (and count them :-)

In [None]:
import rasterio
from glob import glob
import display_api
import json

# Data directory
DATA_DIR = "data"

DATE = "20180711"

print ("Nb islands: {}".format(len(collection['features'])))
raster = rasterio.open(glob(os.path.join(DATA_DIR, "*{}*.tif".format(DATE)))[0])
m, dc = display_api.rasters_on_map([raster], OUTPUT_DIR, [DATE], geojson_data=collection)
m

## Extra steps

We could optimize the previous code by using OTB pipeline : instead of computing 3 watermaks, and then combining them, we could simplify the processing chain and compute directly the final watermask. This will save I/O and thus save computation time (especially if the chain is complex or use a lot of images).

Since OTB 5.8, it is possible to connect an output image parameter from one application to the input image parameter of the next parameter. This results in the wiring of the internal ITK/OTB pipelines together, permitting image streaming between the applications. Consequently, this removes the need of writing temporary images and improves performance. Only the last application of the processing chain is responsible for writing the final result images.

<b> Please rewrite the  <span style="color:black;background:yellow"> code bellow </span> in order to only write the watermask file </b>

**Tips:** Only call Execute() to setup the pipeline, not ExecuteAndWriteOutput() which would run it and write the output image and also use these functions to connect OTB applications :
- ```GetParameterOutputImage``` : get a pointer to an image object [instead of reading from file]
- ```AddImageToParameterInputImageList``` : add an image to an InputImageList parameter as an pointer to an image object pointer [instead of reading from file] (```SetParameterInputImage``` for an InputImageList)