# Welcome to this tutorial ! 
            
## There are 6 different steps :   
>- 1) Import libraries and define parameters.  
>    *Variables followed by **#@param** are variables, you can change them.*
>- 2) Load and process your dataset.
>- 3) Set/find the loss, metrics and optimal hyper-parameters.
>- 4) Train and evaluate your model.  
>    *If you want to load a pretrained model, you can skip these steps and go to the next section.*
>- 5) Predict your image.
>- 6) Visualise your predictions

# 1) Libraries
### First create a new **virtual environment** then install all requirements by running the following :

In [None]:
!pip install -r requirements.txt

### Add all folders you will need

In [None]:
from utils import create_folders
create_folders()

## Your directory shoud be as following :
Check if the folders (the ones **in bold**) are in your directory.
- **Main folder**
    >- **models**
    >    >* .joblib files (sklearn models)
    >    >* .sav files (mappers such as pca and umap)
    >    >* folders (tensorflow models)
    >- **results**
    >    >* .png images (confusion matrices)
    >    >* .log files (tensorflow training curves)
    >- **data**
    >    >- **train**
    >    >    * train*.tfrecord.gz files (training dataset)
    >    >- **eval**
    >    >    * traineval*.tfrecord.gz files (evaluation dataset)
    >    >- **inference**
    >    >   * .tfrecord.gz files (inference dataset)
    >    >   * *-mixer.json files (needed for georeferencing, if you want to add the prediction to Earth Engine Editor)
    >    >- **predictions**
    >    >    - **colored_pipes**
    >    >        * .kml files (colored-pipe nets corresponding to labels)
    >    >    - **kml**
    >    >        * .kml files and corresponding .png images (mask-prediction images)
    >    >    - **tfrecords**
    >    >        * .TFRecord files (needed if you want to add the prediction to Earth Engine Editor)
    >    >    * .csv files

Import, authenticate and initialize the Earth Engine library. If you have a gmail account, do so with yours, if not, you can use this one :  
Gmail adress : [mounierseb93@gmail.com]  
Code : [mounse$15]

In [1]:
import ee
ee.Authenticate()
ee.Initialize()

Enter verification code: 4/1AY0e-g7vT-onSm_sAAoOl1tekiw3amNXLvQ9bgQKlP2jHryH_GfodIDOzws

Successfully saved authorization token.


In [None]:
import tensorflow as tf
import numpy as np 
from tensorflow.keras import backend as K
import os

In [None]:
tf.keras.backend.clear_session()
gpu = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpu[0], True)

In [2]:
from dataset_construction import TFDatasetConstruction
from dataset_loader import TFDatasetProcessing
from unet import DLModel, ModelEvaluation
from losses_and_metrics import Loss, Metric, get_class_weights
from learning_rate import  LRFinder, CyclicLR, step_decay_schedule
from inference import Inference, download_kml, download_tif, download_kml_from_tif
from utils import predict_pipes, color_pipes, get_statistics

In [None]:
# Specify inputs (Landsat bands) to the model and the response variable.
LANDSAT  = ['B1', 'B2', 'B3', 'B4', 'B5', 'B6', 'B7', 'B10', 'B11']
SENTINEL = ['VV','VH','VV_1','VH_1']
BANDS    = LANDSAT + SENTINEL
RESPONSE = 'landcover'
FEATURES = BANDS+[RESPONSE]

# Specify the size and shape of patches expected by the model.
KERNEL_SIZE   = 128 #@param {type:"integer"}
KERNEL_SHAPE  = [KERNEL_SIZE, KERNEL_SIZE]
COLUMNS       = [tf.io.FixedLenFeature(shape=KERNEL_SHAPE, dtype=tf.float32) for k in FEATURES]
FEATURES_DICT = dict(zip(FEATURES, COLUMNS))
NUM_FEATURES  = len(BANDS)
NUM_CLASSES   = 4 #@param {type:"integer"}

# 2) Dataset Loading and Processing
If you don't have access to the training dataset, download it from the Google Drive in this address and password (and make sur to add it the right folder with the same name as in the drive) :    
Gmail adress : [mounierseb93@gmail.com]  
Code : [mounse$15]  
If it's not there for some reason, or if you want to construct your own dataset, go to tuto_dataset_construction.ipynb and run it.


In [None]:
# Specify training parameters
BATCH_SIZE = 4 #@param {type:"integer"}
TRAIN_SIZE = 5000 #@param {type:"integer"}
EVAL_SIZE  = 3000 #@param {type:"integer"}

In [None]:
# Load and process training and evaluation tf.Datasets
tfdataloader = TFDatasetProcessing(FEATURES_DICT,FEATURES,BANDS,NUM_FEATURES)
training     = tfdataloader.get_training_dataset()
evaluation   = tfdataloader.get_eval_dataset()
NUM_FEATURES = tfdataloader.num_features

In [None]:
print(iter(evaluation.take(1)).next())

# 3) Training parameters

In [None]:
# Specify training parameters
MODEL_NAME = 'unet' #@param {type:"string"}
LOSS_STR = 'weighted_scc' #@param ["scc", "weighted_scc","dice","jaccard"]
OPTIMIZER = 'sgd' #@param ["sgd","adam","rmsprop"]
HISTORY_FILE = 'results/'+MODEL_NAME+'.log'
CHECKPOINT_FILE = 'models/'+MODEL_NAME

#Callbacks
MONITOR_EARLY_STOP = 'loss' #@param ["loss","val_loss"]
MONITOR_CHECKPOINT = 'loss' #@param ["loss","val_loss"]
csv_logger = tf.keras.callbacks.CSVLogger(HISTORY_FILE, separator=',', append=False)
early_stop = tf.keras.callbacks.EarlyStopping(monitor=MONITOR_EARLY_STOP, mode='min', verbose=2,min_delta=0.001,patience=4)
checkpoint = tf.keras.callbacks.ModelCheckpoint(CHECKPOINT_FILE, monitor=MONITOR_CHECKPOINT, verbose=0, save_best_only=True,save_weights_only=False, mode='min',                    save_freq=int(TRAIN_SIZE / BATCH_SIZE))

In [None]:
from keras.utils.generic_utils import get_custom_objects
class_weights = get_class_weights(training,TRAIN_SIZE,KERNEL_SIZE,NUM_CLASSES)
losses        = Loss(class_weights,NUM_CLASSES)
metrics       = Metric(NUM_CLASSES)

LOSS = losses.get_loss(LOSS_STR)
get_custom_objects().update({"loss": LOSS})

METRICS = ['sparse_categorical_accuracy',metrics.f1_score]
get_custom_objects().update({"f1_score": metrics.f1_score})

In [None]:
modelconstructor = DLModel(training,NUM_FEATURES,NUM_CLASSES,BATCH_SIZE,OPTIMIZER,LOSS,METRICS,CHECKPOINT_FILE)

## Optimal learning rate 

In [None]:
import gc
gc.collect()

In [None]:
MIN_LR    = 1e-2 #@param {type:"number"} 
MAX_LR    = 1e-1 #@param {type:"number"} 
EPOCHS    = 2 #@param {type:"integer"}
lr_finder = LRFinder(min_lr=MIN_LR, 
                      max_lr=MAX_LR, 
                      steps_per_epoch=int(TRAIN_SIZE / BATCH_SIZE), 
                      epochs=EPOCHS)

FROM_CHECKPOINT = True #@param{type:"boolean"}

m = modelconstructor.init_model(from_checkpoint=FROM_CHECKPOINT)
m.fit(
    x=training, 
    epochs=EPOCHS, 
    batch_size=BATCH_SIZE,
    steps_per_epoch=int(TRAIN_SIZE / BATCH_SIZE), 
    callbacks=[lr_finder,checkpoint])
lr_finder.plot_loss()

ind_max = np.argmax(lr_finder.history['sparse_categorical_accuracy'])
print('The lr corresponding to the maximal metric = ',lr_finder.history['lr'][ind_max])
ind_min = np.argmin(lr_finder.history['loss'])
print('The lr corresponding to the minimal loss = ',lr_finder.history['lr'][ind_min])
import gc
gc.collect()

## Staircase learning rate

In [None]:
# Choose the initial learning rate as the optimal learning rate found just above
INITIAL_LR   = 0.08 #@param {type:"number"}
DECAY_FACTOR = 0.9 #@param {type:"number"}
STEP_SIZE    = 5 #@param {type:"integer"}
lr_step      = step_decay_schedule(initial_lr=INITIAL_LR, decay_factor=DECAY_FACTOR, step_size=STEP_SIZE)

## Cyclical learning rate

In [None]:
BASE_LR   =  1e-8 #@param {type:"number"}
MAX_LR    =  5e-8 #@param {type:"number"}
STEP_SIZE =  1 #@param {type:"number"}
MODE      = 'triangular2' #@param ["triangular","triangular2","exp_range"]
clr       = CyclicLR(base_lr=BASE_LR, max_lr=MAX_LR,step_size=STEP_SIZE*int(TRAIN_SIZE / BATCH_SIZE), mode=MODE)

## Callbacks

In [None]:
CALLBACKS = [csv_logger,checkpoint]

lr_sched = 'exponential' #@param ["exponential","cyclical","constant"]

if lr_sched =='constant' :
  K.set_value(self.model.optimizer.lr, ind_max)
elif lr_sched == 'exponential' :
  CALLBACKS.append(lr_step)
elif lr_sched == 'cyclical' :
  CALLBACKS.append(clr)
else : 
  raise NotImplementedError('Unrecognised schedule')

# 4) Training and Evaluation of Model

## Training

In [None]:
FROM_CHECKPOINT = True #@param{type:"boolean"}
#m = modelconstructor.init_model(from_checkpoint=FROM_CHECKPOINT)

EPOCHS =  100 #@param {type:"integer"}

m.fit(
    x=training, 
    epochs=EPOCHS, 
    batch_size=BATCH_SIZE,
    steps_per_epoch=int(TRAIN_SIZE / BATCH_SIZE), 
    #validation_data=evaluation,
    #validation_steps=EVAL_SIZE,s
    callbacks=CALLBACKS)

m.optimizer.lr
import gc
gc.collect()

## Evaluation

In [None]:
LABEL_NAMES  = [0,1,2,3]
TARGET_NAMES = ['field','forest','urbain','water']

In [None]:
evaluator = ModelEvaluation(MODEL_NAME,EVAL_SIZE,TARGET_NAMES,LABEL_NAMES)
evaluator.evaluate(m,evaluation)

# 5) Inference

You have three options to how you create your test image :  
>**1)** Export an image from a square window of a given center [[Lon,Lat]] and [radius] (in meters)   
>**2)** Export an image from a window given a bounding box [[Lon1, Lat1, Lon2, Lat2]]  
>**3)** Export the whole area of a network* (for the brave who want to use Earth Engine's Editor)

*CAREFUL* ! if you want to **export the whole area of a network** (the third option) : 

If your network is not already uploaded to your **Google Earth Editor Assets** (I have already added sieccao, saur (zone1) and brioude), either provide a bounding box covering the whole area of the network (follow the second option) OR follow these steps :
>
>- If your file is not a **shp** file, for example a **kml**, convert it to ".shp" using QGIS :
>    * Drag your kml to the QGIS window.
>    * Right-click on your layer and choose "Export" then "Export Feature As"
>    * Set "Format" as "ESRI Shapefile" and the "CRS" as "EPSG:3857 / Pseudo-Mercator"
>    * Fill "File name" to the name of the file. Careful, you should provide the directory : example "C:\....pipe.shp"
>    * Click on "OK" and wait, this could take a moment. Now you have created several files, please keep them all.
>- Upload your files to your Google Earth Engine Editor :
>    * Go to https://code.earthengine.google.com/
>    * Click on "Assets", then "NEW", then below "Table Upload", click on "Shape files". Select all the files you just >created with QGIS.
>    * Set Assetid to the name of your network ie "brioude"
>    * Click on "UPLOAD"

In [None]:
# Specify inference parameters
start_date = "2020-01-01"
end_date   = "2020-12-31"
image_name   = 'test' #Name your image as you want'

#FILL ONLY ONE OF THE FOLLOWING :

# 1) if you want to export an image from a square window of a given center and radius (meters) :
lon    = None #@param {type;'number'}
lat    = None #@param {type:'number'}
point  = [lon,lat]
radius = None #@param {type:'number'} (in meters)

# 2) if you want to export an image from a window given a bounding box
minLng    = None #@param {type:'number'}
minLat    = None #@param {type:'number'}
maxLng    = None #@param {type:'number'}
maxLat    = None #@param {type:'number'}
rectangle = [minLng, minLat, maxLng, maxLat]

# 3) if you want to export the whole area of a network
# can be 'brioude','sieccao','saur' or the name of your network you just created following the tutorial above
network_name = None #@param {type:'string'}

In [None]:
# Construct inference dataset
tfdataconstructor = TFDatasetConstruction(LANDSAT,SENTINEL,RESPONSE,KERNEL_SIZE)
corners = tfdataconstructor.test_dataset_construction(start_date,end_date,image_name,patch_size=KERNEL_SHAPE,radius=radius,point=point,length=length,rectangle=rectangle)

In [None]:
# Load and predict inference dataset and write predictions
tfdataloader = TFDatasetProcessing(FEATURES_DICT,FEATURES,BANDS,NUM_FEATURES)
testdataset  = tfdataloader.get_inference_dataset(image_name)
NUM_FEATURES = tfdataloader.num_features

inference = Inference(NUM_CLASSES,MODEL_NAME)
predictions = inference.doDLPrediction(testdataset,image_name)

# 6) Visualization
On **GEEMap** OR on **Google Earth Pro**

*Note : replace **filename** by the name of your inference image in the tutorial below* .
  
You can visualize predictions directly on GEEMap OR on Google Earth Pro by creating a kml file.
*But*  you need to georeference your predictions by adding to Google Earth Editor the files you just created : **filename.tfrecord** and **filename-mixer.json**. To do so, follow the following steps :
- Go to https://code.earthengine.google.com/
- Click on "Assets", then "NEW", then below "Image Upload", click on "GeoTIFF".
- In "Sources files" select the **filename.TFRecord** file folder 'tfrecords' under 'predictions' (ie data/predictions/tfrecords/filename.TFRecord)
- Add to "Sources files" the **filename-mixer.json** file in your folder 'inference' (ie data/inference/filename-mixer.json)
- Set "AssetId" to "filename_pred"
- Click on "UPLOAD"
- On the right corner of your screen, click on "Tasks". Check the status of your export.
- If there's an error "cannot read mixer file", retry the steps above by putting the mixer file before the tfrecord file and vise-versa several times until you succeed, the system bugs sometimes.

## Visualise on GMap

In [None]:
import geemap.folium as gmap
Map = gmap.Map()
predictions = ee.Image('users/lealm/'+image_name+'_pred')
Map.addLayer(predictions,{'min':0,'max':3,'palette':['lime','darkgreen','yellow','blue']},'predictions')

## Export a tif and kml

In [3]:
ASSETID = image_name + '_pred'
# Since Earth Engine allows export only in tif format, we export the tif first and then we convert it to kml
download_tif(ASSETID)

Image export completed
Download image test_image_pred from drive (directory data/predictions) if you work on your local computer


In [4]:
download_kml_from_tif('data/predictions/'+ASSETID+'.tif')

Kml saved at "data/predictions/kml"
