# Welcome to this inference tutorial
            
### There are 3 different steps
>- 1) Install and import libraries, create folders and define parameters.  
>    *Variables followed by **#@param** are variables, you can change them.*
>- 2) Creating a mask image of labels.
>- 3) Assigning a class to pipes, calculating statistics and creating colored pipes  

# 1) Libraries
### First create a new **virtual environment** then install all requirements by running the following :

In [1]:
!pip install -r requirements.txt

b->-r requirements.txt (line 4)) (7.2.0)
Collecting modin[ray]>=0.8.1.1
  Using cached modin-0.8.3-py3-none-win_amd64.whl (564 kB)
ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.

We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.

modin 0.8.3 requires pandas==1.1.5, but you'll have pandas 1.0.5 which is incompatible.
Collecting pyarrow==1.0; extra == "ray"
  Downloading pyarrow-1.0.0-cp38-cp38-win_amd64.whl (10.5 MB)
Collecting ray>=1.0.0; extra == "ray"
  Downloading ray-1.1.0-cp38-cp38-win_amd64.whl (15.3 MB)
Collecting py-spy>=0.2.0
  Using cached py_spy-0.3.4-py2.py3-none-win_amd64.whl (1.4 MB)
Collecting aioredis
  Using cached aioredis-1.3.1-py3-none-any.whl (65 kB)
Collecting opencensus
  Using cached opencensus-0.7.12-py2.py3-none-any.whl (127 kB)
Collecting gpustat
  Using cached gpust

### Create all folders you will need

In [2]:
from utils import create_folders
create_folders()

### Your directory shoud be as following :
Check if the folders (the ones **in bold**) are in your directory.
- **Main folder**
    >- **models**
    >    >* .joblib files (sklearn models)
    >    >* .sav files (mappers such as pca and umap)
    >    >* folders (tensorflow models)
    >- **results**
    >    >* .png images (confusion matrices)
    >    >* .log files (tensorflow training curves)
    >- **data**
    >    >- **train**
    >    >    * train*.tfrecord.gz files (training dataset)
    >    >- **eval**
    >    >    * traineval*.tfrecord.gz files (evaluation dataset)
    >    >- **inference**
    >    >   * .tfrecord.gz files (inference dataset)
    >    >   * *-mixer.json files (needed for georeferencing, if you want to add the prediction to Earth Engine Editor)
    >    >- **predictions**
    >    >    - **colored_pipes**
    >    >        * .kml files (colored-pipe nets corresponding to labels)
    >    >    - **kml**
    >    >        * .kml files and corresponding .png images (mask-prediction images)
    >    >    - **tfrecords**
    >    >        * .TFRecord files (needed if you want to add the prediction to Earth Engine Editor)
    >    >    * .csv files

### Import, authenticate and initialize the Earth Engine library.  
If you have a gmail account, do so with yours, if not, you can use this one :  
Gmail adress : [mounierseb93@gmail.com]    
Code : [mounse$15]

In [4]:
import ee
ee.Authenticate()
ee.Initialize()


Successfully saved authorization token.


In [1]:
import tensorflow as tf
from dataset_construction import TFDatasetConstruction
from dataset_loader import TFDatasetProcessing, NPDatasetProcessing, undersample
from models import ModelTrainingAndEvaluation
from inference import Inference, download_kml
from utils import predict_pipes, predict_pipes_from_csv, clean_predictions, color_pipes, get_statistics

In [3]:
# Specify inputs of your project
MODEL_NAME = 'rf' #@param #the name of the model (without the extension), it should be the same as the one in your folder "models"

LANDSAT  = ['B1', 'B2', 'B3', 'B4', 'B5', 'B6', 'B7', 'B10', 'B11']
SENTINEL = ['VV','VH','VV_1','VH_1']
BANDS    = LANDSAT + SENTINEL
RESPONSE = 'landcover'
FEATURES = BANDS+[RESPONSE]
KERNEL_SIZE   = 128
KERNEL_SHAPE  = [KERNEL_SIZE, KERNEL_SIZE]
COLUMNS       = [tf.io.FixedLenFeature(shape=KERNEL_SHAPE, dtype=tf.float32) for k in FEATURES]
FEATURES_DICT = dict(zip(FEATURES, COLUMNS))
NUM_FEATURES  = len(BANDS)
NUM_CLASSES   = 4 

# 2) Creating a label image
### a) First connect to Google Drive. If you have a gmail account, do so with yours, if not, you can use this one :  
Gmail adress : [mounierseb93@gmail.com]    
Code : [mounse$15]

**Every time your run a code, if you receive a message like this : "Please download file from Drive from folder ...", go to the Google Drive and to the folder mentioned, and download the file in the same folder on your computer.**


**b) You have three options to how you create your test image :**
>**1)** Export an image from a square window of a given center [[Lon,Lat]] and [radius] (in meters)   
>**2)** Export an image from a window given a bounding box [[Lon1, Lat1, Lon2, Lat2]]  
>**3)** Export the whole area of a network* (for the brave who want to use Earth Engine's Editor)



*CAREFUL* ! if you want to **export the whole area of a network** (the 3d option) : 

If your network is not already uploaded to your **Google Earth Editor Assets** (I have already added sieccao, saur (zone 1) and brioude), either provide a bounding box covering the whole area of the network (follow the second option) OR follow these steps :
>
>- If your file is not a **shp** file, for example a **kml**, convert it to ".shp" using QGIS :
>    * Drag your kml to the QGIS window.
>    * Right-click on your layer and choose "Export" then "Export Feature As"
>    * Set "Format" as "ESRI Shapefile" and the "CRS" as "EPSG:3857 / Pseudo-Mercator"
>    * Fill "File name" to the name of the file. Careful, you should provide the directory : example "C:\....pipe.shp"
>    * Click on "OK" and wait, this could take a moment. Now you have created several files, please keep them all.
>- Upload your files to your Google Earth Engine Editor :
>    * Go to https://code.earthengine.google.com/
>    * Click on "Assets", then "NEW", then below "Table Upload", click on "Shape files". Select all the files you just >created with QGIS.
>    * Set Assetid to the name of your network ie "brioude"
>    * Click on "UPLOAD"


In [7]:
# Specify inference parameters
start_date = "2020-01-01"
end_date   = "2020-12-31"
image_name = 'test' #Name your image as you want'

#FILL ONLY ONE OF THE FOLLOWING :

# 1) if you want to export an image from a square window of a given center and radius (meters) :
lon    = None #@param {type;'number'}
lat    = None #@param {type:'number'}
point  = [lon,lat]
radius = None #@param {type:'number'} (in meters)

# 2) if you want to export an image from a window given a bounding box
minLng    = None #@param {type:'number'}
minLat    = None #@param {type:'number'}
maxLng    = None #@param {type:'number'}
maxLat    = None #@param {type:'number'}
rectangle = [minLng, minLat, maxLng, maxLat]

# 3) if you want to export the whole area of a network
# can be 'brioude','sieccao','saur' or the name of your network you just created following the tutorial above
network_name = None #@param {type:'string'}

In [8]:
# Construct inference dataset
tfdataconstructor = TFDatasetConstruction(LANDSAT,SENTINEL,RESPONSE,KERNEL_SIZE)
corners = tfdataconstructor.test_dataset_construction(start_date,end_date,image_name,network_name=network_name,point=point,radius=radius,rectangle=rectangle)

Running export...
Image and Mixer export completed
Please download all files starting with "test" from drive (directory data/inference) if you work on your local computer


In [9]:
#Whether to perform Conditional Random Fields on your predictions
PERFORM_CRF = False #@param

if PERFORM_CRF == True :
    !pip install --upgrade cython
    !pip install --upgrade pydensecrf
    
#If you are on windows and have trouble installing pydensecrf : 
#if you use anaconda, execute the following : conda install -c conda-forge pydensecrf').
#if not, or you have more fails, check https://github.com/lucasb-eyer/pydensecrf

In [13]:
# Load and predict inference dataset and write predictions
tfdataloader = TFDatasetProcessing(FEATURES_DICT,FEATURES,BANDS,NUM_FEATURES,None)
testdataset  = tfdataloader.get_inference_dataset(image_name)
NUM_FEATURES = tfdataloader.num_features

inference = Inference(NUM_CLASSES,MODEL_NAME)
predictions = inference.doMLPrediction(testdataset,image_name,NUM_FEATURES,perform_crf=PERFORM_CRF)

# This function downloads the label image as KML 
download_kml(predictions,image_name,*corners)

Looking for TFRecord files...
files found :  ['data\\inference\\test-00000.tfrecord.gz', 'data\\inference\\test-mixer.json']
Running predictions...
Writing predictions...
Kml saved at "data/predictions/kml"


# 3) Assigning a class to pipes, calculating statistics and creating colored pipes
Two options, one easy but slow that predicts from scratch each pipe, and another tricky but fast that uses your former predictions

## 1) Easy but slow option (no need to run the 2nd section "Creating a label image")
Using Earth Engine Editor can be tricky for a beginner, so I made the following functions that allow you to assign a class to each pipe and create kml files of colored pipes, without using the editor.
These functions are **time consuming** (4 hours for Sieccao for example) but easy to execute.   
*Note* : you should provide a csv of your net (you should have a function in sql_connector.py that does that) with 5 columns as : 
- [Name] for the pipe id
- [lon1, lon2, lat1, lat2] for the coordinates of each pipe

In [2]:
# Enter the name of your csv file :
coordsfilename = 'data/predictions/coords.csv' #@param 

# This function assigns a class to each pipe, it takes time to run ! 
predict_pipes_from_csv(coordsfilename,MODEL_NAME, BANDS, "2020-01-01","2020-12-31")

# This function calculates the statistics of the network provided (proportion of each class)
get_statistics(coordsfilename)

In [None]:
# This functions creates a KML of the network provided where each pipe has a color corresponding to its class
#If it takes too long, execute the multi-processing version of color_pipe named mp_color_pipes
color_pipes(coordsfilename) 

def mp_color_pipes(filename) :
    import simplekml
    import pandas as pd
    import matplotlib
    import numpy as np
    ds_test = pd.read_csv(file_name)
    lines = (ds_test['Name'], ds_test['lon1'], ds_test['lat1'], ds_test['lon2'], ds_test['lat2'], ds_test['landcover'])
    kml = simplekml.Kml()
    ids, lons1, lats1, lons2, lats2, classes = lines
    
    def color(id, lon1, lat1, lon2, lat2, classe):
        line = kml.newlinestring(name=str(id), coords=[(lon1,lat1), (lon2,lat2)])
        if classe == 0:
            r,g,b = np.multiply(255,matplotlib.colors.to_rgb('lime')).astype(int)
        elif classe == 1:
            r,g,b = np.multiply(255,matplotlib.colors.to_rgb('darkgreen')).astype(int)
        elif classe == 2:
            r,g,b = np.multiply(255,matplotlib.colors.to_rgb('yellow')).astype(int)
        else:
            r,g,b = np.multiply(255,matplotlib.colors.to_rgb('blue')).astype(int)
        line.style.linestyle.color = simplekml.Color.rgb(r,g,b)

    def color_wrapper(args):
        color(*args)
        
    from multiprocessing.pool import ThreadPool as Pool
    p = Pool(5)
    
    inputs = zip(ids, lons1, lats1, lons2, lats2, classes)
    p.map(color_wrapper,inputs)
    name = os.path.splitext(os.path.basename(filename))[0].split('_classification')[0]
    kml.save('data/predictions/colored_pipes/'+name+'_colored.kml')

In [None]:
mp_color_pipes(coordsfilename)

## 2) Tricky but fast option (use Earth Engine's Editor)
The following functions are VERY fast, but in order to run them, you should add your predictions (calculated in the 2nd section "Creating a label image") to your Google Earth Editor's Assets, the two files listed below have been created throughout inference.  
*(replace filename by the name of your inference image in the tutorial below)*

- First, if not already done, add your network to Google Earth Editor's Assets following the tutorial in the previous section.
- Go to https://code.earthengine.google.com/ .
- Click on "Assets", then "NEW", then below "Image Upload", click on "GeoTIFF".
- In "Sources files" select the **filename.TFRecord** file in your folder 'tfrecords' under 'predictions' (ie select predictions/tfrecords/filename.TFRecord)
- Add to "Sources files" the **filename-mixer.json** file in your folder 'inference' (ie select inference/filename-mixer.json)
- Set "AssetId" to "filename_pred"
- Click on "UPLOAD"
- On the right corner of your screen, click on "Tasks". Check the status of your export.
- If there's an error "cannot read mixer file", retry the steps above by putting the mixer file before the tfrecord file and vise-versa several times until you succeed, the system bugs sometimes.

In [None]:
network_name = 'saur' #@param can be 'sieccao', 'brioude', 'saur' or the name of your network you just added to Assets

In [None]:
#Produce a csv file with pipe names, coordinates and classes
predict_pipes(network_name,image_name) #image_name is the name of the image you created in the 2nd section "creating a label image"

#Formats the csv in order to have the right columns : Name, lon1,lon2,lat1,la2
clean_predictions(image_name)
filename = 'data/predictions/'+image_name+'_classification.csv'

#Calculates statistics of your network (the proportions of each class)
get_statistics(filename)

In [None]:
# This functions creates a KML of the network provided where each pipe has a color corresponding to its class
#If it takes too long, execute the multi-processing version of color_pipe named mp_color_pipes
color_pipes(filename) 

def mp_color_pipes(filename) :
    import simplekml
    import pandas as pd
    import matplotlib
    import numpy as np
    ds_test = pd.read_csv(file_name)
    lines = (ds_test['Name'], ds_test['lon1'], ds_test['lat1'], ds_test['lon2'], ds_test['lat2'], ds_test['landcover'])
    kml = simplekml.Kml()
    ids, lons1, lats1, lons2, lats2, classes = lines
    
    def color(id, lon1, lat1, lon2, lat2, classe):
        line = kml.newlinestring(name=str(id), coords=[(lon1,lat1), (lon2,lat2)])
        if classe == 0:
            r,g,b = np.multiply(255,matplotlib.colors.to_rgb('lime')).astype(int)
        elif classe == 1:
            r,g,b = np.multiply(255,matplotlib.colors.to_rgb('darkgreen')).astype(int)
        elif classe == 2:
            r,g,b = np.multiply(255,matplotlib.colors.to_rgb('yellow')).astype(int)
        else:
            r,g,b = np.multiply(255,matplotlib.colors.to_rgb('blue')).astype(int)
        line.style.linestyle.color = simplekml.Color.rgb(r,g,b)

    def color_wrapper(args):
        color(*args)
        
    from multiprocessing.pool import ThreadPool as Pool
    p = Pool(5)
    
    inputs = zip(ids, lons1, lats1, lons2, lats2, classes)
    p.map(color_wrapper,inputs)
    name = os.path.splitext(os.path.basename(filename))[0].split('_classification')[0]
    kml.save('data/predictions/colored_pipes/'+name+'_colored.kml')

In [None]:
mp_color_pipes(coordsfilename)