# SCRIPT 03 - Create Prediction

In this script, a trained model is applied to all study area in tiles, resulting in the separation between annual and perennial agriculture for each tile.

Important libraries are used by the script:

+ `rasterio`
    + Used to open raster files, necessary to access the time series data.
+ `numpy`
    + Used to store data in matrix format and allowing algebra operations with arrays, facilitating data processing.
+ `pyplot`
    + Used to plot results for error inspection.
+ `tensorflow`
    + The library necessary for all things deep learning in the script. Loads the model and makes the prediction with Keras.
+ `tqdm`
    + Displays a progressbar to help estimate processing times.
+ `glob`
    + Necessary for gathering files according to path and name patterns.
+ `os`
    + Basic system operations, such as concatenating paths and creating folders.

In [None]:
import rasterio as r
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tqdm import tqdm
import glob
import os

The following parameters define how the predictions happens. They must be carefully altered, because most need to match previously defined parameters and generated data.

+ `tiles` (`array` of `int`)
    + An array with the list of tile ids that the user wish to make the prediction to.
+ `path_to_model_folder` (`string`)
    + The complete path to the folder that contains the model for the prediction. According to previously trained models, this path must also contain the name of the folder created with the model id.
+ `path_to_limits` (`string`)
    + The path the the file with limits used for scaling training samples. It must be from the same group of samples used to train the model.
+ `path_to_predictions_folder` (`string`)
    + A path to the folder where the resulting tiles with predicted values should be stored.
+ `path_to_folder_with_agriculture_mask_tiles` (`string`)
    + The path to the folder with agriculture masks considering during processing, but before merging, i.e., separated according to tile.
+ `path_to_monthly_time_series_folder` (`string`)
    + A path to the folder containing monthly time series separated in tiles. Like used in script 01. Time series files should follow the pattern 'Reduction_Optical_Months_id{tile:03}_B2.tif'.
+ `batch_size` (`int`)
    + The batch size used for predicting over data. It is recommended to be the same as during training.

In [None]:
tiles = [  9, 10, 11, 13, 14, 15, 23, 24, 25, 26, 
           27, 29, 30, 31, 32, 38, 39, 40, 41, 42, 
           43, 44, 45, 46, 47, 48, 51, 52, 53, 54, 
           55, 56, 57, 58, 59, 60, 61, 62, 63, 65, 
           66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 
           76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 
           86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 
           96, 97, 98, 99,100,101,102,103,104,105,
          106,107,108,109,110,111,113,114,115,116,
          117,118,119,120,121,122,123,124,125,126,
          127,130,131,132,133,134,135,136,137,138,
          139,147,148,149,150,151,152,153,154,163,
          164,165,166,167,168,169,179,180,181,182,
          183]
path_to_model_folder = '/path/to/model/folder'
path_to_limits = '/path/to/(samples_id)_limits.npy'
path_to_predictions_folder = '/path/to/predictions/folder'
path_to_folder_with_agriculture_mask_tiles = '/path/to/folder/with/agriculture/mask/tiles'
path_to_monthly_time_series_folder = '/path/to/time/series/tiles/folder'
batch_size = 2048

Here data is loaded, scaled, and then predicted upon. Later it is saved in the specified folder.

In [None]:
limits = np.load(path_to_limits)

models_paths = glob.glob(os.path.join(path_to_model_folder, '*.h5'))
models_paths.sort()
model = tf.keras.models.load_model(models_paths[-1])
print('Model loaded from:', models_paths[-1])

for tile in tqdm(tiles):
    if not os.path.exists(os.path.join(path_to_predictions_folder, f'{tile:03}.tif')):
        mask = r.open(os.path.join(path_to_folder_with_agriculture_mask_tiles, f'result_id{tile:03}.tif')).read(1)
        mask_mask = mask==1
        
        # in case there are pixels to classify
        if mask_mask.any():
            # obtaining data
            data = np.asarray([r.open(os.path.join(path_to_monthly_time_series_folder, f'Reduction_Optical_Months_id{tile:03}_B2.tif')).read()[:,mask_mask],
                               r.open(os.path.join(path_to_monthly_time_series_folder, f'Reduction_Optical_Months_id{tile:03}_B3.tif')).read()[:,mask_mask],
                               r.open(os.path.join(path_to_monthly_time_series_folder, f'Reduction_Optical_Months_id{tile:03}_B4.tif')).read()[:,mask_mask],
                               r.open(os.path.join(path_to_monthly_time_series_folder, f'Reduction_Optical_Months_id{tile:03}_B8.tif')).read()[:,mask_mask],
                               r.open(os.path.join(path_to_monthly_time_series_folder, f'Reduction_Optical_Months_id{tile:03}_B11.tif')).read()[:,mask_mask],
                               r.open(os.path.join(path_to_monthly_time_series_folder, f'Reduction_Optical_Months_id{tile:03}_B12.tif')).read()[:,mask_mask]], dtype=np.float32).T

            # scaling data
            for i in range(data.shape[-1]):
                data[:,:,i] = (data[:,:,i]-limits[i,0])/(limits[i,1]-limits[i,0])

            prediction = np.argmax(model.predict(data, batch_size=batch_size), axis=-1)+1

            result = np.zeros(mask.shape, dtype=np.byte)
            result[mask_mask] = prediction

            with r.Env():

                # Write an array as a raster band to a new 8-bit file. For
                # the new file's profile, we start with the profile of the source
                profile = r.open(os.path.join(path_to_folder_with_agriculture_mask_tiles, f'result_id{tile:03}.tif')).profile

                # And then change the band count to 1, set the
                # dtype to uint8, and specify LZW compression.
                profile.update(
                    dtype=r.uint8,
                    count=1,
                    compress='packbits')

                with r.open(os.path.join(path_to_predictions_folder, f'{tile:03}.tif'), 'w', **profile) as dst:
                    dst.write(result.astype(r.uint8), 1)