<p align="center">
  <img src="https://user-images.githubusercontent.com/90031508/183531098-494a5819-7714-4f72-8ff8-d038982eb5f0.png" alt="Water Oracle logo"/>
</p>



This Work is adapted from 'Tensorflow example workflows', 
https://developers.google.com/earth-engine/guides/tf_examples examples.
Copyright 2020 Google LLC. https://www.apache.org/licenses/LICENSE-2.0

Please run this notebook on google colab (pro+)

<table class="ee-notebook-buttons" align="left"><td>
<a target="_blank"  href="https://colab.research.google.com/drive/1VHBIUorm3GaDxb_GQFhRb-WpSfokexAO?usp=sharing">
    <img src="https://www.tensorflow.org/images/colab_logo_32px.png" /> Run in Google Colab</a>
</td><td>
<a target="_blank"  href="https://github.com/ese-msc-2021/irp-kl121"><img width=32px src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source on GitHub</a></td></table>

# Introduction


## Prerequisites
- Google account and logins
- Google colab subscription with pro or pro+ is optional but would help with long runtime
- Google cloud platform account in order to use google cloud bucket. (Note that you would need sufficient funds to store large amount of models and training data.)
- Wandb.ai account which is free of charge

## What is this notebook?

The main purpose of this notebook is
- <b> Exporting training data locally in Thailand to google cloud bucket</b>: use earth engine's package to export multiple training patches that will be used to train models in the other notebook (TrainModels.ipynb)

The notebook objectives is broken down into 4 objectives

1. Experimental Setup
2. Sampling Data for training and testing
3. Sampling Data for Southern Thailand Flood event
4. Sampling Data for Cloud Cover vs Model performance

## Creating Packages

Creating the tools packages that will be used throughout the notebook. The package includes 
- metrics_.py
- config.py
- preprocessing.py
- sampling.py

In [1]:
PACKAGE_PATH = 'tools'

!ls -l
!mkdir {PACKAGE_PATH}
!touch {PACKAGE_PATH}/__init__.py
!ls -l {PACKAGE_PATH}

total 4
drwxr-xr-x 1 root root 4096 Aug 15 13:44 sample_data
total 0
-rw-r--r-- 1 root root 0 Aug 24 16:01 __init__.py


In [2]:
%%writefile {PACKAGE_PATH}/metrics_.py

from keras import backend as K
import tqdm.notebook as tq
import numpy as np
import tensorflow as tf
from sklearn.metrics import f1_score
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import accuracy_score

CONFIG = None

__all__ = ["f1", "custom_accuracy", "MetricCalculator",
           "MetricCalculator_multiview_2", "MetricCalculator_multiview_3",
           "MetricCalculator_NDWI", "ndwi_threashold"]


def f1(y_true, y_pred):
    """
    The function is used as tensorflow metrics when training.
    It takes in the ground truth and the model predicted result
    and evaluate the F1 score. This is an experimental function
    and should not be used as further model training metric.

    Parameters
    ----------
    y_true : tf.tensor
    y_pred : tf.tensor

    Returns
    ----------
    F1 score in keras backend

    Notes
    -----
    This function is flawed because keras calculates the metrics batchwise
    which is why F1 metric is removed from keras. To properly calulate the
    F1 score, we can use the callback function or manually calculate F1
    score after the model has finished training. The latter is chosen
    and this could be seen in MetricCalculator, MetricCalculator_multiview_2
    and MetricCalculator_multiview_3.

    The reason this function is kept is because the model was initially
    trained with these metrics and stored in the google cloud bucket.
    To retrieve the models these metrics must be passed inorder
    to retrieve the model. Since the model is optimize on the loss rather
    than the metrics, the incorrect metric would not effect the model
    training process. The code is obtained/modified from:

    https://stackoverflow.com/questions/43547402/how-to-calculate-f1-macro-in-keras

    https://neptune.ai/blog/implementing-the-macro-f1-score-in-keras
    """
    def recall(y_true, y_pred):
        """
        Recall metric.

        Only computes a batch-wise average of recall.

        Computes the recall, a metric for multi-label classification of
        how many relevant items are selected.
        """
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
        recall = true_positives / (possible_positives + K.epsilon())
        return recall

    def precision(y_true, y_pred):
        """
        Precision metric.

        Only computes a batch-wise average of precision.

        Computes the precision, a metric for multi-label classification of
        how many selected items are relevant.
        """
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
        precision = true_positives / (predicted_positives + K.epsilon())
        return precision
    precision = precision(y_true, y_pred)
    recall = recall(y_true, y_pred)
    return 2 * ((precision * recall) / (precision + recall + K.epsilon()))


def custom_accuracy(y_true, y_pred):
    """
    The function is used as tensorflow metrics when training.
    It takes in the ground truth and the model predicted result
    and evaluate the accuracy score. This is an experimental function
    and should not be used as further model training metric.

    Parameters
    ----------
    y_true : tf.tensor
    y_pred : tf.tensor

    Returns
    ----------
    accuracy score in keras backend

    Notes
    -----
    This function is modified from the F1 metric above to fit
    the definition of accuracy. However, tensorflow's
    "categorical_accuracy" is used instead. The accuracy metric
    would also be recalculated again in MetricCalculator,
    MetricCalculator_multiview_2 and MetricCalculator_multiview_3.

    The reason this function is kept is because the model was
    initially trained with these metrics and stored in
    the google cloud bucket. To retrieve the models these
    metrics must be passed inorder to retrieve the model.
    Since the model is optimize on the loss rather than
    the metrics, the incorrect metric would not effect
    the model training process. The code is obtained/modified from:

    https://stackoverflow.com/questions/43547402/how-to-calculate-f1-macro-in-keras

    https://neptune.ai/blog/implementing-the-macro-f1-score-in-keras
    """
    # total_data = K.int_shape(y_true) + K.int_shape(y_pred)
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    true_negatives = K.sum(K.round(K.clip(1 - y_true * y_pred, 0, 1)))
    possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
    total_data = - true_positives + true_negatives + \
        possible_positives + predicted_positives
    return (true_positives + true_negatives) / (total_data + K.epsilon())


def MetricCalculator(model, test_data, total_steps):
    """
    This function takes in the feature stack model loaded
    from google cloud bucket, the test_data which is the
    tensor object and the number of steps and returns
    the metrics including accuracy, recall, precision and F1

    Parameters
    ----------
    model : keras.engine.functional.Functional
    test_data : RepeatDataset with tf.float32
    total_steps : int/float

    Returns
    ----------
    Returns the precision, recall, f1, accuracy
    metric based on the model performance.

    Notes
    -----
    This function should be used instead of the F1, custom_accuracy
    written above. The code is obtained/modified from:

    https://stackoverflow.com/questions/43547402/how-to-calculate-f1-macro-in-keras

    https://neptune.ai/blog/implementing-the-macro-f1-score-in-keras
    """
    pred = []
    true = []
    pbar = tq.tqdm(total=total_steps)
    for steps, data in enumerate(test_data):
        pbar.update(1)
        if steps == total_steps:
            break
        input = data[0]
        y_true = data[1]
        y_pred = np.rint(model.predict(input))
        y_true = np.reshape(y_true, (256 * 256, 2))
        y_pred = np.reshape(y_pred, (256 * 256, 2))
        pred.append(y_pred)
        true.append(y_true)
    f1_macro = f1_score(np.reshape(true, (total_steps * 65536, 2)),
                        np.reshape(pred, (total_steps * 65536, 2)),
                        average="macro")
    recall_macro = recall_score(np.reshape(true, (total_steps * 65536, 2)),
                                np.reshape(pred, (total_steps * 65536, 2)),
                                average="macro")
    precision_macro = precision_score(np.reshape(true,
                                      (total_steps * 65536, 2)),
                                      np.reshape(pred,
                                      (total_steps * 65536, 2)),
                                      average="macro")
    accuracy = accuracy_score(np.reshape(true, (total_steps * 65536, 2)),
                              np.reshape(pred, (total_steps * 65536, 2)))

    print("precision_macro: ", precision_macro)
    print("recall_macro: ", recall_macro)
    print("F1_macro_Score: : ", f1_macro)
    print("Accuracy: ", accuracy)

    return precision_macro, recall_macro, f1_macro, accuracy


def MetricCalculator_multiview_2(model, test_data, total_steps):
    """
    This function takes in the multiview-2 model loaded
    from google cloud bucket, the test_data which is the
    tensor object and the number of steps and returns
    the metrics including accuracy, recall, precision and F1

    Parameters
    ----------
    model : keras.engine.functional.Functional
    test_data : RepeatDataset with tf.float32
    total_steps : int/float

    Returns
    ----------
    Returns the precision, recall, f1, accuracy metric
    based on the model performance.

    Notes
    -----
    This function should be used instead of the F1,
    custom_accuracy written above. The code is obtained/modified from:

    https://stackoverflow.com/questions/43547402/how-to-calculate-f1-macro-in-keras

    https://neptune.ai/blog/implementing-the-macro-f1-score-in-keras
    """
    pbar = tq.tqdm(total=total_steps)
    pred = []
    true = []
    for steps, data in enumerate(test_data):
        pbar.update(1)
        if steps >= total_steps:
            break
        input = data[0]
        x1, x2 = tf.split(input, [len(CONFIG.BANDS1), len(CONFIG.BANDS2)], 3)
        y_true = data[1]
        y_pred = np.rint(model.predict([x1, x2]))
        y_true = np.reshape(y_true, (256 * 256, 2))
        y_pred = np.reshape(y_pred, (256 * 256, 2))
        pred.append(y_pred)
        true.append(y_true)
    f1_macro = f1_score(np.reshape(true, (total_steps * 65536, 2)),
                        np.reshape(pred, (total_steps * 65536, 2)),
                        average="macro")
    recall_macro = recall_score(np.reshape(true, (total_steps * 65536, 2)),
                                np.reshape(pred, (total_steps * 65536, 2)),
                                average="macro")
    precision_macro = precision_score(np.reshape(true,
                                      (total_steps * 65536, 2)),
                                      np.reshape(pred,
                                      (total_steps * 65536, 2)),
                                      average="macro")
    accuracy = accuracy_score(np.reshape(true, (total_steps * 65536, 2)),
                              np.reshape(pred, (total_steps * 65536, 2)))

    print("precision_macro: ", precision_macro)
    print("recall_macro: ", recall_macro)
    print("F1_macro_Score: : ", f1_macro)
    print("Accuracy: ", accuracy)

    return precision_macro, recall_macro, f1_macro, accuracy


def MetricCalculator_multiview_3(model, test_data, total_steps):
    """
    This function takes in the multiview-3 model loaded from
    google cloud bucket, the test_data which is the tensor object
    and the number of steps and returns the metrics including
    accuracy, recall, precision and F1

    Parameters
    ----------
    model : keras.engine.functional.Functional
    test_data : RepeatDataset with tf.float32
    total_steps : int/float

    Returns
    ----------
    Returns the precision, recall, f1,
    accuracy metric based on the model performance.

    Notes
    -----
    This function should be used instead of the F1,
    custom_accuracy written above. The code is obtained/modified from:

    https://stackoverflow.com/questions/43547402/how-to-calculate-f1-macro-in-keras

    https://neptune.ai/blog/implementing-the-macro-f1-score-in-keras
    """
    pbar = tq.tqdm(total=total_steps)
    pred = []
    true = []
    for steps, data in enumerate(test_data):
        pbar.update(1)
        if steps >= total_steps:
            break
        input = data[0]
        x1, x2, x3 = tf.split(input,
                              [len(CONFIG.BANDS1),
                               len(CONFIG.BANDS2),
                               len(CONFIG.BANDS3)],
                              3)
        y_true = data[1]
        y_pred = np.rint(model.predict([x1, x2, x3]))
        y_true = np.reshape(y_true, (256 * 256, 2))
        y_pred = np.reshape(y_pred, (256 * 256, 2))
        pred.append(y_pred)
        true.append(y_true)
    f1_macro = f1_score(np.reshape(true, (total_steps * 65536, 2)),
                        np.reshape(pred, (total_steps * 65536, 2)),
                        average="macro")
    recall_macro = recall_score(np.reshape(true, (total_steps * 65536, 2)),
                                np.reshape(pred, (total_steps * 65536, 2)),
                                average="macro")
    precision_macro = precision_score(np.reshape(true,
                                      (total_steps * 65536, 2)),
                                      np.reshape(pred,
                                      (total_steps * 65536, 2)),
                                      average="macro")
    accuracy = accuracy_score(np.reshape(true, (total_steps * 65536, 2)),
                              np.reshape(pred, (total_steps * 65536, 2)))

    print("precision_macro: ", precision_macro)
    print("recall_macro: ", recall_macro)
    print("F1_macro_Score: : ", f1_macro)
    print("Accuracy: ", accuracy)

    return precision_macro, recall_macro, f1_macro, accuracy


def ndwi_threashold(B3, B5):
    """
    This function takes in bands 3 and bands 5 from the landsat
    imagery and returns the tuple prediction of whether
    there is water present or not. The threashold is set at 0.

    Parameters
    ----------
    test_data : RepeatDataset with tf.float32
    total_steps : int/float

    Returns
    ----------
    tuple of whether there is water or not
    """
    ndwi = (B3 - B5) / (B3 + B5)
    if ndwi > 0:
        return 0, 1
    else:
        return 1, 0


def MetricCalculator_NDWI(test_data, total_steps):
    """
    This function takes in the test_data which is the tensor object and
    the number of steps and returns the metrics including accuracy,
    recall, precision and F1 for NDWI performance.

    Parameters
    ----------
    test_data : RepeatDataset with tf.float32
    total_steps : int/float

    Returns
    ----------
    Returns the precision, recall, f1, accuracy metric
    based on the NDWI performance
    """
    pred = []
    true = []
    pbar = tq.tqdm(total=total_steps)
    for steps, data in enumerate(test_data):
        # print(f'Number of steps: {steps}', end = "\r")
        pbar.update(1)
        if steps == total_steps:
            break
        input = data[0]
        y_true = data[1]
        input = np.reshape(input, (256 * 256, 2))
        y_pred = []
        for i in range(256 * 256):
            B3, B5 = input[i]
            first, second = ndwi_threashold(B3, B5)
            y_pred.append([first, second])
        y_true = np.reshape(y_true, (256 * 256, 2))
        y_pred = np.reshape(y_pred, (256 * 256, 2))
        pred.append(y_pred)
        true.append(y_true)
    f1_macro = f1_score(np.reshape(true, (total_steps * 65536, 2)),
                        np.reshape(pred, (total_steps * 65536, 2)),
                        average="macro")
    recall_macro = recall_score(np.reshape(true, (total_steps * 65536, 2)),
                                np.reshape(pred, (total_steps * 65536, 2)),
                                average="macro")
    precision_macro = precision_score(np.reshape(true,
                                                 (total_steps * 65536, 2)),
                                      np.reshape(pred,
                                                 (total_steps * 65536, 2)),
                                      average="macro")
    accuracy = accuracy_score(np.reshape(true, (total_steps * 65536, 2)),
                              np.reshape(pred, (total_steps * 65536, 2)))

    print("precision_macro: ", precision_macro)
    print("recall_macro: ", recall_macro)
    print("F1_macro_Score: : ", f1_macro)
    print("Accuracy: ", accuracy)

    return precision_macro, recall_macro, f1_macro, accuracy


Writing tools/metrics_.py


In [3]:
%%writefile {PACKAGE_PATH}/config.py

import tensorflow as tf
from . import metrics_

__all__ = ["configuration"]


class configuration:
    """
    In each experiment, the combinations of satellite's bands that is
    used to train the neural network is different. Also the way to train
    the neural network is also different, whether it is feature stack,
    multiview learning with two or three perceptrons. As each experiment
    has different settings, it is important to store them and reuse this
    throughout the project. This class enables user to store the settings
    and reuse the settings.
    """
    def __init__(self, PROJECT_TITLE, BANDS1, TRAIN_SIZE, EVAL_SIZE,
                 BANDS2=[], BANDS3=[], country="TH", image=None, sam_arr=None,
                 type_=1, LOSS="categorical_crossentropy", EPOCHS=10,
                 BATCH_SIZE=16, dropout_prob=0.3):
        """

        Initialising/storing the parameters to use later

        Parameters
        ----------
        PROJECT_TITLE : string
        BANDS1 : list
        TRAIN_SIZE : int/float
        EVAL_SIZE : int/float
        BANDS2 : list
        BANDS3 : list
        country : string
        image : ee.image.Image
        sam_arr : ee.image.Image
        type : int/float

        """
        if type_ == 1:
            self.type_ = "fs"
        elif type_ == 2:
            self.type_ = "m2"
        elif type_ == 3:
            self.type_ = "m3"
        else:
            self.type_ = None
        self.country = country
        self.PROJECT_TITLE = PROJECT_TITLE
        self.BANDS1 = BANDS1
        self.BANDS2 = BANDS2
        self.BANDS3 = BANDS3
        self.BUCKET = "geebucketwater"
        self.FOLDER = f'{self.type_}_{self.country}_Cnn_{self.PROJECT_TITLE}'
        self.TRAIN_SIZE = TRAIN_SIZE
        self.EVAL_SIZE = EVAL_SIZE
        self.BUCKET = "geebucketwater"
        self.TRAINING_BASE = 'training_patches'
        self.EVAL_BASE = 'eval_patches'
        self.TEST_BASE = 'test_patches'
        self.RESPONSE = 'water'
        self.BANDS = BANDS1 + BANDS2 + BANDS3
        self.FEATURES = BANDS1 + BANDS2 + BANDS3 + [self.RESPONSE]
        # Specify the size and shape of patches expected by the model.
        self.KERNEL_SIZE = 256
        self.KERNEL_SHAPE = [self.KERNEL_SIZE, self.KERNEL_SIZE]
        self.COLUMNS = [
            tf.io.FixedLenFeature(shape=self.KERNEL_SHAPE, dtype=tf.float32)
            for k in self.FEATURES
        ]
        self.FEATURES_DICT = dict(zip(self.FEATURES, self.COLUMNS))
        # Specify model training parameters.
        self.BATCH_SIZE = BATCH_SIZE
        self.EPOCHS = EPOCHS
        self.BUFFER_SIZE = 2000
        self.OPTIMIZER = 'adam'
        self.LOSS = LOSS
        self.dropout_prob = dropout_prob
        self.METRICS = ['AUC', "categorical_accuracy", metrics_.f1]
        self.image = image
        self.sam_arr = sam_arr


Writing tools/config.py


In [4]:
%%writefile {PACKAGE_PATH}/preprocessing.py

import tensorflow as tf
import ee

__all__ = ["Preprocessor", "maskL8sr", "EnsureTwodigit",
           "GenSeasonalDatesMonthly", "getQABits", "cloud_shadows",
           "clouds", "maskClouds", "applyScaleFactors", "changeNames"]


class Preprocessor:
    """
    Class that preprocessese and returns the training,
    evaluation and testing data from google cloud bucket
    """
    def __init__(self, config):
        self.config = config

    def parse_tfrecord(self, example_proto):
        """
        The parsing function Read a serialized example
        into the structure defined by FEATURES_DICT.

        Parameters
        ----------
        example_proto: a serialized Example

        Returns
        ----------
        A dictionary of tensors, keyed by feature name.

        Notes
        -----
        The code is obtained/modified from:

        https://github.com/google/earthengine-api/blob/master/python/examples/ipynb/UNET_regression_demo.ipynb
        """
        return tf.io.parse_single_example(example_proto,
                                          self.config.FEATURES_DICT)

    def to_tuple(self, inputs):
        """
        Function to convert a dictionary of tensors to a
        tuple of (inputs, outputs). Turn the tensors returned
        by parse_tfrecord into a stack in HWC shape.
        Parameters
        ----------
        inputs: A dictionary of tensors, keyed by feature name.

        Returns
        ----------
        A tuple of (inputs, outputs).

        Notes
        -----
        The code is obtained/modified from:

        https://github.com/google/earthengine-api/blob/master/python/examples/ipynb/UNET_regression_demo.ipynb
        """
        inputsList = [inputs.get(key) for key in self.config.FEATURES]
        stacked = tf.stack(inputsList, axis=0)
        # Convert from CHW to HWC
        stacked = tf.transpose(stacked, [1, 2, 0])
        return stacked[:, :, :len(self.config.BANDS)], \
            tf.reshape(tf.one_hot(
                tf.cast(stacked[:, :, len(self.config.BANDS):],
                        tf.int32),
                depth=2), [256, 256, 2])

    def get_dataset(self, pattern):
        """
        Function to read, parse and format to tuple a
        set of input tfrecord files. Get all the files
        matching the pattern, parse and convert to tuple.

        Parameters
        ----------
        pattern: A file pattern to match in a Cloud Storage bucket.

        Returns
        ----------
        A tf.data.Dataset

        Notes
        -----
        The code is obtained/modified from:

        https://github.com/google/earthengine-api/blob/master/python/examples/ipynb/UNET_regression_demo.ipynb
        """
        try:
            glob = tf.io.gfile.glob(pattern)
        except: # noqa
            return "the bucket you specified doesn't exist"
        if glob == []:
            return "the path you specified doesn't have the data"
        dataset = tf.data.TFRecordDataset(glob, compression_type='GZIP')
        dataset = dataset.map(self.parse_tfrecord, num_parallel_calls=5)
        dataset = dataset.map(self.to_tuple, num_parallel_calls=5)
        return dataset

    def get_training_dataset(self, location):
        """
        Get the preprocessed training dataset
        Parameters
        ----------
        location: string

        Returns
        ----------
        A tf.data.Dataset of training data.

        Notes
        -----
        The code is obtained/modified from:

        https://github.com/google/earthengine-api/blob/master/python/examples/ipynb/UNET_regression_demo.ipynb
        """
        glob = 'gs://' + self.config.BUCKET + \
            '/' + location + "training_patches_" + '*'
        dataset = self.get_dataset(glob)
        dataset = dataset.shuffle(self.config.BUFFER_SIZE).\
            batch(self.config.BATCH_SIZE).\
            repeat()
        return dataset

    def get_training_dataset_for_testing(self, location):
        """
        Get the preprocessed training dataset for testing
        Parameters
        ----------
        location: string

        Returns
        ----------
        A tf.data.Dataset of training data.

        Notes
        -----
        The code is obtained/modified from:

        https://github.com/google/earthengine-api/blob/master/python/examples/ipynb/UNET_regression_demo.ipynb
        """
        glob = 'gs://' + self.config.BUCKET + \
               '/' + location + "training_patches_" + '*'
        dataset = self.get_dataset(glob)
        if type(dataset) == str:
            return dataset
        dataset = dataset.batch(1).repeat()
        return dataset

    def get_eval_dataset(self, location):
        """
        Get the preprocessed evaluation dataset
        Parameters
        ----------
        location: string

        Returns
        ----------
        A tf.data.Dataset of evaluation data.

        Notes
        -----
        The code is obtained/modified from:

        https://github.com/google/earthengine-api/blob/master/python/examples/ipynb/UNET_regression_demo.ipynb
        """
        glob = 'gs://' + self.config.BUCKET + \
               '/' + location + "eval_patches_" + '*'
        dataset = self.get_dataset(glob)
        if type(dataset) == str:
            return dataset
        dataset = dataset.batch(1).repeat()
        return dataset

    # print(iter(evaluation.take(1)).next())

    def get_test_dataset(self, location, test_base):
        """
        Get the preprocessed testing dataset
        Parameters
        ----------
        location: string

        Returns
        ----------
        A tf.data.Dataset of testing data.

        Notes
        -----
        The code is obtained/modified from:

        https://github.com/google/earthengine-api/blob/master/python/examples/ipynb/UNET_regression_demo.ipynb
        """
        glob = 'gs://' + self.config.BUCKET + \
               '/' + location + test_base + '*'
        dataset = self.get_dataset(glob)
        if type(dataset) == str:
            return dataset
        dataset = dataset.batch(1).repeat()
        return dataset


def maskL8sr(image):
    """
    Get the landsat-8 image and returned a cloud masked image
    ----------
    image: ee.image.Image

    Returns
    ----------
    A maksed landsat-8 ee.image.Image

    Notes
    -----
    The code is obtained/modified from:

    https://github.com/google/earthengine-api/blob/master/python/examples/ipynb/UNET_regression_demo.ipynb
    """
    BANDS = ['B2', 'B3', 'B4', 'B5', 'B6', 'B7']
    cloudShadowBitMask = ee.Number(2).pow(3).int()
    cloudsBitMask = ee.Number(2).pow(5).int()
    qa = image.select('pixel_qa')
    mask = qa.bitwiseAnd(cloudShadowBitMask).eq(0).And(
        qa.bitwiseAnd(cloudsBitMask).eq(0))
    return image.updateMask(mask).select(BANDS).divide(10000)


def EnsureTwodigit(number):
    """
    Transform the input month into string in the
    correct format for date and time.
    ----------
    number: int

    Returns
    ----------
    months in string.

    """
    if number > 12:
        return str(12)
    if number < 10:
        return "0" + str(number)
    else:
        return str(number)


def GenSeasonalDatesMonthly(start, end, month_frequency=3):
    """
    Given two dictionary containing the key month and year,
    return two arrays that contains the time between the
    interval of start and end.
    ----------
    start: dict
    end: dict

    Returns
    ----------
    Two arrays containing the time elapsed between start and end

    """
    diff_year = end["year"] - start["year"]
    diff_month = end["month"] - start["month"]
    starts = []
    ends = []
    first_data = str(start["year"]) + "-" + \
        EnsureTwodigit(start["month"]) + "-01"
    if diff_year > 0:
        return "please insert the same year"
    else:
        for i in range(round(diff_month / month_frequency)):
            first_data = str(start["year"]) + "-" + \
                EnsureTwodigit(start["month"] + month_frequency * i) + "-01"
            second_data = str(start["year"]) + "-" + \
                EnsureTwodigit(start["month"] +
                               month_frequency *
                               i +
                               month_frequency) + "-01"
            starts.append(first_data)
            ends.append(second_data)
    return starts, ends


# As collection 1 of Landsat-8 ceased at
# December 2021, collection 2 must be used instead


def getQABits(image, start, end, newName):
    """
    Compute the bits we need to extract.
    ----------
    image: ee.image.Image
    start: int
    end: int
    newName: string

    Returns
    ----------
    Return a single band image of the extracted QA bits
    with a new name

    Notes
    ----------
    Code is modified from
    https://gis.stackexchange.com/questions/277059/cloud-mask-for-landsat8-on-google-earth-engine
    """
    pattern = 0
    for i in range(start, end + 1):
        pattern += 2**i
    return image.select([0], [newName])\
                .bitwiseAnd(pattern)\
                .rightShift(start)


def cloud_shadows(image):
    """
    return the masked cloud shadow image from QABits image.
    ----------
    image: ee.image.Image

    Returns
    ----------
    Return an image masking out cloudy areas.

    Notes
    -----
    Code is modified from
    https://gis.stackexchange.com/questions/277059/cloud-mask-for-landsat8-on-google-earth-engine
    """
    # Select the QA band.
    QA = image.select(['QA_PIXEL'])
    # Get the internal_cloud_algorithm_flag bit.
    return getQABits(QA, 3, 3, 'cloud_shadows').eq(0)


def clouds(image):
    """
    Mask out cloudy pixels from QABit image.
    ----------
    image: ee.image.Image

    Returns
    ----------
    Return an image masking out cloudy areas.

    Notes
    -----
    Code is modified from
    https://gis.stackexchange.com/questions/277059/cloud-mask-for-landsat8-on-google-earth-engine
    """
    # Select the QA band.
    QA = image.select(['QA_PIXEL'])
    # Get the internal_cloud_algorithm_flag bit.
    return getQABits(QA, 5, 5, 'Cloud').eq(0)
    # Return an image masking out cloudy areas.


def maskClouds(image):
    """
    Put all the functions together to mask the clouds and
    shadows
    ----------
    image: ee.image.Image

    Returns
    ----------
    Return an image masking out cloudy and shadow area.

    Notes
    -----
    Code is modified from
    https://gis.stackexchange.com/questions/277059/cloud-mask-for-landsat8-on-google-earth-engine
    """
    cs = cloud_shadows(image)
    c = clouds(image)
    image = image.updateMask(cs)
    return image.updateMask(c)


def applyScaleFactors(image):
    """
    Adjust scale factor to standardize the visualization
    ----------
    image: ee.image.Image

    Returns
    ----------
    Adjusted image with correct scale factor

    Notes
    -----
    Code is modified from
    https://gis.stackexchange.com/questions/277059/cloud-mask-for-landsat8-on-google-earth-engine
    """
    opticalBands = image.select('SR_B.').multiply(0.0000275).add(-0.2)
    thermalBands = image.select('ST_B.*').multiply(0.00341802).add(149.0)
    return image.addBands(opticalBands, None, True)\
                .addBands(thermalBands, None, True)


def changeNames(image):
    """
    Adjust bandNames of collection 2 to match collection 1
    ----------
    image: ee.image.Image

    Returns
    ----------
    ee.image.Image with adjusted bandNames
    """
    return image.select(['SR_B1', 'SR_B2', 'SR_B3',
                         'SR_B4', 'SR_B5', 'SR_B6',
                         'SR_B7', 'SR_QA_AEROSOL',
                         'ST_B10', 'ST_ATRAN', 'ST_CDIST',
                         'ST_DRAD', 'ST_EMIS', 'ST_EMSD',
                         'ST_QA', 'ST_TRAD', 'ST_URAD',
                         'QA_PIXEL', 'QA_RADSAT'],
                        ['B1', 'B2', 'B3', 'B4', 'B5',
                         'B6', 'B7', 'SR_QA_AEROSOL',
                         'ST_B10', 'ST_ATRAN', 'ST_CDIST',
                         'ST_DRAD', 'ST_EMIS', 'ST_EMSD',
                         'ST_QA', 'ST_TRAD', 'ST_URAD',
                         'QA_PIXEL', 'QA_RADSAT'])


Writing tools/preprocessing.py


In [5]:
%%writefile {PACKAGE_PATH}/sampling.py

import ee

__all__ = ["Training_task", "Eval_task", "Testing_task"]


def Training_task(trainingPolys, n, N, arrays, setting, foldername):
    """
    Exporting Training data to google cloud bucket
    Parameters
    ----------
    trainingPolys : ee.featurecollection.FeatureCollection
    n : int/float
    N : int/float
    arrays: ee.image.Image
    setting: tools.config.configuration
    foldername : string

    Returns
    ----------
    A tf.data.Dataset of testing data.

    Notes
    -----
    The code is obtained/modified from:

    https://github.com/google/earthengine-api/blob/master/python/examples/ipynb/UNET_regression_demo.ipynb
    """
    trainingPolysList = trainingPolys.toList(trainingPolys.size())
    # Export all the training data (in many pieces), ith one task
    # per geometry.
    for g in range(trainingPolys.size().getInfo()):
        geomSample = ee.FeatureCollection([])
        for i in range(n):
            sample = arrays.sample(
                region=ee.Feature(trainingPolysList.get(g)).geometry(),
                scale=30,
                numPixels=N / n,
                seed=i,
                tileScale=8
            )
        geomSample = geomSample.merge(sample)

        desc = setting.TRAINING_BASE + '_g' + str(g)
        task = ee.batch.Export.table.toCloudStorage(
            collection=geomSample,
            description=desc,
            bucket=setting.BUCKET,
            fileNamePrefix=foldername + '/' + desc,
            fileFormat='TFRecord',
            selectors=setting.BANDS + [setting.RESPONSE]
        )
        task.start()


def Eval_task(evalPolys, n, N, arrays, setting, foldername):
    """
    Exporting Evaluating data to google cloud bucket
    Parameters
    ----------
    evalPolys : ee.featurecollection.FeatureCollection
    n : int/float
    N : int/float
    arrays: ee.image.Image
    setting: tools.config.configuration
    foldername : string

    Returns
    ----------
    A tf.data.Dataset of testing data.

    Notes
    -----
    The code is obtained/modified from:

    https://github.com/google/earthengine-api/blob/master/python/examples/ipynb/UNET_regression_demo.ipynb
    """
    evalPolysList = evalPolys.toList(evalPolys.size())
    # Export all the evaluation data.
    for g in range(evalPolys.size().getInfo()):
        geomSample = ee.FeatureCollection([])
        for i in range(n):
            sample = arrays.sample(
                region=ee.Feature(evalPolysList.get(g)).geometry(),
                scale=30,
                numPixels=N / n,
                seed=i,
                tileScale=8
            )
            geomSample = geomSample.merge(sample)

        desc = setting.EVAL_BASE + '_g' + str(g)
        task = ee.batch.Export.table.toCloudStorage(
            collection=geomSample,
            description=desc,
            bucket=setting.BUCKET,
            fileNamePrefix=foldername + '/' + desc,
            fileFormat='TFRecord',
            selectors=setting.BANDS + [setting.RESPONSE]
        )
        task.start()


def Testing_task(testPolys, n, N, arrays, setting, foldername, Test_base):
    """
    Exporting Testing data to google cloud bucket
    Parameters
    ----------
    testPolys : ee.featurecollection.FeatureCollection
    n : int/float
    N : int/float
    arrays: ee.image.Image
    setting: tools.config.configuration
    foldername : string

    Returns
    ----------
    A tf.data.Dataset of testing data.

    Notes
    -----
    The code is obtained/modified from:

    https://github.com/google/earthengine-api/blob/master/python/examples/ipynb/UNET_regression_demo.ipynb
    """
    # Export all the test data.
    testPolysList = testPolys.toList(testPolys.size())
    for g in range(testPolys.size().getInfo()):
        geomSample = ee.FeatureCollection([])
        for i in range(n):
            sample = arrays.sample(
                region=ee.Feature(testPolysList.get(g)).geometry(),
                scale=30,
                numPixels=N / n,
                seed=i,
                tileScale=8
            )
            geomSample = geomSample.merge(sample)

        desc = Test_base + '_g' + str(g)
        task = ee.batch.Export.table.toCloudStorage(
            collection=geomSample,
            description=desc,
            bucket=setting.BUCKET,
            fileNamePrefix=foldername + '/' + desc,
            fileFormat='TFRecord',
            selectors=setting.BANDS + [setting.RESPONSE]
        )
        task.start()


Writing tools/sampling.py


## Authentication

Authentication with google colab, earth engine api and google cloud bucket is required before proceeding.

In [6]:
# Cloud authentication.
from google.colab import auth
auth.authenticate_user()

# Import, authenticate and initialize the Earth Engine library.
import ee
ee.Authenticate()
ee.Initialize()

# Google cloud bucket configuration
project_id = 'coastal-cell-299117'
!gcloud config set project {project_id}

To authorize access needed by Earth Engine, open the following URL in a web browser and follow the instructions. If the web browser does not start automatically, please manually browse the URL below.

    https://code.earthengine.google.com/client-auth?scopes=https%3A//www.googleapis.com/auth/earthengine%20https%3A//www.googleapis.com/auth/devstorage.full_control&request_id=Z09EBwNO8rJfnFIKcgzr6DA4Go8BmTl0htC0xv3uero&tc=5boqOafUJZsbU-fuinmbb3ENVVMcC2m-SLTF__GEn6s&cc=hquOV13r1tT-AACaUuXmSEPlUzq6HQb14oyWLAqIgr0

The authorization workflow will generate a code, which you should paste in the box below.
Enter verification code: 4/1AdQt8qixXMgiQ-i1O1JAeqdC5jj3SFsfPmzPdCVeL19-C2fkEermmiR4Qnc

Successfully saved authorization token.
Updated property [core/project].


Import other required library

In [7]:
import tensorflow as tf
import folium
from pprint import pprint
from tools import preprocessing, config, metrics_, sampling
# from importlib import reload
# reload(preprocessing) # Uncomment this line to rerun the modified packages

## Experimental Setup

### Exporting data to Google Cloud bucket

This section is dedicated entirely to exporting training, evaluating and testing data to google cloud bucket.

Initially, the required data is imported from the earth engine repository.

- The data used includes:
Sentinel-1 Data (10m) 
https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S1_GRD?hl=en

- USG’s Landsat-8 Collection 1 and Tier 1 (30m) and the cloud is masked
https://developers.google.com/earth-engine/guides/landsat
- NASADEM: NASA NASADEM Digital Elevation (30m) https://developers.google.com/earth-engine/datasets/catalog/NASA_NASADEM_HGT_001
- JRC-Monthly Water history (30m) and the region which includes "no data" is masked out https://developers.google.com/earth-engine/datasets/catalog/JRC_GSW1_3_MonthlyHistory

Date used to train: 1/1/2018 to 1/2/2018

In [8]:
# Sentinel-1 Data (10m)
S1 = ee.ImageCollection('COPERNICUS/S1_GRD'). \
    filter(ee.Filter.listContains('transmitterReceiverPolarisation', 'VV')).\
    filterDate('2018-01-01', '2018-02-01') \

S1A = S1.median()
S1 = S1.select('VV', 'VH').median()

# USG’s Landsat-8 Collection 1 and Tier 1 (30m)
l8sr = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR').\
    filterDate('2018-01-01', '2018-02-01')

# Cloud masking function.
L8SR = l8sr.map(preprocessing.maskL8sr).median()

# NASADEM: NASA NASADEM Digital Elevation (30m)
elevation = ee.Image('NASA/NASADEM_HGT/001').select('elevation')
slope = ee.Terrain.slope(elevation)
aspect = ee.Terrain.aspect(elevation)

# JRC-Monthly Water history (30m)
waterdata = ee.ImageCollection('JRC/GSW1_3/MonthlyHistory').\
    filterDate('2018-01-01', '2018-02-01').median()
watermask = waterdata.select("water")
# masking out "no data" region
mask = watermask.gt(0)
# Shifting the labels to make it binary
maskedComposite = waterdata.updateMask(mask).subtract(1)

### Setting up Config - to train in Thailand

In total there are:
- 32 feature stack deep learning experiments
- 37 Multiview with 2 input experiments
- 8 Multiview with 3 input experiments

Each experiment has a different configuration for example, different experiment name, bands for each input layer. Hence, a configuration is neccesary. There is also configuration for training globally and is found in Preprocessing_and_export_global.ipynb notebook

In [9]:
configs_fs = {}
train_size = 240 * 3
eval_size = 240 * 2

# Feature stack's 32 experiments

configs_fs["S1A_el_sl_as"] = \
    config.configuration("S1A_el_sl_as",
                         ["VV", "VH", "angle", "elevation",
                          "slope", "aspect"],
                         train_size,
                         eval_size)
configs_fs["S1A_el"] = \
    config.configuration("S1A_el",
                         ["VV", "VH", "angle", "elevation"],
                         train_size,
                         eval_size)
configs_fs["S1A_sl"] = \
    config.configuration("S1A_sl",
                         ["VV", "VH", "angle", "slope"],
                         train_size,
                         eval_size)
configs_fs["S1A_as"] = \
    config.configuration("S1A_as",
                         ["VV", "VH", "angle", "aspect"],
                         train_size,
                         eval_size)
configs_fs["S1A_sl_as"] = \
    config.configuration("S1A_sl_as",
                         ["VV", "VH", "angle", "slope", "aspect"],
                         train_size,
                         eval_size)
configs_fs["S1A_el_sl"] = \
    config.configuration("S1A_el_sl",
                         ["VV", "VH", "angle", "elevation", "slope"],
                         train_size,
                         eval_size)
configs_fs["S1A_el_as"] = \
    config.configuration("S1A_el_as",
                         ["VV", "VH", "angle", "elevation", "aspect"],
                         train_size,
                         eval_size)

configs_fs["S1_el_sl_as"] = \
    config.configuration("S1_el_sl_as",
                         ["VV", "VH", "elevation", "slope", "aspect"],
                         train_size,
                         eval_size)
configs_fs["S1_el"] = \
    config.configuration("S1_el",
                         ["VV", "VH", "elevation"],
                         train_size,
                         eval_size)
configs_fs["S1_sl"] = \
    config.configuration("S1_sl",
                         ["VV", "VH", "slope"],
                         train_size,
                         eval_size)
configs_fs["S1_as"] = \
    config.configuration("S1_as",
                         ["VV", "VH", "aspect"],
                         train_size,
                         eval_size)
configs_fs["S1_sl_as"] = \
    config.configuration("S1_sl_as",
                         ["VV", "VH", "slope", "aspect"],
                         train_size,
                         eval_size)
configs_fs["S1_el_sl"] = \
    config.configuration("S1_el_sl",
                         ["VV", "VH", "elevation", "slope"],
                         train_size,
                         eval_size)
configs_fs["S1_el_as"] = \
    config.configuration("S1_el_as",
                         ["VV", "VH", "elevation", "aspect"],
                         train_size,
                         eval_size)

configs_fs["L8SR_el_sl_as"] = \
    config.configuration("L8SR_el_sl_as",
                         ["B2", "B3", "B4", "B5",
                          "B6", "B7", "elevation",
                          "slope", "aspect"],
                         train_size,
                         eval_size)
configs_fs["L8SR_el"] = \
    config.configuration("L8SR_el",
                         ["B2", "B3", "B4", "B5",
                          "B6", "B7", "elevation"],
                         train_size,
                         eval_size)
configs_fs["L8SR_sl"] = \
    config.configuration("L8SR_sl",
                         ["B2", "B3", "B4", "B5",
                          "B6", "B7", "slope"],
                         train_size,
                         eval_size)
configs_fs["L8SR_as"] = \
    config.configuration("L8SR_as",
                         ["B2", "B3", "B4", "B5",
                          "B6", "B7", "aspect"],
                         train_size,
                         eval_size)
configs_fs["L8SR_sl_as"] = \
    config.configuration("L8SR_sl_as",
                         ["B2", "B3", "B4", "B5", "B6",
                          "B7", "slope", "aspect"],
                         train_size,
                         eval_size)
configs_fs["L8SR_el_sl"] = \
    config.configuration("L8SR_el_sl",
                         ["B2", "B3", "B4", "B5", "B6",
                          "B7", "elevation", "slope"],
                         train_size,
                         eval_size)
configs_fs["L8SR_el_as"] = \
    config.configuration("L8SR_el_as",
                         ["B2", "B3", "B4", "B5", "B6",
                          "B7", "elevation", "aspect"],
                         train_size,
                         eval_size)

configs_fs["L8SR_S1_el"] = \
    config.configuration("L8SR_S1_el",
                         ["B2", "B3", "B4", "B5", "B6",
                          "B7", "VV", "VH", "elevation"],
                         train_size,
                         eval_size)
configs_fs["L8SR_S1_sl"] = \
    config.configuration("L8SR_S1_sl",
                         ["B2", "B3", "B4", "B5", "B6",
                          "B7", "VV", "VH", "slope"],
                         train_size,
                         eval_size)
configs_fs["L8SR_S1_sl_el_as"] = \
    config.configuration("L8SR_S1_sl_el_as",
                         ["B2", "B3", "B4", "B5", "B6",
                          "B7", "VV", "VH", "slope", "elevation", "aspect"],
                         train_size,
                         eval_size)

configs_fs["L8SR_S1A_el"] = \
    config.configuration("L8SR_S1A_el",
                         ["B2", "B3", "B4", "B5", "B6", "B7",
                          "VV", "VH", "angle", "elevation"],
                         train_size,
                         eval_size)
configs_fs["L8SR_S1A_sl"] = \
    config.configuration("L8SR_S1A_sl",
                         ["B2", "B3", "B4", "B5", "B6", "B7",
                          "VV", "VH", "angle", "slope"],
                         train_size,
                         eval_size)
configs_fs["L8SR_S1A_sl_el_as"] = \
    config.configuration("L8SR_S1A_sl_el_as",
                         ["B2", "B3", "B4", "B5", "B6", "B7",
                          "VV", "VH", "angle", "slope",
                          "elevation", "aspect"],
                         train_size,
                         eval_size)

configs_fs["L8SR"] = \
    config.configuration("L8SR",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size)
configs_fs["S1"] = \
    config.configuration("S1",
                         ["VV", "VH"],
                         train_size,
                         eval_size)
configs_fs["S1A"] = \
    config.configuration("S1A",
                         ["VV", "VH", "angle"],
                         train_size,
                         eval_size)
configs_fs["L8SR_S1"] = \
    config.configuration("L8SR_S1",
                         ["B2", "B3", "B4", "B5", "B6",
                          "B7", "VV", "VH"],
                         train_size,
                         eval_size)
configs_fs["L8SR_S1A"] = \
    config.configuration("L8SR_S1A",
                         ["B2", "B3", "B4", "B5", "B6",
                          "B7", "VV", "VH", "angle"],
                         train_size,
                         eval_size)

# Multi-view with 2 inputs learning's 37 experiments

configs_multi = {}

configs_multi["S1A_el_sl_as"] = \
    config.configuration("S1A_el_sl_as",
                         ["VV", "VH", "angle"],
                         train_size,
                         eval_size,
                         ["elevation", "slope", "aspect"], type_=2)
configs_multi["S1A_el"] = \
    config.configuration("S1A_el",
                         ["VV", "VH", "angle"],
                         train_size,
                         eval_size,
                         ["elevation"], type_=2)
configs_multi["S1A_sl"] = \
    config.configuration("S1A_sl",
                         ["VV", "VH", "angle"],
                         train_size,
                         eval_size,
                         ["slope"], type_=2)
configs_multi["S1A_as"] = \
    config.configuration("S1A_as",
                         ["VV", "VH", "angle"],
                         train_size,
                         eval_size,
                         ["aspect"], type_=2)
configs_multi["S1A_sl_as"] = \
    config.configuration("S1A_sl_as",
                         ["VV", "VH", "angle"],
                         train_size,
                         eval_size,
                         ["slope", "aspect"], type_=2)
configs_multi["S1A_el_sl"] = \
    config.configuration("S1A_el_sl",
                         ["VV", "VH", "angle"],
                         train_size,
                         eval_size,
                         ["elevation", "slope"], type_=2)
configs_multi["S1A_el_as"] = \
    config.configuration("S1A_el_as",
                         ["VV", "VH", "angle"],
                         train_size,
                         eval_size,
                         ["elevation", "aspect"], type_=2)

configs_multi["S1_el_sl_as"] = \
    config.configuration("S1_el_sl_as",
                         ["VV", "VH"],
                         train_size,
                         eval_size,
                         ["elevation", "slope", "aspect"], type_=2)
configs_multi["S1_el"] = \
    config.configuration("S1_el",
                         ["VV", "VH"],
                         train_size,
                         eval_size,
                         ["elevation"], type_=2)
configs_multi["S1_sl"] = \
    config.configuration("S1_sl",
                         ["VV", "VH"],
                         train_size,
                         eval_size,
                         ["slope"], type_=2)
configs_multi["S1_as"] = \
    config.configuration("S1_as",
                         ["VV", "VH"],
                         train_size,
                         eval_size,
                         ["aspect"], type_=2)
configs_multi["S1_sl_as"] = \
    config.configuration("S1_sl_as",
                         ["VV", "VH"],
                         train_size,
                         eval_size,
                         ["slope", "aspect"], type_=2)
configs_multi["S1_el_sl"] = \
    config.configuration("S1_el_sl",
                         ["VV", "VH"],
                         train_size,
                         eval_size,
                         ["elevation", "slope"], type_=2)
configs_multi["S1_el_as"] = \
    config.configuration("S1_el_as",
                         ["VV", "VH"],
                         train_size,
                         eval_size,
                         ["elevation", "aspect"], type_=2)

configs_multi["L8SR_el_sl_as"] = \
    config.configuration("L8SR_el_sl_as",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["elevation", "slope", "aspect"], type_=2)
configs_multi["L8SR_el"] = \
    config.configuration("L8SR_el",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["elevation"], type_=2)
configs_multi["L8SR_sl"] = \
    config.configuration("L8SR_sl",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["slope"], type_=2)
configs_multi["L8SR_as"] = \
    config.configuration("L8SR_as",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["aspect"], type_=2)
configs_multi["L8SR_sl_as"] = \
    config.configuration("L8SR_sl_as",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["slope", "aspect"], type_=2)
configs_multi["L8SR_el_sl"] = \
    config.configuration("L8SR_el_sl",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["elevation", "slope"], type_=2)
configs_multi["L8SR_el_as"] = \
    config.configuration("L8SR_el_as",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["elevation", "aspect"], type_=2)


configs_multi["L8SR_S1_as"] = \
    config.configuration("L8SR_S1_as",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "aspect"], type_=2)
configs_multi["L8SR_S1_el"] = \
    config.configuration("L8SR_S1_el",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "elevation"], type_=2)
configs_multi["L8SR_S1_sl"] = \
    config.configuration("L8SR_S1_sl",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "slope"], type_=2)

configs_multi["L8SR_S1_sl_as"] = \
    config.configuration("L8SR_S1_sl_as",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "slope", "aspect"], type_=2)
configs_multi["L8SR_S1_el_sl"] = \
    config.configuration("L8SR_S1_el_sl",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "elevation", "slope"], type_=2)
configs_multi["L8SR_S1_el_as"] = \
    config.configuration("L8SR_S1_el_as",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "elevation", "aspect"], type_=2)
configs_multi["L8SR_S1_sl_el_as"] = \
    config.configuration("L8SR_S1_sl_el_as",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "slope", "elevation", "aspect"], type_=2)

configs_multi["L8SR_S1A_as"] = \
    config.configuration("L8SR_S1A_as",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "angle", "aspect"], type_=2)
configs_multi["L8SR_S1A_el"] = \
    config.configuration("L8SR_S1A_el",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "angle", "elevation"], type_=2)
configs_multi["L8SR_S1A_sl"] = \
    config.configuration("L8SR_S1A_sl",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "angle", "slope"], type_=2)

configs_multi["L8SR_S1A_sl_as"] = \
    config.configuration("L8SR_S1A_sl_as",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "angle", "slope", "aspect"], type_=2)
configs_multi["L8SR_S1A_el_sl"] = \
    config.configuration("L8SR_S1A_el_sl",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "angle", "elevation", "slope"], type_=2)
configs_multi["L8SR_S1A_el_as"] = \
    config.configuration("L8SR_S1A_el_as",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "angle", "elevation", "aspect"], type_=2)
configs_multi["L8SR_S1A_sl_el_as"] = \
    config.configuration("L8SR_S1A_sl_el_as",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "angle", "slope",
                          "elevation", "aspect"],
                         type_=2)

configs_multi["L8SR_S1"] = \
    config.configuration("L8SR_S1",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH"], type_=2)
configs_multi["L8SR_S1A"] = \
    config.configuration("L8SR_S1A",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "angle"], type_=2)

# Multi-view with 3 inputs learning's 8 experiments

configs_multi_3 = {}

configs_multi_3["L8SR_S1_as3"] = \
    config.configuration("L8SR_S1_as3",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH"], ["aspect"], type_=3)
configs_multi_3["L8SR_S1_el3"] = \
    config.configuration("L8SR_S1_el3",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH"], ["elevation"], type_=3)
configs_multi_3["L8SR_S1_sl3"] = \
    config.configuration("L8SR_S1_sl3",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH"], ["slope"], type_=3)
configs_multi_3["L8SR_S1_sl_el_as3"] = \
    config.configuration("L8SR_S1_sl_el_as3",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH"],
                         ["slope", "elevation", "aspect"],
                         type_=3)

configs_multi_3["L8SR_S1A_as3"] = \
    config.configuration("L8SR_S1A_as3",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "angle"], ["aspect"], type_=3)
configs_multi_3["L8SR_S1A_el3"] = \
    config.configuration("L8SR_S1A_el3",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "angle"], ["elevation"], type_=3)
configs_multi_3["L8SR_S1A_sl3"] = \
    config.configuration("L8SR_S1A_sl3",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "angle"], ["slope"], type_=3)
configs_multi_3["L8SR_S1A_sl_el_as3"] = \
    config.configuration("L8SR_S1A_sl_el_as3",
                         ["B2", "B3", "B4", "B5", "B6", "B7"],
                         train_size,
                         eval_size,
                         ["VV", "VH", "angle"],
                         ["slope", "elevation", "aspect"],
                         type_=3)

In [10]:
print(configs_multi["L8SR_S1A"].FOLDER)
print(configs_multi["L8SR_S1A"].BANDS1)
print(configs_multi["L8SR_S1A"].BANDS2)
print(configs_multi["L8SR_S1A"].BANDS)
print(configs_multi["L8SR_S1A"].TRAINING_BASE)
print(configs_multi["L8SR_S1A"].EVAL_BASE)
print(configs_multi["L8SR_S1A"].TEST_BASE)
print(configs_multi["L8SR_S1A"].country)
print(configs_multi["L8SR_S1A"].type_)

m2_TH_Cnn_L8SR_S1A
['B2', 'B3', 'B4', 'B5', 'B6', 'B7']
['VV', 'VH', 'angle']
['B2', 'B3', 'B4', 'B5', 'B6', 'B7', 'VV', 'VH', 'angle']
training_patches
eval_patches
test_patches
TH
m2


### Saving images to each experiment

We also need to add the images to the configuration. This will be neccesary when we want to export images to google earth assets later on.

In [11]:
# Feature Stack

configs_fs["S1A_el_sl_as"].image = \
    ee.Image.cat([S1A, elevation, slope, aspect]).float()
configs_fs["S1A_el"].image = \
    ee.Image.cat([S1A, elevation]).float()
configs_fs["S1A_sl"].image = \
    ee.Image.cat([S1A, slope]).float()
configs_fs["S1A_as"].image = \
    ee.Image.cat([S1A, aspect]).float()
configs_fs["S1A_sl_as"].image = \
    ee.Image.cat([S1A, slope, aspect]).float()
configs_fs["S1A_el_sl"].image = \
    ee.Image.cat([S1A, elevation, slope]).float()
configs_fs["S1A_el_as"].image = \
    ee.Image.cat([S1A, elevation, aspect]).float()

configs_fs["S1_el_sl_as"].image = \
    ee.Image.cat([S1, elevation, slope, aspect]).float()
configs_fs["S1_el"].image = \
    ee.Image.cat([S1, elevation]).float()
configs_fs["S1_sl"].image = \
    ee.Image.cat([S1, slope]).float()
configs_fs["S1_as"].image = \
    ee.Image.cat([S1, aspect]).float()
configs_fs["S1_sl_as"].image = \
    ee.Image.cat([S1, slope, aspect]).float()
configs_fs["S1_el_sl"].image = \
    ee.Image.cat([S1, elevation, slope]).float()
configs_fs["S1_el_as"].image = \
    ee.Image.cat([S1, elevation, aspect]).float()

configs_fs["L8SR_el_sl_as"].image = \
    ee.Image.cat([L8SR, elevation, slope, aspect]).float()
configs_fs["L8SR_el"].image = \
    ee.Image.cat([L8SR, elevation]).float()
configs_fs["L8SR_sl"].image = \
    ee.Image.cat([L8SR, slope]).float()
configs_fs["L8SR_as"].image = \
    ee.Image.cat([L8SR, aspect]).float()
configs_fs["L8SR_sl_as"].image = \
    ee.Image.cat([L8SR, slope, aspect]).float()
configs_fs["L8SR_el_sl"].image = \
    ee.Image.cat([L8SR, elevation, slope]).float()
configs_fs["L8SR_el_as"].image = \
    ee.Image.cat([L8SR, elevation, aspect]).float()

configs_fs["L8SR_S1_el"].image = \
    ee.Image.cat([L8SR, S1, elevation]).float()
configs_fs["L8SR_S1_sl"].image = \
    ee.Image.cat([L8SR, S1, slope]).float()
configs_fs["L8SR_S1_sl_el_as"].image = \
    ee.Image.cat([L8SR, S1, slope, elevation, aspect]).float()

configs_fs["L8SR_S1A_el"].image = \
    ee.Image.cat([L8SR, S1A, elevation]).float()
configs_fs["L8SR_S1A_sl"].image = \
    ee.Image.cat([L8SR, S1A, slope]).float()
configs_fs["L8SR_S1A_sl_el_as"].image = \
    ee.Image.cat([L8SR, S1A, slope, elevation, aspect]).float()

configs_fs["L8SR"].image = \
    L8SR.float()
configs_fs["S1"].image = \
    S1.float()
configs_fs["S1A"].image = \
    S1A.float()
configs_fs["L8SR_S1"].image = \
    ee.Image.cat([L8SR, S1]).float()
configs_fs["L8SR_S1A"].image = \
    ee.Image.cat([L8SR, S1A]).float()

# Multiview

configs_multi["S1A_el_sl_as"].image = \
    ee.Image.cat([S1A, elevation, slope, aspect]).float()
configs_multi["S1A_el"].image = \
    ee.Image.cat([S1A, elevation]).float()
configs_multi["S1A_sl"].image = \
    ee.Image.cat([S1A, slope]).float()
configs_multi["S1A_as"].image = \
    ee.Image.cat([S1A, aspect]).float()
configs_multi["S1A_sl_as"].image = \
    ee.Image.cat([S1A, slope, aspect]).float()
configs_multi["S1A_el_sl"].image = \
    ee.Image.cat([S1A, elevation, slope]).float()
configs_multi["S1A_el_as"].image = \
    ee.Image.cat([S1A, elevation, aspect]).float()

configs_multi["S1_el_sl_as"].image = \
    ee.Image.cat([S1, elevation, slope, aspect]).float()
configs_multi["S1_el"].image = \
    ee.Image.cat([S1, elevation]).float()
configs_multi["S1_sl"].image = \
    ee.Image.cat([S1, slope]).float()
configs_multi["S1_as"].image = \
    ee.Image.cat([S1, aspect]).float()
configs_multi["S1_sl_as"].image = \
    ee.Image.cat([S1, slope, aspect]).float()
configs_multi["S1_el_sl"].image = \
    ee.Image.cat([S1, elevation, slope]).float()
configs_multi["S1_el_as"].image = \
    ee.Image.cat([S1, elevation, aspect]).float()

configs_multi["L8SR_el_sl_as"].image = \
    ee.Image.cat([L8SR, elevation, slope, aspect]).float()
configs_multi["L8SR_el"].image = \
    ee.Image.cat([L8SR, elevation]).float()
configs_multi["L8SR_sl"].image = \
    ee.Image.cat([L8SR, slope]).float()
configs_multi["L8SR_as"].image = \
    ee.Image.cat([L8SR, aspect]).float()
configs_multi["L8SR_sl_as"].image = \
    ee.Image.cat([L8SR, slope, aspect]).float()
configs_multi["L8SR_el_sl"].image = \
    ee.Image.cat([L8SR, elevation, slope]).float()
configs_multi["L8SR_el_as"].image = \
    ee.Image.cat([L8SR, elevation, aspect]).float()


configs_multi["L8SR_S1_as"].image = \
    ee.Image.cat([L8SR, S1, aspect]).float()
configs_multi["L8SR_S1_el"].image = \
    ee.Image.cat([L8SR, S1, elevation]).float()
configs_multi["L8SR_S1_sl"].image = \
    ee.Image.cat([L8SR, S1, slope]).float()

configs_multi["L8SR_S1_sl_as"].image = \
    ee.Image.cat([L8SR, S1, slope, aspect]).float()
configs_multi["L8SR_S1_el_sl"].image = \
    ee.Image.cat([L8SR, S1, elevation, slope]).float()
configs_multi["L8SR_S1_el_as"].image = \
    ee.Image.cat([L8SR, slope, elevation, aspect]).float()
configs_multi["L8SR_S1_sl_el_as"].image = \
    ee.Image.cat([L8SR, S1, slope, elevation, aspect]).float()

configs_multi["L8SR_S1A_as"].image = \
    ee.Image.cat([L8SR, S1A, aspect]).float()
configs_multi["L8SR_S1A_el"].image = \
    ee.Image.cat([L8SR, S1A, elevation]).float()
configs_multi["L8SR_S1A_sl"].image = \
    ee.Image.cat([L8SR, S1A, slope]).float()

configs_multi["L8SR_S1A_sl_as"].image = \
    ee.Image.cat([L8SR, S1A, slope, aspect]).float()
configs_multi["L8SR_S1A_el_sl"].image = \
    ee.Image.cat([L8SR, S1A, elevation, slope]).float()
configs_multi["L8SR_S1A_el_as"].image = \
    ee.Image.cat([L8SR, S1A, elevation, aspect]).float()
configs_multi["L8SR_S1A_sl_el_as"].image = \
    ee.Image.cat([L8SR, S1A, slope, elevation, aspect]).float()

configs_multi["L8SR_S1"].image = \
    ee.Image.cat([L8SR, S1]).float()
configs_multi["L8SR_S1A"].image = \
    ee.Image.cat([L8SR, S1A]).float()

# Multiview-3


configs_multi_3["L8SR_S1_as3"].image = \
    ee.Image.cat([L8SR, S1, aspect]).float()
configs_multi_3["L8SR_S1_el3"].image = \
    ee.Image.cat([L8SR, S1, elevation]).float()
configs_multi_3["L8SR_S1_sl3"].image = \
    ee.Image.cat([L8SR, S1, slope]).float()
configs_multi_3["L8SR_S1_sl_el_as3"].image = \
    ee.Image.cat([L8SR, S1, slope, elevation, aspect]).float()

configs_multi_3["L8SR_S1A_as3"].image = \
    ee.Image.cat([L8SR, S1A, aspect]).float()
configs_multi_3["L8SR_S1A_el3"].image = \
    ee.Image.cat([L8SR, S1A, elevation]).float()
configs_multi_3["L8SR_S1A_sl3"].image = \
    ee.Image.cat([L8SR, S1A, slope]).float()
configs_multi_3["L8SR_S1A_sl_el_as3"].image = \
    ee.Image.cat([L8SR, S1A, slope, elevation, aspect]).float()

We stack the 2D images (Landsat composite and JRC water data) to create a single image from which samples can be taken. We convert the image into an array image in which each pixel stores 256x256 patches of pixels for each band.  To export training patches, convert a multi-band image to an array image using neighborhoodToArray(), then sample the image at points.

In [12]:
for key in list(configs_fs):
    settings = configs_fs[key]
    featureStack = ee.Image.cat([
        settings.image.select(settings.BANDS),
        maskedComposite.select(settings.RESPONSE)
    ]).float()
    list_ = ee.List.repeat(1, settings.KERNEL_SIZE)
    lists = ee.List.repeat(list_, settings.KERNEL_SIZE)
    kernel = ee.Kernel.fixed(settings.KERNEL_SIZE, settings.KERNEL_SIZE, lists)
    arrays = featureStack.neighborhoodToArray(kernel)
    configs_fs[key].sam_arr = arrays
    print(key, settings.sam_arr.getInfo())

S1A_el_sl_as {'type': 'Image', 'bands': [{'id': 'VV', 'data_type': {'type': 'PixelType', 'precision': 'float', 'dimensions': 2}, 'crs': 'EPSG:4326', 'crs_transform': [1, 0, 0, 0, 1, 0]}, {'id': 'VH', 'data_type': {'type': 'PixelType', 'precision': 'float', 'dimensions': 2}, 'crs': 'EPSG:4326', 'crs_transform': [1, 0, 0, 0, 1, 0]}, {'id': 'angle', 'data_type': {'type': 'PixelType', 'precision': 'float', 'dimensions': 2}, 'crs': 'EPSG:4326', 'crs_transform': [1, 0, 0, 0, 1, 0]}, {'id': 'elevation', 'data_type': {'type': 'PixelType', 'precision': 'float', 'dimensions': 2}, 'dimensions': [1288801, 421201], 'crs': 'EPSG:4326', 'crs_transform': [0.0002777777777777778, 0, -179.0001388888889, 0, -0.0002777777777777778, 61.00013888888889]}, {'id': 'slope', 'data_type': {'type': 'PixelType', 'precision': 'float', 'dimensions': 2}, 'crs': 'EPSG:4326', 'crs_transform': [0.0002777777777777778, 0, -179.0001388888889, 0, -0.0002777777777777778, 61.00013888888889]}, {'id': 'aspect', 'data_type': {'typ

## Sampling Data for training and testing

We create some geometries to sample the stack in strategic locations to take the 256x256 samples.  Display the sampling polygons on a map, red for training polygons, blue for evaluation and green for testing polygons

In [13]:
# Training data
trainingPolys = \
    ee.FeatureCollection('users/mewchayutaphong/thailandTraining')
first = \
    ee.Geometry.BBox(101.78381817548382, 14.052178100305664,
                     102.27820294110882, 14.361037359593043)
second = \
    ee.Geometry.BBox(102.16833965985882, 16.426385350573945,
                     102.83850567548382, 16.921030330473783)
evalPolys = \
    ee.FeatureCollection(first).merge(second)

# Global Test
thai_test = \
    ee.Geometry.BBox(100.30632852425321, 17.709225431372587,
                     100.74128946175321, 18.20417872756825)
tibet_test = \
    ee.Geometry.BBox(83.7866460908476, 31.02991423438545,
                     84.4782964814726, 31.623526673040716)
ghana_test = \
    ee.Geometry.BBox(-1.9983132238272127, 5.925449378444892,
                     -1.8517086095694002, 6.081004042776062)
brazil_test = \
    ee.Geometry.BBox(-63.02141262744682, -28.962951804200575,
                     -55.59465481494682, -21.415208673846603)
mexico_test = \
    ee.Geometry.BBox(-93.47672355917602, 15.8775670491606535,
                     -90.70816887167602, 18.27183442641013)
pakistan_test = \
    ee.Geometry.BBox(68.55305064706133, 27.98976293885938,
                     70.93708385018633, 29.42545415991563)
egypt_test = \
    ee.Geometry.BBox(24.8505863665, 20.337905952546933,
                     33.1123051165, 26.856885967831754)
cambodia_test = \
    ee.Geometry.BBox(102.74910463260488, 11.950119301574833,
                     104.88045228885488, 13.600069066335008)
India_test = \
    ee.Geometry.BBox(80.42186850663656, 22.601971814608234,
                     83.05858725663656, 24.635163539106795)
Bangladesh_test = \
    ee.Geometry.BBox(89.25478625914253, 24.205326578074743,
                     90.15566516539253, 24.765228419516816)

# Thailand Test
ChiangMai = \
    ee.Geometry.BBox(98.51510160404028, 17.512566492221968,
                     100.53658597904028, 19.30542553443047)
"""
SinakarinLake = \
  ee.Geometry.BBox(98.54675583495609, 14.450091372437415,
                   99.53552536620609, 15.235944288209435)
LopBuri = \
  ee.Geometry.BBox(100.45837692870609, 14.471367913324434,
                   101.66687302245609, 15.532538680175776)
"""
SinakarinLake = \
    ee.Geometry.BBox(98.54675583495609, 13.450091372437415,
                     99.53552536620609, 14.235944288209435)
LopBuri = \
    ee.Geometry.BBox(100.45837692870609, 13.8211367913324434,
                     101.66687302245609, 14.532538680175776)
KhonKaen = \
    ee.Geometry.BBox(102.21618942870609, 15.997751855373792,
                     104.08386520995609, 17.135000807537583)
Phichit = \
    ee.Geometry.BBox(99.68933395995609, 15.680676541367129,
                     101.20544724120609, 16.630386794127517)
NearPattaya = \
    ee.Geometry.BBox(101.22741989745609, 12.870163640080621,
                     102.21618942870609, 13.810860683288954)
BuriRam = \
    ee.Geometry.BBox(102.38140047088262, 14.509321717259951,
                     103.89751375213262, 15.358532843588616)
Ratchaprapha = \
    ee.Geometry.BBox(98.44829500213262, 8.678959801438905,
                     98.95366609588262, 9.33001213894668)
Phatthalung = \
    ee.Geometry.BBox(99.86915624064652, 7.228284369969466,
                     100.39649999064652, 7.925265984506029)
Tanintharyi = \
    ee.Geometry.BBox(98.9144774757664, 11.597108186209079,
                     99.4857665382664, 12.499620763004494)


testPolys = ee.FeatureCollection(thai_test).\
    merge(tibet_test).\
    merge(ghana_test).\
    merge(brazil_test).\
    merge(mexico_test).\
    merge(pakistan_test).\
    merge(egypt_test).\
    merge(cambodia_test).\
    merge(India_test).\
    merge(Bangladesh_test)

testPolys_TH = ee.FeatureCollection(ChiangMai).\
    merge(SinakarinLake).\
    merge(LopBuri).merge(KhonKaen).\
    merge(Phichit).\
    merge(NearPattaya).\
    merge(BuriRam).\
    merge(Ratchaprapha).\
    merge(Phatthalung).\
    merge(Tanintharyi)

polyImage = ee.Image(0).\
    byte().\
    paint(trainingPolys, 1).\
    paint(evalPolys, 2).\
    paint(testPolys, 3).\
    paint(testPolys_TH, 4)
polyImage = polyImage.updateMask(polyImage)

mapid = polyImage.getMapId(
    {
        'min': 1,
        'max': 4,
        'palette': ['red', 'blue', "green", "purple"]
    }
)
map = \
    folium.Map(location=[16.426385350573945, 102.16833965985882],
               zoom_start=5)
folium.TileLayer(
    tiles=mapid['tile_fetcher'].url_format,
    attr='Map Data &copy; <a href="https://earthengine.google.com/">Google Earth Engine</a>',  # noqa
    overlay=True,
    name='training polygons',
).add_to(map)
map.add_child(folium.LayerControl())
map



sampling.py file will help with sampling by merging smaller samples within each geometry into a single export to avoid the 'computed value too large' error as illustrated by google. The function will sample the array image at points, to get all the pixels in a 256x256 neighborhood at each point. Each rectangle will get exported as a *.tfrecord.gz

Below we export the data for 

- training - 3 x 240*240 points in Thailand
- testing_global - 10 Global testing points of 72x72 pixels
- testing_local - 10 Thaialand testing points of 72x72 pixels


The training data is done in thailand, so fewer rectangle is used to train that in `Preprocessing_and_export_global.ipynb`, but more points is sampled in each rectangle in order for it to be a fair experiment (720 total number of training points).

 `Preprocessing_and_export_global.ipynb`

In [14]:
n = 24  # Number of shards in each polygon.
N = 240  # Total sample size in each polygon.
settings = configs_fs["L8SR_S1A_sl_el_as"]
sampling.Training_task(trainingPolys,
                       n,
                       N,
                       settings.sam_arr,
                       settings,
                       "Train_in_Thailand")

n = 6  # Number of shards in each polygon.
N = 72  # Total sample size in each polygon.
sampling.Testing_task(testPolys,
                      n,
                      N,
                      settings.sam_arr,
                      settings,
                      "Train_in_Thailand",
                      settings.TEST_BASE)
sampling.Testing_task(testPolys_TH,
                      n,
                      N,
                      settings.sam_arr,
                      settings,
                      "Train_in_Thailand_final",
                      settings.TEST_BASE)

## Sampling data for Model Application section: Southern Thailand Flood event accuracy accessment

This section involves exporting the data for the Thailand Flooding event from Dec 2016 to April 2017 and will be in the model application section in the report. This section exports test data and the saved model is used to evaluate the performance in the `metrics_assessment.ipynb` notebook 

In [None]:
n = 6  # Number of shards in each polygon.
N = 72  # Total sample size in each polygon.
flood_region = ee.Geometry.BBox(99.89924501744167,
                                7.634771858293311,
                                100.22265505162136,
                                7.876981296784481)

start_dates = ["2016-11-01", "2016-12-01",
               "2017-01-01", "2017-02-01", "2017-03-01"]
end_dates = ["2016-12-01", "2017-01-01",
             "2017-02-01", "2017-03-01", "2017-04-01"]

# NASADEM: NASA NASADEM Digital Elevation (30m)
elevation = ee.Image('NASA/NASADEM_HGT/001').select('elevation')
slope = ee.Terrain.slope(elevation)
aspect = ee.Terrain.aspect(elevation)

for i in range(0, len(start_dates)):
    print("startdates: ", start_dates[i], "\n enddates: ", end_dates[i])
    S1 = ee.ImageCollection('COPERNICUS/S1_GRD'). \
        filter(ee.Filter.listContains('transmitterReceiverPolarisation',
                                      'VV')). \
        filterDate(start_dates[i], end_dates[i]) \

    S1A = S1.median()
    l8sr = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR').\
        filterDate(start_dates[i], end_dates[i])
    L8SR = l8sr.map(preprocessing.maskL8sr).median()
    # JRC-Monthly Water history (30m)
    waterdata = ee.ImageCollection('JRC/GSW1_3/MonthlyHistory').\
        filterDate(start_dates[i], end_dates[i]).median()
    watermask = waterdata.select("water")
    # masking out "no data" region
    mask = watermask.gt(0)
    # Shifting the labels to make it binary
    maskedComposite = waterdata.updateMask(mask).subtract(1)
    configs_fs["L8SR_S1A_sl_el_as"].image = \
        ee.Image.cat([L8SR, S1A, slope, elevation, aspect]).float()
    settings = configs_fs["L8SR_S1A_sl_el_as"]
    featureStack = ee.Image.cat([
        settings.image.select(settings.BANDS),
        maskedComposite.select(settings.RESPONSE)
    ]).float()
    list_ = ee.List.repeat(1, settings.KERNEL_SIZE)
    lists = ee.List.repeat(list_, settings.KERNEL_SIZE)
    kernel = ee.Kernel.fixed(settings.KERNEL_SIZE, settings.KERNEL_SIZE, lists)
    arrays = featureStack.neighborhoodToArray(kernel)
    configs_fs["L8SR_S1A_sl_el_as"].sam_arr = arrays
    print(settings.TEST_BASE + "_" + str(start_dates[i]))
    sampling.Testing_task(ee.FeatureCollection(flood_region),
                          n,
                          N,
                          settings.sam_arr,
                          settings,
                          "Flood_data",
                          settings.TEST_BASE + "_" + str(start_dates[i]))

startdates:  2016-11-01 
 enddates:  2016-12-01
test_patches_2016-11-01
startdates:  2016-12-01 
 enddates:  2017-01-01
test_patches_2016-12-01
startdates:  2017-01-01 
 enddates:  2017-02-01
test_patches_2017-01-01
startdates:  2017-02-01 
 enddates:  2017-03-01
test_patches_2017-02-01
startdates:  2017-03-01 
 enddates:  2017-04-01
test_patches_2017-03-01


## Cloud cover Experiment

In [20]:
start_dates = ['2018-03-01', '2018-04-01',
               '2018-05-01', '2018-06-01',
               '2018-07-01', '2018-08-01',
               '2018-09-01', '2018-10-01',
               '2018-11-01', '2018-12-01',
               '2019-01-01', '2019-02-01']
end_dates = ['2018-04-01', '2018-05-01',
             '2018-06-01', '2018-07-01',
             '2018-08-01', '2018-09-01',
             '2018-10-01', '2018-11-01',
             '2018-12-01', '2019-01-01',
             '2019-02-01', '2019-03-01']

n = 6  # Number of shards in each polygon.
N = 72  # Total sample size in each polygon.
flood_region = ee.Geometry.BBox(99.89924501744167,
                                7.634771858293311,
                                100.22265505162136,
                                7.876981296784481)

# NASADEM: NASA NASADEM Digital Elevation (30m)
elevation = ee.Image('NASA/NASADEM_HGT/001').select('elevation')
slope = ee.Terrain.slope(elevation)
aspect = ee.Terrain.aspect(elevation)

for i in range(0, len(start_dates)):
    print("startdates: ", start_dates[i], "\n enddates: ", end_dates[i])
    S1 = ee.ImageCollection('COPERNICUS/S1_GRD') \
        .filter(ee.Filter.listContains('transmitterReceiverPolarisation',
                                       'VV')) \
        .filterDate(start_dates[i], end_dates[i]) \

    S1A = S1.median()
    l8sr = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR').\
        filterDate(start_dates[i], end_dates[i])
    L8SR = l8sr.map(preprocessing.maskL8sr).median()
    # JRC-Monthly Water history (30m)
    waterdata = ee.ImageCollection('JRC/GSW1_3/MonthlyHistory').\
        filterDate(start_dates[i], end_dates[i]).median()
    watermask = waterdata.select("water")
    # masking out "no data" region
    mask = watermask.gt(0)
    # Shifting the labels to make it binary
    maskedComposite = waterdata.updateMask(mask).subtract(1)
    configs_fs["L8SR_S1A_sl_el_as"].image = \
        ee.Image.cat([L8SR, S1A, slope, elevation, aspect]).float()
    settings = configs_fs["L8SR_S1A_sl_el_as"]
    featureStack = ee.Image.cat([
        settings.image.select(settings.BANDS),
        maskedComposite.select(settings.RESPONSE)
    ]).float()
    list_ = ee.List.repeat(1, settings.KERNEL_SIZE)
    lists = ee.List.repeat(list_, settings.KERNEL_SIZE)
    kernel = ee.Kernel.fixed(settings.KERNEL_SIZE, settings.KERNEL_SIZE, lists)
    arrays = featureStack.neighborhoodToArray(kernel)
    configs_fs["L8SR_S1A_sl_el_as"].sam_arr = arrays
    print(settings.TEST_BASE + "_" + str(start_dates[i]))
    sampling.Testing_task(ee.FeatureCollection(flood_region),
                          n,
                          N,
                          settings.sam_arr,
                          settings,
                          "Flood_data",
                          settings.TEST_BASE + "_" + str(start_dates[i]))


startdates:  2018-03-01 
 enddates:  2018-04-01
test_patches_2018-03-01
startdates:  2018-04-01 
 enddates:  2018-05-01
test_patches_2018-04-01
startdates:  2018-05-01 
 enddates:  2018-06-01
test_patches_2018-05-01
startdates:  2018-06-01 
 enddates:  2018-07-01
test_patches_2018-06-01
startdates:  2018-07-01 
 enddates:  2018-08-01
test_patches_2018-07-01
startdates:  2018-08-01 
 enddates:  2018-09-01
test_patches_2018-08-01
startdates:  2018-09-01 
 enddates:  2018-10-01
test_patches_2018-09-01
startdates:  2018-10-01 
 enddates:  2018-11-01
test_patches_2018-10-01
startdates:  2018-11-01 
 enddates:  2018-12-01
test_patches_2018-11-01
startdates:  2018-12-01 
 enddates:  2019-01-01
test_patches_2018-12-01
startdates:  2019-01-01 
 enddates:  2019-02-01
test_patches_2019-01-01
startdates:  2019-02-01 
 enddates:  2019-03-01
test_patches_2019-02-01




Smaller sizes of testing areas are selected for prediction in the app to reduce GCB storage cost and for faster rate of computation

## Other apps and experiments

In [19]:
settings = configs_fs["L8SR_S1A_sl_el_as"]
n = 6  # Number of shards in each polygon.
N = 72  # Total sample size in each polygon.

thai_test_app = \
    ee.Geometry.BBox(100.30632852425321, 17.709225431372587,
                     100.74128946175321, 18.20417872756825)
tibet_test_app = \
    ee.Geometry.BBox(83.7866460908476, 31.02991423438545,
                     84.4782964814726, 31.623526673040716)
ghana_test_app = \
    ee.Geometry.BBox(-1.9983132238272127, 5.925449378444892,
                     -1.8517086095694002, 6.081004042776062)
brazil_test_app = \
    ee.Geometry.BBox(-56.914136208542024, -27.726731964448247,
                     -55.734106911667024, -27.158518840235534)
mexico_test_app = \
    ee.Geometry.BBox(-92.70169550108471, 17.506387173382844,
                     -91.57559686827221, 18.248746226429706)
pakistan_test_app = \
    ee.Geometry.BBox(70.62737898109093, 29.031658472574723,
                     70.86907819984093, 29.247564938104578)
egypt_test_app = \
    ee.Geometry.BBox(32.25680416139795, 22.776830239482934,
                     33.28951900514795, 23.267354217706654)
cambodia_test_app = \
    ee.Geometry.BBox(103.7086527335668, 12.395622357749016,
                     104.5710794913793, 13.2151653909455)
India_test_app = \
    ee.Geometry.BBox(80.89907945596022, 23.91896359428029,
                     81.32754625283522, 24.24995005933742)
Bangladesh_test_app = \
    ee.Geometry.BBox(89.25478625914253, 24.205326578074743,
                     90.15566516539253, 24.765228419516816)

testPolys_app = \
    ee.FeatureCollection(thai_test_app).\
    merge(tibet_test_app).\
    merge(ghana_test_app).\
    merge(brazil_test_app).\
    merge(mexico_test_app).\
    merge(pakistan_test_app).\
    merge(egypt_test_app).\
    merge(cambodia_test_app).\
    merge(India_test_app).\
    merge(Bangladesh_test_app)

sampling.Testing_task(testPolys_app,
                      n,
                      N,
                      settings.sam_arr,
                      settings,
                      "hyperparameter_test",
                      settings.TEST_BASE)

The following code is to track the progress of exporting or cancelling tasks, if wrong data is exported

In [16]:
# pprint(ee.batch.Task.list())
for i in range(15):
    pprint(ee.batch.Task.list()[i])

<Task BJXKEQPNDLDJE34H2LQVZ4VK EXPORT_FEATURES: test_patches_g9 (READY)>
<Task YAFPGMXE3VE5YREOEH3SMEAB EXPORT_FEATURES: test_patches_g8 (READY)>
<Task R5EZW35GR5NTDLOYV4BZRW4K EXPORT_FEATURES: test_patches_g7 (READY)>
<Task CRAGLQN3NHUUSOIVTOULEJBF EXPORT_FEATURES: test_patches_g6 (READY)>
<Task WIGXEO73FGH57I57FX6PLRNL EXPORT_FEATURES: test_patches_g5 (READY)>
<Task GFJUTK4QSRRNMPIDF4IRH6NA EXPORT_FEATURES: test_patches_g4 (READY)>
<Task AJ3HW7QPYWTONACGD4X7HXON EXPORT_FEATURES: test_patches_g3 (READY)>
<Task NKZL6TLBXFBYBUN5TGNCTCD5 EXPORT_FEATURES: test_patches_g2 (READY)>
<Task KMUHPS5FNBEM3F3YI4EN4PLZ EXPORT_FEATURES: test_patches_g1 (READY)>
<Task VINKJBXKCJW4H2LX45MFNRGX EXPORT_FEATURES: test_patches_g0 (READY)>
<Task 7GF74KI7APQO2W5EBFRR4OYM EXPORT_FEATURES: test_patches_g9 (READY)>
<Task 6TKGJVYZU74T3ZGKZCWTVTWL EXPORT_FEATURES: test_patches_g8 (READY)>
<Task OJKPFGB3YYDH6GQCSMLVZWYH EXPORT_FEATURES: test_patches_g7 (READY)>
<Task H7QHZODHNZ7LV3IHCLZ5LRNY EXPORT_FEATURES: tes

In [18]:
!earthengine task cancel all

Canceling task "BJXKEQPNDLDJE34H2LQVZ4VK"
Canceling task "YAFPGMXE3VE5YREOEH3SMEAB"
Canceling task "R5EZW35GR5NTDLOYV4BZRW4K"
Canceling task "CRAGLQN3NHUUSOIVTOULEJBF"
Canceling task "WIGXEO73FGH57I57FX6PLRNL"
Canceling task "GFJUTK4QSRRNMPIDF4IRH6NA"
Canceling task "AJ3HW7QPYWTONACGD4X7HXON"
Canceling task "NKZL6TLBXFBYBUN5TGNCTCD5"
Canceling task "KMUHPS5FNBEM3F3YI4EN4PLZ"
Canceling task "VINKJBXKCJW4H2LX45MFNRGX"
Canceling task "7GF74KI7APQO2W5EBFRR4OYM"
Canceling task "6TKGJVYZU74T3ZGKZCWTVTWL"
Canceling task "OJKPFGB3YYDH6GQCSMLVZWYH"
Canceling task "H7QHZODHNZ7LV3IHCLZ5LRNY"
Canceling task "QUB3YGTV3FMABSEHMKDRSYFS"
Canceling task "INV3XY6KUUKVNUGX7UO3VLYJ"
Canceling task "PLMJRDBVJQIEK3N544RQZ5IZ"
Canceling task "HZ4BUM36BOEX4KJXC6F2KV6I"
Canceling task "UBLRWPKWIBUEH44DGJMRER33"
