<a href="https://colab.research.google.com/github/edsml-kl121/geeimperial/blob/master/Preprocessing_and_export.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
#@title Copyright 2020 Google LLC. { display-mode: "form" }
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

<table class="ee-notebook-buttons" align="left"><td>
<a target="_blank"  href="http://colab.research.google.com/github/google/earthengine-api/blob/master/python/examples/ipynb/UNET_regression_demo.ipynb">
    <img src="https://www.tensorflow.org/images/colab_logo_32px.png" /> Run in Google Colab</a>
</td><td>
<a target="_blank"  href="https://github.com/google/earthengine-api/blob/master/python/examples/ipynb/UNET_regression_demo.ipynb"><img width=32px src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source on GitHub</a></td></table>

# Introduction

This is an Earth Engine <> TensorFlow demonstration notebook.  Suppose you want to predict a continuous output (regression) from a stack of continuous inputs.  In this example, the output is impervious surface area from [NLCD](https://www.mrlc.gov/data) and the input is a Landsat 8 composite.  The model is a [fully convolutional neural network (FCNN)](https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Long_Fully_Convolutional_Networks_2015_CVPR_paper.pdf), specifically [U-net](https://arxiv.org/abs/1505.04597). This notebook shows:

1.   Exporting training/testing patches from Earth Engine, suitable for training an FCNN model.
2.   Preprocessing.
3.   Training and validating an FCNN model.
4.   Making predictions with the trained model and importing them to Earth Engine.

# Variables

Declare the variables that will be in use throughout the notebook.

In [11]:
PACKAGE_PATH = 'Water_classification_package'

!ls -l
!mkdir {PACKAGE_PATH}
!touch {PACKAGE_PATH}/__init__.py
!ls -l {PACKAGE_PATH}

total 9504
-rw-r--r--  1 kandanai  staff   256035 Jul 23 10:41 Preprocessing_and_export.ipynb
-rw-r--r--  1 kandanai  staff   103359 Jul 17 16:31 S1+DEM+CNNfromScratch.ipynb
-rw-r--r--  1 kandanai  staff   246279 Jul 17 16:31 Test_accuracy_assessment.ipynb
-rw-r--r--@ 1 kandanai  staff  4094106 Jul 23 10:31 Test_accuracy_assessment_global.ipynb
drwxr-xr-x  6 kandanai  staff      192 Jul 23 10:38 [1m[36mWater_classification_package[m[m
-rw-r--r--  1 kandanai  staff   155531 Jul 22 19:22 results.ipynb
drwxr-xr-x  7 kandanai  staff      224 Jul 17 16:37 [1m[36mtools[m[m
mkdir: Water_classification_package: File exists
total 32
-rw-r--r--  1 kandanai  staff     0 Jul 23 10:42 __init__.py
-rw-r--r--  1 kandanai  staff  1733 Jul 23 10:38 config.py
-rw-r--r--  1 kandanai  staff  4163 Jul 23 10:38 metrics_.py
-rw-r--r--  1 kandanai  staff  2863 Jul 23 10:38 preprocessing.py


In [5]:
%%writefile {PACKAGE_PATH}/metrics_.py

from keras import backend as K
import tqdm.notebook as tq
import numpy as np
import tensorflow as tf

def f1(y_true, y_pred):
    def recall(y_true, y_pred):
        """Recall metric.

        Only computes a batch-wise average of recall.

        Computes the recall, a metric for multi-label classification of
        how many relevant items are selected.
        """
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
        recall = true_positives / (possible_positives + K.epsilon())
        return recall

    def precision(y_true, y_pred):
        """Precision metric.

        Only computes a batch-wise average of precision.

        Computes the precision, a metric for multi-label classification of
        how many selected items are relevant.
        """
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
        precision = true_positives / (predicted_positives + K.epsilon())
        return precision
    precision = precision(y_true, y_pred)
    recall = recall(y_true, y_pred)
    return 2*((precision*recall)/(precision+recall+K.epsilon()))

# https://stackoverflow.com/questions/43547402/how-to-calculate-f1-macro-in-keras

# Acc = TP + TN / (TP + TN + FP + FN)
# possible_pos = TP + FN
# predicted_pos = TP + FP
# Missing TN
# TN = total - possible_pos - predicted_pos + TP
# TP + TN + FP + FN = possible_pos + predicted_pos - TP + TN

def custom_accuracy(y_true, y_pred):
    # total_data = K.int_shape(y_true) + K.int_shape(y_pred)
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    true_negatives = K.sum(K.round(K.clip(1 - y_true * y_pred, 0, 1)))
    possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
    total_data = - true_positives + true_negatives + possible_positives + predicted_positives
  
    # total_data = tf.cast(total_data, tf.float32|tf.int32)
    # true_positives = tf.cast(true_positives, tf.float32|tf.int32)
    # possible_positives = tf.cast(possible_positives, tf.float32|tf.int32)
    # predicted_positives = tf.cast(predicted_positives, tf.float32|tf.int32)
    # print(K.int_shape(y_true), K.int_shape(y_pred))
    # print(K.int_shape(y_pred)[0], K.int_shape(y_pred)[1], K.int_shape(y_pred)[2])
    # print(total_data)
    # print(possible_positives)
    # (true_positives) / (total_data + K.epsilon())
    return (true_positives + true_negatives) / (total_data + K.epsilon())



def MetricCalculator(model, test_data, total_steps):
  TP = 0
  TN = 0
  FP = 0
  FN = 0
  # total_steps = 2000
  test_acc_metric = tf.keras.metrics.Accuracy()
  # test_F1_metric = tfa.metrics.F1Score(num_classes=2, threshold=0.5)
  pbar = tq.tqdm(total=total_steps)
  for steps, data in enumerate(test_data):
    # print(f'Number of steps: {steps}', end = "\r")
    pbar.update(1)
    if steps == total_steps:
      break
    input = data[0]
    y_true = data[1]
    y_pred = np.rint(model.predict(input))
    y_true = np.reshape(y_true, (256*256,2))
    y_pred = np.reshape(y_pred, (256*256,2))
    # print(y_pred[0][1] == 1, y_pred[0][1] == 1)
    for j in range(y_pred.shape[0]):
      if y_true[j][1] == 1 and y_pred[j][1] == 1:
        TP += 1
      if y_true[j][1] == 0 and y_pred[j][1] == 0:
        TN += 1
      if y_true[j][1] == 1 and y_pred[j][1] == 0:
        FN += 1
      if y_true[j][1] == 0 and y_pred[j][1] == 1:
        FP += 1
    test_acc_metric.update_state(y_true, y_pred)
    # # recall_metric.update_state(data[1], test_logits)
    # # precision_metric.update_state(data[1], test_logits)
  print("TP: ", TP)
  print("TN: ", TN)
  print("FP: ", FP)
  print("FN: ", FN)
  if TP != 0:
    precision = TP/(TP + FP)
    recall = TP/(TP + FN)
    F1 = 2*(recall*precision)/(recall + precision)
  else:
    precision = None
    recall = None
    F1 = None

  print("precision: ", precision)
  print("recall: ", recall)
  print("F1_Score: : ", F1)
  print("Accuracy: ", test_acc_metric.result().numpy())
  return F1, test_acc_metric.result().numpy()

Overwriting Water_classification_package/metrics_.py


In [7]:
%%writefile {PACKAGE_PATH}/config.py
import tensorflow as tf
from . import metrics_
class configuration:
  def __init__(self, PROJECT_TITLE, BANDS1, TRAIN_SIZE, EVAL_SIZE, BANDS2=[], BANDS3=[], country="TH", image=None, sam_arr=None, type_=1):
    if type_ == 1:
      self.type_ = "fs"
    elif type_ == 2:
      self.type_ = "m2"
    elif type_ == 3:
      self.type_ = "m3"
    else:
      self.type_ = None
    self.country = country
    self.PROJECT_TITLE = PROJECT_TITLE
    self.BANDS1 = BANDS1
    self.BANDS2 = BANDS2
    self.BANDS3 = BANDS3
    self.BUCKET = "geebucketwater"
    self.FOLDER = f'{self.type_}_{self.country}_Cnn_{self.PROJECT_TITLE}'
    self.TRAIN_SIZE = TRAIN_SIZE
    self.EVAL_SIZE = EVAL_SIZE
    self.BUCKET = "geebucketwater"
    self.TRAINING_BASE = f'training_patches_{PROJECT_TITLE}'
    self.EVAL_BASE = f'eval_patches_{PROJECT_TITLE}'
    self.TEST_BASE_1 = f'test_patches_{PROJECT_TITLE}_1'
    self.TEST_BASE_2 = f'test_patches_{PROJECT_TITLE}_2'
    self.RESPONSE = 'water'
    self.BANDS = BANDS1 + BANDS2 + BANDS3 
    self.FEATURES = BANDS1 + BANDS2 + BANDS3 + [self.RESPONSE]
    # Specify the size and shape of patches expected by the model.
    self.KERNEL_SIZE = 256
    self.KERNEL_SHAPE = [self.KERNEL_SIZE, self.KERNEL_SIZE]
    self.COLUMNS = [
      tf.io.FixedLenFeature(shape=self.KERNEL_SHAPE, dtype=tf.float32) for k in self.FEATURES
    ]
    self.FEATURES_DICT = dict(zip(self.FEATURES, self.COLUMNS))
    # Specify model training parameters.
    self.BATCH_SIZE = 16
    self.EPOCHS = 5
    self.BUFFER_SIZE = 2000
    self.OPTIMIZER = 'adam'
    self.LOSS = 'categorical_crossentropy'
    self.METRICS = ['AUC', metrics_.f1, metrics_.custom_accuracy]
    self.image = image
    self.sam_arr = sam_arr
    




Writing Water_classification_package/config.py


In [6]:
%%writefile {PACKAGE_PATH}/preprocessing.py

import tensorflow as tf

class Preprocessor:
  def __init__(self, config):
    self.config = config

  def parse_tfrecord(self, example_proto):
    """The parsing function.
    Read a serialized example into the structure defined by FEATURES_DICT.
    Args:
      example_proto: a serialized Example.
    Returns:
      A dictionary of tensors, keyed by feature name.
    """
    return tf.io.parse_single_example(example_proto, self.config.FEATURES_DICT)


  def to_tuple(self, inputs):
    """Function to convert a dictionary of tensors to a tuple of (inputs, outputs).
    Turn the tensors returned by parse_tfrecord into a stack in HWC shape.
    Args:
      inputs: A dictionary of tensors, keyed by feature name.
    Returns:
      A tuple of (inputs, outputs).
    """
    inputsList = [inputs.get(key) for key in self.config.FEATURES]
    stacked = tf.stack(inputsList, axis=0)
    # Convert from CHW to HWC
    stacked = tf.transpose(stacked, [1, 2, 0])
    return stacked[:,:,:len(self.config.BANDS)], tf.reshape(tf.one_hot(tf.cast(stacked[:,:,len(self.config.BANDS):], tf.int32), depth=2),[256,256,2])


  def get_dataset(self, pattern):
    """Function to read, parse and format to tuple a set of input tfrecord files.
    Get all the files matching the pattern, parse and convert to tuple.
    Args:
      pattern: A file pattern to match in a Cloud Storage bucket.
    Returns:
      A tf.data.Dataset
    """
    glob = tf.io.gfile.glob(pattern)
    dataset = tf.data.TFRecordDataset(glob, compression_type='GZIP')
    dataset = dataset.map(self.parse_tfrecord, num_parallel_calls=5)
    dataset = dataset.map(self.to_tuple, num_parallel_calls=5)
    return dataset

  def get_training_dataset(self):
    """Get the preprocessed training dataset
    Returns: 
      A tf.data.Dataset of training data.
    """
    glob = 'gs://' + self.config.BUCKET + '/' + self.config.FOLDER + '/' + self.config.TRAINING_BASE + '*'
    print(glob)
    dataset = self.get_dataset(glob)
    dataset = dataset.shuffle(self.config.BUFFER_SIZE).batch(self.config.BATCH_SIZE).repeat()
    return dataset



  def get_eval_dataset(self):
    """Get the preprocessed evaluation dataset
    Returns: 
      A tf.data.Dataset of evaluation data.
    """
    glob = 'gs://' + self.config.BUCKET + '/' + self.config.FOLDER + '/' + self.config.EVAL_BASE + '*'
    print(glob)
    dataset = self.get_dataset(glob)
    dataset = dataset.batch(1).repeat()
    return dataset

  # print(iter(evaluation.take(1)).next())

  def get_test_dataset(self, test_base):
    """Get the preprocessed evaluation dataset
    Returns: 
      A tf.data.Dataset of evaluation data.
    """
    glob = 'gs://' + self.config.BUCKET + '/' + self.config.FOLDER + '/' + test_base + '*'
    print(glob)
    dataset = self.get_dataset(glob)
    dataset = dataset.batch(1).repeat()
    return dataset



Overwriting Water_classification_package/preprocessing.py


# Setup software libraries

Authenticate and import as necessary.

In [2]:
# Cloud authentication.
# from google.colab import auth
# auth.authenticate_user()

# Import, authenticate and initialize the Earth Engine library.
import ee
ee.Authenticate()
ee.Initialize()

# project_id = 'coastal-cell-299117'
# !gcloud config set project {project_id}

# Tensorflow setup.
import tensorflow as tf
print(tf.__version__)

Enter verification code: 4/1AdQt8qgmCmTU7hy2J1p2M9isTYD1dFm0_8dbMOsu43jpNrcqWdoKEpDpBGE

Successfully saved authorization token.
2.0.0


In [9]:
# from Ipython.display import IFrame
# Iframe

%reset_selective [-f]
project_id = 'coastal-cell-299117'
!gcloud config set project {project_id}

Once deleted, variables cannot be recovered. Proceed (y/[n])?  n
Nothing done.
Are you sure you wish to set property [core/project] to coastal-cell-299117?

Do you want to continue (Y/n)?  ^C


Command killed by keyboard interrupt



In [14]:
!pip install keras

Collecting keras
  Downloading keras-2.9.0-py2.py3-none-any.whl (1.6 MB)
[K     |████████████████████████████████| 1.6 MB 6.0 MB/s eta 0:00:01
[?25hInstalling collected packages: keras
Successfully installed keras-2.9.0


In [3]:
from importlib import reload
# reload(config)
from Water_classification_package import preprocessing, config, metrics_

ImportError: cannot import name 'get_config' from 'tensorflow.python.eager.context' (/Users/kandanai/opt/anaconda3/envs/bigdata/lib/python3.7/site-packages/tensorflow_core/python/eager/context.py)

# Imagery

Load data

In [4]:
# Use Landsat 8 surface reflectance data.
S1 = ee.ImageCollection('COPERNICUS/S1_GRD') \
        .filter(ee.Filter.listContains('transmitterReceiverPolarisation', 'VV')) \
        .filterDate('2018-01-01','2018-02-01') \

S1A = S1.median()
S1 = S1.select('VV', 'VH').median()

elevation = ee.Image('NASA/NASADEM_HGT/001').select('elevation');

slope = ee.Terrain.slope(elevation);

aspect = ee.Terrain.aspect(elevation);

# Use Landsat 8 surface reflectance data.
l8sr = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR').filterDate('2018-01-01','2018-02-01')

# Cloud masking function.
def maskL8sr(image):
  BANDS = ['B2', 'B3', 'B4', 'B5', 'B6', 'B7']
  cloudShadowBitMask = ee.Number(2).pow(3).int()
  cloudsBitMask = ee.Number(2).pow(5).int()
  qa = image.select('pixel_qa')
  mask = qa.bitwiseAnd(cloudShadowBitMask).eq(0).And(
    qa.bitwiseAnd(cloudsBitMask).eq(0))
  return image.updateMask(mask).select(BANDS).divide(10000)

L8SR = l8sr.map(maskL8sr).median()


waterdata = ee.ImageCollection('JRC/GSW1_3/MonthlyHistory').filterDate('2018-01-01', '2018-02-01').median()
watermask = waterdata.select("water")
mask = watermask.gt(0)
maskedComposite = waterdata.updateMask(mask).subtract(1)

In [31]:
configs_fs = {}
train_size = 240*3
eval_size = 240*2

#### Feature stack experiment

configs_fs["S1A_el_sl_as"] = config.configuration("S1A_el_sl_as", ["VV", "VH", "angle", "elevation", "slope", "aspect"], train_size, eval_size)
configs_fs["S1A_el"] = config.configuration("S1A_el", ["VV", "VH", "angle", "elevation"], train_size, eval_size)
configs_fs["S1A_sl"] = config.configuration("S1A_sl", ["VV", "VH", "angle", "slope"], train_size, eval_size)
configs_fs["S1A_as"] = config.configuration("S1A_as", ["VV", "VH", "angle", "aspect"], train_size, eval_size)
configs_fs["S1A_sl_as"] = config.configuration("S1A_sl_as", ["VV", "VH", "angle", "slope", "aspect"], train_size, eval_size)
configs_fs["S1A_el_sl"] = config.configuration("S1A_el_sl", ["VV", "VH", "angle", "elevation", "slope"], train_size, eval_size)
configs_fs["S1A_el_as"] = config.configuration("S1A_el_as", ["VV", "VH", "angle", "elevation", "aspect"], train_size, eval_size)

configs_fs["S1_el_sl_as"] = config.configuration("S1_el_sl_as", ["VV", "VH", "elevation", "slope", "aspect"], train_size, eval_size)
configs_fs["S1_el"] = config.configuration("S1_el", ["VV", "VH", "elevation"], train_size, eval_size)
configs_fs["S1_sl"] = config.configuration("S1_sl", ["VV", "VH", "slope"], train_size, eval_size)
configs_fs["S1_as"] = config.configuration("S1_as", ["VV", "VH", "aspect"], train_size, eval_size)
configs_fs["S1_sl_as"] = config.configuration("S1_sl_as", ["VV", "VH", "slope", "aspect"], train_size, eval_size)
configs_fs["S1_el_sl"] = config.configuration("S1_el_sl", ["VV", "VH", "elevation", "slope"], train_size, eval_size)
configs_fs["S1_el_as"] = config.configuration("S1_el_as", ["VV", "VH", "elevation", "aspect"], train_size, eval_size)

configs_fs["L8SR_el_sl_as"] = config.configuration("L8SR_el_sl_as", ["B2", "B3", "B4", "B5", "B6", "B7", "elevation", "slope", "aspect"], train_size, eval_size)
configs_fs["L8SR_el"] = config.configuration("L8SR_el", ["B2", "B3", "B4", "B5", "B6", "B7", "elevation"], train_size, eval_size)
configs_fs["L8SR_sl"] = config.configuration("L8SR_sl", ["B2", "B3", "B4", "B5", "B6", "B7", "slope"], train_size, eval_size)
configs_fs["L8SR_as"] = config.configuration("L8SR_as", ["B2", "B3", "B4", "B5", "B6", "B7", "aspect"], train_size, eval_size)
configs_fs["L8SR_sl_as"] = config.configuration("L8SR_sl_as", ["B2", "B3", "B4", "B5", "B6", "B7", "slope", "aspect"], train_size, eval_size)
configs_fs["L8SR_el_sl"] = config.configuration("L8SR_el_sl", ["B2", "B3", "B4", "B5", "B6", "B7", "elevation", "slope"], train_size, eval_size)
configs_fs["L8SR_el_as"] = config.configuration("L8SR_el_as", ["B2", "B3", "B4", "B5", "B6", "B7", "elevation", "aspect"], train_size, eval_size)

configs_fs["L8SR_S1_el"] = config.configuration("L8SR_S1_el", ["B2", "B3", "B4", "B5", "B6", "B7", "VV", "VH", "elevation"], train_size, eval_size)
configs_fs["L8SR_S1_sl"] = config.configuration("L8SR_S1_sl", ["B2", "B3", "B4", "B5", "B6", "B7", "VV", "VH", "slope"], train_size, eval_size)
configs_fs["L8SR_S1_sl_el_as"] = config.configuration("L8SR_S1_sl_el_as", ["B2", "B3", "B4", "B5", "B6", "B7", "VV", "VH", "slope","elevation", "aspect"], train_size, eval_size)

configs_fs["L8SR_S1A_el"] = config.configuration("L8SR_S1A_el", ["B2", "B3", "B4", "B5", "B6", "B7", "VV", "VH", "angle", "elevation"], train_size, eval_size)
configs_fs["L8SR_S1A_sl"] = config.configuration("L8SR_S1A_sl", ["B2", "B3", "B4", "B5", "B6", "B7", "VV", "VH", "angle", "slope"], train_size, eval_size)
configs_fs["L8SR_S1A_sl_el_as"] = config.configuration("L8SR_S1A_sl_el_as", ["B2", "B3", "B4", "B5", "B6", "B7", "VV", "VH", "angle", "slope","elevation", "aspect"], train_size, eval_size)

configs_fs["L8SR"] = config.configuration("L8SR", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size)
configs_fs["S1"] = config.configuration("S1", ["VV", "VH"], train_size, eval_size)
configs_fs["S1A"] = config.configuration("S1A", ["VV", "VH", "angle"], train_size, eval_size)
configs_fs["L8SR_S1"] = config.configuration("L8SR_S1", ["B2", "B3", "B4", "B5", "B6", "B7", "VV", "VH"], train_size, eval_size)
configs_fs["L8SR_S1A"] = config.configuration("L8SR_S1A", ["B2", "B3", "B4", "B5", "B6", "B7", "VV", "VH", "angle"], train_size, eval_size)

###### Multiexperiment

configs_multi = {}

configs_multi["S1A_el_sl_as"] = config.configuration("S1A_el_sl_as", ["VV", "VH", "angle"], train_size, eval_size, ["elevation", "slope", "aspect"], type_=2)
configs_multi["S1A_el"] = config.configuration("S1A_el", ["VV", "VH", "angle"], train_size, eval_size, ["elevation"], type_=2)
configs_multi["S1A_sl"] = config.configuration("S1A_sl", ["VV", "VH", "angle"], train_size, eval_size, ["slope"], type_=2)
configs_multi["S1A_as"] = config.configuration("S1A_as", ["VV", "VH", "angle"], train_size, eval_size, ["aspect"], type_=2)
configs_multi["S1A_sl_as"] = config.configuration("S1A_sl_as", ["VV", "VH", "angle"], train_size, eval_size, ["slope", "aspect"], type_=2)
configs_multi["S1A_el_sl"] = config.configuration("S1A_el_sl", ["VV", "VH", "angle"], train_size, eval_size, ["elevation", "slope"], type_=2)
configs_multi["S1A_el_as"] = config.configuration("S1A_el_as", ["VV", "VH", "angle"], train_size, eval_size, ["elevation", "aspect"], type_=2)

configs_multi["S1_el_sl_as"] = config.configuration("S1_el_sl_as", ["VV", "VH"], train_size, eval_size, ["elevation", "slope", "aspect"], type_=2)
configs_multi["S1_el"] = config.configuration("S1_el", ["VV", "VH"], train_size, eval_size, ["elevation"], type_=2)
configs_multi["S1_sl"] = config.configuration("S1_sl", ["VV", "VH"], train_size, eval_size, ["slope"], type_=2)
configs_multi["S1_as"] = config.configuration("S1_as", ["VV", "VH"], train_size, eval_size, ["aspect"], type_=2)
configs_multi["S1_sl_as"] = config.configuration("S1_sl_as", ["VV", "VH"], train_size, eval_size, ["slope", "aspect"], type_=2)
configs_multi["S1_el_sl"] = config.configuration("S1_el_sl", ["VV", "VH"], train_size, eval_size, ["elevation", "slope"], type_=2)
configs_multi["S1_el_as"] = config.configuration("S1_el_as", ["VV", "VH"], train_size, eval_size, ["elevation", "aspect"], type_=2)

configs_multi["L8SR_el_sl_as"] = config.configuration("L8SR_el_sl_as", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["elevation", "slope", "aspect"], type_=2)
configs_multi["L8SR_el"] = config.configuration("L8SR_el", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["elevation"], type_=2)
configs_multi["L8SR_sl"] = config.configuration("L8SR_sl", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["slope"], type_=2)
configs_multi["L8SR_as"] = config.configuration("L8SR_as", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["aspect"], type_=2)
configs_multi["L8SR_sl_as"] = config.configuration("L8SR_sl_as", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["slope", "aspect"], type_=2)
configs_multi["L8SR_el_sl"] = config.configuration("L8SR_el_sl", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["elevation", "slope"], type_=2)
configs_multi["L8SR_el_as"] = config.configuration("L8SR_el_as", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["elevation", "aspect"], type_=2)


configs_multi["L8SR_S1_as"] = config.configuration("L8SR_S1_as", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "aspect"], type_=2)
configs_multi["L8SR_S1_el"] = config.configuration("L8SR_S1_el", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "elevation"], type_=2)
configs_multi["L8SR_S1_sl"] = config.configuration("L8SR_S1_sl", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "slope"], type_=2)

configs_multi["L8SR_S1_sl_as"] = config.configuration("L8SR_S1_sl_as", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "slope", "aspect"], type_=2)
configs_multi["L8SR_S1_el_sl"] = config.configuration("L8SR_S1_el_sl", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "elevation", "slope"], type_=2)
configs_multi["L8SR_S1_el_as"] = config.configuration("L8SR_S1_el_as", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "elevation", "aspect"], type_=2)
configs_multi["L8SR_S1_sl_el_as"] = config.configuration("L8SR_S1_sl_el_as", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "slope","elevation", "aspect"], type_=2)

configs_multi["L8SR_S1A_as"] = config.configuration("L8SR_S1A_as", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "angle", "aspect"], type_=2)
configs_multi["L8SR_S1A_el"] = config.configuration("L8SR_S1A_el", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "angle", "elevation"], type_=2)
configs_multi["L8SR_S1A_sl"] = config.configuration("L8SR_S1A_sl", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "angle", "slope"], type_=2)

configs_multi["L8SR_S1A_sl_as"] = config.configuration("L8SR_S1A_sl_as", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "angle", "slope", "aspect"], type_=2)
configs_multi["L8SR_S1A_el_sl"] = config.configuration("L8SR_S1A_el_sl", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "angle", "elevation", "slope"], type_=2)
configs_multi["L8SR_S1A_el_as"] = config.configuration("L8SR_S1A_el_as", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "angle", "elevation", "aspect"], type_=2)
configs_multi["L8SR_S1A_sl_el_as"] = config.configuration("L8SR_S1A_sl_el_as", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "angle", "slope","elevation", "aspect"], type_=2)

configs_multi["L8SR_S1"] = config.configuration("L8SR_S1", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH"], type_=2)
configs_multi["L8SR_S1A"] = config.configuration("L8SR_S1A", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "angle"], type_=2)

configs_multi["L8SR_S1_as3"] = config.configuration("L8SR_S1_as3", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH"], ["aspect"], type_=3)
configs_multi["L8SR_S1_el3"] = config.configuration("L8SR_S1_el3", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH"], ["elevation"], type_=3)
configs_multi["L8SR_S1_sl3"] = config.configuration("L8SR_S1_sl3", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH"], ["slope"], type_=3)
configs_multi["L8SR_S1_sl_el_as3"] = config.configuration("L8SR_S1_sl_el_as3", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH"], ["slope","elevation", "aspect"], type_=3)

configs_multi["L8SR_S1A_as3"] = config.configuration("L8SR_S1A_as3", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "angle"], ["aspect"], type_=3)
configs_multi["L8SR_S1A_el3"] = config.configuration("L8SR_S1A_el3", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "angle"], ["elevation"], type_=3)
configs_multi["L8SR_S1A_sl3"] = config.configuration("L8SR_S1A_sl3", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "angle"], ["slope"], type_=3)
configs_multi["L8SR_S1A_sl_el_as3"] = config.configuration("L8SR_S1A_sl_el_as3", ["B2", "B3", "B4", "B5", "B6", "B7"], train_size, eval_size, ["VV", "VH", "angle"], ["slope","elevation", "aspect"], type_=3)

In [32]:
print(configs_multi["L8SR_S1A"].FOLDER)
print(configs_multi["L8SR_S1A"].BANDS1)
print(configs_multi["L8SR_S1A"].BANDS2)
print(configs_multi["L8SR_S1A"].BANDS)
print(configs_multi["L8SR_S1A"].TRAINING_BASE)
print(configs_multi["L8SR_S1A"].EVAL_BASE)
print(configs_multi["L8SR_S1A"].TEST_BASE_1)
print(configs_multi["L8SR_S1A"].TEST_BASE_2)
print(configs_multi["L8SR_S1A"].country)
print(configs_multi["L8SR_S1A"].type_)

m2_TH_Cnn_L8SR_S1A
['B2', 'B3', 'B4', 'B5', 'B6', 'B7']
['VV', 'VH', 'angle']
['B2', 'B3', 'B4', 'B5', 'B6', 'B7', 'VV', 'VH', 'angle']
training_patches_L8SR_S1A
eval_patches_L8SR_S1A
test_patches_L8SR_S1A_1
test_patches_L8SR_S1A_2
TH
m2


In [12]:
# Feature Stack

configs_fs["S1A_el_sl_as"].image = ee.Image.cat([S1A, elevation, slope, aspect]).float()
configs_fs["S1A_el"].image = ee.Image.cat([S1A, elevation]).float()
configs_fs["S1A_sl"].image = ee.Image.cat([S1A, slope]).float()
configs_fs["S1A_as"].image = ee.Image.cat([S1A, aspect]).float()
configs_fs["S1A_sl_as"].image = ee.Image.cat([S1A, slope, aspect]).float()
configs_fs["S1A_el_sl"].image = ee.Image.cat([S1A, elevation, slope]).float()
configs_fs["S1A_el_as"].image = ee.Image.cat([S1A, elevation, aspect]).float()

configs_fs["S1_el_sl_as"].image = ee.Image.cat([S1, elevation, slope, aspect]).float()
configs_fs["S1_el"].image = ee.Image.cat([S1, elevation]).float()
configs_fs["S1_sl"].image = ee.Image.cat([S1, slope]).float()
configs_fs["S1_as"].image = ee.Image.cat([S1, aspect]).float()
configs_fs["S1_sl_as"].image = ee.Image.cat([S1, slope, aspect]).float()
configs_fs["S1_el_sl"].image = ee.Image.cat([S1, elevation, slope]).float()
configs_fs["S1_el_as"].image = ee.Image.cat([S1, elevation, aspect]).float()

configs_fs["L8SR_el_sl_as"].image = ee.Image.cat([L8SR, elevation, slope, aspect]).float()
configs_fs["L8SR_el"].image = ee.Image.cat([L8SR, elevation]).float()
configs_fs["L8SR_sl"].image = ee.Image.cat([L8SR, slope]).float()
configs_fs["L8SR_as"].image = ee.Image.cat([L8SR, aspect]).float()
configs_fs["L8SR_sl_as"].image = ee.Image.cat([L8SR, slope, aspect]).float()
configs_fs["L8SR_el_sl"].image = ee.Image.cat([L8SR, elevation, slope]).float()
configs_fs["L8SR_el_as"].image = ee.Image.cat([L8SR, elevation, aspect]).float()

configs_fs["L8SR_S1_el"].image = ee.Image.cat([L8SR, S1, elevation]).float()
configs_fs["L8SR_S1_sl"].image = ee.Image.cat([L8SR, S1, slope]).float()
configs_fs["L8SR_S1_sl_el_as"].image = ee.Image.cat([L8SR, S1, slope, elevation, aspect]).float()

configs_fs["L8SR_S1A_el"].image = ee.Image.cat([L8SR, S1A, elevation]).float()
configs_fs["L8SR_S1A_sl"].image = ee.Image.cat([L8SR, S1A, slope]).float()
configs_fs["L8SR_S1A_sl_el_as"].image = ee.Image.cat([L8SR, S1A, slope, elevation, aspect]).float()

configs_fs["L8SR"].image = L8SR.float()
configs_fs["S1"].image = S1.float()
configs_fs["S1A"].image = S1A.float()
configs_fs["L8SR_S1"].image = ee.Image.cat([L8SR, S1]).float()
configs_fs["L8SR_S1A"].image = ee.Image.cat([L8SR, S1A]).float()


Stack the 2D images (Landsat composite and NLCD impervious surface) to create a single image from which samples can be taken.  Convert the image into an array image in which each pixel stores 256x256 patches of pixels for each band.  This is a key step that bears emphasis: to export training patches, convert a multi-band image to [an array image](https://developers.google.com/earth-engine/arrays_array_images#array-images) using [`neighborhoodToArray()`](https://developers.google.com/earth-engine/api_docs#eeimageneighborhoodtoarray), then sample the image at points.

In [13]:
for key in list(configs_fs):
  settings = configs_fs[key]
  featureStack = ee.Image.cat([
  settings.image.select(settings.BANDS),
  maskedComposite.select(settings.RESPONSE)
  ]).float()
  list_ = ee.List.repeat(1, settings.KERNEL_SIZE)
  lists = ee.List.repeat(list_, settings.KERNEL_SIZE)
  kernel = ee.Kernel.fixed(settings.KERNEL_SIZE, settings.KERNEL_SIZE, lists)
  arrays = featureStack.neighborhoodToArray(kernel)
  configs_fs[key].sam_arr = arrays
  print(key, settings.sam_arr.getInfo())
  

S1A_el_sl_as {'type': 'Image', 'bands': [{'id': 'VV', 'data_type': {'type': 'PixelType', 'precision': 'float', 'dimensions': 2}, 'crs': 'EPSG:4326', 'crs_transform': [1, 0, 0, 0, 1, 0]}, {'id': 'VH', 'data_type': {'type': 'PixelType', 'precision': 'float', 'dimensions': 2}, 'crs': 'EPSG:4326', 'crs_transform': [1, 0, 0, 0, 1, 0]}, {'id': 'angle', 'data_type': {'type': 'PixelType', 'precision': 'float', 'dimensions': 2}, 'crs': 'EPSG:4326', 'crs_transform': [1, 0, 0, 0, 1, 0]}, {'id': 'elevation', 'data_type': {'type': 'PixelType', 'precision': 'float', 'dimensions': 2}, 'dimensions': [1288801, 421201], 'crs': 'EPSG:4326', 'crs_transform': [0.0002777777777777778, 0, -179.0001388888889, 0, -0.0002777777777777778, 61.00013888888889]}, {'id': 'slope', 'data_type': {'type': 'PixelType', 'precision': 'float', 'dimensions': 2}, 'crs': 'EPSG:4326', 'crs_transform': [0.0002777777777777778, 0, -179.0001388888889, 0, -0.0002777777777777778, 61.00013888888889]}, {'id': 'aspect', 'data_type': {'typ

Use some pre-made geometries to sample the stack in strategic locations.  Specifically, these are hand-made polygons in which to take the 256x256 samples.  Display the sampling polygons on a map, red for training polygons, blue for evaluation.

In [23]:
import folium
# trainingPolys = ee.FeatureCollection('projects/google/DemoTrainingGeometries')
trainingPolys = ee.FeatureCollection('users/mewchayutaphong/thailandTraining')
# evalPolys = ee.FeatureCollection('projects/google/DemoEvalGeometries')
first = ee.Geometry.BBox(101.78381817548382, 14.052178100305664, 102.27820294110882, 14.361037359593043);
second = ee.Geometry.BBox(102.16833965985882, 16.426385350573945, 102.83850567548382, 16.921030330473783);
evalPolys = ee.FeatureCollection(first).merge(second)
testPolys1 = ee.FeatureCollection(ee.Geometry.BBox(100.30632852425321, 17.709225431372587, 100.74128946175321, 18.20417872756825))
testPolys2 = ee.FeatureCollection(ee.Geometry.BBox(83.7866460908476, 31.02991423438545, 84.4782964814726, 31.623526673040716))

polyImage = ee.Image(0).byte().paint(trainingPolys, 1).paint(evalPolys, 2).paint(testPolys1, 3).paint(testPolys2, 4)
polyImage = polyImage.updateMask(polyImage)

mapid = polyImage.getMapId({'min': 1, 'max': 3, 'palette': ['red', 'blue', "green", "purple"]})
map = folium.Map(location=[16.426385350573945, 102.16833965985882], zoom_start=5)
folium.TileLayer(
    tiles=mapid['tile_fetcher'].url_format,
    attr='Map Data &copy; <a href="https://earthengine.google.com/">Google Earth Engine</a>',
    overlay=True,
    name='training polygons',
  ).add_to(map)
map.add_child(folium.LayerControl())
map

# Sampling

The mapped data look reasonable so take a sample from each polygon and merge the results into a single export.  The key step is sampling the array image at points, to get all the pixels in a 256x256 neighborhood at each point.  It's worth noting that to build the training and testing data for the FCNN, you export a single TFRecord file that contains patches of pixel values in each record.  You do NOT need to export each training/testing patch to a different image.  Since each record potentially contains a lot of data (especially with big patches or many input bands), some manual sharding of the computation is necessary to avoid the `computed value too large` error.  Specifically, the following code takes multiple (smaller) samples within each geometry, merging the results to get a single export.

In [18]:
# Convert the feature collections to lists for iteration.
trainingPolysList = trainingPolys.toList(trainingPolys.size())
evalPolysList = evalPolys.toList(evalPolys.size())

def Training_task(trainingPolys, n, N, arrays, setting):
  trainingPolysList = trainingPolys.toList(trainingPolys.size())
  # Export all the training data (in many pieces), ith one task 
  # per geometry.
  for g in range(trainingPolys.size().getInfo()):
    geomSample = ee.FeatureCollection([])
    for i in range(n):
      sample = arrays.sample(
        region = ee.Feature(trainingPolysList.get(g)).geometry(), 
        scale = 30,
        numPixels = N / n, # Size of the shard.
        seed = i,
        tileScale = 8
      )
      geomSample = geomSample.merge(sample)
    
    desc = setting.TRAINING_BASE + '_g' + str(g)
    task = ee.batch.Export.table.toCloudStorage(
      collection = geomSample,
      description = desc,
      bucket = setting.BUCKET,
      fileNamePrefix = setting.FOLDER + '/' + desc,
      fileFormat = 'TFRecord',
      selectors = setting.BANDS + [setting.RESPONSE]
    )
    task.start()


def Eval_task(evalPolys, n, N, arrays, setting):
  evalPolysList = evalPolys.toList(evalPolys.size())
  # Export all the evaluation data.
  for g in range(evalPolys.size().getInfo()):
    geomSample = ee.FeatureCollection([])
    for i in range(n):
      sample = arrays.sample(
        region = ee.Feature(evalPolysList.get(g)).geometry(), 
        scale = 30,
        numPixels = N / n,
        seed = i,
        tileScale = 8
      )
      # geomSample = geomSample.filter(ee.Filter.greaterThan('water', -1))
      geomSample = geomSample.merge(sample)

    desc = setting.EVAL_BASE + '_g' + str(g)
    task = ee.batch.Export.table.toCloudStorage(
      collection = geomSample,
      description = desc,
      bucket = setting.BUCKET,
      fileNamePrefix = setting.FOLDER + '/' + desc,
      fileFormat = 'TFRecord',
      selectors = setting.BANDS + [setting.RESPONSE]
    )
    task.start()

In [24]:
testPolys1List = testPolys1.toList(testPolys1.size())
testPolys2List = testPolys2.toList(testPolys2.size())

def Testing_task1(testPolys1, n, N, arrays, setting):
  # Export all the test data.
  testPolys1List = testPolys1.toList(testPolys1.size())
  for g in range(testPolys1.size().getInfo()):
    geomSample = ee.FeatureCollection([])
    for i in range(n):
      sample = arrays.sample(
        region = ee.Feature(testPolys1List.get(g)).geometry(), 
        scale = 30,
        numPixels = N / n,
        seed = i,
        tileScale = 8
      )
      # geomSample = geomSample.filter(ee.Filter.greaterThan('water', -1))
      geomSample = geomSample.merge(sample)

    desc = setting.TEST_BASE_1 + '_g' + str(g)
    task = ee.batch.Export.table.toCloudStorage(
      collection = geomSample,
      description = desc,
      bucket = setting.BUCKET,
      fileNamePrefix = setting.FOLDER + '/' + desc,
      fileFormat = 'TFRecord',
      selectors = setting.BANDS + [setting.RESPONSE]
    )
    task.start()

def Testing_task2(testPolys2, n, N, arrays, setting):
  testPolys2List = testPolys2.toList(testPolys2.size())
  # Export all the test2 data.
  for g in range(testPolys2.size().getInfo()):
    geomSample = ee.FeatureCollection([])
    for i in range(n):
      sample = arrays.sample(
        region = ee.Feature(testPolys2List.get(g)).geometry(), 
        scale = 30,
        numPixels = N / n,
        seed = i,
        tileScale = 8
      )
      # geomSample = geomSample.filter(ee.Filter.greaterThan('water', -1))
      geomSample = geomSample.merge(sample)

    desc = setting.TEST_BASE_2 + '_g' + str(g)
    task = ee.batch.Export.table.toCloudStorage(
      collection = geomSample,
      description = desc,
      bucket = setting.BUCKET,
      fileNamePrefix = setting.FOLDER + '/' + desc,
      fileFormat = 'TFRecord',
      selectors = setting.BANDS + [setting.RESPONSE]
    )
    task.start()

In [25]:
n = 24 # Number of shards in each polygon.
N = 240 # Total sample size in each polygon.
import tqdm.notebook as tq

pbar = tq.tqdm(total=len(list(configs_fs)))
for key in list(configs_fs):
  pbar.update(1)
  settings = configs_fs[key]
  print(settings.TEST_BASE_1)
  print(settings.TEST_BASE_2)
  # Training_task(trainingPolys, n, N, settings.sam_arr, settings)
  # Eval_task(evalPolys, n, N, settings.sam_arr, settings)
  Testing_task1(testPolys1, n, N, settings.sam_arr, settings)
  Testing_task2(testPolys2, n, N, settings.sam_arr, settings)

  0%|          | 0/32 [00:00<?, ?it/s]

test_patches_S1A_el_sl_as_1
test_patches_S1A_el_sl_as_2
test_patches_S1A_el_1
test_patches_S1A_el_2
test_patches_S1A_sl_1
test_patches_S1A_sl_2
test_patches_S1A_as_1
test_patches_S1A_as_2
test_patches_S1A_sl_as_1
test_patches_S1A_sl_as_2
test_patches_S1A_el_sl_1
test_patches_S1A_el_sl_2
test_patches_S1A_el_as_1
test_patches_S1A_el_as_2
test_patches_S1_el_sl_as_1
test_patches_S1_el_sl_as_2
test_patches_S1_el_1
test_patches_S1_el_2
test_patches_S1_sl_1
test_patches_S1_sl_2
test_patches_S1_as_1
test_patches_S1_as_2
test_patches_S1_sl_as_1
test_patches_S1_sl_as_2
test_patches_S1_el_sl_1
test_patches_S1_el_sl_2
test_patches_S1_el_as_1
test_patches_S1_el_as_2
test_patches_L8SR_el_sl_as_1
test_patches_L8SR_el_sl_as_2
test_patches_L8SR_el_1
test_patches_L8SR_el_2
test_patches_L8SR_sl_1
test_patches_L8SR_sl_2
test_patches_L8SR_as_1
test_patches_L8SR_as_2
test_patches_L8SR_sl_as_1
test_patches_L8SR_sl_as_2
test_patches_L8SR_el_sl_1
test_patches_L8SR_el_sl_2
test_patches_L8SR_el_as_1
test_patches

In [27]:
# Print all tasks.
from pprint import pprint
pprint(ee.batch.Task.list())
# for i in range(2):
  # pprint(ee.batch.Task.list()[i])

[<Task L3YPQ2AJKJHYEHHZW4N3TH6I EXPORT_FEATURES: test_patches_L8SR_S1A_2_g0 (READY)>,
 <Task 2CNJQQWJLARQTQ6NNXTOXCHN EXPORT_FEATURES: test_patches_L8SR_S1A_1_g0 (READY)>,
 <Task XZOFZH7PWB65ODHZUJMCWXUD EXPORT_FEATURES: test_patches_L8SR_S1_2_g0 (READY)>,
 <Task IGEGJ6XORKF6B2Q65SEQYAMF EXPORT_FEATURES: test_patches_L8SR_S1_1_g0 (READY)>,
 <Task DWDFU4A2W3OO6F75TL7W27ES EXPORT_FEATURES: test_patches_S1A_2_g0 (READY)>,
 <Task OBNJGGYQYOSKUQG4BOH3UGWX EXPORT_FEATURES: test_patches_S1A_1_g0 (READY)>,
 <Task JK37KHONSZQPAZAP54BMNN5T EXPORT_FEATURES: test_patches_S1_2_g0 (RUNNING)>,
 <Task WYPFDOCZDU2CWQOCHCTQKCWU EXPORT_FEATURES: test_patches_S1_1_g0 (RUNNING)>,
 <Task XCOXBF5ANUD4HXLDKKZCOJPY EXPORT_FEATURES: test_patches_L8SR_2_g0 (COMPLETED)>,
 <Task 6XQ3GIQ6HSBX7KVK7XE6Z4NG EXPORT_FEATURES: test_patches_L8SR_1_g0 (COMPLETED)>,
 <Task GK3R56BEKDQN45RTJJGX2TLC EXPORT_FEATURES: test_patches_L8SR_S1A_sl_el_as_2_g0 (RUNNING)>,
 <Task 4XGSRJH5H3WCML2ETF4PE6VU EXPORT_FEATURES: test_patches_L

In [24]:
# !earthengine task cancel all

Canceling task "L5ZJN2IZ6YKWXLD5Q6HUEHOJ"
Canceling task "CKQZODRNXSSOLMPQAMGLR7TW"
Canceling task "KL2B7QTZSZHUDEE73HROSIGS"
Canceling task "43HMONZIVDVGYVQX32LIUSV2"
Canceling task "ISZBI4CHWZGG22O5JNIEHJCZ"
Canceling task "7IOL6STFSKJXBKPAXFTFDEZY"
Canceling task "4NOWTSSAXGPOTN7AA4N2WIDS"
Canceling task "TSD4VGQ5TQ2KL7WBRA2BVWAA"
Canceling task "3VNLRTAVH47YAE3M652SD2SN"
Canceling task "WZRXSTM76TEIVYONKINE4RSW"
Canceling task "6JRDYC4DTPPOZGD4UXY7MDWT"
Canceling task "PWDQER5WCU3YOBVCFVI52TRK"


# Training data

Load the data exported from Earth Engine into a `tf.data.Dataset`.  The following are helper functions for that.

In [86]:
for _, key in enumerate(list(configs_fs)):
  conf = configs_fs[key]
  preproc = preprocessing.Preprocessor(conf)
  training = preproc.get_training_dataset()
  evaluation = preproc.get_eval_dataset()
  test_1 = preproc.get_test_dataset(config.TEST_BASE_1)
  test_2 = preproc.get_test_dataset(config.TEST_BASE_2)

gs://geebucketwater/fs_TH_Cnn_S1A_el_sl_as/training_patches_S1A_el_sl_as*
gs://geebucketwater/fs_TH_Cnn_S1A_el/training_patches_S1A_el*
gs://geebucketwater/fs_TH_Cnn_S1A_sl/training_patches_S1A_sl*
gs://geebucketwater/fs_TH_Cnn_S1A_as/training_patches_S1A_as*
gs://geebucketwater/fs_TH_Cnn_S1A_sl_as/training_patches_S1A_sl_as*
gs://geebucketwater/fs_TH_Cnn_S1A_el_sl/training_patches_S1A_el_sl*
gs://geebucketwater/fs_TH_Cnn_S1A_el_as/training_patches_S1A_el_as*
gs://geebucketwater/fs_TH_Cnn_S1_el_sl_as/training_patches_S1_el_sl_as*
gs://geebucketwater/fs_TH_Cnn_S1_el/training_patches_S1_el*
gs://geebucketwater/fs_TH_Cnn_S1_sl/training_patches_S1_sl*
gs://geebucketwater/fs_TH_Cnn_S1_as/training_patches_S1_as*
gs://geebucketwater/fs_TH_Cnn_S1_sl_as/training_patches_S1_sl_as*
gs://geebucketwater/fs_TH_Cnn_S1_el_sl/training_patches_S1_el_sl*
gs://geebucketwater/fs_TH_Cnn_S1_el_as/training_patches_S1_el_as*
gs://geebucketwater/fs_TH_Cnn_L8SR_el_sl_as/training_patches_L8SR_el_sl_as*
gs://geebu

# Export Images

In [14]:
conf = configs_fs["L8SR_sl_as"]
preproc = preprocessing.Preprocessor(conf)
MODEL_DIR = 'gs://' + conf.BUCKET + "/" + conf.FOLDER + "/Models/" + conf.PROJECT_TITLE + "_EPOCHS_" + str(10)
model_custom = tf.keras.models.load_model(MODEL_DIR, custom_objects={'f1':metrics_.f1, "custom_accuracy": metrics_.custom_accuracy})



In [17]:
def doExport(out_image_base, kernel_buffer, region, setting):
  """Run the image export task.  Block until complete.
  """
  task = ee.batch.Export.image.toCloudStorage(
    image = setting.image.select(setting.BANDS),
    description = out_image_base,
    bucket = setting.BUCKET,
    fileNamePrefix = setting.FOLDER + '/' + out_image_base,
    region = region.getInfo()['coordinates'],
    scale = 30,
    fileFormat = 'TFRecord',
    maxPixels = 1e10,
    formatOptions = {
      'patchDimensions': setting.KERNEL_SHAPE,
      'kernelSize': kernel_buffer,
      'compressed': True,
      'maxFileSize': 104857600
    }
  )
  task.start()

  # Block until the task completes.
  print('Running image export to Cloud Storage...')
  import time
  while task.active():
    time.sleep(30)

  # Error condition
  if task.status()['state'] != 'COMPLETED':
    print('Error with image export.')
  else:
    print('Image export completed.')

In [24]:
import json
from pprint import pprint
import numpy as np
import tqdm.notebook as tq

def LoadImage(out_image_base, user_folder, kernel_buffer, setting):
  print('Looking for TFRecord files...')

  # Get a list of all the files in the output bucket.
  filesList = !gsutil ls 'gs://'{setting.BUCKET}'/'{setting.FOLDER}

  # Get only the files generated by the image export.
  exportFilesList = [s for s in filesList if out_image_base in s]

  # Get the list of image files and the JSON mixer file.
  imageFilesList = []
  jsonFile = None
  for f in exportFilesList:
    if f.endswith('.tfrecord.gz'):
      imageFilesList.append(f)
    elif f.endswith('.json'):
      jsonFile = f

  # Make sure the files are in the right order.
  imageFilesList.sort()

  pprint(imageFilesList)
  print(jsonFile)

  # Load the contents of the mixer file to a JSON object.
  jsonText = !gsutil cat {jsonFile}
  # Get a single string w/ newlines from the IPython.utils.text.SList
  mixer = json.loads(jsonText.nlstr)
  pprint(mixer)
  patches = mixer['totalPatches']

  # Get set up for prediction.
  x_buffer = int(kernel_buffer[0] / 2)
  y_buffer = int(kernel_buffer[1] / 2)

  buffered_shape = [
      setting.KERNEL_SHAPE[0] + kernel_buffer[0],
      setting.KERNEL_SHAPE[1] + kernel_buffer[1]]

  imageColumns = [
    tf.io.FixedLenFeature(shape=buffered_shape, dtype=tf.float32) 
      for k in setting.BANDS
  ]

  imageFeaturesDict = dict(zip(setting.BANDS, imageColumns))

  def parse_image(example_proto):
    return tf.io.parse_single_example(example_proto, imageFeaturesDict)

  def toTupleImage(inputs):
    inputsList = [inputs.get(key) for key in setting.BANDS]
    stacked = tf.stack(inputsList, axis=0)
    stacked = tf.transpose(stacked, [1, 2, 0])
    return stacked

   # Create a dataset from the TFRecord file(s) in Cloud Storage.
  imageDataset = tf.data.TFRecordDataset(imageFilesList, compression_type='GZIP')
  imageDataset = imageDataset.map(parse_image, num_parallel_calls=5)
  imageDataset = imageDataset.map(toTupleImage).batch(1)
  return imageDataset, patches, x_buffer, y_buffer, jsonFile

def predictionSingleinput(model, imageDataset, patches):
  print('Running predictions...')
  predictions = model.predict(imageDataset, steps=patches, verbose=1)
  return predictions

def predictionMultipleinput(model, imageDataset, patches):
  print('Running predictions...')
  predictions = []
  pbar = tq.tqdm(total=patches)
  for data in imageDataset:
    pbar.update(1)
    x1, x2 = tf.split(data, [6,3], 3)
    predictions.append(model_custom.predict([x1, x2], verbose=0))
  return predictions

def uploadToGEEAsset(x_buffer, y_buffer, predictions, out_image_base, jsonFile, suffix, setting, multiview=False):
  print('Writing predictions...')
  out_image_file = 'gs://' + setting.BUCKET + '/' + setting.FOLDER + '/' + out_image_base + '.TFRecord'
  writer = tf.io.TFRecordWriter(out_image_file)
  patches = 0
  for predictionPatch in predictions:
    if multiview == True:
      predictionPatch = predictionPatch[0]
    print('Writing patch ' + str(patches) + '...')
    predictionPatch = predictionPatch[
        x_buffer:x_buffer+setting.KERNEL_SIZE, y_buffer:y_buffer+setting.KERNEL_SIZE]
    predictionPatch = np.argmax(predictionPatch, -1)
    example = tf.train.Example(
      features=tf.train.Features(
        feature={
          'prediction': tf.train.Feature(
              float_list=tf.train.FloatList(
                  value=predictionPatch.flatten()))
        }
      )
    )
    # Write the example.
    writer.write(example.SerializeToString())
    patches += 1

  writer.close()

  # Start the upload.
  out_image_asset = user_folder + '/' + out_image_base + suffix
  !earthengine upload image --asset_id={out_image_asset} {out_image_file} {jsonFile}

def doPrediction_featurestack(out_image_base, user_folder, kernel_buffer, model, suffix, setting):
  """Perform inference on exported imagery, upload to Earth Engine.
  """
  imageDataset, patches, x_buffer, y_buffer, jsonFile = LoadImage(out_image_base, user_folder, kernel_buffer, setting)
  predictions = predictionSingleinput(model, imageDataset, patches)
  uploadToGEEAsset(x_buffer, y_buffer, predictions, out_image_base, jsonFile, suffix, setting, False)
  return

In [21]:
# Output assets folder: YOUR FOLDER
user_folder = 'users/mewchayutaphong' # INSERT YOUR FOLDER HERE.

# Half this will extend on the sides of each patch.
kernel_buffer = [128, 128]

th_image_base = f'Thai_with_{conf.PROJECT_TITLE}'
th_region = ee.Geometry.BBox(100.30632852425321, 17.709225431372587, 100.74128946175321, 18.20417872756825)

tb_image_base = f'Tibet_with_{conf.PROJECT_TITLE}'
tb_region = ee.Geometry.BBox(83.7866460908476, 31.02991423438545, 84.4782964814726, 31.623526673040716)


In [23]:
# Run the export.
doExport(th_image_base, kernel_buffer, th_region, conf)

Running image export to Cloud Storage...
Image export completed.


In [27]:
doExport(tb_image_base, kernel_buffer, tb_region, conf)

Running image export to Cloud Storage...
Image export completed.


In [26]:
doPrediction_featurestack(th_image_base, user_folder, kernel_buffer, model_custom, "_epochs_10_", conf)

Looking for TFRecord files...
['gs://geebucketwater/fs_TH_Cnn_L8SR_sl_as/Thai_with_L8SR_sl_as00000.tfrecord.gz',
 'gs://geebucketwater/fs_TH_Cnn_L8SR_sl_as/Thai_with_L8SR_sl_as00001.tfrecord.gz']
gs://geebucketwater/fs_TH_Cnn_L8SR_sl_as/Thai_with_L8SR_sl_asmixer.json
{'patchDimensions': [256, 256],
 'patchesPerRow': 6,
 'projection': {'affine': {'doubleMatrix': [0.00026949458523585647,
                                            0.0,
                                            100.30615411937102,
                                            0.0,
                                            -0.00026949458523585647,
                                            18.204359232682105]},
                'crs': 'EPSG:4326'},
 'totalPatches': 42}
Running predictions...
Writing predictions...
Writing patch 0...
Writing patch 1...
Writing patch 2...
Writing patch 3...
Writing patch 4...
Writing patch 5...
Writing patch 6...
Writing patch 7...
Writing patch 8...
Writing patch 9...
Writing patch 10...


In [30]:
doPrediction_featurestack(tb_image_base, user_folder, kernel_buffer, model_custom, "_epochs_10_", conf)

Looking for TFRecord files...
['gs://geebucketwater/fs_TH_Cnn_L8SR_sl_as/Tibet_with_L8SR_sl_as00000.tfrecord.gz',
 'gs://geebucketwater/fs_TH_Cnn_L8SR_sl_as/Tibet_with_L8SR_sl_as00001.tfrecord.gz',
 'gs://geebucketwater/fs_TH_Cnn_L8SR_sl_as/Tibet_with_L8SR_sl_as00002.tfrecord.gz',
 'gs://geebucketwater/fs_TH_Cnn_L8SR_sl_as/Tibet_with_L8SR_sl_as00003.tfrecord.gz']
gs://geebucketwater/fs_TH_Cnn_L8SR_sl_as/Tibet_with_L8SR_sl_asmixer.json
{'patchDimensions': [256, 256],
 'patchesPerRow': 10,
 'projection': {'affine': {'doubleMatrix': [0.00026949458523585647,
                                            0.0,
                                            83.78640553899825,
                                            0.0,
                                            -0.00026949458523585647,
                                            31.624111599086813]},
                'crs': 'EPSG:4326'},
 'totalPatches': 80}
Running predictions...
Writing predictions...
Writing patch 0...
Writing patch 1...
W