<a href="https://colab.research.google.com/github/Allen123321/DEMO-DL/blob/master/GEE_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This is an Earth Engine <> TensorFlow demonstration notebook. Specifically, this notebook shows:

Exporting training/testing data from Earth Engine in TFRecord format.
Preparing the data for use in a TensorFlow model.
Training and validating a simple model (Keras Sequential neural network) in TensorFlow.
Making predictions on image data exported from Earth Engine in TFRecord format.
Ingesting classified image data to Earth Engine in TFRecord format.

Import software libraries and/or authenticate as necessary.

In [1]:
from google.colab import auth
auth.authenticate_user()

In [2]:
import ee
ee.Authenticate()
ee.Initialize()

To authorize access needed by Earth Engine, open the following URL in a web browser and follow the instructions. If the web browser does not start automatically, please manually browse the URL below.

    https://accounts.google.com/o/oauth2/auth?client_id=517222506229-vsmmajv00ul0bs7p89v5m89qs8eb9359.apps.googleusercontent.com&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fearthengine+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdevstorage.full_control&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&response_type=code&code_challenge=LBsmhhCtUpwuLT0DS2xuaMDPquA0Iq13z0L2EivPRmY&code_challenge_method=S256

The authorization workflow will generate a code, which you should paste in the box below. 
Enter verification code: 4/1AY0e-g7NroXfz8K1Nm3fQVAaEy9lGCyJzQYOkzVVTuZ3hKTdWxSKPtlz1v4

Successfully saved authorization token.


In [3]:
import tensorflow as tf
print(tf.__version__)

2.4.0


In [4]:
import folium
print(folium.__version__)

0.8.3


## Define variables

In [15]:
# Your Earth Engine username.  This is used to import a classified image
# into your Earth Engine assets folder.
USER_NAME = 'jiashuxu1'

# Cloud Storage bucket into which training, testing and prediction 
# datasets will be written.  You must be able to write into this bucket.
OUTPUT_BUCKET = 'geetest-allen'

# Use Landsat 8 surface reflectance data for predictors. #使用Landsat 8表面反射率数据作为预测变量。
L8SR = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR')
# Use these bands for prediction.
BANDS = ['B2', 'B3', 'B4', 'B5', 'B6', 'B7']

# This is a trianing/testing dataset of points with known land cover labels.
LABEL_DATA = ee.FeatureCollection('projects/google/demo_landcover_labels')
# The labels, consecutive integer indices starting from zero, are stored in
# this property, set on each point.
LABEL = 'landcover'
# Number of label values, i.e. number of classes in the classification.
N_CLASSES = 3

# These names are used to specify properties in the export of
# training/testing data and to define the mapping between names and data
# when reading into TensorFlow datasets.
FEATURE_NAMES = list(BANDS)
FEATURE_NAMES.append(LABEL)

# File names for the training and testing datasets.  These TFRecord files
# will be exported from Earth Engine into the Cloud Storage bucket.
TRAIN_FILE_PREFIX = 'Training_demo'
TEST_FILE_PREFIX = 'Testing_demo'
file_extension = '.tfrecord.gz'
TRAIN_FILE_PATH = 'gs://' + OUTPUT_BUCKET + '/' + TRAIN_FILE_PREFIX + file_extension
TEST_FILE_PATH = 'gs://' + OUTPUT_BUCKET + '/' + TEST_FILE_PREFIX + file_extension

# File name for the prediction (image) dataset.  The trained model will read
# this dataset and make predictions in each pixel.
IMAGE_FILE_PREFIX = 'Image_pixel_demo_'

# The output path for the classified image (i.e. predictions) TFRecord file.
OUTPUT_IMAGE_FILE = 'gs://' + OUTPUT_BUCKET + '/Classified_pixel_demo.TFRecord'
# Export imagery in this region.
EXPORT_REGION = ee.Geometry.Rectangle([-122.7, 37.3, -121.8, 38.00])
# The name of the Earth Engine asset to be created by importing
# the classified image from the TFRecord file in Cloud Storage.
OUTPUT_ASSET_ID = 'users/' + USER_NAME + '/Classified_pixel_demo'

## Get Training and Testing data from Earth Engine
To get data for a classification model of three classes (bare, vegetation, water), we need labels and the value of predictor variables for each labeled example. We've already generated some labels in Earth Engine. Specifically, these are visually interpreted points labeled "bare," "vegetation," or "water" for a very simple classification demo (example script). For predictor variables, we'll use Landsat 8 surface reflectance imagery, bands 2-7.

### Prepare Landsat 8 imagery
First, make a cloud-masked median composite of Landsat 8 surface reflectance imagery from 2018. Check the composite by visualizing with folium.

In [16]:
# Cloud masking function.
def maskL8sr(image):
  cloudShadowBitMask = ee.Number(2).pow(3).int()
  cloudsBitMask = ee.Number(2).pow(5).int()
  qa = image.select('pixel_qa')
  mask = qa.bitwiseAnd(cloudShadowBitMask).eq(0).And(
    qa.bitwiseAnd(cloudsBitMask).eq(0))
  return image.updateMask(mask).select(BANDS).divide(10000)

# The image input data is a 2018 cloud-masked median composite.
image = L8SR.filterDate('2018-01-01', '2018-12-31').map(maskL8sr).median()

# Use folium to visualize the imagery.
mapid = image.getMapId({'bands': ['B4', 'B3', 'B2'], 'min': 0, 'max': 0.3})
map = folium.Map(location=[38., -122.5])

folium.TileLayer(
    tiles=mapid['tile_fetcher'].url_format,
    attr='Map Data &copy; <a href="https://earthengine.google.com/">Google Earth Engine</a>',
    overlay=True,
    name='median composite',
  ).add_to(map)
map.add_child(folium.LayerControl())
map

### Add pixel values of the composite to labeled points
Some training labels have already been collected for you. Load the labeled points from an existing Earth Engine asset. Each point in this table has a property called landcover that stores the label, encoded as an integer. Here we overlay the points on imagery to get predictor variables along with labels.

In [17]:
# Sample the image at the points and add a random column.
sample = image.sampleRegions(
  collection=LABEL_DATA, properties=[LABEL], scale=30).randomColumn()

# Partition the sample approximately 70-30.
training = sample.filter(ee.Filter.lt('random', 0.7))
testing = sample.filter(ee.Filter.gte('random', 0.7))

from pprint import pprint

# Print the first couple points to verify.
pprint({'training': training.first().getInfo()})
pprint({'testing': testing.first().getInfo()})

{'training': {'geometry': None,
              'id': '00009f65e3c9ae02b84e_0',
              'properties': {'B2': 0.05220000073313713,
                             'B3': 0.062049999833106995,
                             'B4': 0.03660000115633011,
                             'B5': 0.01140000019222498,
                             'B6': 0.006800000090152025,
                             'B7': 0.005249999929219484,
                             'landcover': 2,
                             'random': 0.39322176058184644},
              'type': 'Feature'}}
{'testing': {'geometry': None,
             'id': '0000642db5938d967908_0',
             'properties': {'B2': 0.05829999968409538,
                            'B3': 0.08560000360012054,
                            'B4': 0.11620000004768372,
                            'B5': 0.24390000104904175,
                            'B6': 0.29600000381469727,
                            'B7': 0.19820000231266022,
                            'landcove

### Export the training and testing data
Now that there's training and testing data in Earth Engine and you've inspected a couple examples to ensure that the information you need is present, it's time to materialize the datasets in a place where the TensorFlow model has access to them. You can do that by exporting the training and testing datasets to tables in TFRecord format (learn more about TFRecord format) in your Cloud Storage bucket.

In [18]:
# Make sure you can see the output bucket.  You must have write access.
print('Found Cloud Storage bucket.' if tf.io.gfile.exists('gs://' + OUTPUT_BUCKET) 
    else 'Can not find output Cloud Storage bucket.')

Found Cloud Storage bucket.


In [25]:
# Create the tasks.
training_task = ee.batch.Export.table.toCloudStorage(
  collection=training,
  description='Training Export',
  fileNamePrefix=TRAIN_FILE_PREFIX,
  bucket=OUTPUT_BUCKET,
  fileFormat='TFRecord',
  selectors=FEATURE_NAMES)

testing_task = ee.batch.Export.table.toCloudStorage(
  collection=testing,
  description='Testing Export',
  fileNamePrefix=TEST_FILE_PREFIX,
  bucket=OUTPUT_BUCKET,
  fileFormat='TFRecord',
  selectors=FEATURE_NAMES)

In [26]:
# Start the tasks.
training_task.start()
testing_task.start()

## Monitor task progress
You can see all your Earth Engine tasks by listing them. Make sure the training and testing tasks are completed before continuing.

In [27]:
# Print all tasks.
pprint(ee.batch.Task.list())

[<Task EXPORT_FEATURES: Testing Export (READY)>,
 <Task EXPORT_FEATURES: Training Export (READY)>,
 <Task EXPORT_FEATURES: Testing Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Training Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Testing Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Training Export (COMPLETED)>,
 <Task EXPORT_IMAGE: Image Export (RUNNING)>,
 <Task EXPORT_FEATURES: Testing Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Training Export (COMPLETED)>,
 <Task EXPORT_IMAGE: Image Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Testing Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Training Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Testing Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Training Export (COMPLETED)>]


In [28]:
print('Found training file.' if tf.io.gfile.exists(TRAIN_FILE_PATH) 
    else 'No training file found.')
print('Found testing file.' if tf.io.gfile.exists(TEST_FILE_PATH) 
    else 'No testing file found.')

Found training file.
Found testing file.


## Export the imagery
You can also export imagery using TFRecord format. Specifically, export whatever imagery you want to be classified by the trained model into the output Cloud Storage bucket.

In [29]:
# Specify patch and file dimensions.
image_export_options = {
  'patchDimensions': [256, 256],
  'maxFileSize': 104857600,
  'compressed': True
}

# Setup the task.
image_task = ee.batch.Export.image.toCloudStorage(
  image=image,
  description='Image Export',
  fileNamePrefix=IMAGE_FILE_PREFIX,
  bucket=OUTPUT_BUCKET,
  scale=30,
  fileFormat='TFRecord',
  region=EXPORT_REGION.toGeoJSON()['coordinates'],
  formatOptions=image_export_options,
)

In [30]:
# Start the task.
image_task.start()

In [31]:
# Print all tasks.
pprint(ee.batch.Task.list())

[<Task EXPORT_IMAGE: Image Export (READY)>,
 <Task EXPORT_FEATURES: Testing Export (READY)>,
 <Task EXPORT_FEATURES: Training Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Testing Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Training Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Testing Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Training Export (COMPLETED)>,
 <Task EXPORT_IMAGE: Image Export (RUNNING)>,
 <Task EXPORT_FEATURES: Testing Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Training Export (COMPLETED)>,
 <Task EXPORT_IMAGE: Image Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Testing Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Training Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Testing Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Training Export (COMPLETED)>]


It's also possible to monitor an individual task. Here we poll the task until it's done. If you do this, please put a sleep() in the loop to avoid making too many requests. Note that this will block until complete (you can always halt the execution of this cell).

In [32]:
import time

while image_task.active():
  print('Polling for task (id: {}).'.format(image_task.id))
  time.sleep(30)
print('Done with image export.')

Polling for task (id: 7R2M7DYX2OYJBK5W7N73BU76).
Polling for task (id: 7R2M7DYX2OYJBK5W7N73BU76).
Polling for task (id: 7R2M7DYX2OYJBK5W7N73BU76).
Polling for task (id: 7R2M7DYX2OYJBK5W7N73BU76).
Polling for task (id: 7R2M7DYX2OYJBK5W7N73BU76).
Polling for task (id: 7R2M7DYX2OYJBK5W7N73BU76).
Polling for task (id: 7R2M7DYX2OYJBK5W7N73BU76).
Polling for task (id: 7R2M7DYX2OYJBK5W7N73BU76).
Polling for task (id: 7R2M7DYX2OYJBK5W7N73BU76).
Polling for task (id: 7R2M7DYX2OYJBK5W7N73BU76).
Polling for task (id: 7R2M7DYX2OYJBK5W7N73BU76).
Polling for task (id: 7R2M7DYX2OYJBK5W7N73BU76).
Polling for task (id: 7R2M7DYX2OYJBK5W7N73BU76).
Polling for task (id: 7R2M7DYX2OYJBK5W7N73BU76).
Done with image export.


### Data preparation and pre-processing
Read data from the TFRecord file into a tf.data.Dataset. Pre-process the dataset to get it into a suitable format for input to the model.

#### Read into a tf.data.Dataset
Here we are going to read a file in Cloud Storage into a tf.data.Dataset. (these TensorFlow docs explain more about reading data into a Dataset). Check that you can read examples from the file. The purpose here is to ensure that we can read from the file without an error. The actual content is not necessarily human readable.

In [33]:
# Create a dataset from the TFRecord file in Cloud Storage.
train_dataset = tf.data.TFRecordDataset(TRAIN_FILE_PATH, compression_type='GZIP')
# Print the first record to check.
print(iter(train_dataset).next())

tf.Tensor(b'\nw\n\x0e\n\x02B2\x12\x08\x12\x06\n\x04\xab\xcfU=\n\x0e\n\x02B3\x12\x08\x12\x06\n\x04$(~=\n\x0e\n\x02B4\x12\x08\x12\x06\n\x04\xe2\xe9\x15=\n\x0e\n\x02B5\x12\x08\x12\x06\n\x04\x11\xc7:<\n\x0e\n\x02B6\x12\x08\x12\x06\n\x04\x89\xd2\xde;\n\x0e\n\x02B7\x12\x08\x12\x06\n\x041\x08\xac;\n\x15\n\tlandcover\x12\x08\x12\x06\n\x04\x00\x00\x00@', shape=(), dtype=string)


### Define the structure of your data
For parsing the exported TFRecord files, featuresDict is a mapping between feature names (recall that featureNames contains the band and label names) and float32 tf.io.FixedLenFeature objects. This mapping is necessary for telling TensorFlow how to read data in a TFRecord file into tensors. Specifically, all numeric data exported from Earth Engine is exported as float32.

(Note: features in the TensorFlow context (i.e. tf.train.Feature) are not to be confused with Earth Engine features (i.e. ee.Feature), where the former is a protocol message type for serialized data input to the model and the latter is a geometry-based geographic data structure.)

In [34]:
# List of fixed-length features, all of which are float32.
columns = [
  tf.io.FixedLenFeature(shape=[1], dtype=tf.float32) for k in FEATURE_NAMES
]

# Dictionary with names as keys, features as values.
features_dict = dict(zip(FEATURE_NAMES, columns))

pprint(features_dict)

{'B2': FixedLenFeature(shape=[1], dtype=tf.float32, default_value=None),
 'B3': FixedLenFeature(shape=[1], dtype=tf.float32, default_value=None),
 'B4': FixedLenFeature(shape=[1], dtype=tf.float32, default_value=None),
 'B5': FixedLenFeature(shape=[1], dtype=tf.float32, default_value=None),
 'B6': FixedLenFeature(shape=[1], dtype=tf.float32, default_value=None),
 'B7': FixedLenFeature(shape=[1], dtype=tf.float32, default_value=None),
 'landcover': FixedLenFeature(shape=[1], dtype=tf.float32, default_value=None)}


### Parse the dataset
Now we need to make a parsing function for the data in the TFRecord files. The data comes in flattened 2D arrays per record and we want to use the first part of the array for input to the model and the last element of the array as the class label. The parsing function reads data from a serialized Example proto into a dictionary in which the keys are the feature names and the values are the tensors storing the value of the features for that example. (These TensorFlow docs explain more about reading Example protos from TFRecord files).

In [35]:
def parse_tfrecord(example_proto):
  """The parsing function.

  Read a serialized example into the structure defined by featuresDict.

  Args:
    example_proto: a serialized Example.

  Returns:
    A tuple of the predictors dictionary and the label, cast to an `int32`.
  """
  parsed_features = tf.io.parse_single_example(example_proto, features_dict)
  labels = parsed_features.pop(LABEL)
  return parsed_features, tf.cast(labels, tf.int32)

# Map the function over the dataset.
parsed_dataset = train_dataset.map(parse_tfrecord, num_parallel_calls=5)

# Print the first parsed record to check.
pprint(iter(parsed_dataset).next())

({'B2': <tf.Tensor: shape=(1,), dtype=float32, numpy=array([0.0522], dtype=float32)>,
  'B3': <tf.Tensor: shape=(1,), dtype=float32, numpy=array([0.06205], dtype=float32)>,
  'B4': <tf.Tensor: shape=(1,), dtype=float32, numpy=array([0.0366], dtype=float32)>,
  'B5': <tf.Tensor: shape=(1,), dtype=float32, numpy=array([0.0114], dtype=float32)>,
  'B6': <tf.Tensor: shape=(1,), dtype=float32, numpy=array([0.0068], dtype=float32)>,
  'B7': <tf.Tensor: shape=(1,), dtype=float32, numpy=array([0.00525], dtype=float32)>},
 <tf.Tensor: shape=(1,), dtype=int32, numpy=array([2], dtype=int32)>)


### Create additional features
Another thing we might want to do as part of the input process is to create new features, for example NDVI, a vegetation index computed from reflectance in two spectral bands. Here are some helper functions for that.

In [36]:
def normalized_difference(a, b):
  """Compute normalized difference of two inputs.

  Compute (a - b) / (a + b).  If the denomenator is zero, add a small delta.

  Args:
    a: an input tensor with shape=[1]
    b: an input tensor with shape=[1]

  Returns:
    The normalized difference as a tensor.
  """
  nd = (a - b) / (a + b)
  nd_inf = (a - b) / (a + b + 0.000001)
  return tf.where(tf.math.is_finite(nd), nd, nd_inf)

def add_NDVI(features, label):
  """Add NDVI to the dataset.
  Args:
    features: a dictionary of input tensors keyed by feature name.
    label: the target label

  Returns:
    A tuple of the input dictionary with an NDVI tensor added and the label.
  """
  features['NDVI'] = normalized_difference(features['B5'], features['B4'])
  return features, label

In [37]:
from tensorflow import keras

# Add NDVI.
input_dataset = parsed_dataset.map(add_NDVI)

# Keras requires inputs as a tuple.  Note that the inputs must be in the
# right shape.  Also note that to use the categorical_crossentropy loss,
# the label needs to be turned into a one-hot vector.
def to_tuple(inputs, label):
  return (tf.transpose(list(inputs.values())),
          tf.one_hot(indices=label, depth=N_CLASSES))

# Map the to_tuple function, shuffle and batch.
input_dataset = input_dataset.map(to_tuple).batch(8)

# Define the layers in the model.
model = tf.keras.models.Sequential([
  tf.keras.layers.Dense(64, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(N_CLASSES, activation=tf.nn.softmax)
])

# Compile the model with the specified loss function.
model.compile(optimizer=tf.keras.optimizers.Adam(),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Fit the model to the training data.
model.fit(x=input_dataset, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7fbc401edef0>

In [38]:
test_dataset = (
  tf.data.TFRecordDataset(TEST_FILE_PATH, compression_type='GZIP')
    .map(parse_tfrecord, num_parallel_calls=5)
    .map(add_NDVI)
    .map(to_tuple)
    .batch(1))

model.evaluate(test_dataset)



[0.6407201886177063, 0.9583333134651184]

In [39]:
# Get a list of all the files in the output bucket.
files_list = !gsutil ls 'gs://'{OUTPUT_BUCKET}
# Get only the files generated by the image export.
exported_files_list = [s for s in files_list if IMAGE_FILE_PREFIX in s]

# Get the list of image files and the JSON mixer file.
image_files_list = []
json_file = None
for f in exported_files_list:
  if f.endswith('.tfrecord.gz'):
    image_files_list.append(f)
  elif f.endswith('.json'):
    json_file = f

# Make sure the files are in the right order.
image_files_list.sort()

pprint(image_files_list)
print(json_file)

['gs://geetest-allen/Image_pixel_demo_00000.tfrecord.gz',
 'gs://geetest-allen/Image_pixel_demo_00001.tfrecord.gz']
gs://geetest-allen/Image_pixel_demo_mixer.json


In [40]:
import json

# Load the contents of the mixer file to a JSON object.
json_text = !gsutil cat {json_file}
# Get a single string w/ newlines from the IPython.utils.text.SList
mixer = json.loads(json_text.nlstr)
pprint(mixer)

{'patchDimensions': [256, 256],
 'patchesPerRow': 13,
 'projection': {'affine': {'doubleMatrix': [0.00026949458523585647,
                                            0.0,
                                            -122.70007617412975,
                                            0.0,
                                            -0.00026949458523585647,
                                            38.00089247493765]},
                'crs': 'EPSG:4326'},
 'totalPatches': 130}


### Read the image files into a dataset
You can feed the list of files (imageFilesList) directly to the TFRecordDataset constructor to make a combined dataset on which to perform inference. The input needs to be preprocessed differently than the training and testing. Mainly, this is because the pixels are written into records as patches, we need to read the patches in as one big tensor (one patch for each band), then flatten them into lots of little tensors.

In [41]:
# Get relevant info from the JSON mixer file.
patch_width = mixer['patchDimensions'][0]
patch_height = mixer['patchDimensions'][1]
patches = mixer['totalPatches']
patch_dimensions_flat = [patch_width * patch_height, 1]

# Note that the tensors are in the shape of a patch, one patch for each band.
image_columns = [
  tf.io.FixedLenFeature(shape=patch_dimensions_flat, dtype=tf.float32) 
    for k in BANDS
]

# Parsing dictionary.
image_features_dict = dict(zip(BANDS, image_columns))

# Note that you can make one dataset from many files by specifying a list.
image_dataset = tf.data.TFRecordDataset(image_files_list, compression_type='GZIP')

# Parsing function.
def parse_image(example_proto):
  return tf.io.parse_single_example(example_proto, image_features_dict)

# Parse the data into tensors, one long tensor per patch.
image_dataset = image_dataset.map(parse_image, num_parallel_calls=5)

# Break our long tensors into many little ones.
image_dataset = image_dataset.flat_map(
  lambda features: tf.data.Dataset.from_tensor_slices(features)
)

# Add additional features (NDVI).
image_dataset = image_dataset.map(
  # Add NDVI to a feature that doesn't have a label.
  lambda features: add_NDVI(features, None)[0]
)

# Turn the dictionary in each record into a tuple without a label.
image_dataset = image_dataset.map(
  lambda data_dict: (tf.transpose(list(data_dict.values())), )
)

# Turn each patch into a batch.
image_dataset = image_dataset.batch(patch_width * patch_height)

### Generate predictions for the image pixels
To get predictions in each pixel, run the image dataset through the trained model using model.predict(). Print the first prediction to see that the output is a list of the three class probabilities for each pixel. Running all predictions might take a while.

In [42]:
# Run prediction in batches, with as many steps as there are patches.
predictions = model.predict(image_dataset, steps=patches, verbose=1)

# Note that the predictions come as a numpy array.  Check the first one.
print(predictions[0])

[[0.30747983 0.5304895  0.16203062]]


In [43]:
print('Writing to file ' + OUTPUT_IMAGE_FILE)

Writing to file gs://geetest-allen/Classified_pixel_demo.TFRecord


### Write the predictions to a TFRecord file
Now that there's a list of class probabilities in predictions, it's time to write them back into a file, optionally including a class label which is simply the index of the maximum probability. We'll write directly from TensorFlow to a file in the output Cloud Storage bucket.

Iterate over the list, compute class label and write the class and the probabilities in patches. Specifically, we need to write the pixels into the file as patches in the same order they came out. The records are written as serialized tf.train.Example protos. This might take a while.

In [44]:
# Instantiate the writer.
writer = tf.io.TFRecordWriter(OUTPUT_IMAGE_FILE)

# Every patch-worth of predictions we'll dump an example into the output
# file with a single feature that holds our predictions. Since our predictions
# are already in the order of the exported data, the patches we create here
# will also be in the right order.
patch = [[], [], [], []]
cur_patch = 1
for prediction in predictions:
  patch[0].append(tf.argmax(prediction, 1))
  patch[1].append(prediction[0][0])
  patch[2].append(prediction[0][1])
  patch[3].append(prediction[0][2])
  # Once we've seen a patches-worth of class_ids...
  if (len(patch[0]) == patch_width * patch_height):
    print('Done with patch ' + str(cur_patch) + ' of ' + str(patches) + '...')
    # Create an example
    example = tf.train.Example(
      features=tf.train.Features(
        feature={
          'prediction': tf.train.Feature(
              int64_list=tf.train.Int64List(
                  value=patch[0])),
          'bareProb': tf.train.Feature(
              float_list=tf.train.FloatList(
                  value=patch[1])),
          'vegProb': tf.train.Feature(
              float_list=tf.train.FloatList(
                  value=patch[2])),
          'waterProb': tf.train.Feature(
              float_list=tf.train.FloatList(
                  value=patch[3])),
        }
      )
    )
    # Write the example to the file and clear our patch array so it's ready for
    # another batch of class ids
    writer.write(example.SerializeToString())
    patch = [[], [], [], []]
    cur_patch += 1

writer.close()

Done with patch 1 of 130...
Done with patch 2 of 130...
Done with patch 3 of 130...
Done with patch 4 of 130...
Done with patch 5 of 130...
Done with patch 6 of 130...
Done with patch 7 of 130...
Done with patch 8 of 130...
Done with patch 9 of 130...
Done with patch 10 of 130...
Done with patch 11 of 130...
Done with patch 12 of 130...
Done with patch 13 of 130...
Done with patch 14 of 130...
Done with patch 15 of 130...
Done with patch 16 of 130...
Done with patch 17 of 130...
Done with patch 18 of 130...
Done with patch 19 of 130...
Done with patch 20 of 130...
Done with patch 21 of 130...
Done with patch 22 of 130...
Done with patch 23 of 130...
Done with patch 24 of 130...
Done with patch 25 of 130...
Done with patch 26 of 130...
Done with patch 27 of 130...
Done with patch 28 of 130...
Done with patch 29 of 130...
Done with patch 30 of 130...
Done with patch 31 of 130...
Done with patch 32 of 130...
Done with patch 33 of 130...
Done with patch 34 of 130...
Done with patch 35 of 1

### Upload the classifications to an Earth Engine asset
Verify the existence of the predictions file
At this stage, there should be a predictions TFRecord file sitting in the output Cloud Storage bucket. Use the gsutil command to verify that the predictions image (and associated mixer JSON) exist and have non-zero size.

In [52]:
!gsutil ls -l {OUTPUT_IMAGE_FILE}

 110772220  2021-01-26T09:24:53Z  gs://geetest-allen/Classified_pixel_demo.TFRecord
TOTAL: 1 objects, 110772220 bytes (105.64 MiB)


### Upload the classified image to Earth Engine
Upload the image to Earth Engine directly from the Cloud Storage bucket with the earthengine command. Provide both the image TFRecord file and the JSON file as arguments to earthengine upload.

In [53]:
print('Uploading to ' + OUTPUT_ASSET_ID)

Uploading to users/jiashuxu1/Classified_pixel_demo


In [82]:
# Start the upload.
!earthengine upload image --asset_id={OUTPUT_ASSET_ID} --pyramiding_policy=mode {OUTPUT_IMAGE_FILE} {json_file}

Instructions for updating:
non-resource variables are not supported in the long term
Running command using Cloud API.  Set --no-use_cloud_api to go back to using the API

I0126 11:09:29.171093 140040762943360 discovery.py:275] URL being requested: GET https://earthengine.googleapis.com/$discovery/rest?version=v1alpha&prettyPrint=false
I0126 11:09:30.594670 140040762943360 discovery.py:275] URL being requested: GET https://earthengine.googleapis.com/$discovery/rest?version=v1alpha&prettyPrint=false
I0126 11:09:31.765893 140040762943360 discovery.py:894] URL being requested: GET https://earthengine.googleapis.com/v1alpha/projects/earthengine-legacy/algorithms?prettyPrint=false&alt=json
I0126 11:09:33.240528 140040762943360 discovery.py:894] URL being requested: POST https://earthengine.googleapis.com/v1alpha/projects/earthengine-legacy/image:import?alt=json
Started upload task with ID: RL5EGZBGFUKM3CQNH3WUMJFD


### Check the status of the asset ingestion
You can also use the Earth Engine API to check the status of your asset upload. It might take a while. The upload of the image is an asset ingestion task.

In [86]:
ee.batch.Task.list()

[<Task INGEST_IMAGE: Ingest image: "projects/earthengine-legacy/assets/users/jiashuxu1/Classified_pixel_demo" (FAILED)>,
 <Task INGEST_IMAGE: Ingest image: "projects/earthengine-legacy/assets/users/jiashuxu1/Classified_pixel_demo" (FAILED)>,
 <Task INGEST_IMAGE: Ingest image: "projects/earthengine-legacy/assets/users/jiashuxu1/Classified_pixel_demo" (FAILED)>,
 <Task INGEST_IMAGE: Ingest image: "projects/earthengine-legacy/assets/users/jiashuxu1/Classified_pixel_demo" (COMPLETED)>,
 <Task EXPORT_IMAGE: Image Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Testing Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Training Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Testing Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Training Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Testing Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Training Export (COMPLETED)>,
 <Task EXPORT_IMAGE: Image Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Testing Export (COMPLETED)>,
 <Task EXPORT_FEATURES: Training Export (CO

### View the ingested asset
Display the vector of class probabilities as an RGB image with colors corresponding to the probability of bare, vegetation, water in a pixel. Also display the winning class using the same color palette.

In [87]:
predictions_image = ee.Image(OUTPUT_ASSET_ID)

prediction_vis = {
  'bands': 'prediction',
  'min': 0,
  'max': 2,
  'palette': ['red', 'green', 'blue']
}
probability_vis = {'bands': ['bareProb', 'vegProb', 'waterProb'], 'max': 0.5}

prediction_map_id = predictions_image.getMapId(prediction_vis)
probability_map_id = predictions_image.getMapId(probability_vis)

map = folium.Map(location=[37.6413, -122.2582])
folium.TileLayer(
  tiles=prediction_map_id['tile_fetcher'].url_format,
  attr='Map Data &copy; <a href="https://earthengine.google.com/">Google Earth Engine</a>',
  overlay=True,
  name='prediction',
).add_to(map)
folium.TileLayer(
  tiles=probability_map_id['tile_fetcher'].url_format,
  attr='Map Data &copy; <a href="https://earthengine.google.com/">Google Earth Engine</a>',
  overlay=True,
  name='probability',
).add_to(map)
map.add_child(folium.LayerControl())
map