# Machine learning in Earth Engine

Machine Learning (ML) in Earth Engine is supported with Earth Engine API methods in the `ee.Classifier`, `ee.Clusterer`, or `ee.Reducer` packages for training and inference within Earth Engine.

These are useful for approx. less than 400 images.

If more, TensorFlow is the way to go.

TensorFlow is developed and trained outside of Earth Engine, but Earth Engine provides the option to import and export data in [TFRecord](https://www.tensorflow.org/tutorials/load_data/tfrecord#tfrecords_format_details) format. This way, you can generate training datasets in Earth Engine.

Due to time limitations, we are going to focus on Classifiers, but you should explore further in your own time!

## Supervised classification algorithms

The `Classifier` package handles supervised classification by traditional Machine Learning (ML) algorithms running in Earth Engine. 

These classifiers include Classification and Regression Trees ([CART](https://towardsdatascience.com/https-medium-com-lorrli-classification-and-regression-analysis-with-decision-trees-c43cdbc58054)), [RandomForest](https://towardsdatascience.com/understanding-random-forest-58381e0602d2), [NaiveBayes](https://towardsdatascience.com/all-about-naive-bayes-8e13cef044cf) and Support Vector Machine ([SVM](https://towardsdatascience.com/support-vector-machines-svm-c9ef22815589)).

The way classification works is:



*   Collect training data. Assemble features which have a property that stores the known class label and properties storing numeric values for the predictors.
*   Instantiate a classifier. Set its parameters if necessary.
*   Train the classifier using the training data.
*   Classify an image or feature collection.
*   Estimate classification error with independent validation data.



In [1]:
# Import earthengine API
import ee
# Authenticate and initialise 
ee.Authenticate()
ee.Initialize()

To authorize access needed by Earth Engine, open the following URL in a web browser and follow the instructions. If the web browser does not start automatically, please manually browse the URL below.

    https://accounts.google.com/o/oauth2/auth?client_id=517222506229-vsmmajv00ul0bs7p89v5m89qs8eb9359.apps.googleusercontent.com&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fearthengine+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdevstorage.full_control&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&response_type=code&code_challenge=T5lo8Yty45Fq0L7WJbCKI_z6qmme02q0jcbbi7k01m0&code_challenge_method=S256

The authorization workflow will generate a code, which you should paste in the box below. 
Enter verification code: 4/1AX4XfWjlBd5o5Zy22yrXtYtpdv5ou1DGVz34ETwjU8wkO4L9Ztz2iTpQpRQ

Successfully saved authorization token.


In [2]:
# Make a cloud-free Landsat 8 TOA composite (from raw imagery)
l8 = ee.ImageCollection('LANDSAT/LC08/C01/T1')

image = ee.Algorithms.Landsat.simpleComposite(l8.filterDate('2018-01-01', '2018-12-31'))#,asFloat='true')

# Use these bands for prediction.
bands = ['B2', 'B3', 'B4', 'B5', 'B6', 'B7', 'B10', 'B11']

# Load training points. The numeric property 'class' stores known labels.
points = ee.FeatureCollection('GOOGLE/EE/DEMOS/demo_landcover_labels')

# This property stores the land cover labels as consecutive
# integers starting from zero.
label = 'landcover'

# Overlay the points on the imagery to get training.
training = image.select(bands).sampleRegions(points,properties=[label],scale=30)

# Train a CART classifier with default parameters.
trained = ee.Classifier.smileCart().train(training, label, bands)

# Classify the image with the same bands used for training.
classified = image.select(bands).classify(trained)


In [6]:
# Plot the result

import folium
!pip install geehydro # Life saver for plotting GEE stuff with Python!
import geehydro
#print(points.getInfo())
# Use folium to visualize the imagery.
map = folium.Map(location=[37.820452055421086,-122.27096557617189],zoom_start=11)

map.addLayer(image, {'min':0, 'max':100, 'bands': ['B4', 'B3', 'B2']}, 'image')
map.addLayer(classified, {'min':0, 'max':2, 'palette': ['red', 'green', 'blue']}, 'classification')
folium.LayerControl().add_to(map)
map




Note that the training property (`'landcover'`) stores consecutive integers starting at 0 (Use `remap()` on your table to turn your class labels into consecutive integers starting at zero if necessary).

If the training data are polygons representing homogenous regions, every pixel in each polygon is a training point. You can use polygons to train as illustrated in the following example:

In [7]:
# Make a cloud-free Landsat 8 TOA composite (from raw imagery).
l8 = ee.ImageCollection('LANDSAT/LC08/C01/T1')

image = ee.Algorithms.Landsat.simpleComposite(l8.filterDate('2018-01-01', '2018-12-31'))

# Use these bands for prediction.
bands = ['B2', 'B3', 'B4', 'B5', 'B6', 'B7', 'B10', 'B11']

# Manually created polygons.
forest1 = ee.Geometry.Rectangle(-63.0187, -9.3958, -62.9793, -9.3443)
forest2 = ee.Geometry.Rectangle(-62.8145, -9.206, -62.7688, -9.1735)
nonForest1 = ee.Geometry.Rectangle(-62.8161, -9.5001, -62.7921, -9.4486)
nonForest2 = ee.Geometry.Rectangle(-62.6788, -9.044, -62.6459, -8.9986)

# Make a FeatureCollection from the hand-made geometries.
polygons = ee.FeatureCollection([
  ee.Feature(nonForest1, {'class': 0}),
  ee.Feature(nonForest2, {'class': 0}),
  ee.Feature(forest1, {'class': 1}),
  ee.Feature(forest2, {'class': 1}),
])

# Get the values for all pixels in each polygon in the training.
  # Get the sample from the polygons FeatureCollection.
  # Keep this list of properties from the polygons.
  # Set the scale to get Landsat pixels in the polygons.
training = image.sampleRegions(polygons, properties= ['class'], scale= 30)
  
# Create an SVM classifier with custom parameters.
# RBF = Radial Basis Function kernel
classifier = ee.Classifier.libsvm(kernelType='RBF',gamma= 0.5,cost= 10)

# Train the classifier.
trained = classifier.train(training, 'class', bands);

# Classify one image.

classified = image.classify(trained)

# Redude the region to plot it without issues

roi = ee.Geometry.Rectangle([-62.836, -9.2399, -8, -61]);
classified_reduced = classified.clip(roi)


In [8]:
# Plot the result

map = folium.Map(location=[-9.2399,-62.836],zoom_start=9)

map.addLayer(image, {'bands': ['B4', 'B3', 'B2']}, 'image')
map.addLayer(polygons, {}, 'training polygons')
map.addLayer(classified_reduced, {'min': 0, 'max': 1, 'palette': ['red', 'green']}, 'deforestation') # Probably wont be able to plot it!
folium.LayerControl().add_to(map)
map

KeyboardInterrupt: ignored

## Unsupervised classification

The `ee.Clusterer` package handles unsupervised classification (or clustering) in Earth Engine. More details about each Clusterer are available in the [reference docs in the Code Editor](https://code.earthengine.google.com/#workspace).

Clusterers are used in the same manner as classifiers in Earth Engine. The general workflow for clustering is:

*  Assemble features with numeric properties in which to find clusters.
*   Instantiate a clusterer. Set its parameters if necessary.
*  Train the clusterer using the training data.
* Apply the clusterer to an image or feature collection.
* Label the clusters.


The training data is a `FeatureCollection` with properties that will be input to the clusterer. 

Unlike classifiers, there is no input class value for a Clusterer. 

Like classifiers, the data for the train and apply steps are expected to have the same number of values. When a trained clusterer is applied to an image or table, it assigns an integer cluster ID to each pixel or feature.

These algorithms are currently based on the algorithms with the same name in [Weka](https://www.cs.waikato.ac.nz/ml/weka/).

Here is a simple example of building and using an `ee.Clusterer`:

In [41]:
# testing classification on a Borneo image
inputB = ee.ImageCollection('COPERNICUS/S2_SR').filterBounds(ee.Geometry.Point(117.5, 0)).filterDate('2020-01-01', '2020-12-31').sort('CLOUDY_PIXEL_PERCENTAGE').first() 
inputC = ee.ImageCollection('projects/planet-nicfi/assets/basemaps/asia').filterBounds(ee.Geometry.Point(117.5, 0)).filterDate('2020-01-01', '2020-12-31').first() 

regionB = ee.Geometry.Rectangle(117.5, 0.6, 117.8, 0.9)

mapB = folium.Map(location = [0.6, 117.5], zoom_start=10)
mapB.addLayer(inputB, {'min': 0, 'max': 2000, 'bands':['B4','B3','B2'],}, 'Borneo')
# mapB.addLayer(inputC, {'min': 0, 'max': 2000, 'bands':['R','G','B'],}, 'Borneo')
mapB.addLayer(ee.Image().paint(regionB, 0, 2), {}, 'region')
mapB

In [42]:
# Make the training dataset.
trainingB = inputB.sample(region = regionB,scale= 4.77,  numPixels= 5000)

# Instantiate the clusterer and train it.
clustererB = ee.Clusterer.wekaKMeans(10).train(trainingB)

# Cluster the input using the trained clusterer.
resultB = inputB.cluster(clustererB)

# Display the clusters with random colors.

mapB.addLayer(resultB.randomVisualizer(), {}, 'clusters')
folium.LayerControl().add_to(mapB)
mapB


In [12]:
# Load a pre-computed Landsat composite for input.
input = ee.Image('LANDSAT/LE7_TOA_1YEAR/2001')

# Define a region in which to generate a sample of the input.
region = ee.Geometry.Rectangle(29.7, 30, 32.5, 31.7)

# Display the sample region.

map = folium.Map(location=[31,31.5],zoom_start=8)

map.addLayer(ee.Image().paint(region, 0, 2), {}, 'region')
map

In [15]:
# Make the training dataset.
training = input.sample(region = region,scale= 30,  numPixels= 5000)

# Instantiate the clusterer and train it.
clusterer = ee.Clusterer.wekaKMeans(15).train(training)

# Cluster the input using the trained clusterer.
result = input.cluster(clusterer)

# Display the clusters with random colors.

map.addLayer(result.randomVisualizer(), {}, 'clusters')
folium.LayerControl().add_to(map)
map
