# Landcover Classification Example

#### Sources: 
- https://blog.gishub.org/earth-engine-tutorial-32-machine-learning-with-earth-engine-supervised-classification
- https://geohackweek.github.io/GoogleEarthEngine/05-classify-imagery/
- https://ceholden.github.io/open-geo-tutorial/python/chapter_5_classification.html
- GEE Documentation

#### Steps:
1. Collect training data. Assemble features which have a property that stores the known class label and properties storing numeric values for the predictors.
2. Instantiate a classifier. Set its parameters if necessary.
3. Train the classifier using the training data.
4. Classify an image or feature collection.
5. Estimate classification error with independent validation data.

The training data is a `FeatureCollection` with a property storing the class label and properties storing predictor variables. Class labels should be consecutive, integers starting from 0. If necessary, use remap() to convert class values to consecutive integers. The predictors should be numeric.

### Import libraries

In [None]:
import ee
import geemap
from geemap import *
import json
from geemap import geojson_to_ee, ee_to_geojson
from ipyleaflet import GeoJSON
import os
import sklearn
# !pip install geemap


## Data Preparation

### Create an interactive map

In [None]:
Map = geemap.Map()
Map

### Add region data to the map

In [None]:
point = ee.Geometry.Point(-122.4439, 37.7538)

#making a cloud free Landsat 8 Surface Reflectance Composite
image = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR') \
    .filterBounds(point) \
    .filterDate('2016-01-01', '2016-12-31') \
    .sort('CLOUD_COVER') \
    .first() \
    .select('B[1-7]')

#taking out any remaining cloud cover
#qa = image.select('pixel_qa')
#cloudMask = qa.bitwiseAnd(1<<5).eq(0)
#.and(qa.bitwiseAnd(1<<3).eq(0))
#masked = image.updateMask(cloudMask).clip(bounds)

vis_params = {
    'min': 0,
    'max': 3000,
    'bands': ['B5', 'B4', 'B3']
}

Map.centerObject(point, 8)
Map.addLayer(image, vis_params, "Landsat-8")

### Check image properties

In [None]:
ee.Date(image.get('system:time_start')).format('YYYY-MM-dd').getInfo()

In [None]:
image.get('CLOUD_COVER').getInfo()

### Creating the training dataset

There are several ways you can create a region for generating the training dataset.

- Draw a shape (e.g., rectangle) on the map and the use `region = Map.user_roi`
- Define a geometry, such as `region = ee.Geometry.Rectangle([-122.6003, 37.4831, -121.8036, 37.8288])`
- Create a buffer zone around a point, such as `region = ee.Geometry.Point([-122.4439, 37.7538]).buffer(10000)`
- If you don't define a region, it will use the image footprint by default

In [None]:
# region = Map.user_roi
# region = ee.Geometry.Rectangle([-122.6003, 37.4831, -121.8036, 37.8288])
region = ee.Geometry.Point([-122.4439, 37.7538]).buffer(10000)

The [USGS National Land Cover Database (NLCD)](https://developers.google.com/earth-engine/datasets/catalog/USGS_NLCD) will be used to create label dataset for training


![](https://i.imgur.com/7QoRXxu.png)

In [None]:
nlcd = ee.Image('USGS/NLCD/NLCD2016').select('landcover').clip(image.geometry()) #pre-defined data from an Earth Engine table asset
Map.addLayer(nlcd, {}, 'NLCD')
Map

In [None]:
# Make the training dataset.
points = nlcd.sample(**{
    'region': image.geometry(), #The region to sample from. If unspecified, uses the image's whole footprint.
    'scale': 30, #A nominal scale in meters of the projection to sample in.
    'numPixels': 5000, #The approximate number of pixels to sample.
    'seed': 0, #A randomization seed to use for subsampling.
    'geometries': True  # If true, adds the center of the sampled pixel as the geometry property of the output 
                        #feature. Otherwise, geometries will be omitted (saving memory).Set this to False to 
                        #ignore geometries
})

Map.addLayer(points, {}, 'Training', False)

In [None]:
print(points.size().getInfo())

In [None]:
print(points.first().getInfo()) #Returns the first entry from a given collection.

In [None]:
'''#Export the training data to Google Drive as GEoJSON
Export.table.toDrive({'collection': points,
                      'folder': MRes
                     'description': lc_training_data,
                      'fileFormat': CSV
                     }) ''' 

### Train the classifier

Here we complete supervised classification using the RandomForest ensemble decision tree algorithm by [Leo Breiman and Adele Cutler](https://link.springer.com/article/10.1023/A:1010933404324).

The RandomForest algorithm is popular in the field of remote sensing, and is quite fast compared to some other machine learning approaches (e.g., SVM can be quite computationally intensive). It isn't necessarily the best but provides great first step into the world of machine learning for classification and regression.

In [None]:
# Use these bands for prediction.
bands = ['B1', 'B2', 'B3', 'B4', 'B5', 'B6', 'B7']


# This property of the table stores the land cover labels.
label = 'landcover'

# Overlay the points on the imagery to get training.
training = image.select(bands).sampleRegions(**{
  'collection': points,
  'properties': [label],
  'scale': 30, #A nominal scale in meters of the projection to sample in. The spacial resolution of Landsat is 30m.
})

# Train a Random Forest classifier with 10 decision trees (will employ hyperparameter testing outside of notebook)
classifier = ee.Classifier.smileRandomForest(10).train(training,label,bands)


In [15]:
print(training.first().getInfo())

{'type': 'Feature', 'geometry': None, 'id': '0_0', 'properties': {'B1': 575, 'B2': 814, 'B3': 1312, 'B4': 1638, 'B5': 1980, 'B6': 2091, 'B7': 1967, 'landcover': 31}}


### Classify the image

In [18]:
# Classify the image with the same bands used for training.
classified = image.select(bands).classify(train)

# # Display the clusters with random colors.
Map.addLayer(classified.randomVisualizer(), {}, 'classfied')
Map

EEException: Classifier.randomForest: This classifier has been replaced.  For more information see: http://goo.gle/deprecated-classifiers.

### Render categorical map

To render a categorical map, we can set two image properties: `landcover_class_values` and `landcover_class_palette`. We can use the same style as the NLCD so that it is easy to compare the two maps. 

In [None]:
class_values = nlcd.get('landcover_class_values').getInfo()
class_values

In [None]:
class_palette = nlcd.get('landcover_class_palette').getInfo()
class_palette

In [None]:
landcover = result.set('classification_class_values', class_values)
landcover = landcover.set('classification_class_palette', class_palette)

In [None]:
Map.addLayer(landcover, {}, 'Land cover')
Map

### Visualize the result

In [None]:
print('Change layer opacity:')
cluster_layer = Map.layers[-1]
cluster_layer.interact(opacity=(0, 1, 0.1))

### Add a legend to the map

In [None]:
Map.add_legend(builtin_legend='NLCD')
Map

### Assess The Accuracy

In [None]:
#Get a confusion matrix representing resubstitution accuracy.
trainAccuracy = train.confusionMatrix('landcover', 'classification')
#print('RF error matrix: ', train.confusionMatrix())
print('RF overall accuracy: ', train.confusionMatrix().accuracy())

### Validation on NLCD Data

In [None]:
validation = masked.addBands(nlcd2016).sample({
  numPixels: sampleSize,
  seed: 1
}).filter(ee.Filter.neq('B1', null))

validated = validation.classify(train)

testAccuracy = validated.errorMatrix('landcover', 'classification')
print('Validation error matrix: ', testAccuracy)
print('Validation overall accuracy: ', testAccuracy.accuracy())

### Validation on AOI (Mai Ndombe)

### Export the result

Export the result directly to your computer:

In [None]:
import os
out_dir = os.path.join(os.path.expanduser('~'), 'Downloads')
out_file = os.path.join(out_dir, 'landcover.tif')

In [None]:
geemap.ee_export_image(landcover, filename=out_file, scale=900)

Export the result to Google Drive:

In [None]:
geemap.ee_export_image_to_drive(landcover, description='landcover', folder='export', scale=900)