______

<img src="./imgs/DRRlogo.jpg" width="350" />

# Land Cover Classification for use in the CAPRA Model 

## Image Classification
_____

### Learning Objectives

In this lesson you will learn concepts of supervised classification. You will explore processes of training data collection, classifier selection, classifier training, image classification and accuracy assessment. 

The purpose of this tutorial is to derive a land cover map from satellite imagery using a classification method called recursive partitioning. The procedure for the classification process is outlined in Figure 2.
The tutorial will use the open source software Google Earth Engine (GEE) and its Python API. We will cover common tasks for data loading, image visualization and processing. The main steps of the classification process we developed for this tutorial with relevant software indicated in () are:

Image classification means mapping the values captured by remote sensors that are encoded as image digital levels to specific land cover types. <br>
Classifying remotely sensed data into a thematic map is a relevant task beacuse the resulting information is the basis for many environmental and socioeconomic applications. In this example, a crop map type is produced by applying a [*supervised*](https://en.wikipedia.org/wiki/Supervised_learning) [*Random Forest*](https://en.wikipedia.org/wiki/Random_forest) algorithm. 

<img src="./imgs/JensenClfProcess_.PNG">

**Figure 1. Procedure for the acquisition and classification process of remotely sensed imagery (adapted from Jensen 2005)**

### Supervised Classification

A supervised classification method consists of training a classifier (algorithm) using (ground) truth data to classify  subsequently unseen data. <br>
The process flow in a supervised classification includes basically, the following steps:
* Collect the ground data. This data is used to train the classifier and validate its results.
* Train the classifier.
* Apply the classifier to produce a classified image.
* Assess the classification accuracy.

### Classification Scheme Definition

The classes and the detail of a classification scheme of a land cover map are determined by its intended use.  A classification scheme can have multiple levels of detail.  It is good practice to design a classification scheme with mutually exclusive and exhaustive classes of either land cover OR land use.  For different levels or scales the class details need to be determined in a way that they can be easily aggregated such as in a hierarchical structure.  

The classification scheme used in the runoff factor map delineation in the CAPRA flood model consists of a mixture of detailed land cover and land use classes as described in Table 1.  This is not an optimal scenario as overlap of classes can lead to non-exclusiveness of classes (e.g., impermeable surface and paved roads).  It is not consistent in the level of the hierarchical structure either.  The two level one forest land cover classes in the classification scheme are “natural forests” differentiated from “forests” which at the second level are further divided into  seeded or cultivated.  Further “cropped furrows”,” cereals”, “leguminous” and their subclasses can be grouped under agricultural or cropland at the first level. Roads, rangeland and rest (uncultivated) are pure land use classes, with class “rangeland” overlapping with class “pasture”.  

In general it is a lot more difficult to derive land use from a remotely sensed image than it is to detect land cover, as land cover describes the surface material and therefore has a direct linkage to the spectral reflectance behavior of the material (e.g. land cover grassland vs. land use classes pasture, football field, city park).  Land use classification of detailed agricultural classes as suggested by the CAPRA classification scheme requires a very good knowledge and expertise of a region’s agricultural practices.  

An important component of a land-cover classification procedure is the proper choice and delineation of training sites, used to train the computer in pattern recognition.  It is crucial to develop a database of reference points with reliable ground cover or use information at the spatial scale and thematic detailed at which the mapping procedure will be performed.  Preferably such reference points are acquired in the field (in situ) or from high resolution aerial photography.  For each of the different land cover types of interest a set of representative samples would be collected using a GPS (or digitizing technique).  

In our case we do not have a database of ground reference points or aerial photography, and we do not have an intimate knowledge of agricultural practices of the region, therefore we will refer to the U.S. Geological Survey Land Use/Land Cover Classification System for use with Remote Sensor Data (Tbl. 2).  

**Table 2. Land use and land cover classification system for use with remote sensor data (Anderson et al., 1976)**

<img src="./imgs/LCclassTable_.PNG">

For this tutorial we will use classes based on level one of the USGS classification system.  The classes of interest for our mapping application are: Water, Forest Land, Rangeland, Agricultural Land, Barren Land, and Urban or Built-up Land.  
Since the classification scheme is a mix of land cover and land use we will rename classes with land use character to their corresponding land cover.  We also want to avoid spaces in the class names since we are going to use the names in code and spaces can be problematic.  The class names we are going to use instead are:

(1)	water 
(2)	forestLand
(3)	cropLand
(4)	buildUpLand

#### Question:
#### (1) Which of the level one classes of the U.S. Geological Survey Land Use/Land Cover Classification System for use with remote                  sensor Data do you expect to be present in your ROI (google it).

### Remote Sensing Process

One of the more common procedures to generate thematic land cover maps of a region is by applying pattern-recognition algorithms to satellite imagery in a digital image processing environment.  The use of remote sensing classification methods involve three general steps, (1) the selection and pre-processing of imagery adequate for the mapping of land cover classes of interest at their appropriate scales, (2) extraction and evaluation of spectral signatures and their statistical separability for all land cover classes of interest and development of a classification method and (3) evaluation and documentation of the mapping process and the final map product.   

(1)	Discrimination, and mapping of various land use/land cover classes depends on the spatial resolution (pixel size); the spectral resolution (number of bands or channels and their bandwidth); the radiometric resolution (range of discrete brightness values); and the temporal resolution (return time of the sensor) in relation to the classification scheme and minimum mapping unit or appropriate scale (Teillet et al., 1997 ,Rao et al., 2007).  Since the limitations of a mapping project are often constrained by its budget, the most economic approach to consider is to use data free of charge processed with open source image processing software, which is the a solution demonstrated in this tutorial.  

A large Landsat archive of 45 years of image acquisitions is available and open to the public. Landsat sensor specific information are accessible through the data portal of USGS (http://landsat.usgs.gov/Landsat_Search_and_Download.php).  For this exercise we are going to use Landsat 8 images. The Scan Line Corrector (SLC) in the Landsat 7 failed in in April of 2003 resulting in imagery with significant geometric error and considerable areas of no data value in each scene.  The Landsat 8 data source is adequate for the classes we identified in the classification scheme. Google Earth Engine (GEE) contains a variety of Landsat specific processing methods. Specifically, there are methods to compute at-sensor radiance, top-of-atmosphere reflectance, surface reflectance, cloud score and cloud-free composites.   
    
(2)	The extraction and evaluation of spectral signatures of all land cover classes of interest requires a training dataset.  In this exercise a set of training points for each class will be digitized on screen using the Google Earth application.  For the digitized sampling points spectral characteristics will then be extracted from the pre-processed imagery.  The variables we are interested in evaluating are reflectance values.  In addition we will consider derivative information, the Normalized Difference Vegetation Index (NDVI) and tasseled cap or Kauth-Thomas transformation which is based on the relationship of spectral characteristics to soil brightness, moisture content and vegetation cover.  The pre-processing and preparation of all data and the extraction of variable values for each sample of the training data set will be performed in Google Earth and GEE.

The classification process is guided by the evaluation of the separability of classes for different combinations of reflectance bands and other derived variables based on the training set.  Separability analysis and feature selection are frequently based on descriptive class parameters such as the class specific mean vector and variance-covariance matrices.  For this tutorial instead of using parametric measures to evaluate class-separability, we will evaluate the effectiveness of a recursive partitioning algorithm, a non-parametric classification procedure (see pg. 38 in landCoverClassification_Theory.pdf).  Performance of different classifiers is assessed for all evaluated models based on accuracy measures derived from confusion matrices of the classified training dataset.  This initial evaluation of accuracies has a bias towards overestimating accuracies, and will only be used to select the model with the best fit.  The analysis will provide two important components for the final classification, a list of variables that are most suitable to differentiate the classes of interest, and the decision rules or classification tree (the model or classifier), that will be applied to all pixels of the entire area of interest, the Dominican Republic.  This part of the analysis will be performed in GEE Python API.      

(3)	The final step of the mapping project is the accuracy assessment of the final classification results (the final map).  For this purpose we will use a stratified random sampling design (see pg. 45 in landCoverClassification_Theory.pdf) applied to the sampling frame of all classified image elements (pixels).  For lack of ground reference data we will use Google Earth (or Google maps) as reference source to determine classification accuracy.


In [1]:
# Import required modules

import bqplot
import datetime
import dateutil.parser
import ee
import ipywidgets
import ipyleaflet
import IPython.display
import numpy as np
import pprint
import pandas as pd
import traitlets

# Configure the pretty printing output & initialize earthengine.
pp = pprint.PrettyPrinter(depth=4)
ee.Initialize()

In [2]:
import ee
from IPython.display import Image
ee.Initialize()
import pprint
import matplotlib as mp

# Configure the pretty printing output.
pp = pprint.PrettyPrinter(depth=4)


In [3]:
# Function to get sizes in Human readable format
suffixes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB']
def humansize(nbytes):
    i = 0
    while nbytes >= 1024 and i < len(suffixes)-1:
        nbytes /= 1024.
        i += 1
    f = ('%.2f' % nbytes).rstrip('0').rstrip('.')
    return '%s %s' % (f, suffixes[i])

In [4]:
# Function to get tilelayer url from earthengine server
def GetTileLayerUrl(ee_image_object):
  map_id = ee.Image(ee_image_object).getMapId()
  tile_url_template = "https://earthengine.googleapis.com/map/{mapid}/{{z}}/{{x}}/{{y}}?token={token}"
  return tile_url_template.format(**map_id)

## Masking clouds and cloud shadows

Clouds and atmospheric conditions present a significant challenge when working with multispectral remote sensing data. Extreme cloud cover and shadows can make the data in those areas unusable if reflectance values are either washed out (too bright - as the clouds scatter all light back to the sensor) or are too dark (shadows which represent blocked or absorbed light).

Many remote sensing data sets come with quality layers that you can use as a mask to remove “bad” pixels from your analysis. In the case of Landsat, the mask layers identify pixels that are likely representative of cloud cover, shadow and even water. Landsat 8 data comes with a processed cloud shadow/mask raster layer called pixel_qa.


In the following section of code you will sort the Landsat 8 surface reflectance image collection,  in order to get the most recent, oldest, or optimal image. Then you will use first() to obtain the first least cloudy image.

In [5]:
# Sort collection from least to most cloudy
l8sr = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR').sort('CLOUD_COVER')

# Get the first, least cloudy image from the collection
pixel_qa_image = ee.Image(
    l8sr.filterDate('2015-01-01', '2019-12-30')
        .filterBounds(ee.Geometry.Point(-70.0522, 18.7490))
        .first()
        .select('pixel_qa')
)
pp.pprint(pixel_qa_image.getInfo())

{'bands': [{'crs': 'EPSG:32619',
            'crs_transform': [30.0, 0.0, 305985.0, 0.0, -30.0, 2193615.0],
            'data_type': {'max': 65535,
                          'min': 0,
                          'precision': 'int',
                          'type': 'PixelType'},
            'dimensions': [7621, 7771],
            'id': 'pixel_qa'}],
 'id': 'LANDSAT/LC08/C01/T1_SR/LC08_007047_20160116',
 'properties': {'CLOUD_COVER': 4.49,
                'CLOUD_COVER_LAND': 7.32,
                'EARTH_SUN_DISTANCE': 0.983695,
                'ESPA_VERSION': '2_23_0_1a',
                'GEOMETRIC_RMSE_MODEL': 7.083,
                'GEOMETRIC_RMSE_MODEL_X': 4.812,
                'GEOMETRIC_RMSE_MODEL_Y': 5.198,
                'IMAGE_QUALITY_OLI': 9.0,
                'IMAGE_QUALITY_TIRS': 9.0,
                'LANDSAT_ID': 'LC08_L1TP_007047_20160116_20170405_01_T1',
                'LEVEL1_PRODUCTION_DATE': 1491376888000.0,
                'PIXEL_QA_VERSION': 'generate_pixel_qa_1.6.0'

In [6]:
# the maximun pixel value is 65535 because the image has 16-bit per pixel 
url  = pixel_qa_image.getThumbUrl({'min':0, 'max':3000})
# you could copy and paste this url in a browser to see the image 
print(url)
# display image thumbnails.
Image(url=url)

https://earthengine.googleapis.com/api/thumb?thumbid=2fc330c1b208cf8a1ad3d67bab94696b&token=b21e9cbc68b42bdb5305eca71b4250cf


In [7]:
vis_image = ee.Image(
    l8sr.filterDate('2015-01-01', '2019-12-30')
        .filterBounds(ee.Geometry.Point(-70.0522, 18.7490))
        .first()
        .select('B[2-5]')
)
pp.pprint(vis_image.getInfo())

Image(url=vis_image.getThumbUrl({'min': 0, 'max': 6000,'bands': 'B4,B3,B2', 'gamma': '0.95, 1.1, 1'}))

{'bands': [{'crs': 'EPSG:32619',
            'crs_transform': [30.0, 0.0, 305985.0, 0.0, -30.0, 2193615.0],
            'data_type': {'max': 32767,
                          'min': -32768,
                          'precision': 'int',
                          'type': 'PixelType'},
            'dimensions': [7621, 7771],
            'id': 'B2'},
           {'crs': 'EPSG:32619',
            'crs_transform': [30.0, 0.0, 305985.0, 0.0, -30.0, 2193615.0],
            'data_type': {'max': 32767,
                          'min': -32768,
                          'precision': 'int',
                          'type': 'PixelType'},
            'dimensions': [7621, 7771],
            'id': 'B3'},
           {'crs': 'EPSG:32619',
            'crs_transform': [30.0, 0.0, 305985.0, 0.0, -30.0, 2193615.0],
            'data_type': {'max': 32767,
                          'min': -32768,
                          'precision': 'int',
                          'type': 'PixelType'},
            'dimensio

In [8]:
#Create a slider widget to add both Landsat 8 and PlanetScope 4B SR imagery
map1 = ipyleaflet.Map(
    center=(18.7490, -70.0522), zoom=12,
    layout={'height':'500px'},
)
# ps4bsr_tile_url=GetTileLayerUrl(ps4bsr.median().visualize(min=600, max=4000, bands=['b4', 'b3', 'b2']))
l8sr_tile_url = GetTileLayerUrl(l8sr.median().visualize(min=100, max=3500, gamma=1.5, bands= ['B5', 'B3', 'B2']))  #Landsat 8 SR
# left = ipyleaflet.TileLayer(url=ps4bsr_tile_url)
right=ipyleaflet.TileLayer(url=l8sr_tile_url)
control = ipyleaflet.SplitMapControl(right_layer=right)
map1.add_control(control)
map1

Map(basemap={'url': 'https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png', 'max_zoom': 19, 'attribution': 'Map …

In [9]:
# Define function to mask clouds by using the pixel QA band to mask clouds in surface reflectance (SR) data.
def maskL8sr(image):
    # Bits 3 and 5 are cloud shadow and cloud, respectively.
    cloud_shadow_bit_mask = (1 << 3)
    clouds_bit_mask = (1 << 5)
    # Get the pixel QA band.
    qa = image.select('pixel_qa');
    # Both flags should be set to zero, indicating clear conditions.
    mask = qa.bitwiseAnd(cloud_shadow_bit_mask).eq(0) and (qa.bitwiseAnd(clouds_bit_mask).eq(0))
    # Return the masked image, scaled to reflectance, without the QA bands.
    return image.updateMask(mask).select("B[0-9]*").copyProperties(image, ["system:time_start"])#.divide(10000)

l8sr_mask = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR').sort('CLOUD_COVER').map(maskL8sr)

mask_image = ee.Image(
    l8sr_mask.filterDate('2015-01-01', '2019-12-30')
             .filterBounds(ee.Geometry.Point(-70.0522, 18.7490))
             .first()
             .select('B[2-5]')
)

#mask_image = l8sr_mask.filterDate('2015-01-01', '2019-12-30').filterBounds(ee.Geometry.Point(-70.0522, 18.7490)).first()

pp.pprint(mask_image.getInfo())

Image(url=mask_image.getThumbUrl({'min': 0, 'max': 2000,'bands': 'B4,B3,B2', 'gamma': '0.95, 1.1, 1'}))

{'bands': [{'crs': 'EPSG:32619',
            'crs_transform': [30.0, 0.0, 305985.0, 0.0, -30.0, 2193615.0],
            'data_type': {'max': 32767,
                          'min': -32768,
                          'precision': 'int',
                          'type': 'PixelType'},
            'dimensions': [7621, 7771],
            'id': 'B2'},
           {'crs': 'EPSG:32619',
            'crs_transform': [30.0, 0.0, 305985.0, 0.0, -30.0, 2193615.0],
            'data_type': {'max': 32767,
                          'min': -32768,
                          'precision': 'int',
                          'type': 'PixelType'},
            'dimensions': [7621, 7771],
            'id': 'B3'},
           {'crs': 'EPSG:32619',
            'crs_transform': [30.0, 0.0, 305985.0, 0.0, -30.0, 2193615.0],
            'data_type': {'max': 32767,
                          'min': -32768,
                          'precision': 'int',
                          'type': 'PixelType'},
            'dimensio

In [10]:
#region = area_of_interest.bounds().getInfo()['coordinates'][0]
task = ee.batch.Export.image.toDrive(
  image = mask_image,
  description = "mask_image",
  #assetId = 'users/ximenamesa/LC' + "DR",
  maxPixels = 12997211221,   
  #region = region,
  scale = 2)
task.start()
print("Task started")

Task started


## ee.Algorithms.Landsat.simpleComposite()

For creating simple cloud-free Landsat composites, Earth Engine provides the ee.Algorithms.Landsat.simpleComposite() method. This method selects a subset of scenes at each location, converts to TOA reflectance, applies the simple cloud score and takes the median of the least cloudy pixels. This example creates a simple composite using default parameters and compares it to a composite using custom parameters for the cloud score threshold and the percentile:

## Stack L8 bands, NDVI, texture

Classification result could be improved by adding more data such as vegetation indices and textural features.

In [11]:
# NDVI data
# compute the normalized vegetation index using Infrared and red bands
# NDVI = (NIR - red) / (NIR + red)

# Functions to derive vegetation indices and other raster operations
def NDVI(image):
    return image.normalizedDifference(['B5', 'B4'])

ndvi = NDVI(mask_image)

# Texture Data
# Define the kernel
kernel = ee.Kernel.square(1, 'pixels')  # filter mean of 3*3
# Compute mean as texture of the landsat 8 collection
mean = mask_image.reduceNeighborhood(
    ee.Reducer.mean(),
    kernel
)

pp.pprint(mean.getInfo())

{'bands': [{'crs': 'EPSG:32619',
            'crs_transform': [30.0, 0.0, 305985.0, 0.0, -30.0, 2193615.0],
            'data_type': {'max': 32767.0,
                          'min': -32768.0,
                          'precision': 'double',
                          'type': 'PixelType'},
            'dimensions': [7621, 7771],
            'id': 'B2_mean'},
           {'crs': 'EPSG:32619',
            'crs_transform': [30.0, 0.0, 305985.0, 0.0, -30.0, 2193615.0],
            'data_type': {'max': 32767.0,
                          'min': -32768.0,
                          'precision': 'double',
                          'type': 'PixelType'},
            'dimensions': [7621, 7771],
            'id': 'B3_mean'},
           {'crs': 'EPSG:32619',
            'crs_transform': [30.0, 0.0, 305985.0, 0.0, -30.0, 2193615.0],
            'data_type': {'max': 32767.0,
                          'min': -32768.0,
                          'precision': 'double',
                          'type': 'Pix

In [12]:
spectral_indices_stack = ee.Image(mask_image).addBands(ndvi).addBands(mean)
pp.pprint(spectral_indices_stack.getInfo())

{'bands': [{'crs': 'EPSG:32619',
            'crs_transform': [30.0, 0.0, 305985.0, 0.0, -30.0, 2193615.0],
            'data_type': {'max': 32767,
                          'min': -32768,
                          'precision': 'int',
                          'type': 'PixelType'},
            'dimensions': [7621, 7771],
            'id': 'B2'},
           {'crs': 'EPSG:32619',
            'crs_transform': [30.0, 0.0, 305985.0, 0.0, -30.0, 2193615.0],
            'data_type': {'max': 32767,
                          'min': -32768,
                          'precision': 'int',
                          'type': 'PixelType'},
            'dimensions': [7621, 7771],
            'id': 'B3'},
           {'crs': 'EPSG:32619',
            'crs_transform': [30.0, 0.0, 305985.0, 0.0, -30.0, 2193615.0],
            'data_type': {'max': 32767,
                          'min': -32768,
                          'precision': 'int',
                          'type': 'PixelType'},
            'dimensio

## Training Points

We're going to perform a supervised forest change classification using the random forests algorithm. To do so, we need to provide this algorithm with training data, i.e. we need to delineate areas of known fate to characterized the spectral signatures of the different classes. In particular, we're interested in the following 4 classes:

1. Water
2. Forest
3. urban
4. Agriculture

## Upload fusion table


In [None]:


def edit_kmz(kmz,output,image):
  zf = zipfile.ZipFile(kmz)
  temp = r'tempfolder\doc.kml'
  for line in zf.read("doc.kml").split("\n"):
    with open(temp,'a') as wf: #Create the doc.kml...
        

In [19]:
import pprint
# Load tables into feature collection
#points = ee.FeatureCollection('ft:10q302Kafv_V_Le-jm1Cj274Ejz0kTsSWTKNxf3br').remap([1, 2, 3,4], [0, 1, 2, 3], "class")
## https://fusiontables.google.com/data?docid=10q302Kafv_V_Le-jm1Cj274Ejz0kTsSWTKNxf3br
# points = ee.FeatureCollection('ft:1p-YuR8JdqopYb-LrUxb43xzGHQyKmhGr1EJU3wku').remap([1, 2, 3,4], [0, 1, 2, 3], "Name")
## find table in https://fusiontables.google.com/data?docid=1Axs89HgE3yIgPoTMxTjL965OYgtBrp2i1O2waz29#rows:id=1
# points = ee.FeatureCollection('ft:1Axs89HgE3yIgPoTMxTjL965OYgtBrp2i1O2waz29')
# 07/11 latest points. Finde here https://fusiontables.google.com/data?docid=1jHRNbgg-nvJ-TNJBjeB28apGsIvdrySp3WbL_PZG#rows:id=1
points = ee.FeatureCollection('ft:1jHRNbgg-nvJ-TNJBjeB28apGsIvdrySp3WbL_PZG')#.remap([1, 2, 3,4], [0, 1, 2, 3], "name")
        
pprint.pprint(points.getInfo())


{'columns': {'name': 'Number'},
 'features': [{'geometry': {'coordinates': [-69.821972, 18.509205],
                            'type': 'Point'},
               'id': '2',
               'properties': {'name': 4.0},
               'type': 'Feature'},
              {'geometry': {'coordinates': [-69.814749, 18.514787],
                            'type': 'Point'},
               'id': '3',
               'properties': {'name': 4.0},
               'type': 'Feature'},
              {'geometry': {'coordinates': [-69.816642, 18.515114],
                            'type': 'Point'},
               'id': '4',
               'properties': {'name': 4.0},
               'type': 'Feature'},
              {'geometry': {'coordinates': [-69.821361, 18.51591],
                            'type': 'Point'},
               'id': '5',
               'properties': {'name': 4.0},
               'type': 'Feature'},
              {'geometry': {'coordinates': [-69.816524, 18.519229],
                         

## Extract cell values

We need to extract the imagery cell values for each training point. This will produce a single table that associates pixels of each class with the spectral band values in those pixels. 

In [20]:
# Sample the input imagery to get a FeatureCollection of training data.
training = spectral_indices_stack.sampleRegions(points,["name"],30)
pprint.pprint(training.getInfo())

{'columns': {},
 'features': [{'geometry': None,
               'id': '2_0',
               'properties': {'B2': 699,
                              'B2_mean': 688.5555555555554,
                              'B3': 1130,
                              'B3_mean': 1118.1111111111106,
                              'B4': 1222,
                              'B4_mean': 1247.8888888888887,
                              'B5': 2226,
                              'B5_mean': 2166.666666666666,
                              'name': 4.0,
                              'nd': 0.2911833},
               'type': 'Feature'},
              {'geometry': None,
               'id': '3_0',
               'properties': {'B2': 1319,
                              'B2_mean': 894.7777777777776,
                              'B3': 1815,
                              'B3_mean': 1361.8888888888887,
                              'B4': 2095,
                              'B4_mean': 1522.3333333333328,
                   

                              'B4_mean': 175.99999999999994,
                              'B5': 2212,
                              'B5_mean': 2100.3333333333326,
                              'name': 2.0,
                              'nd': 0.8387365},
               'type': 'Feature'},
              {'geometry': None,
               'id': '96_0',
               'properties': {'B2': 118,
                              'B2_mean': 135.66666666666663,
                              'B3': 328,
                              'B3_mean': 345.7777777777777,
                              'B4': 184,
                              'B4_mean': 207.11111111111103,
                              'B5': 2570,
                              'B5_mean': 2800.666666666666,
                              'name': 2.0,
                              'nd': 0.86637616},
               'type': 'Feature'},
              {'geometry': None,
               'id': '97_0',
               'properties': {'B2': 175,
           

## Train classifier

We now have our imagery and our training data and it's time to run the random forests classification.

In [21]:
# Make a Random Forest classifier and train it.
rf = ee.Classifier.randomForest(10).train(training, ["name"])
# cart = ee.Classifier.cart().train(training,["landcover"],bands)
print(rf)

ee.Classifier({
  "type": "Invocation",
  "arguments": {
    "classifier": {
      "type": "Invocation",
      "arguments": {
        "numberOfTrees": 10
      },
      "functionName": "Classifier.randomForest"
    },
    "features": {
      "type": "Invocation",
      "arguments": {
        "image": {
          "type": "Invocation",
          "arguments": {
            "dstImg": {
              "type": "Invocation",
              "arguments": {
                "dstImg": {
                  "type": "Invocation",
                  "arguments": {
                    "input": {
                      "type": "Invocation",
                      "arguments": {
                        "collection": {
                          "type": "Invocation",
                          "arguments": {
                            "collection": {
                              "type": "Invocation",
                              "arguments": {
                                "collection": {
                   

In [22]:
# Classify the input imagery.
# import matplotlib.pyplot as plt
# import matplotlib.image as mpimg

result = spectral_indices_stack.classify(rf)
type(result)

# imgplot = plt.imshow(result)
#result_cart = image.select(bands).classify(cart)
#pprint.pprint(result.getInfo())
# Image(url=result.getThumbUrl({'min': 0, 'max': 3}))
#pp.pprint(spectral_indices_stack.getInfo())

ee.image.Image

In [None]:
#Create a slider widget to add both Landsat 8 and PlanetScope 4B SR imagery
map1 = ipyleaflet.Map(
    center=(18.7490, -70.0522), zoom=12,
    layout={'height':'500px'},
)
# ps4bsr_tile_url=GetTileLayerUrl(ps4bsr.median().visualize(min=0, max=3, bands=['b4', 'b3', 'b2']))
l8sr_tile_url = GetTileLayerUrl(result().visualize(min=0, max=3, gamma=1.5))  #Landsat 8 SR
# left = ipyleaflet.TileLayer(url=ps4bsr_tile_url)
right=ipyleaflet.TileLayer(url=l8sr_tile_url)
control = ipyleaflet.SplitMapControl(right_layer=right)
map1.add_control(control)
map1

In [71]:
import ipywidgets

In [None]:
#region = boundary.geometry().bounds().getInfo()['coordinates'][0]
task = ee.batch.Export.image.toDrive(
  image = result,
  description = "LCimage",
  #assetId = 'users/ximenamesa/LC' + "DR",
  maxPixels = 100000000,   
  #region = region,
  scale = 2)
task.start()
print("Task started")

In [None]:
# clas_col = ','.join(['red','green','blue','pink'])   
Image(result({'min': 0, 'max': 3}))

## Other Classifiers

Try some other classifiers in GEE to see if the result is better or different.

## Accuracy Assessment

The final aspect of the classification is an unbiased estimate of the overall and class specific classification accuracy. For the accuracy assessment we will use a stratified random sampling procedure where the number of samples is equal for all strata. It is also possible to sample based on inclusion probability of each stratum, which depends on the frequency of pixels of each stratum.

## Stratified Random Sampling by Land-Cover Class

Before generating random pixels from the data frame we need to know how many pixels we need per class to evaluate a specified accuracy with an acceptable confidence of that estimate. To calculate the number of samples per class according to sampling theory to evaluate an accuracy refer to equation 26 on page 44 of the landCoverClassification_theory.

In [27]:
trainAccuracy = rf.confusionMatrix()
print('Rerror matrix: ', trainAccuracy)
print('Training overall accuracy: ', trainAccuracy.accuracy())

Rerror matrix:  ee.ConfusionMatrix({
  "type": "Invocation",
  "arguments": {
    "array": {
      "type": "Invocation",
      "arguments": {
        "classifier": {
          "type": "Invocation",
          "arguments": {
            "classifier": {
              "type": "Invocation",
              "arguments": {
                "numberOfTrees": 10
              },
              "functionName": "Classifier.randomForest"
            },
            "features": {
              "type": "Invocation",
              "arguments": {
                "image": {
                  "type": "Invocation",
                  "arguments": {
                    "dstImg": {
                      "type": "Invocation",
                      "arguments": {
                        "dstImg": {
                          "type": "Invocation",
                          "arguments": {
                            "input": {
                              "type": "Invocation",
                              "arguments

In [None]:
validation = result.sample(240)

In [36]:
trainingTesting = training.randomColumn()
trainingSet = trainingTesting.filter(ee.Filter.lessThan('random', 0.6)) 
testingSet = trainingTesting.filter(ee.Filter.greaterThanOrEquals('random', 0.6)) 