### This notebook demonstrates the use of post-classification filters created by the MapBiomas team.
The MapBiomas team provides guidance on post-classification filters, including gap filling, spatial filters, temporal filters, incidence filters, and frequency filters. These filters were implemented in the module [post_classification_filters.py] of the [wri_change_detection repository](https://github.com/wri/rw-dynamicworld-cd/tree/master/wri_change_detection). This notebook is provided to give a tutorial on how to apply these filters to a land cover classification time series over one area of Brazil, classified using Dynamic World.

You can learn more about the MapBiomas project at their [home page](https://mapbiomas.org/). The development of MapBiomas was done by several groups for each biome and cross-cutting theme that occurs in Brazil. You can read more of the methodology in the [Algorithm Theoretical Basis Document (ATBD) Page](https://mapbiomas.org/en/download-of-atbds) on their website, including the main ATBD and appendices for each each biome and cross-cutting themes. 

From Section 3.5 of the ATBD, MapBiomas defines post-classification filters,
"[due] to the pixel-based classification method and the long temporal series, a chain of post-classification filters was applied. The first post-classification action involves the application of temporal filters. Then, a spatial filter was applied followed by a gap fill filter. The application of these filters remove classification noise. 
These post-classification procedures were implemented in the Google Earth Engine platform"

Below is the copy of the licensing for MapBiomas:
The MapBiomas data are public, open and free through license Creative Commons CC-CY-SA and the simple reference of the source observing the following format:
"Project MapBiomas - Collection v5.0 of Brazilian Land Cover & Use Map Series, accessed on 12/14/2020 through the link: https://github.com/mapbiomas-brazil/mapbiomas-brazil.github.io"
"MapBiomas Project - is a multi-institutional initiative to generate annual land cover and use maps using automatic classification processes applied to satellite images. The complete description of the project can be found at http://mapbiomas.org".
Access here the scientific publication: Souza at. al. (2020) - Reconstructing Three Decades of Land Use and Land Cover Changes in Brazilian Biomes with Landsat Archive and Earth Engine - Remote Sensing, Volume 12, Issue 17, 10.3390/rs12172735.


This notebook includes 6 Steps:
1. Loading land cover classifications from Dynamic World
2. Applying Gap Filling for Clouds
3. Applying Temporal Filters
4. Applying Spatial Filters
5. Applying Incidence Filters
5. Applying Frequency Filters


## Step 0: Load libraries and iniatilize Earth Engine

In [None]:
#Load necessary libraries
import sys
import os
import ee
import geemap
import numpy as np
import pandas as pd
from IPython.display import HTML, display
from ipyleaflet import Map, basemaps
import random
import json
import time
import ast

# relative import for this folder hierarchy, credit: https://stackoverflow.com/a/35273613
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)
    
from wri_change_detection import preprocessing as npv
from wri_change_detection import gee_classifier as gclass
from wri_change_detection import post_classification_filters as pcf



<font size="4">Iniatilize Earth Engine and Google Cloud authentication</font>

In [None]:
#Initialize earth engine
try:
    ee.Initialize()
except Exception as e:
    ee.Authenticate()
    ee.Initialize()

<font size="4">Define a seed number to ensure reproducibility across random processes. This seed will be used in all subsequent sampling as well. We'll also define seeds for sampling the training, validation, and test sets.</font>

In [None]:
num_seed=30
random.seed(num_seed)


## Step 1: Load Land Cover Classifications and Label Data


<font size="4">

Define land cover classification image collection, with one image for each time period. Each image should have one band representing the classification in that pixel for one time period.</font>

In [None]:
#Load collection
#This collection represents monthly dynamic world classifications of land cover, later we'll squash it to annual
dynamic_world_classifications_monthly = ee.ImageCollection('projects/wings-203121/assets/dynamic-world/v3-5_stack_tests/wri_test_goldsboro')

#Get classes from first image
dw_classes = dynamic_world_classifications_monthly.first().bandNames()
dw_classes_str = dw_classes.getInfo()
full_dw_classes_str = ['No Data']+dw_classes_str

#Get dictionary of classes and values
#Define array of land cover classification values
dw_class_values = np.arange(1,10).tolist()
dw_class_values_ee = ee.List(dw_class_values)
#Create dictionary representing land cover classes and land cover class values
dw_classes_dict = ee.Dictionary.fromLists(dw_classes, dw_class_values_ee)

#Make sure the dictionary looks good
print(dw_classes_dict.getInfo())


<font size="4">Define color palettes to map land cover</font>

In [None]:
change_detection_palette = ['#ffffff', # no_data=0
                              '#419bdf', # water=1
                              '#397d49', # trees=2
                              '#88b053', # grass=3
                              '#7a87c6', # flooded_vegetation=4
                              '#e49535', # crops=5
                              '#dfc25a', # scrub_shrub=6
                              '#c4291b', # builtup=7
                              '#a59b8f', # bare_ground=8
                              '#a8ebff', # snow_ice=9
                              '#616161', # clouds=10
]
statesViz = {'min': 0, 'max': 10, 'palette': change_detection_palette};

oneChangeDetectionViz = {'min': 0, 'max': 1, 'palette': ['696a76','ff2b2b']}; #gray = 0, red = 1
consistentChangeDetectionViz = {'min': 0, 'max': 1, 'palette': ['0741df','df07b5']}; #blue = 0, pink = 1



<font size="4">Gather projection and geometry information from the land cover classifications</font>

In [None]:
projection_ee = dynamic_world_classifications_monthly.first().projection()
projection = projection_ee.getInfo()
crs = projection.get('crs')
crsTransform = projection.get('transform')
scale = dynamic_world_classifications_monthly.first().projection().nominalScale().getInfo()
print('CRS and Transform: ',crs, crsTransform)

geometry = dynamic_world_classifications_monthly.first().geometry().bounds()


<font size="4">Convert the land cover collection to a multiband image, one band for each year</font>

In [None]:
#Define years to get annual classifications for
years = np.arange(2016,2020)

#Squash scenes from monthly to annual
dynamic_world_classifications = npv.squashScenesToAnnualClassification(dynamic_world_classifications_monthly,years,method='median',image_name='dw_{}')
#Get image names 
dw_band_names = dynamic_world_classifications.aggregate_array('system:index').getInfo()
#Convert to a multiband image and rename using dw_band_names
dynamic_world_classifications_image = dynamic_world_classifications.toBands().rename(dw_band_names)


<font size="4">
Load label data to later compare land cover classification to label data. Export points of labelled data in order to compare to classifications later.
</font>

In [None]:
#only labels for regions in Modesto CA, Goldsboro NC, the Everglades in FL, and one region in Brazil have been
#uploaded to this collection
labels = ee.ImageCollection('projects/wri-datalab/DynamicWorld_CD/DW_Labels')

#Filter to where we have DW classifications
labels_filtered = labels.filterBounds(dynamic_world_classifications_monthly.geometry())
print('Number of labels that overlap classifications', labels_filtered.size().getInfo())

#Save labels projection
labels_projection = labels_filtered.first().projection()
#Define geometry to sample points from 
labels_geometry = labels_filtered.geometry().bounds()

#Compress labels by majority vote
labels_filtered = labels_filtered.reduce(ee.Reducer.mode())
#Remove pixels that were classified as no data
labels_filtered = labels_filtered.mask(labels_filtered.neq(0))
#Rename band
labels_filtered = labels_filtered.rename(['labels'])


#Sample points from label image at every pixel
labelPoints = labels_filtered.sample(region=labels_geometry, projection=labels_projection, 
                                     factor=1, 
                                     seed=num_seed, dropNulls=True,
                                     geometries=True)

#Export sampled points
labelPoints_export_name = 'goldsboro'
labelPoints_assetID = 'projects/wri-datalab/DynamicWorld_CD/DW_LabelPoints_{}'
labelPoints_description = 'DW_LabelPoints_{}'

export_results_task = ee.batch.Export.table.toAsset(
    collection=labelPoints, 
    description = labelPoints_description.format(labelPoints_export_name), 
    assetId = labelPoints_assetID.format(labelPoints_export_name))
export_results_task.start()


<font size="4">
Map years to check them out.</font>

In [None]:
#Map years to check them out!
center = [35.410769, -78.100163]
zoom = 12
Map1 = geemap.Map(center=center, zoom=zoom,basemap=basemaps.Esri.WorldImagery,add_google_map = False)
Map1.addLayer(dynamic_world_classifications_image.select('dw_2016'),statesViz,name='2016 DW LC')
Map1.addLayer(dynamic_world_classifications_image.select('dw_2017'),statesViz,name='2017 DW LC')
Map1.addLayer(dynamic_world_classifications_image.select('dw_2018'),statesViz,name='2018 DW LC')
Map1.addLayer(dynamic_world_classifications_image.select('dw_2019'),statesViz,name='2019 DW LC')
Map1.addLayer(labels_filtered,statesViz,name='Labels')
display(Map1)


<font size="4">
Calculate Accuracy and Confusion Matrix for Original Classifications on Label Data</font>

In [None]:
#Load label points
labelPointsFC = ee.FeatureCollection(labelPoints_assetID.format('goldsboro'))

#Save 2019 DW classifications and rename to "dw_classifications"
dw_2019 = dynamic_world_classifications_image.select('dw_2019').rename('dw_classifications')

#Sample the 2019 classifications at each label point
labelPointsWithDW = dw_2019.sampleRegions(collection=labelPointsFC, projection = projection_ee, 
                                          tileScale=4, geometries=True)

#Calculate confusion matrix, which we will use for an accuracy assessment
originalErrorMatrix = labelPointsWithDW.errorMatrix('labels', 'dw_classifications')

#Print the confusion matrix with the class names as a dataframe
errorMatrixDf = gclass.pretty_print_confusion_matrix_multiclass(originalErrorMatrix, full_dw_classes_str)
#Axis 1 (the rows) of the matrix correspond to the actual values, and Axis 0 (the columns) to the predicted values.
print('Axis 1 (the rows) of the matrix correspond to the actual values, and Axis 0 (the columns) to the predicted values.')
display(errorMatrixDf)

#You can also print further accuracy scores from the confusion matrix, however each one takes a couple minutes 
#to load
print('Accuracy',originalErrorMatrix.accuracy().getInfo())
# print('Consumers Accuracy',originalErrorMatrix.consumersAccuracy().getInfo())
# print('Producers Accuracy',originalErrorMatrix.producersAccuracy().getInfo())
# print('Kappa',originalErrorMatrix.kappa().getInfo())



In [None]:
#Calculate the number of changes for each year

for year in years[0:-1]:
    year_list = ['dw_{}'.format(year),'dw_{}'.format(year+1)]
    num_changes = pcf.calculateNumberOfChanges(dynamic_world_classifications_image.select(year_list), year_list)

    num_changes_mean = num_changes.reduceRegion(reducer=ee.Reducer.mean(), 
                                                  geometry=geometry,
                                                  crs=crs, crsTransform=crsTransform, 
                                                  bestEffort=True, 
                                                  maxPixels=1e13, tileScale=4)
    print('Number of changes from',year,'to',year+1,"{:.4f}".format(num_changes_mean.get('sum').getInfo()))



## Begin Applying Filters

<font size="4">

Now that we have prepared all of the necessary variables to do our post-processing, we'll start applying the filters defined by MapBiomas. While the filters are designed to be applied serially, here we'll apply each filter individually (after the gap filling) in order to see the performance of each one on its own, mainly because we only have so many years of Dynamic World. For each filter, we'll apply the filter, then find the overall accuracy against the training data. 
</font>

## Step 2: Apply Gap Filling


<font size="4">Section 3.5.1. of the ATBD: Gap fill:
The Gap fill filter was used to fill possible no-data values. In a long time series of severely cloud-affected regions, it is expected that no-data values may populate some of the resultant median composite pixels. In this filter, no-data values (“gaps”) are theoretically not allowed and are replaced by the temporally nearest valid classification. In this procedure, if no “future” valid position is available, then the no-data value is replaced by its previous valid class. Up to three prior years can be used to fill in persistent no-data positions. Therefore, gaps should only exist if a given pixel has been permanently classified as no-data throughout the entire temporal domain.

All code for the Gap Filters was provided by the [Pampa Team](https://github.com/mapbiomas-brazil/pampa) in [this file](https://github.com/mapbiomas-brazil/pampa/blob/master/Step006_Filter_01_gagfill.js), although the same gap fill is applied to all cross-cutting themes and biome groups.
    
Functions were rewritten in Python and made independent of the land cover classification image. The implementation of the gap fill in the MapBiomas code actually applies both a forward no-data filter and a backwards no-data filter. 

For the demo Dynamic World classifications in this notebook, none of the years have any missing data! Therefore we'll introduce some fake missing data areas in order to demonstrate the gap filling.</font>

In [None]:
#Introducing no data pixels for some years
dw_2016_with_gaps = dynamic_world_classifications_image.select('dw_2016').mask(dynamic_world_classifications_image.select('dw_2016').neq(ee.Image.constant(3)))
dw_2017_with_gaps = dynamic_world_classifications_image.select('dw_2017').mask(dynamic_world_classifications_image.select('dw_2017').neq(ee.Image.constant(5)))
dw_2019_with_gaps = dynamic_world_classifications_image.select('dw_2019').mask(dynamic_world_classifications_image.select('dw_2019').neq(ee.Image.constant(1)))
dw_with_gaps = dw_2016_with_gaps.addBands(dw_2017_with_gaps).addBands(dynamic_world_classifications_image.select('dw_2018')).addBands(dw_2019_with_gaps)
dw_with_gaps = dw_with_gaps.rename(dw_band_names)

#Apply gap filtering
gap_filled = pcf.applyGapFilter(dw_with_gaps, dw_band_names)


<font size="4">Map the before and after to see the affects of the gap filtering</font>

In [None]:
#Map years to check them out!
center = [35.410769, -78.100163]
zoom = 12
Map2 = geemap.Map(center=center, zoom=zoom,basemap=basemaps.Esri.WorldImagery,add_google_map = False)
Map2.addLayer(dw_with_gaps.select('dw_2016'),statesViz,name='2016 DW LC')
Map2.addLayer(gap_filled.select('dw_2016'),statesViz,name='2016 Gap Filled')
Map2.addLayer(dw_with_gaps.select('dw_2017'),statesViz,name='2017 DW LC')
Map2.addLayer(gap_filled.select('dw_2017'),statesViz,name='2017 Gap Filled')
Map2.addLayer(dw_with_gaps.select('dw_2018'),statesViz,name='2018 DW LC')
Map2.addLayer(gap_filled.select('dw_2018'),statesViz,name='2018 Gap Filled')
Map2.addLayer(dw_with_gaps.select('dw_2019'),statesViz,name='2019 DW LC')
Map2.addLayer(gap_filled.select('dw_2019'),statesViz,name='2019 Gap Filled')
display(Map2)


<font size="4">Calculate accuracy and confusion matrix for gap filled classifications on label data</font>

In [None]:
#Load label points
labelPointsFC = ee.FeatureCollection(labelPoints_assetID.format('goldsboro'))

#Save 2019 post-filtered DW classifications and rename to "dw_filterd_classifications"
classifications_filtered_2019 = gap_filled.select('dw_2019').rename('dw_gap_filled_classifications')

#Sample the 2019 classifications at each label point
labelPointsWithFilteredDW = classifications_filtered_2019.sampleRegions(collection=labelPointsFC, 
                                                                        projection = projection_ee, 
                                                                        tileScale=4, geometries=True)

#Calculate confusion matrix, which we will use for an accuracy assessment
filteredErrorMatrix = labelPointsWithFilteredDW.errorMatrix('labels', 'dw_gap_filled_classifications')

#Print the confusion matrix with the class names as a dataframe
errorMatrixDf = gclass.pretty_print_confusion_matrix_multiclass(filteredErrorMatrix, full_dw_classes_str)
#Axis 1 (the rows) of the matrix correspond to the actual values, and Axis 0 (the columns) to the predicted values.
print('Axis 1 (the rows) of the matrix correspond to the actual values, and Axis 0 (the columns) to the predicted values.')
display(errorMatrixDf)

#You can also print further accuracy scores from the confusion matrix, however each one takes a couple minutes 
#to load
print('Accuracy',filteredErrorMatrix.accuracy().getInfo())
# print('Consumers Accuracy',originalErrorMatrix.consumersAccuracy().getInfo())
# print('Producers Accuracy',originalErrorMatrix.producersAccuracy().getInfo())
# print('Kappa',originalErrorMatrix.kappa().getInfo())


In [None]:
#Calculate the number of changes for each year

for year in years[0:-1]:
    year_list = ['dw_{}'.format(year),'dw_{}'.format(year+1)]
    num_changes = pcf.calculateNumberOfChanges(gap_filled.select(year_list), year_list)

    num_changes_mean = num_changes.reduceRegion(reducer=ee.Reducer.mean(), 
                                                  geometry=geometry,
                                                  crs=crs, crsTransform=crsTransform, 
                                                  bestEffort=True, 
                                                  maxPixels=1e13, tileScale=4)
    print('Number of changes from',year,'to',year+1,"{:.4f}".format(num_changes_mean.get('sum').getInfo()))



## Step 3: Apply Temporal Filters

<font size="4">
<br>
Section 3.5.3. of the ATBD: Temporal filter:
"The temporal filter uses sequential classifications in a three-to-five-years unidirectional moving window to identify temporally non-permitted transitions. Based on generic rules (GR), the temporal filter inspects the central position of three to five consecutive years, and if the extremities of the consecutive years are identical but the centre position is not, then the central pixels are reclassified to match its temporal neighbour class. For the three years based temporal filter, a single central position shall exist, for the four and five years filters, two and there central positions are respectively considered.
Another generic temporal rule is applied to extremity of consecutive years. In this case, a three consecutive years window is used and if the classifications of the first and last years are different from its neighbours, this values are replaced by the classification of its matching neighbours."
    
All code for the Temporal Filters was provided by the [Pampa Team](https://github.com/mapbiomas-brazil/pampa) in [this file](https://github.com/mapbiomas-brazil/pampa/blob/master/Step006_Filter_03_temporal.js)

Functions were rewritten in Python and made independent of the land cover classification image.

The MapBiomas implementation of the temporal filters includes the ability to perform the temporal filtering for one land cover class at a time.</font>

In [None]:
#Load classifications into an image that will be filtered
temporally_filtered = dynamic_world_classifications_image

#Get a list of land cover values to apply the filters
class_dictionary = dw_classes_dict.getInfo()
order_of_values = [class_dictionary.get('trees'),class_dictionary.get('crops'),class_dictionary.get('built_area'),
                  class_dictionary.get('grass'),class_dictionary.get('scrub'),class_dictionary.get('bare_ground'),
                   class_dictionary.get('flooded_vegetation'),class_dictionary.get('water'),class_dictionary.get('snow_and_ice')]

#Loop through order_of_values and apply temporal filters, in order applied by MapBiomas in https://github.com/mapbiomas-brazil/pampa/blob/master/Step006_Filter_03_temporal.js
#We'll first apply the filter to the first year
#Then apply the filter for the final year
#Then apply the 3 year window, 4 year window, and 5 year window

for i in np.arange(len(order_of_values)):
    id_class = order_of_values[i] 
    temporally_filtered = pcf.applyMask3first(temporally_filtered, id_class, dw_band_names)

for i in np.arange(len(order_of_values)):
    id_class = order_of_values[i] 
    temporally_filtered = pcf.applyMask3last(temporally_filtered, id_class, dw_band_names)

for i in np.arange(len(order_of_values)):
    id_class = order_of_values[i] 
    temporally_filtered = pcf.applyWindow3years(temporally_filtered, id_class, dw_band_names)

for i in np.arange(len(order_of_values)):
    id_class = order_of_values[i] 
    temporally_filtered = pcf.applyWindow4years(temporally_filtered, id_class, dw_band_names)
    
for i in np.arange(len(order_of_values)):
    id_class = order_of_values[i] 
    temporally_filtered = pcf.applyWindow5years(temporally_filtered, id_class, dw_band_names)

    

In [None]:
#Map before and after along with a layer to see pixels that changed
changed_with_temporal_filter = temporally_filtered.select('dw_2017').neq(dynamic_world_classifications_image.select('dw_2017'))

center = [35.410769, -78.100163]
zoom = 12
Map3 = geemap.Map(center=center, zoom=zoom,basemap=basemaps.Esri.WorldImagery,add_google_map = False)
Map3.addLayer(dynamic_world_classifications_image.select('dw_2016'),statesViz,name='2016 LC')
Map3.addLayer(dynamic_world_classifications_image.select('dw_2017'),statesViz,name='2017 LC')
Map3.addLayer(dynamic_world_classifications_image.select('dw_2018'),statesViz,name='2018 LC')
Map3.addLayer(dynamic_world_classifications_image.select('dw_2019'),statesViz,name='2018 LC')
Map3.addLayer(temporally_filtered.select('dw_2016'),statesViz,name='2016 Post Filter')
Map3.addLayer(temporally_filtered.select('dw_2017'),statesViz,name='2017 Post Filter')
Map3.addLayer(temporally_filtered.select('dw_2018'),statesViz,name='2018 Post Filter')
Map3.addLayer(temporally_filtered.select('dw_2019'),statesViz,name='2019 Post Filter')
Map3.addLayer(changed_with_temporal_filter,oneChangeDetectionViz,name='LC Classes in 2017 that changed after filter')
#Grey areas show no change with the filter, red areas show change with the filter
display(Map3)


<font size="4">Calculate accuracy and confusion matrix for temporally filtered classifications on label data</font>

In [None]:
#Load label points
labelPointsFC = ee.FeatureCollection(labelPoints_assetID.format('goldsboro'))

#Save 2019 post-filtered DW classifications and rename to "dw_filterd_classifications"
classifications_filtered_2019 = temporally_filtered.select('dw_2019').rename('dw_temp_filt_classifications')

#Sample the 2019 classifications at each label point
labelPointsWithFilteredDW = classifications_filtered_2019.sampleRegions(collection=labelPointsFC, 
                                                                        projection = projection_ee, 
                                                                        tileScale=4, geometries=True)

#Calculate confusion matrix, which we will use for an accuracy assessment
filteredErrorMatrix = labelPointsWithFilteredDW.errorMatrix('labels', 'dw_temp_filt_classifications')

#Print the confusion matrix with the class names as a dataframe
errorMatrixDf = gclass.pretty_print_confusion_matrix_multiclass(filteredErrorMatrix, full_dw_classes_str)
#Axis 1 (the rows) of the matrix correspond to the actual values, and Axis 0 (the columns) to the predicted values.
print('Axis 1 (the rows) of the matrix correspond to the actual values, and Axis 0 (the columns) to the predicted values.')
display(errorMatrixDf)

#You can also print further accuracy scores from the confusion matrix, however each one takes a couple minutes 
#to load
print('Accuracy',filteredErrorMatrix.accuracy().getInfo())
# print('Consumers Accuracy',originalErrorMatrix.consumersAccuracy().getInfo())
# print('Producers Accuracy',originalErrorMatrix.producersAccuracy().getInfo())
# print('Kappa',originalErrorMatrix.kappa().getInfo())


In [None]:
#Calculate the number of changes for each year

for year in years[0:-1]:
    year_list = ['dw_{}'.format(year),'dw_{}'.format(year+1)]
    num_changes = pcf.calculateNumberOfChanges(temporally_filtered.select(year_list), year_list)

    num_changes_mean = num_changes.reduceRegion(reducer=ee.Reducer.mean(), 
                                                  geometry=geometry,
                                                  crs=crs, crsTransform=crsTransform, 
                                                  bestEffort=True, 
                                                  maxPixels=1e13, tileScale=4)
    print('Number of changes from',year,'to',year+1,"{:.4f}".format(num_changes_mean.get('sum').getInfo()))



## Step 4: Apply Spatial Filters
<font size="4">
<br>
Section 3.5.2. of the ATBD: Spatial filter:
Spatial filter was applied to avoid unwanted modifications to the edges of the pixel groups (blobs), a spatial filter was built based on the “connectedPixelCount” function. Native to the GEE platform, this function locates connected components (neighbours) that share the same pixel value. Thus, only pixels that do not share connections to a predefined number of identical neighbours are considered isolated. In this filter, at least five connected pixels are needed to reach the minimum connection value. Consequently, the minimum mapping unit is directly affected by the spatial filter applied, and it was defined as 5 pixels (~0.5 ha).
    
All code for the spatial filter was provided within the [intregration-toolkit](https://github.com/mapbiomas-brazil/integration-toolkit) that is used to combine land cover classifications from each biome and cross-cutting theme team. The direct code was provided in [this file](https://github.com/mapbiomas-brazil/integration-toolkit/blob/master/mapbiomas-integration-toolkit.js).
    
Functions were rewritten in Python and made independent of the land cover classification image.

The spatial filters are applied for each land cover class defined by the user. For each class the user can define the minimum number of connected pixels needed to not filter out the cluster. If the number of connected pixels is too small, the central pixel is replaced by the mode of the time series.</font>

In [None]:
#Define a list of dictionaries, where each dictionary contains 'classValue' representing the value of the land cover
#class and 'minSize' representing the minimum connectedPixelCount needed to not be replaced by the filter

# no_data=0
# water=1
# trees=2
# grass=3
# flooded_vegetation=4
# crops=5
# scrub_shrub=6
# builtup=7
# bare_ground=8
# snow_ice=9
# clouds=10

filterParams = [
    {'classValue': 1, 'minSize': 5},
    {'classValue': 2, 'minSize': 5},
    {'classValue': 3, 'minSize': 5},
    {'classValue': 4, 'minSize': 5},
    {'classValue': 5, 'minSize': 10},
    {'classValue': 6, 'minSize': 5},
    {'classValue': 7, 'minSize': 3},
    {'classValue': 8, 'minSize': 5},
    {'classValue': 9, 'minSize': 10},
]

#Load classifications into an image that will be filtered
spatially_filtered = dynamic_world_classifications_image

#Define empty list to append outputted images from spatial filter
spatial_filter_output = []
#Loop through years
for band in dw_band_names:
    #Apply spatial filter for one year using the filterParams
    out_image = pcf.applySpatialFilter(spatially_filtered.select(band), filterParams)
    #Append result to list
    spatial_filter_output.append(out_image)
#Convert list to image collection, then to multiband image
spatially_filtered = ee.ImageCollection(spatial_filter_output).toBands().rename(dw_band_names)


In [None]:
#Map the before and after!
#The spatial filter depends on the scale, so to see the final results, reproject the image to the original 10 m resolution
changed_with_spatial_filter = dynamic_world_classifications_image.select('dw_2017').neq(spatially_filtered.select('dw_2017').reproject(crs='EPSG:3857', scale=10))

center = [35.410769, -78.100163]
zoom = 12
Map4 = geemap.Map(center=center, zoom=zoom,basemap=basemaps.Esri.WorldImagery,add_google_map = False)
Map4.addLayer(dynamic_world_classifications_image.select('dw_2017'),statesViz,name='2017 LC')
Map4.addLayer(spatially_filtered.select('dw_2017').reproject(crs='EPSG:3857', scale=10),statesViz,name='2017 LC Post Spatial Filter')
Map4.addLayer(changed_with_spatial_filter,oneChangeDetectionViz,name='Changed with spatial filter')
#Grey areas show no change with the filter, red areas show change with the filter
display(Map4)


<font size="4">Calculate accuracy and confusion matrix for spatially filtered classifications on label data</font>

In [None]:
#Load label points
labelPointsFC = ee.FeatureCollection(labelPoints_assetID.format('goldsboro'))

#Save 2019 post-filtered DW classifications and rename to "dw_filterd_classifications"
classifications_filtered_2019 = spatially_filtered.select('dw_2019').rename('dw_temp_spatial_filt_classifications')

#Sample the 2019 classifications at each label point
labelPointsWithFilteredDW = classifications_filtered_2019.sampleRegions(collection=labelPointsFC, 
                                                                        projection = projection_ee, 
                                                                        tileScale=4, geometries=True)

#Calculate confusion matrix, which we will use for an accuracy assessment
filteredErrorMatrix = labelPointsWithFilteredDW.errorMatrix('labels', 'dw_temp_spatial_filt_classifications')

#Print the confusion matrix with the class names as a dataframe
errorMatrixDf = gclass.pretty_print_confusion_matrix_multiclass(filteredErrorMatrix, full_dw_classes_str)
#Axis 1 (the rows) of the matrix correspond to the actual values, and Axis 0 (the columns) to the predicted values.
print('Axis 1 (the rows) of the matrix correspond to the actual values, and Axis 0 (the columns) to the predicted values.')
display(errorMatrixDf)

#You can also print further accuracy scores from the confusion matrix, however each one takes a couple minutes 
#to load
print('Accuracy',filteredErrorMatrix.accuracy().getInfo())
# print('Consumers Accuracy',originalErrorMatrix.consumersAccuracy().getInfo())
# print('Producers Accuracy',originalErrorMatrix.producersAccuracy().getInfo())
# print('Kappa',originalErrorMatrix.kappa().getInfo())

## Step 5 Apply Incidence Filter
<font size="4">
<br>
Section 3.5.5. of the ATBD: Incident Filter
"An incident filter were applied to remove pixels that changed too many times in the 34 years of time spam. All pixels that changed more than eight times and is connected to less than 6 pixels was replaced by the MODE value of that given pixel position in the stack of years. This avoids changes in the border of the classes and helps to stabilize originally noise pixel trajectories. Each biome and cross-cutting themes may have constituted customized applications of incident filters, see more details in its respective appendices."

This was not clearly implemented in the MapBiomas code, so this filter was coded by the WRI Team. The incidence filter finds all pixels that changed more than numChangesCutoff times and is connected to less than connectedPixelCutoff pixels, then replaces those pixels with the MODE value of that given pixel position in the stack of years.
</font>

In [None]:
#Load classifications into an image that will be filtered
incident_filtered = dynamic_world_classifications_image

#Calculate the number of changes in each pixel before the incidence filter
num_changes = pcf.calculateNumberOfChanges(dynamic_world_classifications_image, dw_band_names)

#Apply incidence filter
incident_filtered = pcf.applyIncidenceFilter(incident_filtered, dw_band_names, dw_classes_dict, 
                                             numChangesCutoff = 2, connectedPixelCutoff=6)

#Calculate the number of changes in each pixel before the incidence filter
num_changes_post_incidence = pcf.calculateNumberOfChanges(incident_filtered, dw_band_names)

#Calculate the difference in the number of changes before and after the filter
changed_from_incidence = num_changes.neq(num_changes_post_incidence)


In [None]:
#Map the results!
numChangesViz = {'min': 0, 'max': 3, 'palette': ['131b7a','04ecff']}; #gray = 0, red = 1co
center = [35.410769, -78.100163]
zoom = 12
Map5 = geemap.Map(center=center, zoom=zoom,basemap=basemaps.Esri.WorldImagery,add_google_map = False)
Map5.addLayer(num_changes,numChangesViz,name='Number of Changes Pre Filter')
Map5.addLayer(num_changes_post_incidence,numChangesViz,name='Number of Changes Post Filter')
Map5.addLayer(changed_from_incidence,oneChangeDetectionViz,name='Changed with Filter')
display(Map5)

<font size="4">Calculate accuracy and confusion matrix for incidence filtered classifications on label data</font>

In [None]:
#Load label points
labelPointsFC = ee.FeatureCollection(labelPoints_assetID.format('goldsboro'))

#Save 2019 post-filtered DW classifications and rename to "dw_filterd_classifications"
classifications_filtered_2019 = incident_filtered.select('dw_2019').rename('dw_temp_incidence_filt_classifications')

#Sample the 2019 classifications at each label point
labelPointsWithFilteredDW = classifications_filtered_2019.sampleRegions(collection=labelPointsFC, 
                                                                        projection = projection_ee, 
                                                                        tileScale=4, geometries=True)

#Calculate confusion matrix, which we will use for an accuracy assessment
filteredErrorMatrix = labelPointsWithFilteredDW.errorMatrix('labels', 'dw_temp_incidence_filt_classifications')

#Print the confusion matrix with the class names as a dataframe
errorMatrixDf = gclass.pretty_print_confusion_matrix_multiclass(filteredErrorMatrix, full_dw_classes_str)
#Axis 1 (the rows) of the matrix correspond to the actual values, and Axis 0 (the columns) to the predicted values.
print('Axis 1 (the rows) of the matrix correspond to the actual values, and Axis 0 (the columns) to the predicted values.')
display(errorMatrixDf)

#You can also print further accuracy scores from the confusion matrix, however each one takes a couple minutes 
#to load
print('Accuracy',filteredErrorMatrix.accuracy().getInfo())
# print('Consumers Accuracy',originalErrorMatrix.consumersAccuracy().getInfo())
# print('Producers Accuracy',originalErrorMatrix.producersAccuracy().getInfo())
# print('Kappa',originalErrorMatrix.kappa().getInfo())


## Step 6: Apply Frequency Filter
<font size="4">
<br>
Section 3.5.5. of the ATBD: Frequency Filter
"This filter takes into consideration the occurrence frequency throughout the entire time series. Thus, all class occurrence with less than given percentage of temporal persistence (eg. 3 years or fewer out of 33) are filtered out. This mechanism contributes to reducing the temporal oscillation associated to a given class, decreasing the number of false positives and preserving consolidated trajectories. Each biome and cross-cutting themes may have constituted customized applications of frequency filters, see more details in their respective appendices."

This was not clearly implemented in the MapBiomas code, so this filter was coded by the WRI Team. All class occurrence with less than given percentage of temporal persistence (eg. 3 years or fewer out of 33) are replaced with the mode value of that given pixel position in the stack of years.
</font>

In [None]:
#Load classifications into an image that will be filtered
frequency_filtered = dynamic_world_classifications_image

#Define filterParams that defines the class name and the minimum number of occurances that need to occur
filterParams = {'water':2, 
                'trees': 2, 
                'grass': 2,
                'flooded_vegetation':2,
                'crops': 2,
                'scrub': 2,
                'built_area': 2, 
                'bare_ground': 2, 
                'snow_and_ice': 2}
filterParams = ee.Dictionary(filterParams)

#Apply frequency filter
frequency_filtered = pcf.applyFrequencyFilter(frequency_filtered, dw_band_names, 
                                              dw_classes_dict, filterParams)


In [None]:
#Get binary images of the land cover classifications for the current year
binary_class_images = npv.convertClassificationsToBinaryImages(dynamic_world_classifications_image, dw_classes_dict)
#Get the frequency of each class through the years by reducing the image collection to an image
class_frequency = binary_class_images.reduce(ee.Reducer.sum().unweighted()).rename(filterParams.keys().getInfo())

#Get binary images of the land cover classifications for the current year
post_binary_class_images = npv.convertClassificationsToBinaryImages(frequency_filtered, dw_classes_dict)
#Get the frequency of each class through the years by reducing the image collection to an image
post_class_frequency = post_binary_class_images.reduce(ee.Reducer.sum().unweighted()).rename(filterParams.keys())

changed_from_frequency_filter = class_frequency.neq(post_class_frequency)


#Map the results!
numChangesViz = {'min': 0, 'max': 3, 'palette': ['131b7a','04ecff']}; #gray = 0, red = 1co
center = [35.410769, -78.100163]
zoom = 12
Map4 = geemap.Map(center=center, zoom=zoom,basemap=basemaps.Esri.WorldImagery,add_google_map = False)
Map4.addLayer(class_frequency.select('grass'),numChangesViz,name='Number of Occurrences Pre Filter')
Map4.addLayer(post_class_frequency.select('grass'),numChangesViz,name='Number of Occurrences Post Filter')
Map4.addLayer(changed_from_frequency_filter.select('grass'),oneChangeDetectionViz,name='Changed with Filter')
display(Map4)

<font size="4">Calculate accuracy and confusion matrix for frequency filtered classifications on label data</font>

In [None]:
#Load label points
labelPointsFC = ee.FeatureCollection(labelPoints_assetID.format('goldsboro'))

#Save 2019 post-filtered DW classifications and rename to "dw_filterd_classifications"
classifications_filtered_2019 = frequency_filtered.select('dw_2019').rename('dw_temp_frequency_filt_classifications')

#Sample the 2019 classifications at each label point
labelPointsWithFilteredDW = classifications_filtered_2019.sampleRegions(collection=labelPointsFC, 
                                                                        projection = projection_ee, 
                                                                        tileScale=4, geometries=True)

#Calculate confusion matrix, which we will use for an accuracy assessment
filteredErrorMatrix = labelPointsWithFilteredDW.errorMatrix('labels', 'dw_temp_frequency_filt_classifications')

#Print the confusion matrix with the class names as a dataframe
errorMatrixDf = gclass.pretty_print_confusion_matrix_multiclass(filteredErrorMatrix, full_dw_classes_str)
#Axis 1 (the rows) of the matrix correspond to the actual values, and Axis 0 (the columns) to the predicted values.
print('Axis 1 (the rows) of the matrix correspond to the actual values, and Axis 0 (the columns) to the predicted values.')
display(errorMatrixDf)

#You can also print further accuracy scores from the confusion matrix, however each one takes a couple minutes 
#to load
print('Accuracy',filteredErrorMatrix.accuracy().getInfo())
# print('Consumers Accuracy',originalErrorMatrix.consumersAccuracy().getInfo())
# print('Producers Accuracy',originalErrorMatrix.producersAccuracy().getInfo())
# print('Kappa',originalErrorMatrix.kappa().getInfo())
