In [1]:
# !pip install geemap

# Machine Learning with Earth Engine - Unsupervised Classification

## Unsupervised classification algorithms available in Earth Engine

Source: https://developers.google.com/earth-engine/clustering

The `ee.Clusterer` package handles unsupervised classification (or clustering) in Earth Engine. These algorithms are currently based on the algorithms with the same name in [Weka](http://www.cs.waikato.ac.nz/ml/weka/). More details about each Clusterer are available in the reference docs in the Code Editor.

Clusterers are used in the same manner as classifiers in Earth Engine. The general workflow for clustering is:

1. Assemble features with numeric properties in which to find clusters.
2. Instantiate a clusterer. Set its parameters if necessary.
3. Train the clusterer using the training data.
4. Apply the clusterer to an image or feature collection.
5. Label the clusters.

The training data is a `FeatureCollection` with properties that will be input to the clusterer. Unlike classifiers, there is no input class value for an `Clusterer`. Like classifiers, the data for the train and apply steps are expected to have the same number of values. When a trained clusterer is applied to an image or table, it assigns an integer cluster ID to each pixel or feature.

Here is a simple example of building and using an ee.Clusterer:

![](https://i.imgur.com/IcBapEx.png)

## Step-by-step tutorial

### Import libraries

In [7]:
import ee
import geemap
ee.Authenticate()
ee.Initialize(project='ee-eslamelnahas-jupyter')

In [8]:
# Define a region in which to generate a segmented map.
region = ee.Geometry.Rectangle(29.7, 30, 32.5, 31.7)

# Load a Landsat composite for input.
input = (
    ee.ImageCollection('LANDSAT/COMPOSITES/C02/T1_L2_32DAY')
    .filterDate('2001-05', '2001-06')
    .first()
    .clip(region)
)

# Display the sample region.
m = geemap.Map()
m.set_center(31.5, 31.0, 8)
m.add_layer(ee.Image().paint(region, 0, 2), {}, 'region')

# Make the training dataset.
training = input.sample(region=region, scale=30, numPixels=5000)

# Instantiate the clusterer and train it.
n_cluster=15 # عدد الكلاسات المصنفعه
clusterer = ee.Clusterer.wekaKMeans(n_cluster).train(training)

# Cluster the input using the trained clusterer.
result = input.cluster(clusterer)

# Display the clusters with random colors.
m.add_layer(result.randomVisualizer(), {}, 'clusters')
m

Map(center=[31.0, 31.5], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=SearchDataGUI(…

### Create an interactive map

In [10]:
Map = geemap.Map()
Map

Map(center=[0, 0], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=SearchDataGUI(childr…

### Add data to the map

In [12]:
point=ee.Geometry.Point([31.5,31.5])
first = ee.ImageCollection('COPERNICUS/S2_SR') \
.filterBounds(point) \
.filterDate('2024-01-01', '2024-8-31') \
.sort('CLOUDY_PIXEL_PERCENTAGE') \
.first()
Map.centerObject(first, 10)
Map.addLayer(first, {"bands": ['B4', 'B3', 'B2'], "min": 0, "max": 4000}, 'first')


### Check image properties

In [14]:
props = geemap.image_props(first)
props.getInfo()

{'AOT_RETRIEVAL_ACCURACY': 0,
 'AOT_RETRIEVAL_METHOD': 'SEN2COR_DDV',
 'BOA_ADD_OFFSET_B1': -1000,
 'BOA_ADD_OFFSET_B10': -1000,
 'BOA_ADD_OFFSET_B11': -1000,
 'BOA_ADD_OFFSET_B12': -1000,
 'BOA_ADD_OFFSET_B2': -1000,
 'BOA_ADD_OFFSET_B3': -1000,
 'BOA_ADD_OFFSET_B4': -1000,
 'BOA_ADD_OFFSET_B5': -1000,
 'BOA_ADD_OFFSET_B6': -1000,
 'BOA_ADD_OFFSET_B7': -1000,
 'BOA_ADD_OFFSET_B8': -1000,
 'BOA_ADD_OFFSET_B8A': -1000,
 'BOA_ADD_OFFSET_B9': -1000,
 'CLOUDY_PIXEL_OVER_LAND_PERCENTAGE': 0.001766,
 'CLOUDY_PIXEL_PERCENTAGE': 0.002369,
 'CLOUD_COVERAGE_ASSESSMENT': 0.002369,
 'CLOUD_SHADOW_PERCENTAGE': 0,
 'DARK_FEATURES_PERCENTAGE': 0.008222,
 'DATASTRIP_ID': 'S2B_OPER_MSI_L2A_DS_2BPS_20240711T111659_S20240711T083525_N05.10',
 'DATATAKE_IDENTIFIER': 'GS2B_20240711T082559_038371_N05.10',
 'DATATAKE_TYPE': 'INS-NOBS',
 'DEGRADED_MSI_DATA_PERCENTAGE': 0.0213,
 'FORMAT_CORRECTNESS': 'PASSED',
 'GENERAL_QUALITY': 'PASSED',
 'GENERATION_TIME': 1720696619000,
 'GEOMETRIC_QUALITY': 'PASSED',
 'GRA

In [15]:
props.get("IMAGE_DATE").getInfo()

'2024-07-11'

In [16]:
props.get("CLOUD_COVERAGE_ASSESSMENT").getInfo()

0.002369

In [19]:
Map.user_roi.getInfo()

{'geodesic': False,
 'type': 'Polygon',
 'coordinates': [[[31.217651, 31.102407],
   [31.217651, 31.594986],
   [31.992188, 31.594986],
   [31.992188, 31.102407],
   [31.217651, 31.102407]]]}

In [20]:
# region = Map.user_roi
# region = ee.Geometry.Rectangle([31.217651, 31.102407, 31.992188, 31.594986])
# region = ee.Geometry.Point([31.6049195, 31.3486965]).buffer(10000)

### Make training dataset

There are several ways you can create a region for generating the training dataset.

- Draw a shape (e.g., rectangle) on the map and the use `region = Map.user_roi`
- Define a geometry, such as `region = ee.Geometry.Rectangle([31.217651, 31.102407, 31.992188, 31.594986])`
- Create a buffer zone around a point, such as `region = ee.Geometry.Point([31.6049195, 31.3486965]).buffer(10000)`
- If you don't define a region, it will use the image footprint by default

In [24]:
# Make the training dataset.
training = input.sample(
    **{
        # 'region': region,
        "scale": 30,
        "numPixels": 5000,
        "seed": 0,
        "geometries": True,  # Set this to False to ignore geometries
    }
)

Map.addLayer(training, {}, "training", False)
Map

Map(bottom=27181.0, center=[30.748271908888693, 31.039123535156254], controls=(WidgetControl(options=['positio…

### Train the clusterer

In [43]:
# Instantiate the clusterer and train it.
n_clusters = 5
clusterer = ee.Clusterer.wekaKMeans(n_clusters).train(training)

### Classify the image

In [46]:
# Cluster the input using the trained clusterer.
result = input.cluster(clusterer)

# # Display the clusters with random colors.
Map.addLayer(result.randomVisualizer(), {}, "clusters")
Map

Map(bottom=27181.0, center=[30.748271908888693, 31.039123535156254], controls=(WidgetControl(options=['positio…

### Label the clusters

In [49]:
legend_keys = ["Vegetation", "Buliding", "Desart", "Water", "etc"]
legend_colors = ["#008000", "#A52A2A", "#FFFF00", "#0000FF", "#80B1D3"]

# Reclassify the map
result = result.remap([0, 1, 2, 3, 4], [1, 2, 3, 4, 5])

Map.addLayer(
    result, {"min": 1, "max": 5, "palette": legend_colors}, "Labelled clusters"
)
Map.add_legend(
    legend_keys=legend_keys, legend_colors=legend_colors, position="bottomright"
)
Map

Map(bottom=27181.0, center=[30.748271908888693, 31.039123535156254], controls=(WidgetControl(options=['positio…

### Visualize the result

In [32]:
print("Change layer opacity:")
cluster_layer = Map.layers[-1]
cluster_layer.interact(opacity=(0, 1, 0.1))

Change layer opacity:


Box(children=(FloatSlider(value=1.0, description='opacity', max=1.0),))

### Export the result

Export the result directly to your computer:

In [34]:
import os

out_dir = os.path.join(os.path.expanduser("~"), "Downloads")
out_file = os.path.join(out_dir, "cluster.tif")

In [35]:
geemap.ee_export_image(result, filename=out_file, scale=90)

Generating URL ...
Downloading data from https://earthengine.googleapis.com/v1/projects/ee-eslamelnahas-jupyter/thumbnails/02dfdfb3fdd5aaf77c0667c525de60e4-f8c00e877514cb67e22557a8ef47418b:getPixels
Please wait ...
An error occurred while downloading.


Export the result to Google Drive:

In [51]:
geemap.ee_export_image_to_drive(
    result, description="clusters", folder="export", scale=90
)