# Removing Clouds, and using image collections

* **Special requirements:** A Google account, access to Google Earth Engine.
* **Prerequisites:** You should have completed the "Week 2 - Prac 1" notebook.


## Background

Filtering Clouds, 
Filter by location, 
Filter by date

There are many ways to remove clouds and cloud shadows from images, however not all algorithms to do this are available in GEE. In this notebook you'll learn to remove (i.e. 'filter') clouds and cloud shadows from Landsat images using available algorithms. 

Likewise, sometimes we need images from one region of interest (ROI) or from a certain date or date range. Here we'll see how to search the Landsat archive to filter for those images, but you can do the same with other optical sensors such as Sentinel 2 or MODIS (although filtering methods will vary).


***

## Aims of the practical session

This practical has three aims:
1. to demonstrate how to filter clouds using Landsat images,
1. to demonstrate how to filter image collections by location, and
1. to demonstrate how to filter images by date.


***

## Description

In this notebook you'll learn to remove (i.e. 'filter') clouds filter images by location and by date. 

First:
- Create a Region of Interest (ROI). This can be a polygon or a single point.

Then:
- load an image collection from the Canberra region (filter by location based on created ROI) and visualize some images corresponding to the year 2022 (filter by date), and

Finally:
- remove the clouds and evaluate the result.

**Challenge:**
- replicate the workflow for the Amazon region and compare the results to check the effectiveness of different filtering approaches in the different regions.

<div class="alert alert-block alert-warning">
<b>Assessment:</b> Once you finish the practical and the excercises, remember to submit your notebook through Wattle.
Challenges are optional and will not be part of the assessment.
</div>

***

## Getting started


### Load packages

Import Python packages that are used for the analysis.


In [9]:
%matplotlib inline

import geemap as gmap
import ee
import matplotlib.pyplot as plt

### Connect to Google Earth Engine (GEE)

Connect to the GEE so we can access GEE datasets and computing assets.
You may be required to input your Google account name and password. Please keep those safe and don't share them with anyone.

In [10]:
m = gmap.Map()

***

## Load a satellite image of the ACT region.


Use a Landsat 8 image.

First, let's display the image in 'True color'. True Color means that we display the Red, Green, and Blue (RGB) bands. This makes it more intuitive to distinguish the features in the image.

You can find more information about the spectral bands of the Landsat 8 sensor [here](https://www.usgs.gov/media/images/landsat-8-band-designations)

**GEE TIP:** be mindful if you're using *collection 1* or *collection 2* images because the band names are different. You'll have to adjust your code accordingly.

In [11]:
# We give the 'center' location, and a 'zoom' level.
Map = gmap.Map(center=[-35.2041, 149.2721], zoom=9)

# Search for a specific landsat 8 image collection 1
clearC1 = ee.Image('LANDSAT/LC08/C01/T1_SR/LC08_090084_20210219')

# Now we select the bands we want to display for the collection 1 image
landsatC1_vis = {'bands': ['B4', 'B3', 'B2'],
              'min': 0,
              'max': 3000}

# Search for a landsat 8 image collection 2
clearC2 = ee.Image('LANDSAT/LC08/C02/T1_L2/LC08_090085_20210118')

# Now we select the bands we want to display for the collection 2 image
landsatC2_vis = {'bands': ['SR_B4', 'SR_B3', 'SR_B2'],
              'min': -0.10,
              'max': 65000}

# Add the landsat 8 image to our map
Map.addLayer(clearC1, landsatC1_vis, 'a clear Landsat 8 collection 1 img')
Map.addLayer(clearC2, landsatC2_vis, 'a clear Landsat 8 collection 2 img')

Map

Map(center=[-35.2041, 149.2721], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=HBox(c…

Let's have a look at the metadata of each image.

In [12]:
gmap.image_props(clearC1).getInfo()

{'CLOUD_COVER': 3.81,
 'CLOUD_COVER_LAND': 3.98,
 'EARTH_SUN_DISTANCE': 0.988715,
 'ESPA_VERSION': '2_23_0_1b',
 'GEOMETRIC_RMSE_MODEL': 6.224,
 'GEOMETRIC_RMSE_MODEL_X': 3.977,
 'GEOMETRIC_RMSE_MODEL_Y': 4.787,
 'IMAGE_DATE': '2021-02-19',
 'IMAGE_QUALITY_OLI': 9,
 'IMAGE_QUALITY_TIRS': 9,
 'LANDSAT_ID': 'LC08_L1TP_090084_20210219_20210304_01_T1',
 'LEVEL1_PRODUCTION_DATE': 1614856179000,
 'NOMINAL_SCALE': 30,
 'PIXEL_QA_VERSION': 'generate_pixel_qa_1.6.0',
 'SATELLITE': 'LANDSAT_8',
 'SENSING_TIME': '2021-02-19T23:50:20.9151729Z',
 'SOLAR_AZIMUTH_ANGLE': 62.88945,
 'SOLAR_ZENITH_ANGLE': 40.404987,
 'SR_APP_VERSION': 'LaSRC_1.3.0',
 'WRS_PATH': 90,
 'WRS_ROW': 84,
 'system:asset_size': '621.984133 MB',
 'system:band_names': ['B1',
  'B2',
  'B3',
  'B4',
  'B5',
  'B6',
  'B7',
  'B10',
  'B11',
  'sr_aerosol',
  'pixel_qa',
  'radsat_qa'],
 'system:id': 'LANDSAT/LC08/C01/T1_SR/LC08_090084_20210219',
 'system:index': 'LC08_090084_20210219',
 'system:time_end': '2021-02-19 23:50:20',
 

In [13]:
gmap.image_props(clearC2).getInfo()

{'ALGORITHM_SOURCE_SURFACE_REFLECTANCE': 'LaSRC_1.5.0',
 'ALGORITHM_SOURCE_SURFACE_TEMPERATURE': 'st_1.3.0',
 'CLOUD_COVER': 4.7,
 'CLOUD_COVER_LAND': 4.74,
 'COLLECTION_CATEGORY': 'T1',
 'COLLECTION_NUMBER': 2,
 'DATA_SOURCE_AIR_TEMPERATURE': 'MODIS',
 'DATA_SOURCE_ELEVATION': 'GLS2000',
 'DATA_SOURCE_OZONE': 'MODIS',
 'DATA_SOURCE_PRESSURE': 'Calculated',
 'DATA_SOURCE_REANALYSIS': 'GEOS-5 FP-IT',
 'DATA_SOURCE_TIRS_STRAY_LIGHT_CORRECTION': 'TIRS',
 'DATA_SOURCE_WATER_VAPOR': 'MODIS',
 'DATE_ACQUIRED': '2021-01-18',
 'DATE_PRODUCT_GENERATED': 1615075141000,
 'DATUM': 'WGS84',
 'EARTH_SUN_DISTANCE': 0.9839023,
 'ELLIPSOID': 'WGS84',
 'GEOMETRIC_RMSE_MODEL': 7.166,
 'GEOMETRIC_RMSE_MODEL_X': 4.107,
 'GEOMETRIC_RMSE_MODEL_Y': 5.872,
 'GEOMETRIC_RMSE_VERIFY': 3.9,
 'GRID_CELL_SIZE_REFLECTIVE': 30,
 'GRID_CELL_SIZE_THERMAL': 30,
 'GROUND_CONTROL_POINTS_MODEL': 1157,
 'GROUND_CONTROL_POINTS_VERIFY': 286,
 'GROUND_CONTROL_POINTS_VERSION': 5,
 'IMAGE_DATE': '2021-01-18',
 'IMAGE_QUALITY_OLI'

### <a name="ex1"></a> Exercise 1 - Answer the following questions using the information above

<div class="alert alert-block alert-danger">
    
1. What are the main differences in the metadata of the two images? 
1. why is the metadata so different if they both images come from the same Landsat 8 sensor?

Answer these questions in the cell below.
</div>

.

These images looks very clear. As you zoom in and out, you can easily distinguish some landscape features.

However, not all satellite images are as good as this one. So lets' look at another image and its metadata:


In [14]:
# Get a Landsat Image
cloudy = ee.Image('LANDSAT/LC08/C02/T1_L2/LC08_090084_20190622') 

Map1 = gmap.Map(center=[-35.2041, 149.2721], zoom=9)

# Add the image to the map
Map1.addLayer(cloudy, landsatC2_vis, 'a cloudy Landsat 8 image')

# Display the map
Map1

Map(center=[-35.2041, 149.2721], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=HBox(c…

In [15]:
gmap.image_props(cloudy).getInfo()

{'ALGORITHM_SOURCE_SURFACE_REFLECTANCE': 'LaSRC_1.5.0',
 'ALGORITHM_SOURCE_SURFACE_TEMPERATURE': 'st_1.3.0',
 'CLOUD_COVER': 71.31,
 'CLOUD_COVER_LAND': 71.36,
 'COLLECTION_CATEGORY': 'T1',
 'COLLECTION_NUMBER': 2,
 'DATA_SOURCE_AIR_TEMPERATURE': 'MODIS',
 'DATA_SOURCE_ELEVATION': 'GLS2000',
 'DATA_SOURCE_OZONE': 'MODIS',
 'DATA_SOURCE_PRESSURE': 'Calculated',
 'DATA_SOURCE_REANALYSIS': 'GEOS-5 FP-IT',
 'DATA_SOURCE_TIRS_STRAY_LIGHT_CORRECTION': 'TIRS',
 'DATA_SOURCE_WATER_VAPOR': 'MODIS',
 'DATE_ACQUIRED': '2019-06-22',
 'DATE_PRODUCT_GENERATED': 1598554331000,
 'DATUM': 'WGS84',
 'EARTH_SUN_DISTANCE': 1.0163335,
 'ELLIPSOID': 'WGS84',
 'GEOMETRIC_RMSE_MODEL': 7.397,
 'GEOMETRIC_RMSE_MODEL_X': 4.717,
 'GEOMETRIC_RMSE_MODEL_Y': 5.698,
 'GEOMETRIC_RMSE_VERIFY': 4.466,
 'GRID_CELL_SIZE_REFLECTIVE': 30,
 'GRID_CELL_SIZE_THERMAL': 30,
 'GROUND_CONTROL_POINTS_MODEL': 489,
 'GROUND_CONTROL_POINTS_VERIFY': 134,
 'GROUND_CONTROL_POINTS_VERSION': 5,
 'IMAGE_DATE': '2019-06-22',
 'IMAGE_QUALITY_

<div class="alert alert-block alert-danger">

Based on the metadata above, can you tell if this is a Collection 1 or Collection 2 image?
</div>


There are several ways to remove clouds from satellite images, but for now, you must know that (*most*) Landsat images has a pixel quality band called `pixel_qa` or `QA_PIXEL`.

**GEE TIP:** again, be mindful of the image(s) you're loading; for example, Landsat 4,5, and 7 will have different coefficients to Landsat 8 and 9. If you get any errors in applying coulds filtering, inspect the metadata to search for the information on the image you are downloading and adjust your code accordingly.

You can learn more about the differnt Landsat processing levels [here](https://www.usgs.gov/landsat-missions/landsat-collection-2), and about the contents of the `pixel_qa` band [here](https://www.usgs.gov/landsat-missions/landsat-collection-2-quality-assessment-bands)

Ok, let's try to remove the clouds from the `cloudy` and the `clear` images. To do that, we'll have to create a function, and then we'll apply that function to the images.

In [16]:
# This example demonstrates the use of the Landsat 8 Collection 2, Level 2
# QA_PIXEL band (CFMask) to mask unwanted pixels.
# First, we define a function to mask the clouds .

def cloudMaskL8Collection2(image):
  # Bit 0 - Fill
  # Bit 1 - Dilated Cloud
  # Bit 2 - Cirrus
  # Bit 3 - Cloud
  # Bit 4 - Cloud Shadow
    qaMask = image.select('QA_PIXEL').bitwiseAnd(int('11111',2)).eq(0)
#  QA_RADSAT.eq(0) = Valid data
    saturationMask = image.select('QA_RADSAT').eq(0)

  # Apply the scaling factors to the appropriate bands.
    opticalBands = image.select('SR_B.').multiply(0.0000275).add(-0.2)
    thermalBands = image.select('ST_B.*').multiply(0.00341802).add(149.0)

  # Replace the original bands with the scaled ones and apply the masks.
    return image.addBands(opticalBands, None, True) \
          .addBands(thermalBands, None, True) \
          .updateMask(qaMask) \
          .updateMask(saturationMask)

Let's apply the function and display the images to see the result.

Change the opacity of the images so you can see the differences more clearly.

In [17]:
Map2 = gmap.Map(center=[-35.2041, 149.2721], zoom=9)

# Apply the function to the images
clearMasked = cloudMaskL8Collection2(clearC2)
cloudyMasked = cloudMaskL8Collection2(cloudy)

# Display the results.
Map2.addLayer(clearMasked, landsatC2_vis,'clearMasked')
Map2.addLayer(cloudyMasked, landsatC2_vis,'cloudyMasked')
Map2.addLayer(clearC2, landsatC2_vis, 'a clear Landsat 8 image')
Map2.addLayer(cloudy, landsatC2_vis, 'a cloudy Landsat 8 image')

Map2

Map(center=[-35.2041, 149.2721], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=HBox(c…

It's your turn to code.

### <a name="ex2"></a> Exercise 2 -  Create a cloud masking function for a Landsat 8 Collection 1 image. 


In [18]:
# Your code goes here.

def cloudMaskL8Collection1(image):
  
    return 

In [19]:
Map3 = gmap.Map(center=[-35.2041, 149.2721], zoom=9)



Map3

Map(center=[-35.2041, 149.2721], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=HBox(c…

<a href="#ex2answer">Answer to Exercise 2</a>

***

## Add multiple images at once (i.e. and Image Collection)

We are now going to use the `ImageCollection` command to call the Landsat 8 image collection.

This shows *all available* Landsat 8 images for all the globe. Under some circumstances this could be good, but we don't want (or need) all those images.

In [22]:
Map4 = gmap.Map(center=[-35.2041, 149.2721], zoom=8)

# Map the function over one year of data.
collection = ee.ImageCollection('LANDSAT/LC08/C02/T1_L2')

# Display the results.
Map4.addLayer(collection, landsatC2_vis, 'collection')
Map4

Map(center=[-35.2041, 149.2721], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=HBox(c…

But we dont need *all* Landsat 8 images; we only want those directly above the ACT.
Let's filter the image collection using a polygon (i.e. Region of Interest)

You can create a ROI using the menus on the left of your map (see below), or you can upload your own vector file (not shown in this notebook).
![2.2_ROI.PNG](attachment:2.2_ROI.PNG)

Once we have a polygon, we need to extract its coordinates so we can tell GEE that we want *only the images that intersect that polygon*. We do that by using the `user_roi` command:

In [25]:
Map4.user_roi.getInfo()
# Note that this will be different for you because my polygon is not the same as yours.
# Also note the 'BaseMap' you're using to create the polygon

## Filter an image collection by location

To filter the image, we use the `filterBounds` method in GEE. Read more [here](https://developers.google.com/earth-engine/apidocs/ee-imagecollection-filterbounds)

In [27]:
Map5 = gmap.Map(center=[-35,149], zoom=8)

# Filter the image collection to select only the images within the ROI boundaries
filtered = collection.filterBounds(Map4.user_roi.getInfo())

# Add all landsat 8 images to our map
Map5.addLayer(filtered, landsatC2_vis, 'only images from the ACT')

Map5

In [None]:
# we can see how many Landsat 8 images there are over the ACT:
print(f'there are: {filtered.size().getInfo()} Landsat 8 images that intersect our ROI')

Now let's use a point to filter the image collection:

In [None]:
Map5 = gmap.Map(center=[-35,149], zoom=8)

# Create a point with set coordinates
point = ee.Geometry.Point([149.158494, -35.156445])
    
# Filter the image collection to select only the images within the ROI boundaries
filteredByPoint = collection.filterBounds(point)

# Add all landsat 8 images to our map
Map5.addLayer(filteredByPoint, {}, 'Landsat 8 images that intersect our point')
Map5.addLayer(point,{},'The point')
Map5

***

## Filter an image collection by date

Similar to `filterBounds`, GEE has a `filterDate` method. Let's see it in action.
You can learn more [here](https://developers.google.com/earth-engine/apidocs/ee-imagecollection-filterdate)

In [None]:
Map6 = gmap.Map(center=[-35,149], zoom=8)

# Note we're still using the whole Landsat 8 collection from above, and
# the same ROI.
filteredByDate = collection \
                        .filterBounds(Map4.user_roi.getInfo()) \
                        .filterDate('2021', '2022')

# we can see how many Landsat 8 images were collected :
print(f'there are: {filteredByDate.size().getInfo()} Landsat 8 images over the ACT for 2019-2020')

Map6.addLayer(filteredByDate, landsatC2_vis, 'Landsat 8 images over the ACT for 2019-2020')

Map6

But we can be much more specific than just giving the years, we can filter the image Collection using the years and months, or we can use specific dates.

In the cells below, try using the following:
>- `startDate = '2014-01'`,  `endDate = '2014-12'`, and
>- `startDate = '2021-04-16'`,  `endDate = '2021-04-30'`

In [None]:
# Your code goes here.


In [None]:
# Your code goes here.


***

## Put everything together

It's time to put all the things we've learned together:

1. Load a Landsat 8 image collection
1. Filter the collection by any date,
1. Filter the collection by location
1. Remove the clouds of the image Collection, and add the masked collection to the map.

In [None]:
Map6 = gmap.Map(center=[-35.2041, 149.2721], zoom=7)

# Get the image collection
actImages = (ee.ImageCollection('LANDSAT/LC08/C02/T1_L2')
             # Filter by date 
             .filterDate('2021-01-10','2021-01-30') \
                # Filter by location                
                .filterBounds(Map4.user_roi.getInfo()) \
                # Remove the clouds from the selected images
                .map(cloudMaskL8Collection2) )

# Display the results.
Map6.addLayer(actImages, {'bands': ['SR_B4',  'SR_B3',  'SR_B2'], 'min': 0, 'max': 0.9},'ACT Images')
Map6

***

## Summary

In this notebook you have learned about:
- Different Landsat collections, by loading images from Collection 1 and Collection 2. They're similar, but not exactly the same, so be mindful of which images/collections you're using. This **will** affect your code.
- Single images and image collections. A single image is a single 'photo' of earth, while image collections are all the images taken by a single satellite, and processed to a certain quality.
- Filtering images and image collections by date and location. You did this by using specific dates, specific months, or just the years you're interested in. You also drew polygons (regions of interest) and gathered the images that intersected those regions of interest.

***

## References and useful readings

- Chapters 12, 13, and 14 form the "Earth Observation: Data, Processing and Applications" book, Volume 1A: Data—Basics and Acquisition". Available through Wattle, or  http://www.crcsi.com.au/earth-observation-series.
- https://geemap.org/
- http://dx.doi.org/10.1016/j.rse.2015.11.032
- https://doi.org/10.3390/rs1030184
- https://doi.org/10.1016/j.rse.2014.02.001
- 10.1016/j.rse.2019.05.024 

***

## Additional information

**Sources:** The code in this notebook as several sources, including:
https://github.com/giswqs/geemap; 

**License:** Some of the code in this notebook was initially created by [Qiusheng Wu](https://github.com/giswqs), and has been modified by Nicolas Younes. The code in this notebook is licensed under a [Creative Commons Attribution 4.0 International License](https://creativecommons.org/licenses/by/4.0/), and an [MIT Licence](https://mit-license.org/). 

**Contact:** If you need assistance, please post a question on the ENGN3903 Wattle course forum

**Last modified:** July 2022

***