# Cloudcover

*****

Optical instruments face cloudcover as a significant problem. Many data products provide a quality assessment in form of a separate layer, for Landsat `QA_PIXEL` and for Sentinel `SLC`. 

**However**, these quality layers are far from perfect. This notebook provides an overview of cloud cover assessment. **In summary**, often it is the best to **identify your scenes and select the dates of interest individually**; or at least test different quality classes. Sometimes there are some quality flags that are useful in one environment (some specific cloud class for lowlands) but inefficient or wrong for another one (not working in mountains for glaciers and snow).

Different instruments of satellites (3 different ones alone for Landsat) have different quality layers. It is worth looking into them individually. Unfortunately, there is no method valid for all region nor all sensors.

*****

In the following you see four examples:
- Landsat 8:
    - Fribourg area
    - Mountain area
- Sentinel 2:
    - Fribourg area
    - Mountain area

*****

Note that the approaches are **slightly different between Landsat and Sentinel 2**:
- Landsat uses a bit mask for which the function `create_mask_from_bits` is used
- Sentinel uses direct decimals for which the function `create_mask_from_values` is used

In [None]:
# Load packages
import numpy as np
import pandas as pd
import ast
from odc.stac import stac_load
import time
import psutil
import dask.distributed
import rioxarray
import numpy as np
import xarray as xr
from pystac_client import Client
import matplotlib.pyplot as plt
import pandas as pd 
from sdc_utilities import *
# silence warning (not recommended during development)
import warnings
warnings.filterwarnings("ignore")



# Landsat 8

# Example 1: Fribourg Region

You can later (to check for cloud cover in your region of interest) load also your own config file. For this, delete everything except `%load config_cell.txt`, and execute the cell. **For now, keep everything as is**. A representative example with cloud cover is provided.

In [None]:
# %load config_cell.txt
# Configuration

product = 'landsat_ot_c2_l2'
measurements = ['QA_PIXEL', 'SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6', 'SR_B7', 'ST_B10']
aliases = ['QA_PIXEL', 'blue', 'green', 'red', 'nir', 'swir_1', 'swir_2', 'surface_temperature']  # you can also provide only the aliases and get the measurements with:
# measurements, aliases = get_alias_band(aliases)
# to make your live easier you can manually replace the measurements variable by 
# one of their alias:

longitude = (7.127, 7.199)
latitude = (46.773, 46.816)
crs = 'epsg:4326'

time = ('2016-04-01', '2016-07-01')
# the following date formats are also valid:
# time = ('2000-01-01', '2001-12-31')
# time=('2000-01', '2001-12')
# time=('2000', '2001')

# You can use an UTM zone according to the DataCube System.
# We prefer not to use this, instead specifying SwissGrid (epsg:2056).
# output_crs = 'epsg:2056'

output_crs = 'epsg:2056'
resolution = -30.0, 30.0

# These are the pixel classifications for Sentinel (SCL) and Landsat (QA_PIXEL); 
# you can use values to mask out values that belong to certain classes

###################################
# SCL categories:                 #
#   0 - no data                   #
#   1 - saturated or defective    #
#   2 - dark area pixels          #
#   3 - cloud_shadows             #
#   4 * vegetation                #
#   5 * not vegetated             #
#   6 * water                     #
#   7 * unclassified              #
#   8 - cloud medium probability  #
#   9 - cloud high probability    #
#  10 - thin cirrus               #
#  11 * snow                      #
###################################

# Check for more detailed information: 
# - Landsat 8/9 (OLI/TIRS), Page 19:
# https://d9-wret.s3.us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/media/files/LSDS-1619_Landsat8-9-Collection2-Level2-Science-Product-Guide-v6.pdf
# - Landsat 7 (ETM+), Page 15:
# https://d9-wret.s3.us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/media/files/LSDS-1337_Landsat7ETM-C2-L2-DFCB-v6.pdf
# - Landsat 4,5 (TM), Page 18:
# https://d9-wret.s3.us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/atoms/files/LSDS-1415_Landsat4-5-TM-C2-L1-DFCB-v3.pdf

#############################################
# QA_PIXEL BITS : CATEGORIES                #
#    0 : Fill                               #
#    1 : Clear                              #
#    2 : Water                              #
#    3 : Cloud shadow                       #
#    4 : Snow                               #
#    5 : Cloud                              #
#   10 : Terrain occlusion (Landsat 8 only) #
#############################################

chunks = {"x": 2048, "y": 2048, "time": 1}  # 2048 values are OK with ~21Gb memory available



In [None]:
# %load config_cell.txt
# Configuration

product = 'landsat_ot_c2_l2'
measurements = ['QA_PIXEL', 'SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6']
aliases = ['QA_PIXEL', 'blue', 'green', 'red', 'nir', 'swir_1']  # you can also provide only the aliases and get the measurements with:
# measurements, aliases = get_alias_band(aliases)
# to make your live easier you can manually replace the measurements variable by 
# one of their alias:

longitude = (7.127, 7.199)
latitude = (46.773, 46.816)
crs = 'epsg:4326'

time = ('2015-03-10', '2015-04-01')
# the following date formats are also valid:
# time = ('2000-01-01', '2001-12-31')
# time=('2000-01', '2001-12')
# time=('2000', '2001')

# You can use an UTM zone according to the DataCube System.
# We prefer not to use this, instead specifying SwissGrid (epsg:2056).
# output_crs = 'epsg:2056'

output_crs = 'epsg:2056'
resolution = -30.0, 30.0

# These are the pixel classifications for Sentinel (SCL) and Landsat (QA_PIXEL); 
# you can use values to mask out values that belong to certain classes

###################################
# SCL categories:                 #
#   0 - no data                   #
#   1 - saturated or defective    #
#   2 - dark area pixels          #
#   3 - cloud_shadows             #
#   4 * vegetation                #
#   5 * not vegetated             #
#   6 * water                     #
#   7 * unclassified              #
#   8 - cloud medium probability  #
#   9 - cloud high probability    #
#  10 - thin cirrus               #
#  11 * snow                      #
###################################

# Check for more detailed information: 
# - Landsat 8/9 (OLI/TIRS), Page 19:
# https://d9-wret.s3.us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/media/files/LSDS-1619_Landsat8-9-Collection2-Level2-Science-Product-Guide-v6.pdf
# - Landsat 7 (ETM+), Page 15:
# https://d9-wret.s3.us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/media/files/LSDS-1337_Landsat7ETM-C2-L2-DFCB-v6.pdf
# - Landsat 4,5 (TM), Page 18:
# https://d9-wret.s3.us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/atoms/files/LSDS-1415_Landsat4-5-TM-C2-L1-DFCB-v3.pdf

#############################################
# QA_PIXEL BITS : CATEGORIES                #
#    0 : Fill                               #
#    1 : Clear                              #
#    2 : Water                              #
#    3 : Cloud shadow                       #
#    4 : Snow                               #
#    5 : Cloud                              #
#   10 : Terrain occlusion (Landsat 8 only) #
#############################################

chunks = {"x": 2048, "y": 2048, "time": 1}  # 2048 values are OK with ~21Gb memory available





In [None]:
# Establish connection to SDC
catalog = Client.open("https://explorer.swissdatacube.org/stac")


In [None]:
# Load dataset with parameters defined in the config cell
ds_tmp = load_product_ts(catalog=catalog,
                        product=product,
                        longitude=longitude,
                        latitude=latitude,
                        output_crs=output_crs,
                        measurements=measurements,
                        resolution = resolution,
                        time=time,
                        chunks=chunks,
                         rename=True,
                         alias_names = aliases
                        )


## Pre-defined cloud identification and classification

The first example is for Landsat 8 (OLI/TIRS (OT)). As provided in the config file, there is a link to the QA_PIXEL description (page 19: https://d9-wret.s3.us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/media/files/LSDS-1619_Landsat8-9-Collection2-Level2-Science-Product-Guide-v6.pdf)

<img src="https://www.dropbox.com/scl/fi/zslaub479bjrpmqnlrkb3/Landsat8_QA_PIXEL.png?rlkey=jglvektr4mw7vax5bcm97jd1p&dl=1" width="800" />

*Figure 1: Landsat 8 QA_PIXEL Bit flags*

In [None]:
# From the Table in the picture above, we choose bit 3 (Cloud)
# The following code displays what is identified as clouds (yellow) in the first row of the plot.
# The second row of the plot shows the RGB image.
# How well represented are the clouds?
# We also take the "fill" values (where the instrument didn't have a view) [0,3]
bit_positions = [0,3]  

ds_tmp = create_mask_from_bits(ds_tmp, bit_positions)

# Define figure and axis grid with two rows: one for the mask and one for RGB
n_time_steps = len(ds_tmp['time'])
fig, axes = plt.subplots(2, n_time_steps, figsize=(5 * n_time_steps, 10))

# Plot the mask in the first row
for i, time_step in enumerate(ds_tmp['time']):
    ds_tmp['mask'].sel(time=time_step).plot.imshow(ax=axes[0, i], add_colorbar=False, vmin=0, vmax=1)
    axes[0, i].set_title(f'Mask - Time {time_step.values}')
    axes[0, i].set_xlabel('')
    axes[0, i].set_ylabel('')

# Plot RGB composite in the second row
for i, time_step in enumerate(ds_tmp['time']):
    rgb = ds_tmp[['red', 'green', 'blue']].sel(time=time_step).to_array()
    rgb.plot.imshow(ax=axes[1, i], add_colorbar=False, robust=True)
    axes[1, i].set_title(f'RGB - Time {time_step.values}')
    axes[1, i].set_xlabel('')
    axes[1, i].set_ylabel('')

# Adjust layout and show plot
plt.tight_layout()
plt.show()

## Assessment

It seems that in the first figure there is some thin cloud cover that smudges the imagery. Even though this is not identified as clouds. Be aware that you might end up with this automatically and that your values of the bands of interest are therefore changed (maybe a bit higher or lower). If you simply extract values the values without checking this, you should be aware that your final values could be affected. The best way to check, is making a time-series plot of the imagery as shown here.


The next cell shows how to set all values to "not assigned" `np.nan` where we identified invalid pixels. 

In [None]:
ds_clean = ds_tmp.where(ds_tmp['mask'] != 1, other=np.nan)

In [None]:
# Define figure and axis grid with two rows: one for the mask and one for RGB
n_time_steps = len(ds_tmp['time'])
fig, axes = plt.subplots(2, n_time_steps, figsize=(5 * n_time_steps, 10))

# Plot the mask in the first row
for i, time_step in enumerate(ds_tmp['time']):
    ds_tmp['mask'].sel(time=time_step).plot.imshow(ax=axes[0, i], add_colorbar=False, vmin=0, vmax=1)
    axes[0, i].set_title(f'Mask - Time {time_step.values}')
    axes[0, i].set_xlabel('')
    axes[0, i].set_ylabel('')

# Plot RGB composite in the second row
for i, time_step in enumerate(ds_clean['time']):
    rgb = ds_clean[['red', 'green', 'blue']].sel(time=time_step).to_array()
    rgb.plot.imshow(ax=axes[1, i], add_colorbar=False, robust=True)
    axes[1, i].set_title(f'RGB - Time {time_step.values}')
    axes[1, i].set_xlabel('')
    axes[1, i].set_ylabel('')

# Adjust layout and show plot
plt.tight_layout()
plt.show()

# Example 2: Mountains

Here is simply a copy of all the above code, but we change the longitude and latitude to match the example from the glacier_mapping notebook. And we adjust the `time` variable to have some examples with clouds and snow.

In [None]:
product = 'landsat_ot_c2_l2'
measurements = ['QA_PIXEL', 'SR_B2', 'SR_B3', 'SR_B4', 'SR_B5', 'SR_B6']
aliases = ['QA_PIXEL', 'blue', 'green', 'red', 'nir', 'swir_1']  # you can also provide only the aliases and get the measurements with:
# measurements, aliases = get_alias_band(aliases)
# to make your live easier you can manually replace the measurements variable by 
# one of their alias:

# longitude = (7.1514, 7.1769)
# latitude = (46.7965, 46.8079)
longitude = (7.73228, 7.957461)
latitude = (45.877007, 46.022142)
crs = 'epsg:4326'

time = ('2014-02-10', '2014-04-01')
time = ('2015-03-10', '2015-04-01')
# the following date formats are also valid:
# time = ('2000-01-01', '2001-12-31')
# time=('2000-01', '2001-12')
# time=('2000', '2001')

# You can use an UTM zone according to the DataCube System.
# We prefer not to use this, instead specifying SwissGrid (epsg:2056).
# output_crs = 'epsg:2056'

output_crs = 'epsg:2056'
resolution = -30.0, 30.0


In [None]:
# Establish connection to SDC
catalog = Client.open("https://explorer.swissdatacube.org/stac")


In [None]:
# Load dataset with parameters defined in the config cell
ds_tmp = load_product_ts(catalog=catalog,
                        product=product,
                        longitude=longitude,
                        latitude=latitude,
                        output_crs=output_crs,
                        measurements=measurements,
                        resolution = resolution,
                        time=time,
                        chunks=chunks,
                         rename=True,
                         alias_names = aliases
                        )


In [None]:
# From the Table in the picture above, we choose bit 3 (Cloud)
# The following code displays what is identified as clouds (yellow) in the first row of the plot.
# The second row of the plot shows the RGB image.
# How well represented are the clouds?
# We also take the "fill" values (where the instrument didn't have a view) [0,3]
bit_positions = [0,3]  

ds_tmp = create_mask_from_bits(ds_tmp, bit_positions)

# Define figure and axis grid with two rows: one for the mask and one for RGB
n_time_steps = len(ds_tmp['time'])
fig, axes = plt.subplots(2, n_time_steps, figsize=(5 * n_time_steps, 10))

# Plot the mask in the first row
for i, time_step in enumerate(ds_tmp['time']):
    ds_tmp['mask'].sel(time=time_step).plot.imshow(ax=axes[0, i], add_colorbar=False, vmin=0, vmax=1)
    axes[0, i].set_title(f'Mask - Time {time_step.values}')
    axes[0, i].set_xlabel('')
    axes[0, i].set_ylabel('')

# Plot RGB composite in the second row
for i, time_step in enumerate(ds_tmp['time']):
    rgb = ds_tmp[['red', 'green', 'blue']].sel(time=time_step).to_array()
    rgb.plot.imshow(ax=axes[1, i], add_colorbar=False, robust=True)
    axes[1, i].set_title(f'RGB - Time {time_step.values}')
    axes[1, i].set_xlabel('')
    axes[1, i].set_ylabel('')

# Adjust layout and show plot
plt.tight_layout()
plt.show()

## Assessment

This is a good example showing how difficult it can be to identify clouds. Even for the eye with the RGB imagery or longwave bands (example below) it is difficult to discriminate the clouds from snow in the images. In either case, the provided mask seems to miss some (middle panel above the center part) and assign non-existing clouds (left image in the middle, or the shaded parts near the side-moraines in the middle image).



In [None]:
# Test with different bands
# Define figure and axis grid with two rows: one for the mask and one for RGB
n_time_steps = len(ds_tmp['time'])
fig, axes = plt.subplots(2, n_time_steps, figsize=(5 * n_time_steps, 10))

# Plot the mask in the first row
for i, time_step in enumerate(ds_tmp['time']):
    ds_tmp['mask'].sel(time=time_step).plot.imshow(ax=axes[0, i], add_colorbar=False, vmin=0, vmax=1)
    axes[0, i].set_title(f'Mask - Time {time_step.values}')
    axes[0, i].set_xlabel('')
    axes[0, i].set_ylabel('')

# Plot RGB composite in the second row
for i, time_step in enumerate(ds_tmp['time']):
    rgb = ds_tmp[['swir_1','nir', 'red']].sel(time=time_step).to_array()
    rgb.plot.imshow(ax=axes[1, i], add_colorbar=False, robust=True)
    axes[1, i].set_title(f'RGB - Time {time_step.values}')
    axes[1, i].set_xlabel('')
    axes[1, i].set_ylabel('')

# Adjust layout and show plot
plt.tight_layout()
plt.show()

The next cell shows how to set all values to "not assigned" `np.nan` where we identified invalid pixels. 

In [None]:
ds_clean = ds_tmp.where(ds_tmp['mask'] != 1, other=np.nan)

In [None]:
# Define figure and axis grid with two rows: one for the mask and one for RGB
n_time_steps = len(ds_tmp['time'])
fig, axes = plt.subplots(2, n_time_steps, figsize=(5 * n_time_steps, 10))

# Plot the mask in the first row
for i, time_step in enumerate(ds_tmp['time']):
    ds_tmp['mask'].sel(time=time_step).plot.imshow(ax=axes[0, i], add_colorbar=False, vmin=0, vmax=1)
    axes[0, i].set_title(f'Mask - Time {time_step.values}')
    axes[0, i].set_xlabel('')
    axes[0, i].set_ylabel('')

# Plot RGB composite in the second row
for i, time_step in enumerate(ds_clean['time']):
    rgb = ds_clean[['swir_1','nir', 'red']].sel(time=time_step).to_array()
    rgb.plot.imshow(ax=axes[1, i], add_colorbar=False, robust=True)
    axes[1, i].set_title(f'RGB - Time {time_step.values}')
    axes[1, i].set_xlabel('')
    axes[1, i].set_ylabel('')

# Adjust layout and show plot
plt.tight_layout()
plt.show()

# Sentinel 2

# Example 3: Fribourg Region

You can later (to check for cloud cover in your region of interest) load also your own config file. For this, delete everything except `%load config_cell.txt`, and execute the cell. **For now, keep everything as is**. A representative example with cloud cover is provided.

In [None]:
# %load config_cell.txt
# Configuration

product = 's2_l2'
measurements = ['B02', 'B03', 'B04', 'B05', 'B08', 'B11', 'SCL']
aliases = ['blue', 'green', 'red', 'red_edge_1', 'nir_1', 'swir_16', 'SCL']  # you can also provide only the aliases and get the measurements with:
# measurements, aliases = get_alias_band(aliases)
# to make your live easier you can manually replace the measurements variable by 
# one of their alias:

longitude = (7.127, 7.199)
latitude = (46.773, 46.816)
crs = 'epsg:4326'

time = ('2016-03-04', '2016-03-27')
time = ('2016-12-01', '2016-12-12')
# the following date formats are also valid:
# time = ('2000-01-01', '2001-12-31')
# time=('2000-01', '2001-12')
# time=('2000', '2001')

# You can use an UTM zone according to the DataCube System.
# We prefer not to use this, instead specifying SwissGrid (epsg:2056).
# output_crs = 'epsg:2056'

output_crs = 'epsg:2056'
resolution = -10.0, 10.0

# These are the pixel classifications for Sentinel (SCL) and Landsat (QA_PIXEL); 
# you can use values to mask out values that belong to certain classes

###################################
# SCL categories:                 #
#   0 - no data                   #
#   1 - saturated or defective    #
#   2 - dark area pixels          #
#   3 - cloud_shadows             #
#   4 * vegetation                #
#   5 * not vegetated             #
#   6 * water                     #
#   7 * unclassified              #
#   8 - cloud medium probability  #
#   9 - cloud high probability    #
#  10 - thin cirrus               #
#  11 * snow                      #
###################################

# Check for more detailed information: 
# - Landsat 8/9 (OLI/TIRS), Page 19:
# https://d9-wret.s3.us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/media/files/LSDS-1619_Landsat8-9-Collection2-Level2-Science-Product-Guide-v6.pdf
# - Landsat 7 (ETM+), Page 15:
# https://d9-wret.s3.us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/media/files/LSDS-1337_Landsat7ETM-C2-L2-DFCB-v6.pdf
# - Landsat 4,5 (TM), Page 18:
# https://d9-wret.s3.us-west-2.amazonaws.com/assets/palladium/production/s3fs-public/atoms/files/LSDS-1415_Landsat4-5-TM-C2-L1-DFCB-v3.pdf

#############################################
# QA_PIXEL BITS : CATEGORIES                #
#    0 : Fill                               #
#    1 : Clear                              #
#    2 : Water                              #
#    3 : Cloud shadow                       #
#    4 : Snow                               #
#    5 : Cloud                              #
#   10 : Terrain occlusion (Landsat 8 only) #
#############################################

chunks = {"x": 2048, "y": 2048, "time": 1}  # 2048 values are OK with ~21Gb memory available



In [None]:
# Establish connection to SDC
catalog = Client.open("https://explorer.swissdatacube.org/stac")


In [None]:
# Load dataset with parameters defined in the config cell
ds_tmp = load_product_ts(catalog=catalog,
                        product=product,
                        longitude=longitude,
                        latitude=latitude,
                        output_crs=output_crs,
                        measurements=measurements,
                        resolution = resolution,
                        time=time,
                        chunks=chunks,
                         rename=True,
                         alias_names = aliases
                        )


## Pre-defined cloud identification and classification

This example uses the Sentinel 2 data that is providing a *Scene Classification Scheme* SCL (https://custom-scripts.sentinel-hub.com/custom-scripts/sentinel-2/scene-classification/)

<img src="https://www.dropbox.com/scl/fi/axvewt4y7nhm64rmruvmj/Sentinel2_SCL.png?rlkey=im4aonv9xq6efjj8ciu0s2vue&dl=1" width="400" />

*Figure 2: Sentinel 2 Scene Classification Scheme*

In [None]:
ds_tmp['SCL'].plot(col='time', vmin=0)

In [None]:
# We want to remove:
# 0 - missing data 
# 1 - saturated/defective
# 9 - cloud high probability
# 10 - thin cirrus clouds
invalid_values = [0,1,9,10]  

ds_tmp = create_mask_from_values(ds_tmp, invalid_values)

# Define figure and axis grid with two rows: one for the mask and one for RGB
n_time_steps = len(ds_tmp['time'])
fig, axes = plt.subplots(2, n_time_steps, figsize=(5 * n_time_steps, 10))

# Plot the mask in the first row
for i, time_step in enumerate(ds_tmp['time']):
    ds_tmp['mask'].sel(time=time_step).plot.imshow(ax=axes[0, i], add_colorbar=False, vmin=0, vmax=1)
    axes[0, i].set_title(f'Mask - Time {time_step.values}')
    axes[0, i].set_xlabel('')
    axes[0, i].set_ylabel('')

# Plot RGB composite in the second row
for i, time_step in enumerate(ds_tmp['time']):
    rgb = ds_tmp[['red', 'green', 'blue']].sel(time=time_step).to_array()
    rgb.plot.imshow(ax=axes[1, i], add_colorbar=False, robust=True)
    axes[1, i].set_title(f'RGB - Time {time_step.values}')
    axes[1, i].set_xlabel('')
    axes[1, i].set_ylabel('')

# Adjust layout and show plot
plt.tight_layout()
plt.show()

## Assessment

It seems that in the first figure there is again some thin cirrus band that makes the image **lighter**. This seems to be correctly identified. In the last image, the north-western part is classified as valid even though there are some clouds visible.


The next cell shows how to set all values to "not assigned" `np.nan` where we identified invalid pixels. 

In [None]:
ds_clean = ds_tmp.where(ds_tmp['mask'] != 1, other=np.nan)

In [None]:
# Define figure and axis grid with two rows: one for the mask and one for RGB
n_time_steps = len(ds_tmp['time'])
fig, axes = plt.subplots(2, n_time_steps, figsize=(5 * n_time_steps, 10))

# Plot the mask in the first row
for i, time_step in enumerate(ds_tmp['time']):
    ds_tmp['mask'].sel(time=time_step).plot.imshow(ax=axes[0, i], add_colorbar=False, vmin=0, vmax=1)
    axes[0, i].set_title(f'Mask - Time {time_step.values}')
    axes[0, i].set_xlabel('')
    axes[0, i].set_ylabel('')

# Plot RGB composite in the second row
for i, time_step in enumerate(ds_clean['time']):
    rgb = ds_clean[['red', 'green', 'blue']].sel(time=time_step).to_array()
    rgb.plot.imshow(ax=axes[1, i], add_colorbar=False, robust=True)
    axes[1, i].set_title(f'RGB - Time {time_step.values}')
    axes[1, i].set_xlabel('')
    axes[1, i].set_ylabel('')

# Adjust layout and show plot
plt.tight_layout()
plt.show()

# Example 2: Mountains

Here is simply a copy of all the above code, but we change the longitude and latitude to match the example from the glacier_mapping notebook. And we adjust the `time` variable to have some examples with clouds and snow.

In [None]:
product = 's2_l2'
measurements = ['B02', 'B03', 'B04', 'B05', 'B08', 'B11', 'SCL']
aliases = ['blue', 'green', 'red', 'red_edge_1', 'nir_1', 'swir_16', 'SCL']  # you can also provide only the aliases and get the measurements with:
# measurements, aliases = get_alias_band(aliases)
# to make your live easier you can manually replace the measurements variable by 
# one of their alias:

# longitude = (7.127, 7.199)
# latitude = (46.773, 46.816)
longitude = (7.73228, 7.957461)
latitude = (45.877007, 46.022142)
crs = 'epsg:4326'

time = ('2016-03-12', '2016-03-23')
# time = ('2016-12-01', '2016-12-12')
# the following date formats are also valid:
# time = ('2000-01-01', '2001-12-31')
# time=('2000-01', '2001-12')
# time=('2000', '2001')

# You can use an UTM zone according to the DataCube System.
# We prefer not to use this, instead specifying SwissGrid (epsg:2056).
# output_crs = 'epsg:2056'

output_crs = 'epsg:2056'
resolution = -10.0, 10.0


In [None]:
# Establish connection to SDC
catalog = Client.open("https://explorer.swissdatacube.org/stac")


In [None]:
# Load dataset with parameters defined in the config cell
ds_tmp = load_product_ts(catalog=catalog,
                        product=product,
                        longitude=longitude,
                        latitude=latitude,
                        output_crs=output_crs,
                        measurements=measurements,
                        resolution = resolution,
                        time=time,
                        chunks=chunks,
                         rename=True,
                         alias_names = aliases
                        )


In [None]:
# We want to remove:
# 0 - missing data 
# 1 - saturated/defective
# 9 - cloud high probability
# 10 - thin cirrus clouds
invalid_values = [0,1,9,10]  

ds_tmp = create_mask_from_values(ds_tmp, invalid_values)

# Define figure and axis grid with two rows: one for the mask and one for RGB
n_time_steps = len(ds_tmp['time'])
fig, axes = plt.subplots(2, n_time_steps, figsize=(5 * n_time_steps, 10))

# Plot the mask in the first row
for i, time_step in enumerate(ds_tmp['time']):
    ds_tmp['mask'].sel(time=time_step).plot.imshow(ax=axes[0, i], add_colorbar=False, vmin=0, vmax=1)
    axes[0, i].set_title(f'Mask - Time {time_step.values}')
    axes[0, i].set_xlabel('')
    axes[0, i].set_ylabel('')

# Plot RGB composite in the second row
for i, time_step in enumerate(ds_tmp['time']):
    rgb = ds_tmp[['red', 'green', 'blue']].sel(time=time_step).to_array()
    rgb.plot.imshow(ax=axes[1, i], add_colorbar=False, robust=True)
    axes[1, i].set_title(f'RGB - Time {time_step.values}')
    axes[1, i].set_xlabel('')
    axes[1, i].set_ylabel('')

# Adjust layout and show plot
plt.tight_layout()
plt.show()

## Assessment

Again a good example showing how difficult it can be to identify clouds. It seems like some bands are oversaturated and cause some artifacts. On the right hand side, there are some clouds that are correctly identified. Let's check with a false color composite. We need to change the **band names**. Let's see the correct names for `swir` and `nir`.



In [None]:
ds_tmp

In [None]:
# Test with different bands
# Define figure and axis grid with two rows: one for the mask and one for RGB
n_time_steps = len(ds_tmp['time'])
fig, axes = plt.subplots(2, n_time_steps, figsize=(5 * n_time_steps, 10))

# Plot the mask in the first row
for i, time_step in enumerate(ds_tmp['time']):
    ds_tmp['mask'].sel(time=time_step).plot.imshow(ax=axes[0, i], add_colorbar=False, vmin=0, vmax=1)
    axes[0, i].set_title(f'Mask - Time {time_step.values}')
    axes[0, i].set_xlabel('')
    axes[0, i].set_ylabel('')

# Plot RGB composite in the second row
for i, time_step in enumerate(ds_tmp['time']):
    rgb = ds_tmp[['swir_16','nir_1', 'red']].sel(time=time_step).to_array()
    rgb.plot.imshow(ax=axes[1, i], add_colorbar=False, robust=True)
    axes[1, i].set_title(f'RGB - Time {time_step.values}')
    axes[1, i].set_xlabel('')
    axes[1, i].set_ylabel('')

# Adjust layout and show plot
plt.tight_layout()
plt.show()

The next cell shows how to set all values to "not assigned" `np.nan` where we identified invalid pixels. 

In [None]:
ds_clean = ds_tmp.where(ds_tmp['mask'] != 1, other=np.nan)

In [None]:
# Define figure and axis grid with two rows: one for the mask and one for RGB
n_time_steps = len(ds_tmp['time'])
fig, axes = plt.subplots(2, n_time_steps, figsize=(5 * n_time_steps, 10))

# Plot the mask in the first row
for i, time_step in enumerate(ds_tmp['time']):
    ds_tmp['mask'].sel(time=time_step).plot.imshow(ax=axes[0, i], add_colorbar=False, vmin=0, vmax=1)
    axes[0, i].set_title(f'Mask - Time {time_step.values}')
    axes[0, i].set_xlabel('')
    axes[0, i].set_ylabel('')

# Plot RGB composite in the second row
for i, time_step in enumerate(ds_clean['time']):
    rgb = ds_clean[['swir_16','nir_1', 'red']].sel(time=time_step).to_array()
    rgb.plot.imshow(ax=axes[1, i], add_colorbar=False, robust=True)
    axes[1, i].set_title(f'RGB - Time {time_step.values}')
    axes[1, i].set_xlabel('')
    axes[1, i].set_ylabel('')

# Adjust layout and show plot
plt.tight_layout()
plt.show()

This does not look too bad. But there are still pixels that are under cloud shadows. You can check if you can improve (but loose pixels) if you include the cloud shadow class **3**.

In [None]:
# We want to remove:
# 0 - missing data 
# 1 - saturated/defective
# 9 - cloud high probability
# 10 - thin cirrus clouds
invalid_values = [0,1,3,9,10]  

ds_tmp = create_mask_from_values(ds_tmp, invalid_values)
ds_clean = ds_tmp.where(ds_tmp['mask'] != 1, other=np.nan)

# Define figure and axis grid with two rows: one for the mask and one for RGB
n_time_steps = len(ds_tmp['time'])
fig, axes = plt.subplots(2, n_time_steps, figsize=(5 * n_time_steps, 10))

# Plot the mask in the first row
for i, time_step in enumerate(ds_tmp['time']):
    ds_tmp['mask'].sel(time=time_step).plot.imshow(ax=axes[0, i], add_colorbar=False, vmin=0, vmax=1)
    axes[0, i].set_title(f'Mask - Time {time_step.values}')
    axes[0, i].set_xlabel('')
    axes[0, i].set_ylabel('')

# Plot RGB composite in the second row
for i, time_step in enumerate(ds_tmp['time']):
    rgb = ds_clean[['swir_16', 'nir_1', 'red']].sel(time=time_step).to_array()
    rgb.plot.imshow(ax=axes[1, i], add_colorbar=False, robust=True)
    axes[1, i].set_title(f'RGB - Time {time_step.values}')
    axes[1, i].set_xlabel('')
    axes[1, i].set_ylabel('')

# Adjust layout and show plot
plt.tight_layout()
plt.show()

That is not so good ! ...