# Sentinel-2 spectral indices using Google Earth Engine

This demo gives an overview of filtering Sentinel-2 data by date and area and the calculation of spectral indices. It uses the Google Earth Engine (GEE) API package for python. This allows to compute large amount of satellite (and other remote sensing) data without having to download the huge data sets. To be able to use this package a GEE account is required.

### Start a GEE session

The first time the "ee" package is used, you need to run ee.Authenticate(). This will open a window on the web where you have to log in with your GEE credentials. This will create an access token that needs to be pasted in the box that appeared below (this needs to be done at the beginning or when the kernel/session is restarted). To start the connection with your GEE account, you run ee.Initialize(). If you have only one project on your GEE account, you do not need to specify more. If you have multiple projects, you can specify with ee.Initialize(project=project-number). To find the project number, you have to log in to GEE online and click on the respective project.

In [None]:
# Load the packages
import ee       # GEE API package
import geemap   # package for interactive plotting->does not work on PyCharm

# Login with the GEE credentials and connect to your account
ee.Authenticate()   # needs to be done once in a while
ee.Initialize()     # starts the connection to your GEE account and allows you to use all the datasets you might have stored there

---

## Selecting and Visualizing Sentinel-2 Data

**Sentinel-2**<br>
Sentinel-2 is a passive optical multispectral satellite mission which can image the Earth surface using 13 different spectral bands from the visible to the short-wave infrared parts of the spectrum$^{[1]}$.<br>
In the table below the 13 different bands are listed with their band numer, a description, the center wavelength from the elecromagnetic spectrum and the spatial resolution.

| Sentinel-2 band | Description     | Central wavelength [nm] | Resolution [m] | 
|:----------------|:----------------|:------------------------|:---------------|
| B01             | Coastal aerosol | 443                     | 60             | 
| B02             | Blue            | 490                     | 10             |
| B03             | Green           | 560                     | 10             | 
| B04             | Red             | 665                     | 10             | 
| B05             | Veg. red edge   | 705                     | 20             | 
| B06             | Veg. red edge   | 740                     | 20             | 
| B07             | Veg. red edge   | 783                     | 20             |
| B08             | NIR             | 842                     | 10             | 
| B8A             | Veg. red edge   | 865                     | 20             | 
| B09             | Water vapour    | 945                     | 60             | 
| B10             | SWIR cirrus     | 1375                    | 60             | 
| B11             | SWIR            | 1610                    | 20             | 
| B12             | SWIR            | 2190                    | 20             | 

### Filter Sentinel-2 data by date

We select the Sentinel-2 ImageCollection ('COPERNICUS/S2_SR_HARMONIZED') provided by GEE and filter the data by selecting a start and an end date. For this demo, we only need one Sentinel-2 image, so we can define the time range to include the desired day. We know that on the 23. August 2024, there were no clouds over the desired study site. This way, we do not have to filter and mask clouds from our Sentinel-2 data additionally.<br>  
After filtering, we can print the size of the newly generated ImageCollection. We see that there are over 20'000 images available in the selected time frame! This is because we did not define an area of interest to filter the Sentinel-2 collection, and it returns all images available globally in the time frame we defined with startTime and endTime.

In [None]:
# Define the study period
startTime = '2024-08-22'
endTime = '2024-08-24'

# Filter the Sentinel-2 image collection by date
S2_images_col = ee.ImageCollection('COPERNICUS/S2_SR_HARMONIZED')\
    .filter(ee.Filter.date(startTime, endTime))

# Check how many images are available
print('Number of images in study period:', S2_images_col.size().getInfo())

### Define an area of interest

We can define an area of interest (AOI) to further filter the Sentinel-2 ImageCollection. The filtered ImageCollection will then only comprise Sentinel-2 images where the AOI is within the image.<br>
There are multiple ways to specify an area of interest:<br>
- Define the geometry using coordinates (e.g., 4 corners of a square with the 5th coordinate=first coordinate to close the shape)
- Load an already existing shapefile from e.g., QGIS and convert it using geemap.shp_to_ee(shapefile)
- Define the geometry in the Map directly using the polygon icon and store it with .user_roi function. E.g., AOI = Map.user_roi

In [None]:
# Define an area of interest using a set of coordinates as a list
AOI = ee.Geometry.Polygon([[9.829207878112806,46.80626536296483],[9.88104961395265,46.80626536296483],[9.88104961395265,46.82638289500436],
                           [9.829207878112806,46.82638289500436],[9.829207878112806,46.80626536296483]])

We can then display the defined AOI polygon in an interactive map using the "geemap" package. After initializing the map we can define the visuals of the basemap (the background). Then we can determine the visualization parameters for the polygon (e.g., color, linetype etc.) and use it with "addLayer". Note: the interactive map only works with Jupyter Lab or other interactive Python interpreters.

In [None]:
# Initialize the map
Map = geemap.Map()

# Define how the basemap should be displayed
Map.set_center(9.85, 46.82, 13) # coordinates and zoom
Map.add_basemap('TERRAIN')      # can also be "SATELLITE" 

# Define the visualization parameters for the AOI, such as color (as hex-code)
vis_params = {'color': 'FFD700',
              'pointSize': 3,
              'pointShape': 'circle',
              'width': 2,
              'lineType': 'solid',
              'lineType': 'solid'}

# Add the area of interest as a polygon to the map
Map.addLayer(AOI, vis_params, 'Area of interest')

# Display the map
Map

### Filter Sentinel-2 data by date and area of interest

After filtering the Sentinel-2 ImageCollection by date and area of interest, we are left with only one available image (as we intended) in the ImageCollection. If we increase the time period, more images would be in the Collection. However, there would also be images of the AOI with clouds that we would then need to filter (see for more info: https://developers.google.com/earth-engine/datasets/catalog/GOOGLE_CLOUD_SCORE_PLUS_V1_S2_HARMONIZED)

In [None]:
# Filter the Sentinel-2 image collection by date and area

# Define the study period
startTime = '2024-08-22'
endTime = '2024-08-24'

# import the Sentinel-2 ImageCollection, filter by date and area
S2_images_col_aoi = ee.ImageCollection('COPERNICUS/S2_SR_HARMONIZED')\
    .filter(ee.Filter.date(startTime, endTime))\
    .filter(ee.Filter.bounds(AOI))

# Check how many images are available
print('Number of images in study period:', S2_images_col_aoi.size().getInfo())

To display the image, we need to convert the ImageCollection (which is currently only one image) to a unique image. As we only have one Image in the Collection, we can simply select the first image using ImageCollection.first(). If there were multiple images in the Collection, we would need to decide how to reduce our Collection to a single image:
- .first(): use the first image in the Collection (earliest in the time range)
- reducers such as .median(), .mean() etc.: calculate the e.g., pixel-wise mean value of all available images in the Collection (mean value of all pixels with the same coordinates)
- .mosaic(): uses the newest pixel values available

In [None]:
# We select the first image in our ImageCollection and can have a look at the output
S2_image = S2_images_col_aoi.first()
S2_image

The "bands" list comprises the 13 spectral bands of Sentinel-2 as well as bands used for processing and quality flags. Information about the additional bands can be found here: https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR_HARMONIZED#bands<br>

The "properties/datastrip_id" indicates the date and time the image was acquired (2024 08 23, 10:15:59 UTC) and which Sentinel-2 satellite was used (S2B) (also visible in "spacecraft_name"). The "granule_id" tells us that the image is L2A, meaning level 2A data. In this case, Bottom-of-atmosphere (or surface) reflectance data. This means the data was already corrected for the atmosphere (using sen2cor) and geometry. The "properties" also give information about the cloud cover, water vapour, etc., used for the atmosphere correction.<br>

We can visualize the Sentinel-2 image as true color RGB:

In [None]:
# Add the filtered Sentinel-2 image to our map
# Initialize the map
Map = geemap.Map()

# Define how the basemap should be displayed
Map.set_center(9.85, 46.82, 9.5)
Map.add_basemap('TERRAIN')

# Define the visualization parameters for the aoi, such as color (as hex-code)
vis_params = {'color': 'FFD700',
              'pointSize': 3,
              'pointShape': 'circle',
              'width': 2,
              'lineType': 'solid',
              'lineType': 'solid'}

# Add the Sentinel-2 image as true color RGB
Map.addLayer(S2_image, {'bands': ['B4', 'B3', 'B2'], 'min': 0, 'max': 2500, 'gamma': 1.1},'Sentinel-2 image')

# Add the area of interest as polygon to the map
Map.addLayer(AOI, vis_params, 'Area of interest')

# Display the map
Map

In the displayed map, we can see how big the extent of one Sentinel-2 image (also named "tile") is. The filters we used above searched through the Sentinel-2 Collection and output the one tile that our desired area of interest is incorporated in. As we are only interested in the data within our AOI, we can clip this area and discard all the data around it. This will save a lot of computational time and storage if we decide to download the data in the end. All the information of the "bands" and "properties" will stay the same as they are valid for the whole Sentinel-2 tile.

In [None]:
# Clip the desired area
clippedImage = S2_image.clip(AOI);
clippedImage

This clipped image can also be visualized the same way as the whole Sentinel-2 tile. We will use this clipped image for further analysis.

In [None]:
# Add the filtered Sentinel-2 image to our map
# Initialize the map
Map = geemap.Map()

# Define how the basemap should be displayed
Map.set_center(9.85, 46.82, 13)
Map.add_basemap('TERRAIN')

# Add the clipped Sentinel-2 image as true color RGB
Map.addLayer(clippedImage, {'bands': ['B4', 'B3', 'B2'], 'min': 0, 'max': 2500, 'gamma': 1.1},'Sentinel-2 image clipped')

# Display the map
Map

---

## Image Analysis using Spectral Indices

By combining bands from different parts of the spectrum, in so-called spectral indices, information about the Earth surface, such as vegetation greenness and canopy water content, can be derived$^{[2]}$. This method relays on the fact that the desired phenomena to observe has a different effect on different parts of the spectrum. Using normalized difference indices we use an affected spectral band and a not affected one as reference to detect the desired phenomena (e.g., canopy water content).

**Normalized difference vegetation index - NDVI**<br>
The normalized difference vegetation index (NDVI) was developed by Tucker et al. in 1970tes and is widely used to estimate the green vegetation$^{[3]}$. The index normalizes green leaf scattering in NIR with chlorophyll absorption in red wavelengths and uses the following formula$^{[4]}$:<br>

$NDVI := \frac {NIR - red}{NIR + red} = \frac {B08 - B04}{B08 + B04}$ <br>

The NDVI is defined to have a value range of -1 to 1. Negative values correspond to water bodies (see lake surface), values close to 0 correspond to no-vegetation areas such as rock, snow, and urban. With increasing NDVI values the proportion of green vegetation also increases$^{[4, 5]}$.

In [None]:
# Define a function to add the NDVI as a new band
def calcNDVI(image):
    return image.addBands(image.normalizedDifference(['B8', 'B4']).rename('NDVI'))

**Normalized difference moisture index - NDMI**<br>
The normalized difference moisture index (NDMI) was developed by Gao et al. in the 1990is and is used to estimate the vegetation water content or moisture$^{[6]}$ and is widely used for drought monitoring. The index used the different absorption features of water in leaves in the NIR and SWIR part of the spectrum$^{[7]}$.
It is calculated using bands in the NIR (B08) and SWIR (B11) region of the spectrum $^{[8]}$:<br>

$NDMI := \frac {NIR - SWIR}{NIR + SWIR} = \frac {B08 - B11}{B08 + B11}$ <br>

NDMI values below 0 can indicate water stress (if vegetation is present), values above 0.4 indicate no water stress in the vegetation$^{[9]}$. The highest NDMI values in our image can be found in the forest where we have a lot of vegetation to store water in the canopy. The lake surface has a value of -1 indicating a non-vegetation surface. This is a bit counter-intuitive as we would expect a moisture or water index to be high over a large water body. To detect open water bodies we need to use a different water index.<br>

Note: NDWI index is often used synonymously with the NDMI index, often using NIR-SWIR combination as one of the two options. Gao$^{[6]}$ also called the index NDWI. NDMI seems to be consistently described using NIR-SWIR combination. As the indices with these two combinations work very differently, with NIR-SWIR highlighting differences in water content of leaves, and GREEN-NIR highlighting differences in water content of water bodies, we have decided to separate the indices on our repository as NDMI using NIR-SWIR, and NDWI using GREEN-NIR$^{[8]}$ (quote from Sentinel-Hub).

In [None]:
# Define a function to add the NDMI as a new band
def calcNDMI(image):
    return image.addBands(image.normalizedDifference(['B8', 'B11']).rename('NDMI'))

**Normalized difference water index (NDWI)**<br>
During the same time as Gao et al. developed their NDMI, McFeeters developed the normalized difference water index (NDWI) to detect water bodies $^{[10]}$. Water bodies absorb a lot of light in the visible domain (they appear dark), therefore the index uses the band combination of green and NIR $^{[11]}$:<br>

$NDWI := \frac {green - SWIR}{green + SWIR} = \frac {B03 - B08}{B03 + B08}$ <br> 

NDWI values above 0 indicate potential water bodies or flooding. The lake and river are clearly visible. However, some of the urban area has also high values. This is due to the sensitivity of the index to build structure as well as water$^{[12]}$.<br>

This example nicely illustrate that we need to think about the limitations of the used method in interpreting the calculated index images!

In [None]:
# Define a function to add the NDWI as a new band
def calcNDWI(image):
    return image.addBands(image.normalizedDifference(['B3', 'B8']).rename('NDWI'))

We now use all the index functions from above in one step using "nested functions". In the "bands" list, three new bands with the indices appeared.

In [None]:
# Use the defined functions to add the indices as new layers to the Sentinel-2 image
S2_new = calcNDWI(calcNDMI(calcNDVI(clippedImage)))
S2_new

To visualize the index values as a colormap, we need to define the visualization parameters for each (or use the same). In a dictionary (={}), we define the min. and max. values of the indices (adjust for different indices, stretches etc.) as well as some key colors for the palette. Here, we only defined the color for min. and max. value the rest will be automatically interpolated. You could also add more colors in hex code. An overview of the hex colors can be found here: https://en.wikipedia.org/wiki/Web_colors

In [None]:
# Initialize the map
Map = geemap.Map()

# Define how the basemap should be displayed
Map.set_center(9.85, 46.82, 13)
Map.add_basemap('TERRAIN')

# NDVI: From grey (low vegetation) to green (high vegetation)
ndvi_vis = {
    'min': -1,
    'max': 1,
    'palette': ['C0C0C0', '008000'],
}

# NDMI: From yellow (low moisture) to blue (high moisture)
ndmi_vis = {
    'min': -1,
    'max': 1,
    'palette': ['FFFF00', '0000FF'],
}

# NDWI: From gray (no water) to blue (water)
ndwi_vis = {
    'min': -1,
    'max': 1,
    'palette': ['C0C0C0', '0000FF'],
}

# Add the Sentinel-2 image
Map.addLayer(clippedImage, {'bands': ['B4', 'B3', 'B2'], 'min': 0, 'max': 2500, 'gamma': 1.1}, 'Sentinel-2 image clipped')

# Add the indices as layers with their respective visualizations
Map.addLayer(S2_new.select('NDVI'), ndvi_vis, 'NDVI')
Map.addLayer(S2_new.select('NDMI'), ndmi_vis, 'NDMI')
Map.addLayer(S2_new.select('NDWI'), ndwi_vis, 'NDWI')

# Display the map
Map

---

*Literature*<br>
$^{[1]}$: https://en.wikipedia.org/wiki/Sentinel-2<br>
$^{[2]}$: https://custom-scripts.sentinel-hub.com/custom-scripts/sentinel/sentinel-2/<br>
$^{[3]}$: Tucker, C.J., 1979. Red and photographic infrared linear combinations for monitoring vegetation. Remote Sensing of Environment 8, 127–150.<br>
$^{[4]}$: https://custom-scripts.sentinel-hub.com/sentinel-2/ndvi/<br>
$^{[5]}$: https://en.wikipedia.org/wiki/Normalized_difference_vegetation_index<br>
$^{[6]}$: Gao, B., 1996. NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sens. Environ. 58, 257–266.<br>
$^{[7]}$: Colombo, R., Meroni, M., Marchesi, A., Busetto, L., Rossini, M., Giardino, C., Panigada, C., 2008. Estimation of leaf and canopy water content in poplar plantations by means of hyperspectral indices and inverse modeling. Remote Sens. Environ. 112, 1820–1834.<br>
$^{[8]}$: https://custom-scripts.sentinel-hub.com/sentinel-2/ndmi/<br>
$^{[9]}$: https://eos.com/make-an-analysis/ndmi/<br>
$^{[10]}$: McFeeters, S.K., 1996. The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features. Int. J. Remote Sens. 17, 1425–1432.<br>
$^{[11]}$: https://custom-scripts.sentinel-hub.com/sentinel-2/ndwi/<br>
$^{[12]}$: https://eos.com/make-an-analysis/ndwi/<br>