# Mosaic from a single multitemporal dataset


The goal of this notebook is to provide an example of how to create a cloud-free mosaic from Sentinel-2 imagery over a specific area over a time period. We first use `satsearch` to search for Sentinel-2 data then combine them together using `stackstac`. A median operation will be applied to merge the images into a single layer that could be save off into Azure blob storage as COGs for later use.


## 1. Sentinel-2 Dataset

Satellite images (also Earth observation imagery, spaceborne photography, or simply satellite photo) are images of Earth collected by imaging satellites operated by governments and businesses around the world (see https://en.wikipedia.org/wiki/Satellite_imagery). Its major applications include Earth observation and land cover monitoring. 


SENTINEL-2 (https://sentinel.esa.int/web/sentinel/user-guides/sentinel-2-msi/overview) is a wide-swath, high-resolution, multi-spectral imaging mission, supporting Copernicus Land Monitoring studies, including the monitoring of vegetation, soil and water cover, as well as observation of inland waterways and coastal areas.

## 2. Environment setup

Necessary libraries are listed as below

In [None]:
from dask_gateway import GatewayCluster
from dask_gateway import Gateway
from distributed import Client

from datashader import Canvas
from PIL import Image

import stackstac
from satsearch import Search

import xrspatial.multispectral as ms

Let's create a new cluster that configured to use Dask-Gateway, and a new client that executes all Dask computations on the cluster. And we can set the mode for the cluster to be adaptive mode so that it will resize itself automatically based on the workload.

In [None]:
cluster = GatewayCluster()  # Creates the Dask Scheduler. Might take a minute.
client = cluster.get_client()
cluster.adapt(minimum=8, maximum=100)

client

## 3. Load Sentinel 2 data

In this example, we use data from `sentinel-s2-l2a-cogs` collection within a bounding box of `[-97.185642, 27.569157, -95.117574, 29.500710]`, and the time range considered is from `2019-07-01` to `2020-06-30`. And the collected data has less than 25% cloud coverage.

In [None]:
items = Search(
    url="https://earth-search.aws.element84.com/v0",
    bbox=[-97.185642, 27.569157, -95.117574, 29.500710],
    collections=["sentinel-s2-l2a-cogs"],
    query={'eo:cloud_cover': {'lt': 25}},
    datetime="2019-07-01/2020-06-30"
).items()

len(items)

Let's combine all the above STAC items into a lazy xarray with following settings:
- projection: epsg=32613
- resolution: 100m
- bands: red (B04), green (B03), blue (B02)

In [None]:
stack_ds = stackstac.stack(
    items, epsg=32613, resolution=100, assets=['B04', 'B03', 'B02']
)

stack_ds

We can get a median composite for each month in the considered period of time:

In [None]:
monthly = stack_ds.resample(time="MS").median("time", keep_attrs=True)
monthly.data = monthly.data.rechunk(1024, 1024)
monthly

## 4. Cloud-free scene using median operator

In this step, we use a median operation to merge all monthly images into 1 single cloud-free layer. With an assumption that, along a multitemporal stack, clouds would not persist at the same geographical position from time to time (i.e image to image), the more data we have, the higher chance of dropping clouds.

In [None]:
median_scene = monthly.median(dim=['time'])
median_scene.data = median_scene.data.rechunk(2048, 2048)
median_scene

## 5. Downsample for visualization

With 3 bands: red, green, blue, let's see visualize the cloud-free scene we just constructed using the `true_color` function from `xrspatial.multispectral module`

In [None]:
h, w = 600, 800
canvas = Canvas(plot_height=h, plot_width=w)
resampled_agg = canvas.raster(median_scene)

resampled_agg

`true_color` function takes 3 bands: red, green, blue as inputs and returns a PIL.Image object

In [None]:
image = ms.true_color(resampled_agg[2], resampled_agg[1], resampled_agg[0])
image

Compute the image and visualize with PIL Image

In [None]:
image = image.compute()

Image.fromarray(image.data, 'RGBA')

Finally, close the client and the cluster.

In [None]:
client.close()
cluster.close()