## Dask Sentinel-2 cloudless

Before running the cells in this notebook, open a Terminal and run:

```bash
cd cloudless-mosaic
pip install -e .
```

In [1]:
import os
from dask_gateway import Gateway
from eopf_sentinel_2.app import main

## Dask Gateway connection

The environment variable `DASK_GATEWAY_ADDRESS` contains the Dask Gateway internal service address.

Create a `Gateway` object that will later be used to create a Dask cluster

In [2]:
gateway_url = os.environ.get("DASK_GATEWAY_ADDRESS")

gateway = Gateway(gateway_url)



## Create a Dask cluster

The environment variable `DASK_IMAGE` contains the container image to use as for the Dask cluster workers.

In [3]:
cluster_options = gateway.cluster_options()

image = os.environ.get("DASK_IMAGE")
worker_cores = 0.5
worker_cores_limit = 5
worker_memory = 2

cluster_options['image'] = image
cluster_options['worker_cores'] = worker_cores
cluster_options['worker_cores_limit'] = worker_cores_limit
cluster_options['worker_memory'] = f"{worker_memory} G"

In [4]:
cluster = gateway.new_cluster(cluster_options)

Print the cluster name:


In [5]:
cluster.name

'eoap-dask-gateway.b336c686a3b74374a91124410d759130'

Print the cluster dashboard link although it's not accessible via browser.

In [6]:
cluster.dashboard_link

'http://traefik-dask-gateway.eoap-dask-gateway.svc.cluster.local:80/clusters/eoap-dask-gateway.b336c686a3b74374a91124410d759130/status'

If there's a port forward on the `traefik-dask-gateway` service, the dashboard link becomes:

In [7]:
print(f"https://localhost:8001/clusters/{cluster.name}/status")

https://localhost:8001/clusters/eoap-dask-gateway.b336c686a3b74374a91124410d759130/status


Get the cluster client and scale the cluster workers:

In [8]:

client = cluster.get_client()

cluster.adapt(minimum=4, maximum=6)


+---------+--------+-----------+---------+
| Package | Client | Scheduler | Workers |
+---------+--------+-----------+---------+
| lz4     | 4.4.4  | None      | None    |
+---------+--------+-----------+---------+


## Cloudless monthly mosaic generation

In [10]:
params = {
    "item_url": "https://stac.core.eopf.eodc.eu/collections/sentinel-2-l1c/items/S2B_MSIL1C_20250113T103309_N0511_R108_T32TLQ_20250113T122458"
}

main(**params)

[32m2025-05-12 13:27:38.832[0m | [1mINFO    [0m | [36meopf_sentinel_2.app[0m:[36mmain[0m:[36m19[0m - [1mArea of interest[0m
  dt = xr.open_datatree(remote_product_path, engine="zarr", chunks={})
  dt = xr.open_datatree(remote_product_path, engine="zarr", chunks={})


RuntimeError: Error during deserialization of the task graph. This frequently
occurs if the Scheduler and Client have different environments.
For more information, see
https://docs.dask.org/en/stable/deployment-considerations.html#consistent-software-environments


## Dispose the cluster

In [None]:
cluster.shutdown()