<img src='https://gitlab.eumetsat.int/eumetlab/oceans/ocean-training/tools/frameworks/-/raw/main/img/Standard_banner.png' align='right' width='100%'/>

<font color="#138D75">**WEkEO Training**</font> <br>
**Copyright:** 2024 EUMETSAT <br>
**License:** MIT <br>
**Authors:** Anna-Lena Erdmann (EUMETSAT)

<html>
  <div style="width:100%">
    <div style="float:left"><a href="https://jupyterhub.prod.wekeo2.eu/hub/user-redirect/lab/tree/public/wekeo4data/wekeo-eocanvas/EOCAnvasWorkflow_Create_an_OLCI_DataCube.ipynb"><img src="https://img.shields.io/badge/launch-WEKEO-1a4696.svg?style=flat&logo=" alt="Open in WEkEO"></a></div>
    <div style="float:left"><p>&emsp;</p></div>
  </div>    
</html>

<div class="alert alert-block alert-success">
<h3> WEkEO EOCanvas Workflows - Applied Examples of EOCanvas for your EO Data Processing</h3></div>

<div class="alert alert-block alert-warning">
    
<b>PREREQUISITES </b>
    
This notebook has the following prerequisites:
  - **<a href="https://my.wekeo.eu/user-registration" target="_blank">A WEkEO account</a>**
  - basic knowledge of EOCanvas through executing the **<a href="https://github.com/wekeo/wekeo4data/blob/main/wekeo-eocanvas/01_Introduction_to_EOCanvas.ipynb" target="_blank">EOCanvas Introduction Notebook</a>**
  

</div>
<hr>

# Creating an OLCI Data Cube 

### Learning outcomes

At the end of this notebook you will know;

* how to reproject and resample OLCI marine satellite imagery using the EOCanvas
* how to work with the EOCanvas if you have more than one input 
* how to merge several outputs to a datacube
* hot to visualize a datacube using the <a href="https://xcube.readthedocs.io/en/latest/index.html" target="_blank">xcube</a> Viewer 


### Outline

The The EOCanvas is a WEkEO service to process Coperncius data in the Cloud. In the EOCanvas Workflows series of notebooks, we will discover applied examples of how the EOCanvas can be used in EO data processing. This notebooks focused on the creation of a datacube usig the Sentinel-3 OLCI satellite imagery.  

<div class="alert alert-info" role="alert">

### Contents <a id='totop'></a>

</div>
    
 1. [Setting Up](#section0)
 2. [Definition of the Input Parameters](#section1)
 3. [Access the OLCI data](#section2)
 4. [OLCI Data Processing using the EOCAnvas](#section3)
 5. [Create the Datacube from EOCanvas Output](#section4)
 6. [Visualize the DataCube in the xcube Viewer](#section4)

<hr>

<div class="alert alert-info" role="alert">

## 1. <a id='section0'></a>Setting up
[Back to top](#totop)
    
</div>

Load necessary modules:

In [None]:
from eocanvas import API, Credentials
from eocanvas.api import Input, Config, ConfigOption
from eocanvas.datatailor.chain import Chain
from eocanvas.processes import DataTailorProcess

You must replace `<your_user_name>` and `<your_password>` with the information from your WEkEO account (if you don't have one yet, register <a href="https://www.wekeo.eu/" target="_blank">here</a>.

Save your credentials. They will be automatically loaded when required.

In [None]:
c = Credentials(username="<your_user_name>", password="<your_password>")
c.save()

<div class="alert alert-info" role="alert">

## 2. <a id='section1'></a>Definition of the Input Parameters
[Back to top](#totop)
    
</div>

In this section, we define the Area of Interest `W, S, E, N` and the Time of Interest `start_date` and `end_date` for data analysis. 

For this example, we have chosen an area of interest in the baltic sea. 

<img src='./img/04_datacube_aoi.png' alt='' align='centre' width='30%'></img>

In [None]:
W, S, E, N = [
    15,
    53,
    22,
    58,
  ]

In [None]:
start_date = "2024-07-01T04:00:00.000Z"
end_date = "2024-07-07T04:00:00.000Z"

<div class="alert alert-info" role="alert">
    
## 3. <a id='section2'></a>Access the OLCI data  
[Back to top](#totop)
    
</div>

We will be working with the Sentinel-3 OLCI Water Full Resolution product. For more information on this dataset, refer to the [WEkEO data description](https://www.wekeo.eu/data?view=dataset&dataset=EO%3AEUM%3ADAT%3ASENTINEL-3%3AOL_2_WFR___).

This cell generates the WEkEO API request to retrieve Sentinel-3 OLCI data, based on the Area of Interest (AOI) and Time of Interest. For further details on creating API requests, see the [WEkEO API notebook](https://github.com/wekeo/wekeo4data/blob/main/wekeo-hda/wekeo_harmonised_data_access_api.ipynb).

In [None]:
q = {
  "dataset_id": "EO:EUM:DAT:SENTINEL-3:OL_2_WFR___",
  "dtstart": start_date,
  "dtend": end_date,
  "bbox": [W, S, E, N],
  "type": "OL_2_WFR___",
  "sat": "Sentinel-3A",
  "itemsPerPage": 200,
  "startIndex": 0
}


Using the WEkEO API client, this cell searches for Sentinel-3 OLCI data according to the specified parameters and returns a list of download URLs, one for each satellite imagery product.

In [None]:
from hda import Client

c = Client()
r = c.search(q)
url_list = r.get_download_urls()

This cell creates a list of IDs from the search results to examine the returned dataset in more detail.

In [None]:
id_list = [result['id'] for result in r.results]

In [7]:
id_list

['S3A_OL_2_WFR____20240706T094017_20240706T094317_20240707T171404_0179_114_193_1980_MAR_O_NT_003.SEN3',
 'S3A_OL_2_WFR____20240705T100628_20240705T100928_20240706T180520_0180_114_179_1980_MAR_O_NT_003.SEN3',
 'S3A_OL_2_WFR____20240705T082529_20240705T082829_20240706T160511_0180_114_178_1980_MAR_O_NT_003.SEN3',
 'S3A_OL_2_WFR____20240704T085140_20240704T085440_20240705T171100_0179_114_164_1980_MAR_O_NT_003.SEN3',
 'S3A_OL_2_WFR____20240703T091751_20240703T092051_20240704T155449_0179_114_150_1980_MAR_O_NT_003.SEN3',
 'S3A_OL_2_WFR____20240702T094402_20240702T094702_20240703T164724_0179_114_136_1980_MAR_O_NT_003.SEN3',
 'S3A_OL_2_WFR____20240701T101013_20240701T101313_20240702T172844_0179_114_122_1980_MAR_O_NT_003.SEN3',
 'S3A_OL_2_WFR____20240701T082914_20240701T083214_20240702T152801_0179_114_121_1980_MAR_O_NT_003.SEN3']

<div class="alert alert-info" role="alert">
    
## 4. <a id='section3'></a>OLCI Data Processing using the EOCanvas  
[Back to top](#totop)
    
</div>


First, we get the processing chain from the YAML configuration file. The Region of Interest (ROI) is set to the coordinates `[N, S, W, E]`, which we defined on the top of the notebook. It specifies the area to which the images are cropped.

In [None]:
chain = Chain.from_file("input_graphs/olci_datacube.yaml")
chain.roi =  {'NSWE': [N, S, W, E], 'name': 'Baltic', 'id': 'baltic'}

In [9]:
chain

Chain(id=None, product='OLL2WFR', format='netcdf4', name=None, description=None, aggregation=None, projection='geographic', roi={'NSWE': [58, 53, 15, 22], 'name': 'Baltic', 'id': 'baltic'}, filter=Filter(id=None, bands=['chl_nn'], name=None, product=None), quicklook=None, resample_method=None, resample_resolution=[0.003, 0.003], compression=None, xrit_segments=None)

As we want to process the different Senitnel-3 products as quick as possible, we will launch the EOCanvas processes in parallel. For this the use the `ThreadPoolExecutor` python tool. 

Note: The processing is restricted to a **quota**. This limits the processes which you can do in one hour. The default quota is 3 requests/hour. Due to this quota, we limit the example to 3 products that we process. In case you want to process more product, you can delay the submission of further processes by 1 hour, or request a **quota increase** at **support@wekeo.eu**. 

In [None]:
from concurrent.futures import ThreadPoolExecutor
MAX_QUOTA = 3

Next, we define the `exec_data_tailor_process` function. This function takes a URL, initiates a data processing chain (Data Tailor), and saves the output in the specified directory. It also logs the start and finish of each process for tracking purposes.

In [None]:
def exec_data_tailor_process(url):
    print(f"Starting Data Tailor for product at {url}")
    inputs = Input(key="img1", url=url)
    process = DataTailorProcess(epct_chain=chain, epct_input=inputs)
    process.run(download_dir="result/datacube_test")
    return f"Finishing Data Tailor for product at {url}"

Finally, we are ready to execute the `exec_data_tailor_process` function on multiple URLs in parallel using a `ThreadPoolExecutor`. It submits up to three tasks at a time, processes them, and outputs the results as they complete.

In [12]:
with ThreadPoolExecutor(max_workers=MAX_QUOTA) as executor:
            
    futures = [executor.submit(exec_data_tailor_process, url_list[i]) for i in range(MAX_QUOTA)]

    # Get the results as they complete
    for future in futures:
        print(future.result())

Starting Data Tailor for product at https://gateway.prod.wekeo2.eu/hda-broker/api/v1/dataaccess/download/6706a59994f83a7bc7938ce1
Starting Data Tailor for product at https://gateway.prod.wekeo2.eu/hda-broker/api/v1/dataaccess/download/6706a59e94f83a7bc7938ce2
Starting Data Tailor for product at https://gateway.prod.wekeo2.eu/hda-broker/api/v1/dataaccess/download/6706a5a494f83a7bc7938ce7
Job: 00c3901f-86ca-566a-9eb5-5ab04e06a5cf - Status: accepted at 2024-10-09T17:49:16.788703
Job: 757e5fdd-be4d-5def-a228-1fbcca1ce7cb - Status: accepted at 2024-10-09T17:49:16.995631
Job: f5eb5d9f-c1d2-5882-8f65-f90d7984eff2 - Status: accepted at 2024-10-09T17:49:17.012637
Job: 00c3901f-86ca-566a-9eb5-5ab04e06a5cf - Status: running at 2024-10-09T17:49:27.451889
Job: 757e5fdd-be4d-5def-a228-1fbcca1ce7cb - Status: running at 2024-10-09T17:49:27.754149
Job: f5eb5d9f-c1d2-5882-8f65-f90d7984eff2 - Status: running at 2024-10-09T17:49:27.763400
Job: 00c3901f-86ca-566a-9eb5-5ab04e06a5cf - Status: running at 2024

<div class="alert alert-info" role="alert">
    
## 5. <a id='section4'></a>Create the Datacube from EOCanvas Output  
[Back to top](#totop)
    
</div>

Each output file corresponds to one processed Sentinel-3 image. In this step, we open each result file sequentially and add a time dimension to represent the acquisition time of each image.


In [None]:
import glob
import xarray as xr
import pandas as pd

datasets = []

for file in glob.glob("result/datacube_test/*.nc"):
    ds = xr.open_dataset(file)
    timestamp = pd.to_datetime(ds.attrs['start_time'])
    ds = ds.expand_dims(time=[timestamp])
    #ds = ds.drop_vars([var for var in ds.variables if var not in ['lat', 'lon', 'time', 'chl_nn']], errors='ignore')
    datasets.append(ds)

We merge the files along latitude, longitude, and time dimensions to create a unified datacube, enabling spatial and temporal analysis across the dataset.


In [None]:
data_cube = xr.merge(datasets)

After merging, we assign a name to the datacube for easier reference and examine its structure. The resulting datacube contains the data from the three different OLCI products.

In [None]:
data_cube.attrs["title"] = "Senitnel-3 OLCI CHL_NN Data Cube"

In [16]:
data_cube

<div class="alert alert-info" role="alert">
    
## 6. <a id='section5'></a>Visualize the DataCube in the xcube Viewer  
[Back to top](#totop)
    
</div>

In this section, we visualize the datacube using the xcube viewer. For additional information and examples, check out the [WEkEO xcube viewer examples](https://github.com/wekeo/wekeo4data/tree/main/wekeo-xcube).


We set an environment variable to ensure that the xcube viewer displays correctly within the WEkEO JupyterHub. This step is only necessary if you are running this notebook inside the [WEkEO JupyterHub](https://jupyterhub.prod.wekeo2.eu/).


In [None]:
from xcube.webapi.viewer import Viewer 
os.environ["XCUBE_JUPYTER_LAB_URL"] = "https://jupyterhub.prod.wekeo2.eu/user/<your_username>/"

Here, we configure the viewer settings and choose a style for displaying the datacube. This includes setting parameters such as color maps and value range to enhance data visualization.


In [None]:
viewer = Viewer(server_config={
    "Styles": [
        {
            "Identifier": "chl_nn",
            "ColorMappings": {
                "t2m": {
                    "ValueRange": [0, 1],
                    "ColorBar": "viridis"
                },
            }
        }
    ]
})

In this cell, we add the datacube layer to the viewer and specify the display style for this layer. This allows us to customize how the data appears on the map.


In [19]:
viewer.add_dataset(data_cube, style="chl_nn")

'2e1d58bf-ef29-4d70-83de-96d15ca0a3c7'

Finally, we display the datacube in the viewer, enabling interactive exploration of the data across different time steps and spatial regions.

In [20]:
viewer.show()

404 GET /viewer/config/config.json (127.0.0.1): xcube viewer has not been been configured
404 GET /viewer/config/config.json (127.0.0.1) 51.47ms
  dim_name: cube_chunks.get(dim_name, cube.dims[dim_name])
