<img src='https://gitlab.eumetsat.int/eumetlab/oceans/ocean-training/tools/frameworks/-/raw/main/img/Standard_banner.png' align='right' width='100%'/>

<font color="#138D75">**NERO Winter School training**</font> <br>
**Copyright:** (c) 2025 EUMETSAT <br>
**License:** GPL-3.0-or-later <br>
**Authors:** Dominika Leskow-Czyżewska (EUMETSAT), based on <a href='https://github.com/eu-cdse/notebook-samples/blob/main/openeo/Batch_job.ipynb'>OpenEO Copernicus Data Space Ecosystem Sample notebooks</a>.

# Sentinel-2 full resolution imagery - Using openEO Batch Jobs To Run Large and Heavy Workflows


Most of the simple, basic openEO usage examples show synchronous execution of process graphs:
you submit a process graph with a HTTP request and receive the result as direct response of that same request. 
This is only feasible if the processing doesn’t take too long (a couple of minutes at most).


For the heavier work, covering large regions of interest, long time series, more intensive processing, etc, you have to use batch jobs.

This notebook shows how to programmatically create and interact with batch job using the openEO Python client library.

## Set up

Import `openeo` package and establish an authenticated connection to Copernicus Data Space Ecosystem openEO back-end.

In [1]:
import openeo
from os import chmod

  _set_context_ca_bundle_path(ca_bundle_path)


In [3]:
# Uncomment the line below the the cell below encounters an error
# chmod('/home/jovyan/.local/share/openeo-python-client/refresh-tokens.json', 0o600)

In [4]:
connection = openeo.connect(url="openeo.dataspace.copernicus.eu")
connection.authenticate_oidc()

Authenticated using refresh token.


<Connection to 'https://openeo.dataspace.copernicus.eu/openeo/1.2/' with OidcBearerAuth>

## Build data cube

Start with a simple data cube: small spatiotemporal slice of `SENTINEL2_L2A` data:

In [5]:
#  lat-lon geographical bounds of search area
W = 23.3
S = 37.8
E = 24.5
N = 38.5

start_date = "2024-08-07"
end_date = "2024-08-18"

run_name = "varnavas_example"
output_dir = './example_data/'

bands = ["B08", "B04", "B03", "B02", "B12"]

In [6]:
datacube = connection.load_collection(
    "SENTINEL2_L2A",
    bands=bands,
    temporal_extent=(start_date, end_date),
    spatial_extent={
        "west": W,
        "south": S,
        "east": E,
        "north": N,
        "crs": "EPSG:4326",
    },
    max_cloud_cover=100,
)

Set up output format to be GeoTIFF:

In [7]:
job = datacube.save_result(format="GTiff")

## Run as Batch Job

The easiest way to run our processing as a batch job is using the `execute_batch()` helper,
which takes care of creating a batch job, starting it, and keep polling its status until it's finished (or failed).

While not necessary, it is recommended to give your batch job a descriptive title so it’s easier to identify in your job listing.

In [8]:
job = job.execute_batch(title="Slice of S2 data")
print("Download complete!")

0:00:00 Job 'j-2502191742104f609ad6db65dac8d066': send 'start'
0:00:13 Job 'j-2502191742104f609ad6db65dac8d066': created (progress 0%)
0:00:18 Job 'j-2502191742104f609ad6db65dac8d066': created (progress 0%)
0:00:24 Job 'j-2502191742104f609ad6db65dac8d066': created (progress 0%)
0:00:32 Job 'j-2502191742104f609ad6db65dac8d066': created (progress 0%)
0:00:42 Job 'j-2502191742104f609ad6db65dac8d066': created (progress 0%)
0:00:54 Job 'j-2502191742104f609ad6db65dac8d066': running (progress N/A)
0:01:10 Job 'j-2502191742104f609ad6db65dac8d066': running (progress N/A)
0:01:29 Job 'j-2502191742104f609ad6db65dac8d066': running (progress N/A)
0:01:53 Job 'j-2502191742104f609ad6db65dac8d066': running (progress N/A)
0:02:23 Job 'j-2502191742104f609ad6db65dac8d066': running (progress N/A)
0:03:00 Job 'j-2502191742104f609ad6db65dac8d066': running (progress N/A)
0:03:47 Job 'j-2502191742104f609ad6db65dac8d066': running (progress N/A)
0:04:45 Job 'j-2502191742104f609ad6db65dac8d066': running (progres

If you need a bit more control over the lifetime of a batch job, 
you can do each step manually, e.g. 
- create a job with `job = cube.create_job()`
- start a job with `job.start_job()`
- wait until `job.status()` reaches `"finished"`


## Optional: Inspecting a Job

A batch job on a back-end is fully identified by its job id. Especially if a job fails, it might be useful to check the logs to understand why it was the case.

In case of the job we created above:

In [9]:
job.job_id

'j-2502031342484645a2d4283c99fcad56'

It's recommended to properly take note of the batch job id.
It allows you to “reconnect” to your job (using `connection.job(job_id)`) on the back-end, 
even if it was created at another time, by another script/notebook or even with another openEO client.


A batch job typically takes some time to finish, and you can check its status with the `status()` method.

In [7]:
job.status()

'finished'

Batch job logs can be fetched with `job.logs()`. If you prefer a graphical, web-based interactive environment to manage and monitor your batch jobs, feel free to switch to an openEO web editor like [openeo.dataspace.copernicus.eu](https://openeo.dataspace.copernicus.eu/) at any time.

In [23]:
job.logs()

## Fetch Batch Job Results

The result of a finished batch job consists of several elements:
- a STAC-compatible description (metadata) of the batch job results
- one or more output files (e.g. multiple GeoTIFF or netCDF assets)

You can get a handle to these results with  `get_results()`:

In [16]:
results = job.get_results()

In the general case, when you have one or more result files (also called “assets”), the easiest option to download them is using `download_files()` (plural) where you just specify a download folder (otherwise the current working directory will be used by default).

In [17]:
output_dir = os.path.join(output_dir, run_name, 'Satellite_Imagery', 'S2_DAILY')
results.download_files(output_dir)

[PosixPath('test/testrun/Satellite_Imagery/S2_DAILY/openEO_2024-08-07Z.tif'),
 PosixPath('test/testrun/Satellite_Imagery/S2_DAILY/openEO_2024-08-12Z.tif'),
 PosixPath('test/testrun/Satellite_Imagery/S2_DAILY/openEO_2024-08-17Z.tif'),
 PosixPath('test/testrun/Satellite_Imagery/S2_DAILY/job-results.json')]

<hr>

<p style="text-align:left;">This project is licensed under the <a href="./LICENSE.TXT">GPL-3.0-or-later</a> license <span style="float:right;"><a href="https://gitlab.eumetsat.int/eumetlab/atmosphere/trainings/nero-winter-school-2025">View on GitLab</a> | <a href="https://classroom.eumetsat.int/">EUMETSAT Training</a> | <a href=mailto:ops@eumetsat.int>Contact</a></span></p>