### Development and example of patch-to-point extraction workflow

In [1]:
import openeo
import pandas as pd 
import pystac
import pystac_client
import requests

from shapely.geometry import shape, MultiPolygon

from worldcereal.extract.patch_to_point_worldcereal import create_job_patch_to_point_worldcereal, get_label_points
from worldcereal.rdm_api import RdmInteraction

#### Create job dataframe

We will orchestrate the jobs by splitting per `ref_id` and `EPSG`. The below will be replaced by a `create_job_dataframe_patch_to_point_worldcereal` function (or a fancier name). For now we just create a dummy `pandas.DataFrame` containing all necessary columns (prone to change).

The `geometry_url` columns is obtained as follows:

- For each row in the dataframe, run `get_sample_points_from_rdm` from the `worldcereal.extract.patch_to_point_worldcereal` module
- Upload the resulting geodataframe to object storage or Artifactory (choose something stable)
- The resulting URL is added to the dataframe under the `geometry_url` column

In [2]:
job_df = pd.read_parquet('job_df.parquet')
job_df

Unnamed: 0,backend,start_date,end_date,epsg,ref_id,geometry_url,ground_truth_file
0,Terrascope,2020-10-01,2022-03-31,32736,2021_KEN_COPERNICUS-GEOGLAM-LR_POINT_111,,/vitodata/worldcereal/data/RDM/2021_KEN_COPERN...
1,Terrascope,2020-10-01,2022-03-31,32737,2021_KEN_COPERNICUS-GEOGLAM-LR_POINT_111,https://s3.prod.warsaw.openeo.dataspace.copern...,/vitodata/worldcereal/data/RDM/2021_KEN_COPERN...
2,Terrascope,2020-10-01,2022-03-31,32637,2021_KEN_COPERNICUS-GEOGLAM-LR_POINT_111,https://s3.prod.warsaw.openeo.dataspace.copern...,/vitodata/worldcereal/data/RDM/2021_KEN_COPERN...


#### Create job patch to point

Here we create the openEO process graph to be sent to the backend

In [4]:
row = job_df.iloc[-1]

requests.packages.urllib3.util.connection.HAS_IPV6 = False  # IPv6 is apparently disabled by sysadmins, so we need to disable it here too
connection = openeo.connect('openeo.vito.be').authenticate_oidc()

Authenticated using refresh token.


In [5]:
job = create_job_patch_to_point_worldcereal(
    row=row,
    connection=connection,
    provider=None,
    connection_provider=None,
    executor_memory='2g',
    python_memory='2g',
)

In [6]:
job.start_and_wait()

0:00:00 Job 'j-25042214475940a5b0854808799781af': send 'start'


0:00:19 Job 'j-25042214475940a5b0854808799781af': created (progress 0%)
0:00:24 Job 'j-25042214475940a5b0854808799781af': queued (progress 0%)
0:00:30 Job 'j-25042214475940a5b0854808799781af': queued (progress 0%)
0:00:38 Job 'j-25042214475940a5b0854808799781af': queued (progress 0%)
0:00:48 Job 'j-25042214475940a5b0854808799781af': queued (progress 0%)
0:01:01 Job 'j-25042214475940a5b0854808799781af': queued (progress 0%)
0:01:16 Job 'j-25042214475940a5b0854808799781af': queued (progress 0%)
0:01:35 Job 'j-25042214475940a5b0854808799781af': running (progress 11.4%)
0:01:59 Job 'j-25042214475940a5b0854808799781af': running (progress 14.4%)
0:02:29 Job 'j-25042214475940a5b0854808799781af': running (progress 17.9%)
0:03:06 Job 'j-25042214475940a5b0854808799781af': running (progress 21.9%)
0:03:53 Job 'j-25042214475940a5b0854808799781af': finished (progress 100%)
