### Installing openEO client libraries

To use openEO, you need to make sure that the openEO client libraries are installed. The library is available on pypi, so it can be installed with pip: 
https://pypi.org/project/openeo/

We recommend using at least Python 3.6 so in the notebook environment, the install command is:

In [1]:
!pip3.6 install --user openeo

Looking in indexes: https://artifactory.vgt.vito.be/api/pypi/python-packages/simple
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [2]:
## loading of the required functions 

import openeo
import json
import os
import geopandas as gpd
import pandas as pd

### Create openeo session

In [3]:
endpoint = os.environ.get("ENDPOINT", "https://openeo.vito.be")
openeo_url = "{e}/openeo/{v}".format(e=endpoint, v="1.0.0") 
openeo_session = openeo.connect(openeo_url).authenticate_oidc(provider_id="egi") # The first time you will try to connect to openeo you will need to link your Terrascope account to EGI credentials. Follow the procedure list below to allow this. 
#if you want more info, please consult: https://docs.terrascope.be/#/Developers/WebServices/OpenEO/OpenEO?id=logging-in 

OIDC token response did not contain refresh token.


Authenticated using refresh token.


### Description on about the harvest detection service

### Harvest detector summary
The **Harvest detector** service is used to automatically predict harvest dates from a set of input geometries. The harvest date prediction gives per field the amount of harvest events within the requested time period. 


### Methodology
A trained neural network model is used to automatically predict harvest dates based on input timeseries of Sentinel data. Both timeseries generated from **Sentinel-1 and Sentinel-2** are used to ingest in the model, more specifically the **VH/VV ratio**, the **daily fAPAR (from CropSAR)**  and a **Sentinel-2 metric sensitive to crop residue(NBR2)** are used. The model is only trained on detecting the moment the **above ground biomass** is removed from the field. 

### Quality
The model was trained on field data from Belgium, Italy & Greece with a focus on potato, maize and flax and cover crops. Overall, the harvest date could be detected within 10 days accurate based on validation of some reference fields, but for cover crops the uncertainty may be somewhat larger.
However, it can be expected that for certain crop types and areas with different field management around harvest the performance might be less accurate.
Furthermore, if the harvest of a certain crop does not match with the removal of all above ground biomass, the harvest detector will detect the actual removal of all biomass. This is for example the case for cotton.

### Disclaimer
The time range parameter should span more than 2 months and preferably centered around the expected harvest date otherwise the performance will drop.
Requesting harvest date prediction in near-real time modus is not yet supported.
The users must **inwards buffer** their geometries beforehand to reduce the risk of including to match noise into the timeseries.
The buffered geometries should be larger than 20x20 m.

### Links

For a more detailed description on the fundaments of the approach, please consult:

[Link]https://blog.vito.be/remotesensing/what-happens-on-the-fields-monitoring-the-crop-calendars"

### Version 
Version from October 2022 based on the work for E-shape by the addition of cover crops in the model training by making use of the in-situ data collected by the CropObserve app.

### Required parameters

* **date**: The period for which satellite data should be retrieved and used to estimate harvest date(s)
  * e.g. "2017-01-01","2019-12-31".  
* **polygon**: Featurecollection of polygons for which harvest should be estimated

### Output

The output of the service is a JSON format with all the harvest date(s) that are detected for each field.In case no harvest event could be detected, it is set to 'None'. Each field will get a unique label in the outut JSON, based on the order the input fields are read when opening the GEOJSON file. For example, the harvest date of the first read field can be found in the JSON under the key 'Field_0' as Python starts counting by zero. For the next field it can be found under key 'Field_1' etc. (see example further below.

In [4]:
# Define input parameters
 
time_range = "2021-10-31", "2022-05-18"

# now we only need to still specify the polygons for which the predictions should be made
# please first upload the geometry files to the Private/Public folder in the notebook environment!

ID_identifier = 'OBJECTID' # the column name of the input geometries files that contains the unique names of the fields. 
base_public_folder = os.path.join('/data', 'users', 'Public', os.environ['USER']) # use the public folder to store the input geometries so that they can be loaded from there
output_folder = 'Harvest_output_fields_demo' # name of the subfolder in the working directory that will be used to store the results
os.makedirs(os.path.join(os.getcwd(), output_folder), exist_ok = True)


file_geom_path = r'Public/e_shape/Notebooks/test_fields/test_fields.shp' #Copy the file path from the available shapefile(s) on the left side of the screen. Do this by right clicking on the file and click on 'Copy Path'
file_geom_path = file_geom_path.replace('Public/','') # remove redundant Public name in directory
full_geom_path = os.path.join(base_public_folder, file_geom_path)

# now that we have the path, the shapefile can be loaded and coverted to the desired format to send to openEO
gpd_shp = gpd.read_file(full_geom_path)
field_ids = gpd_shp[ID_identifier].to_list()
polygons = json.loads(gpd_shp.to_json()) # Mandatory parameter

Harvest_process = openeo_session.datacube_from_process("Harvest_detector", namespace="https://openeo.vito.be/openeo/1.0/processes/u:bontek/Harvest_detector"
                                                         , date=time_range ,polygon = polygons) #create the process graph for the service


#print(Harvest_process.graph)


# Obtain result

### option 1 -> When executing for many fields over a long time period (do asynchronous call)
Harvest_result = Harvest_process.send_job().start_and_wait().get_result().load_json() # # once the job is launched it will take a while before you will get the result so wait until in the screen  below the message 'finished' pops-up. First you will see 'queued' -> 'running' -> 'finished'
    # in case the message is 'error' -> Please try again to run the job once again, if still fails please reach out.  

### option 2 -> When executing for only few fields and for a short time period, a synchronous call can be done which is quite fast

#Harvest_result = Harvest_process.execute()

  complain("No cube:dimensions metadata")


0:00:00 Job 'j-8e428dc60f1b4ada988e801e81015453': send 'start'
0:00:31 Job 'j-8e428dc60f1b4ada988e801e81015453': queued (progress N/A)
0:00:36 Job 'j-8e428dc60f1b4ada988e801e81015453': queued (progress N/A)
0:00:43 Job 'j-8e428dc60f1b4ada988e801e81015453': queued (progress N/A)
0:00:51 Job 'j-8e428dc60f1b4ada988e801e81015453': queued (progress N/A)
0:01:01 Job 'j-8e428dc60f1b4ada988e801e81015453': queued (progress N/A)
0:01:14 Job 'j-8e428dc60f1b4ada988e801e81015453': queued (progress N/A)
0:01:30 Job 'j-8e428dc60f1b4ada988e801e81015453': running (progress N/A)
0:01:49 Job 'j-8e428dc60f1b4ada988e801e81015453': running (progress N/A)
0:02:14 Job 'j-8e428dc60f1b4ada988e801e81015453': running (progress N/A)
0:02:44 Job 'j-8e428dc60f1b4ada988e801e81015453': running (progress N/A)
0:03:22 Job 'j-8e428dc60f1b4ada988e801e81015453': running (progress N/A)
0:04:09 Job 'j-8e428dc60f1b4ada988e801e81015453': running (progress N/A)
0:05:07 Job 'j-8e428dc60f1b4ada988e801e81015453': running (progress



### Store the output result

Below the link between the predicted harvest date and the field ids will be made again. 
The index of dataframe contains the ID of the field.

In [5]:
Harvest_dates = []
for field in Harvest_result.keys():
    harvest_field = Harvest_result.get(field)
    if not harvest_field:
        harvest_field = [None]
    Harvest_dates.append(harvest_field)
    
df_harv_result = pd.DataFrame(data = Harvest_dates, index = field_ids, columns = ['Harvest_date'])     
df_harv_result.to_csv('Harv_result.csv', index = True) # This stores the CSV file in the same folder as the notebook is stored
print(df_harv_result)

        Harvest_date
318575  [2022-03-30]
330692            []
76880   [2022-02-28]
162529  [2022-04-17]
227385  [2022-03-06]
328898  [2022-04-11]
433449  [2022-03-06]
436653  [2022-04-17]
472259            []
477289  [2022-03-18]
575251  [2022-01-23]
560832  [2022-02-04]
