# Scenario 3: Choosing the Right Pesticide Data-Driven Crop Protection
This notebook is complementary material for the walkthrough scenario **"Choosing the Right Pesticide Data-Driven Crop Protection"** used the STELAR KLMS
It is not intended to be run as a standalone notebook. It **requires access to a deployment of STELAR KLMS** and an **account** on the respective instance. 

Some of the instances used during the evaluation period of the STELAR Project are:

Internal Pilot Instance: https://klms.stelar.gr

Public Sandbox Instance: https://sandbox.stelar.gr


*If you don't have an account on the STELAR KLMS, you can create one on the respective instance. 
Kindly note that the internal pilot instance is only accessible to STELAR project members, while the public sandbox instance is open to everyone by registration.*

---
# Overview

This notebook is intended to run the Agri Products Match which is used to match agricultural products with the user's pesticide data. The results will help users make informed decisions about pesticide use based on the specific crops they are growing.

### Prerequisites

- Fill in your accounts credentials in the block below. 
- Select datasets according to the walkthrough directions.
- Ensure you have a modern python version installed (3.9 or later).
- Install the STELAR Python SDK and any other required libraries (`pip install stelar_client --upgrade`).

### Instatiate a STELAR Client object
**Modify credentials and base URL as needed.**

In [None]:
from stelar.client import Client, Dataset, TaskSpec, Process
from datetime import datetime

# Base URL
# Sandbox: https://sandbox.stelar.gr
# Internal Pilots: https://klms.stelar.gr

BASE_URL = "https://sandbox.stelar.gr"
USERNAME = "your_username"  # Replace with your username
PASSWORD = "your_password"  # Replace with your password

c = Client(base_url=BASE_URL, username=USERNAME, password=PASSWORD)
print(f"Connected to STELAR KLMS @ {c._base_url} as {c._username}")

### Select with missing weather data and weather stations coordinates

In [None]:
weather_stations_dataset = c.datasets["lombardy-weather-data"]
print(f"Selected Dataset: {weather_stations_dataset.id} | {weather_stations_dataset.title}")
print(f"Browse the dataset at: {c._base_url}/console/v1/catalog/{weather_stations_dataset.id}")

### Create/Select a Workflow Process to run the forecasting task

In [None]:
ORGANIZATION = "stelar-klms"

try:
    c.processes.create(**{
        "title": "Evalution Workflow for " + c._username,
        "name": "evaluation-workflow-" + c._username,
        "organization": c.organizations[ORGANIZATION]
    })
    print(f"Created new process for evaluation: {c.processes[-1].id} | {c.processes[-1].title}")
except Exception as e:
    proc = c.processes["evaluation-workflow-" + c._username]
    print(f"Using existing process for evaluation: {proc.id} | {proc.title}")

### Create a dataset to store the results of the forecasting task

In [None]:
ORGANIZATION = "stelar-klms"

try:
    res_dset = c.datasets.create(**{
        "title": "Pesticides Matches for " + c._username,
        "name": "pesticides-matches-" + c._username,
        "organization": c.organizations[ORGANIZATION],
        "notes": "Pesticides Matches curated by " + c._username,
    })
    print(f"Created new dataset for matching pesticide prodcuts {c.datasets[-1].id} | {c.datasets[-1].title}")
except Exception as e:
    res_dset = c.datasets["pesticides-matches-" + c._username]
    print(f"Using existing dataset for matching pesticides: {res_dset.id} | {res_dset.title}")

### Prepare the Missing Data Interpolation task

In [None]:
# Start building the TaskSpec for Missing Data Interpolation
t = TaskSpec(tool="agri-products-match", name="MDI Weather Data for "+c._username)

# Define the local dataset aliases
t.d(alias='d0', dset=res_dset)

# Define the inputs
t.i(meteo_file=str(weather_stations_dataset.resources[0].id),
           coords_file=str(weather_stations_dataset.resources[1].id))

# Set the outputs
timestamp= datetime.now().strftime("%Y%m%d%H%M%S")
t.o(interpolated_file={
    "url": f"s3://klms-bucket/experiments/evaluation/proc-" + str(proc.id) + f"/interpolated_meteo_{timestamp}_"+c._username+".xlsx",
    "dataset": "d0",
    "resource": {"name": "Interpolated Meteo Station Data", "relation": "interpolated"}
})

# Run the task using the workflow process created earlier
predictions_task = proc.run(t)
print(f"Task {predictions_task.id} is running. Check the status at: {c._base_url}/console/v1/task/{str(proc.id)}/{str(predictions_task.id)}")