# 🍏 Welcome to the **Smart Droplets Hackathon**!

![image](assets/sd_logo.png)

The Smart Droplets project is an EU-funded project, focusing on achieving reduced pesticide and fertilizer use with techniques of Digital Twins and Reinforcement Learning. One of the pilot projects of Smart Droplets is about the reduction of the **apple scab** pest.
In commercial apple production, **apple scab (_Venturia inaequalis_)** is the most economically important disease. Growers traditionally rely on heuristics such as **calendar-based fungicide programs**, which can lead to unnecessary sprays, resistance, and environmental impact.
In this hackathon we’ll flip that paradigm: **you will train a reinforcement-learning (RL) agent, or an intelligent conditional agent, to decide _when_ (and _how much_) to spray, balancing disease risk with sustainability.**

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/WUR-AI/A-scab/blob/hackathon/hackathon/general_hackathon_notebook.ipynb)

---

## 🧩 The Challenge

**Goal**: **Learn an optimal and adaptive spraying policy.**

**How to achieve that?**: Use **RL** to interact with the **A-scab** disease simulator. Your agent chooses daily actions (“spray how much” vs. “don’t spray”) through an entire season, to reduce the risk of breakout.

**Why is this important?** Smarter timing reduces chemical use, lowers costs, and lowers environmental impact while keeping orchards healthy to reduce yield loss.

Success is measured by a **cumulative reward**:

$R_t = - Risk_{t} - \beta P_t$

* $Risk$ is **Infection risk** – cumulative infection severity at harvest
* $\beta$ is **trade-off coefficient** - coefficient describing economic and ecological price of pesticides
* $P$ is **Pesticide amount** – the amount of pesticide sprayed

A higher score is better -- closer to zero!

---

## 🌱 The Environment: **Ascabgym**

Ascabgym is a stochastic, weather-driven simulation of apple scab dynamics, adapted from the A-scab model.
At each daily time step, an agent selects a pesticide dosage based on observations related to fungal state, weather conditions, and host susceptibility.
We provide a **Gymnasium-style API**, enabling easy integration with any reinforcement learning library (e.g., Stable-Baselines3, RLlib, CleanRL).

---

## 🏆 Evaluation & Leaderboards

1. **Local validation** – run episodes on the provided weather years (fast iteration).
2. **Public leaderboard** – on submission, Kaggle simulates **eight hidden seasons** from a particular location and reveals your average score.

> **Tip:** Aim for generalisation, not leaderboard-hacking!

---


## 🚀 First things first

Install all the necessary packages. Feel free to install the RL framework you feel most comfortable working with. By default, you can use Stable Baselines 3. Make sure to install the hackathon version of the A-scab repository.

In [None]:
from mpmath import hyper
import os
import sys
import subprocess # Added for subprocess

# Clone the repository
!rm -rf A-scab/
!git clone -b hackathon https://github.com/WUR-AI/A-scab.git

# Change directory
%cd A-scab

# Get the absolute path of the project root *after* changing directory
PROJECT_ROOT = os.getcwd()

# Install poetry if needed.
!pip install -qqq poetry

# Configure poetry to create virtual environments in the project directory
!poetry config virtualenvs.in-project true

# Install project dependencies
# This step may take some time (e.g., 5 minutes)
!poetry install --quiet --all-extras

# These are things you don't need to know about for now :)
# 1. Add the project root to sys.path
if PROJECT_ROOT not in sys.path:
    sys.path.insert(0, PROJECT_ROOT)
    print(f"Added project root {PROJECT_ROOT} to sys.path")

# 2. Get the path to the site-packages directory of the poetry virtual environment
try:
    venv_path_output = subprocess.check_output(['poetry', 'env', 'info', '--path']).decode('utf-8').strip()
    python_version_major_minor = f"python{sys.version_info.major}.{sys.version_info.minor}"
    SITE_PACKAGES_PATH = os.path.join(venv_path_output, 'lib', python_version_major_minor, 'site-packages')
except Exception as e:
    print(f"Could not determine poetry venv site-packages path using 'poetry env info': {e}")
    print("Attempting a common fallback path structure...")
    SITE_PACKAGES_PATH = os.path.abspath(os.path.join('.venv', 'lib', f'python{sys.version_info.major}.{sys.version_info.minor}', 'site-packages'))

# 3. Add the site-packages directory to sys.path
if os.path.exists(SITE_PACKAGES_PATH) and SITE_PACKAGES_PATH not in sys.path:
    sys.path.insert(0, SITE_PACKAGES_PATH)
    print(f"Added {SITE_PACKAGES_PATH} to sys.path")
else:
    print(f"Warning: Could not find site-packages at {SITE_PACKAGES_PATH} or it's already in sys.path.")

In [None]:
# install further packages using poetry, to keep package dependency intact
# in general the syntax is `poetry add PACKAGE_NAME`, then install with `poetry install`.
# below is an example how to install (rllib)[https://docs.ray.io/en/latest/rllib/rllib-algorithms.html

# note: you don't have to install it now for this notebook!
!poetry add ray[rllib]
!poetry install

 Initialize the gymnasium environment by importing the necessary methods. You must do `import ascab` in this hackathon!

This may take 1 minute or so :)

In [None]:
import os
import ascab
import gymnasium as gym

For this hackathon, we have provided pre-registered environments that you will only need the ID to call. For example, the code below will construct the AscabGym environment!

In [None]:
ascab_train = gym.make('AscabTrainEnv-Discrete')

The code below will be your validation gym environment, i.e., the place where you test your trained agents! (Note the different environment id)

In [None]:
ascab_val = gym.make('AscabValEnv-Discrete')

Additionally, if you don't plan on training an RL agent, please use the environment below:

In [None]:
ascab_val_nonrl = gym.make('AscabValEnv-Continuous-NonRL')

# Let's get to know the Ascab environment! (_Gym speed dating?_)

We're gonna introduce to you the _action space_, the _observation space_ and the _goal_ of the A-scab gym environment!

### Action space, A.K.A what your agent can do!

with the code below, you can check what the agent can do in each timestep!

In [None]:
print(ascab_train.action_space)

Notice that it is a discrete action space of 6? This corresponds to the following amounts: $ A = \{0.2\,i\,|\,i=0,1,2,\dots,6\}$.

You can also use a continuous action spaces, which will have actions of [0, 1]! That will allow the agent to have a more fine-grained decision when spraying.

_hint: initialize the environment with `gym.make(AscabTrainEnv-Continuous')`_

Let's try checking what the agent can do: let's sample an action:


In [None]:
# Try running this cell a few times: you should see different things the agent can do in the AscabGym!
print(ascab_train.action_space.sample())

See different numbers pop up? What do they mean? Remember the formula above! 1 means the agent sprays 0.2 pesticide at that day, and so on. 0 just means the agent did not spray at all.

### Observation space, A.K.A what your agent can see!

The in general, the agent can see these following features:

#### *Fungus (n = 2)*

| Feature           | Description                                                      | Units |
|-------------------|------------------------------------------------------------------|-------|
| InfectionWindow   | An indicator of whether the simulation goes in the risk period   | -     |
| SRA_discharge     | Portion of ascopores becoming airborne during a discharge event  | -     |

#### *RL Agent (n = 4)*

| Feature          | Description                                           | Units |
|------------------|-------------------------------------------------------|-------|
| AppliedPesticide | Total amount of applied pesticide                     | -     |
| ActionHistory    | Total number of spraying events in the growing season | -     |
| SinceLastAction  | Number of days since last spraying event              | -     |
| β (beta)         | Trade-off coefficient[^a]                             | -     |

#### *Host (n = 1)*

| Feature         | Description                            | Units |
|------------------|----------------------------------------|-------|
| LAI              | Leaf Area Index                        | -     |

#### *Weather (n = 15)*

| Feature         | Description                                                                 | Units   |
|------------------|-----------------------------------------------------------------------------|---------|
| LWD              | Leaf wetness duration                                                       | hours   |
| Precip           | Amount of precipitation                                                     | mm      |
| Temp             | Average temperature                                                         | °C      |
| HasRainEvent     | Presence of significant rainfall (≥ 0.2 mm/h)                               | -       |
| HighHumDur       | Hours of high humidity (≥ 85%)                                              | hours   |

---

[^a]: Used in the reward function.
[^b]: Weather variables are computed for the current day and the two-day forecast (5 variables × 3 days), aggregated over 24 hours.

You can also check this through code:


In [None]:
print(ascab_train.observation_space)

A bit confusing maybe, but each dictionary key represents the features, or what your RL agent can "see". It directly maps from the table above!

### Cool! But what's the goal here? _How do I $\mathcal{W}\mathcal{I}\mathcal{N}$_? 🧐

Oh OK, eager are we? To win, you have to minimize risk by spraying pesticide! But also, don't spray too much. The environment will not like that.

You're going to train (or create) a decision-making agent that optimally sprays pesticide to minimize risk! Check again the reward formula from above:

$R_t = - Risk_{t} - \beta P_t$

* $Risk$ is **Infection risk** – cumulative infection severity at harvest
* $\beta$ is **trade-off coefficient** - coefficient describing economic and ecological price of pesticides
* $P$ is **Pesticide amount** – the amount of pesticide sprayed

Ultimately, you want to minimize cumulative risk per year!

Let's get a hands-on to see how reward works:

In [None]:
# reset the environment
_, _ = ascab_train.reset()

# Now let's try doing an action of spraying half-amount
action = 3

_, reward, _, _, info = ascab_train.step(action)

print(f"I sprayed {action} and got a reward of {reward:.03} :(")

Did you see a negative reward signal from spraying? Too bad... But we actually need to spray to minimize risk of infections. But not too much! Your task is then to find this balance. Feel free to change the value of action and see how it changes the reward you get.

Try running the next code block to see how risk affects reward:

In [None]:
# reset the environment
_, _ = ascab_train.reset()

# let's just not do anything for now and see what happens in the environment
action = 0

_, reward, _, _, info = ascab_train.step(action)
# let's loop until we get some risk going on:
while info['Risk'][-1] < 0.01:
    _, reward, _, _, info = ascab_train.step(action)

print(f"I did not spray and now I see some apple scab on the leaves!\n"
      f"I now have a risk of {info['Risk'][-1]:.03} and got a reward of {reward:.03} :(")

Uh oh, with high risk there is a chance of yield loss, and nobody wants that! Now that you understand how the reward function works, let's build an agent that can learn when to optimally spray!~

# Example of training your own agent

Here we show you how to train your own agent! You can either
1. Train an RL agent with your own framework.
or
2. Create your own intelligence conditional agent!

Scroll below for further instructions.

### Training with Stable Baselines 3

Below we provide an example of training your RL-based pesticide expert with the DQN algorithm, provided by the Stable Baselines 3 algorithm. You can use it by using our defined `RL_Agent` class! Otherwise, you are free to create your own class by subclassing `Base_Agent`. Confused? Feel free to ask us!

In [None]:
from ascab.train import RLAgent
from stable_baselines3 import DQN

os.makedirs(os.path.join(os.getcwd(), 'log'), exist_ok=True)

log_dir = os.path.join(os.getcwd(), 'log')

# Let's try using these hyperparameters
hyperparameters = {"learning_rate": 0.0001, 'gamma': 1}


Here, we first instantiate tensorboard, so you can track the training progress of your RL agents!

In [None]:
# Start tensorboard.
# .... keep on hitting the "refresh" icon (the circle-with-arrow) during training

%load_ext tensorboard
# or %reload_ext tensorboard if you loaded it already


In [None]:
 # change this dot if you want ot point to a specific folder!
%tensorboard --logdir .

In [None]:
# now train!
rl_agent = RLAgent(
    ascab_train=ascab_train,  # train in the training environment
    ascab_test=ascab_val,
    observation_filter=list(ascab_train.observation_space.keys()),
    render=False,
    path_model=os.path.join(os.getcwd(), 'rl_agent'),
    path_log=os.path.join(os.getcwd(), 'log'),
    rl_algorithm=DQN,  # feed in the call function of the Stable Baselines 3 model
    seed=107,  # use random seed if you like to
    n_steps=50_000,  #train it for 50k steps. NOTE: This is nowhere near enough for an agent to learn
    hyperparameters = hyperparameters,
)

During training, you can check out its running performance with tensorboard! Scroll up a bit and check out its performance. Keep hitting that refresh button!

### "Do I _have_ to use Stable Baselines 3?"
If you don't want to use the Stable Baselines 3 framework, you can of course start training using any RL framework you prefer, starting by using the `ascab_train` gym environment defined above.
There's a bunch available! Some popular ones are:

- [CleanRL](https://docs.cleanrl.dev/rl-algorithms/overview/)
- [RLlib](https://docs.ray.io/en/latest/rllib/rllib-algorithms.html)
- [Google Dopamine](https://github.com/google/dopamine)
- [Meta's PeaRL](https://pearlagent.github.io)
- [Tianshou](https://tianshou.org/en/stable/)
- or go hardcore with [PyTorchRL](https://docs.pytorch.org/rl/stable/index.html) 😎

Quite some choices available huh? But don't worry, this is just a matter of taste. Each algorithm has their own pros and cons. If you want to go simple, [StableBaselines3](https://stable-baselines3.readthedocs.io/en/master/guide/algos.html) is more than enough as a starting point!

### "What if I don't want to use RL?" No problem, we got you!
It's nevertheless possible to create your own non-RL agent; a conditional agent!
One example of a conditional agent could be a spraying schedule based on weather forecasts. This strategy is typically employed by farmers. Here's an example of how to do it:

### Example of creating a conditional agent

In [None]:
from datetime import datetime
from ascab.train import BaseAgent
from ascab.env.env import AScabEnv

# First define the class, not forgetting to SubClass `BaseAgent`
# In general, you want to change the `get_action` method to apply your conditional spraying strategy
class HowMyLocalFarmerSprays(BaseAgent):
    def __init__(
        self,
        ascab: AScabEnv = None,
        render: bool = True,
    ):
        super().__init__(ascab=ascab, render=render)

    def get_action(self, observation: dict = None) -> float:
        # The code below means: "If it is forecasted that it will rain in two days, I will spray today".
        if self.ascab.get_wrapper_attr("info")["Forecast_day2_HasRain"] and self.ascab.get_wrapper_attr("info")["Forecast_day2_HasRain"][-1]:
            return 1.0  # the agent sprays this much if is forecasted to rain in two days
        return 0.0


Then, you can try and run your expert strategy in the validation environment!

In [None]:
farmer_strategy = HowMyLocalFarmerSprays(ascab_val_nonrl, render=False)

# Use the class method below to run your agent!
farmer_strategy_results = farmer_strategy.run()

Also, you can evaluate your trained DQN RL agent the same way:

In [None]:
rl_results = rl_agent.run()

### Want to check out how your agent did?

Use the code below to plot results after using the `.run()` method!

In [None]:
from ascab.utils.plot import plot_results


# First, make a dictionary of your agents
dict_to_plot = {"RL DQN":rl_results,
                "MyFarmer":farmer_strategy_results,}

plot_results(dict_to_plot,
             save_path=os.getcwd(),
        )

These graphs are quite dense, but it shows important information! It shows 6 features for the whole season. The top 3 are precipitation in millimetres, fungus development (this is the risk period!), and discharge events of the fungus.

On the right you can see a zoomed in version of the bottom three during the risk period. The features are pesticide levels on the tree, the risk index of the season and the pesticide spraying actions.

The label below shows the total reward each agent achieved. Zero is the highest reward.

##### So, how did your agent(s) do? Not satisfied? Try another strategy or RL agent!

## Want to train in your own machine? Or train in a super-computer? Piece of cake!

Just install everything there, as we show in the first code block of this notebook. Keep in mind to use the `hackathon` branch from the A-scab repository, and download the training and testing data from the kaggle competition. Make sure to put both `train.csv` and `val.csv` are under `..\A-scab\dataset\`.

# Is there a model to beat? How do I know I'm doing good?

Good question! Below we show the performance of an RL agent that Hilmy trained. The table below shows the validation environment.

#### Below we show the performance of an RL agent that Hilmy trained, evaluated on the `AscabValEnv-Discrete` env for every year


| No  | 2017 | 2019 | 2021 | 2023 |
| --- | -------- | -------- | -------- | -------- |
| 1   | -0.07      | -0.09       | -0.04        | -0.06       |

It does quite well huh? When trained properly, RL can do great things!

# 🔑🔑🔑 Important things below! 🔑🔑🔑

Last but not least, while we would like for you to enjoy the hackathon process, there are a few hard rules we would like to enforce:
1. No sharing agents between teams.
2. All RL models must be trained in the given environment(s). No training with additional data or features!
3. Conditional agents (and therefore RL agents) are not permitted to use two features:
    - `"Risk"`, this is the target :)
    - `"Pesticide"`, this makes the agent(s) cheat a bit, since it will know how much pesticide is left in the canopy.

If you're wondering if Hilmy's RL agent adhered to rules 2 and 3 to get the above results: _yes, he did adhere to them_ 😎

We will manually check every model during testing and submission.

_Any violations could result in a disqualification from the leaderboard._

Specific questions? Feel free to ask us directly or send us a quick email!
michiel.kallenberg@wur.nl, hilmy.baja@wur.nl, zehao.lu@wur.nl

# Instructions for submitting your winning Agent

The testing will be done in an unseen location! Here are the instructions to submit your agent:

### For RL agents:

### For Conditional agents: