# FarmVibes.AI Harvest Period

This notebook demonstrates how to infer germination and harvest periods based on NDVI timeseries. It makes use of an existing workflow that computes a NDVI timeseries for a given area.

To install the required packages, see [this README file](../README.md).

### Notebook outline
The user provides a geographical region and a date range of interest, which are input to a FarmVibes.AI workflow. The workflow consists of fetching Sentinel-2 data for the corresponding region and time, running cloud detection algorithms to obtain cloud-free imagery, and computing daily NDVI indexes at 10m resolution.   

Below are the main libraries used for this example and other useful links:
- [Shapely](https://github.com/shapely/shapely) is a library for manipulating geometric shapes.
- [Pandas](https://pandas.pydata.org/) is a library for manipulating tabular data.

### Imports & Constants

In [None]:
# Utility imports
from datetime import datetime
from shapely import wkt
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd

# FarmVibes.AI imports
from vibe_core.client import get_default_vibe_client

# FarmAI workflow name and description
WORKFLOW_NAME = "farm_ai/agriculture/ndvi_summary"
RUN_NAME = "ndvi summary"

### Generate the NDVI dataset with FarmVibes.AI platform

Let's define the region and the time range to consider for this task:
- **Region:** FarmVibes.AI platform expects a `.wkt` file with the polygon of the ROI (an example `input_region.wkt` is already provided, representing a field chosen at random in Iowa);
- **Time Range:** we define the range as a tuple with two datetimes (start and end dates). In the example below, we will analyze NDVI observations from 1st of April until the end of October. Any informed period must cover an entire crop season;

In [None]:
input_geometry_path = "./input_region.wkt"
time_range = (datetime(2020, 4, 1), datetime(2020, 10, 30))

# Reading the geometry file 
with open(input_geometry_path) as f:
    geometry = wkt.load(f)

For the germination and harvest period task, we will run the `farm_ai/agriculture/ndvi_summary` workflow.
To build the dataset, we will instantiate the FarmVibes.AI remote client and run the workflow:

In [None]:
# Instantiate the client
client = get_default_vibe_client()

In [None]:
# Start the workflow
wf_run = client.run(WORKFLOW_NAME, RUN_NAME, geometry=geometry, time_range=time_range)

`wf_run` is a `VibeWorkflowRun` that holds the information about the workflow execution. A few of its important attributes:
- `wf_run.id`: the ID of the run
- `wf_run.status`: indicate the status of the run (pending, running, failed, or done)
- `wf_run.workflow`: the name of the workflow being executed (i.e., `WORKFLOW_NAME`)
- `wf_run.name`: the description provided by `RUN_NAME`
- `wf_run.output`: the dictionary with outputs produced by the workflow, indexed by sink names

In case you need to retrieve a previous workflow run, you can use `client.list_runs()` to list all existing executions and find the id of the desired run. It can be recovered by running `wf_run = client.get_run_by_id("ID-of-the-run")`.

We can also use the method `monitor` from `VibeWorkflowRun` to verify the progress of each op/inner workflow of our run.

In [None]:
wf_run.monitor()

Once finished, we can access the generated outputs through `wf_run.output`.

The list of outputs of the dataset generation workflow is:

In [None]:
wf_run.output.keys()

To access a specific output, we can do:

In [None]:
ndvi_timeseries = wf_run.output["timeseries"]

### Preprocess data
With the NDVI timeseries yielded by FarmVibes.AI, we will infer the germination and harvest periods based on the NDVI difference between two sucessive days. There are three parameters in this section:

- `ndvi_threshold`: upper limit for NDVI at the beginning of germination and ending of harvest periods. Default: 0.15
- `delta_threshold`: upper limit for NDVI difference between successive observations. Default: 0.1
- `rolling_window`: the number of NDVI observations on each rolling window step. Default: 14

#### The next steps will:
- load the CSV file in a Pandas dataframe
- smooth the timeseries using the rolling window method
- identify germination and harvest dates by looking for periods of small NDVI values (smaller than **ndvi_threshold**), and small variation between successive observations (df['delta_mean'] < **delta_threshold**). Obs.: select periods that provide at least ten observations.


In [None]:
ndvi_threshold = 0.15
delta_threshold = 0.1
rolling_window = 14

In [None]:
timeseries = wf_run.output["timeseries"]
df = pd.read_csv(timeseries[0].assets[0].path_or_url)
df

In [None]:
df['rolled_mean'] = df['mean'].rolling(window=3).mean()
df['delta_mean'] = df['rolled_mean'].diff()
df

In [None]:
df.loc[(df['rolled_mean'] < ndvi_threshold) & (df['delta_mean'].abs() <= delta_threshold) & (df['delta_mean'] < 0), 'harvest_period'] = ndvi_threshold
df.loc[(df['rolled_mean'] < ndvi_threshold) & (df['delta_mean'].abs() < delta_threshold) & (df['delta_mean'] > 0), 'germination_period'] = ndvi_threshold
df

In [None]:
df[['rolled_mean', 'harvest_period', 'germination_period']].plot()