# VERDE benchmarking tool
This is a simple example of the principle of benchmarking, applied to compare NDVI, Chlorophyll and Green Vegetation Cover Fraction between a chosen field and those in the surrounding area growing the same crop.
Given a field ID, this notebook will guide you through the process of retrieving information about that field and nearby fields inside a 2km radius.

## Using a GraphQL query
We will use the Agrimetrics GraphQL API to retrieve data about fields. Each query must be sent with a subscription key. For more information about finding and using your API key, see the [introduction to using GraphQL](../graphql-examples/using_graphql_intro.ipynb) and the [Agrimetrics developer portal](https://developer.agrimetrics.co.uk).

In [2]:
import os

GRAPHQL_ENDPOINT = "https://api.agrimetrics.co.uk/query-api/v1/graphql"

if "API_KEY" in os.environ:
    API_KEY = os.environ["API_KEY"]
else:
    API_KEY = input("Query API Subscription Key: ").strip()

Let's start by retrieving the location of a field known by its ID and data on the crops recorded as growin in that field. See the [introduction to using GraphQL](../graphql-examples/using_graphql_intro.ipynb) tutorial for more details on making queries.

In [3]:
import requests

FIELD_ID = 'https://data.agrimetrics.co.uk/fields/pE9QRHmyDMX9BzvtP8hvcg'
GRAPHQL_ENDPOINT = 'https://api.agrimetrics.co.uk/graphql'
headers = { 'Ocp-Apim-Subscription-Key': API_KEY, 'Content-Type': 'application/json', 'Accept-Encoding': 'gzip, deflate, br' }

response = requests.post(GRAPHQL_ENDPOINT, headers=headers, json={
    'query': '''
        query getFieldByID($fieldID: [ID!]!) {
            fields(where: {id: {EQ: $fieldID}}) {
                sownCrop {
                    harvestYear
                    cropType
                }
                shape
            }
        }
    ''',
    'variables': { 'fieldID': FIELD_ID }
})
response.raise_for_status()

The response is a JSON document, which we can query to find the field centroid and crop species:

In [4]:
FIELD_ID_info = response.json()
FIELD_ID_centroid = FIELD_ID_info['data']['fields'][0]['shape']['features'][0]['geometry']['coordinates']
FIELD_ID_crop_species = FIELD_ID_info['data']['fields'][0]['sownCrop'][-1]['cropType']
print('Chosen field centroid:', FIELD_ID_centroid, 'species:', FIELD_ID_crop_species)

Chosen field centroid: [-0.941086646, 51.410013151] species: GRASS


## Find all fields inside a 2km radius
Now we use a second GraphQL query to find all fields inside a 2km circle centred on the chosen field.

In [5]:
response = requests.post(GRAPHQL_ENDPOINT, headers=headers, json={
    'query': '''
        query getFieldsWithinRadius($centroid: CoordinateScalar!, $distance: Float!) {
            fields(geoFilter: {location: {type: Point, coordinates: $centroid}, distance: {LE: $distance}}) {
                id
                sownCrop {
                    cropType
                    harvestYear
                }
            }
        }
    ''',
    'variables': { 'centroid': FIELD_ID_centroid, 'distance': 2000 } # distance in m
})
response.raise_for_status()

The response can be converted to a pandas DataFrame:

In [6]:
import pandas as pd
results = response.json()
nearby_fields = pd.io.json.json_normalize(
    results['data']['fields'], 
    record_path=['sownCrop'],
    meta=['id'],
)

nearby_fields

Unnamed: 0,cropType,harvestYear,id
0,GRASS,2016,https://data.agrimetrics.co.uk/fields/-Dzkwq1l...
1,GRASS,2017,https://data.agrimetrics.co.uk/fields/-Dzkwq1l...
2,GRASS,2018,https://data.agrimetrics.co.uk/fields/-Dzkwq1l...
3,GRASS,2016,https://data.agrimetrics.co.uk/fields/0IQiYLak...
4,MAIZE,2016,https://data.agrimetrics.co.uk/fields/0K6omy7x...
...,...,...,...
171,GRASS,2018,https://data.agrimetrics.co.uk/fields/y6Ot7itb...
172,GRASS,2017,https://data.agrimetrics.co.uk/fields/yCNQqWFv...
173,GRASS,2018,https://data.agrimetrics.co.uk/fields/yCNQqWFv...
174,MAIZE,2017,https://data.agrimetrics.co.uk/fields/zMYGv-uH...


For our benchmarking purpose, we want to compare crops of the same species. So here we extract all the grass fields.

In [7]:
nearby_species_2018_fields = nearby_fields[(nearby_fields['cropType'] == FIELD_ID_crop_species) & (nearby_fields['harvestYear'] == 2018)]
print(f'There are {len(nearby_species_2018_fields)} {FIELD_ID_crop_species} fields within 2km')

There are 36 GRASS fields within 2km


## Accessing Verde field attributes
For accessing Verde field attributes, we must first register our Agrimetrics subscription for crop observations. It is necessary to register for each field we want Verde attributes. This is achieved by a GraphQL mutation:

In [8]:
for field_id in nearby_species_2018_fields['id']:
    response = requests.post(GRAPHQL_ENDPOINT, headers=headers, json={
        'query': '''
            mutation registerCropObservations($fieldId: ID!) {
                account {
                    premiumData {
                        addCropObservationRegistrations(registrations: {fieldId: $fieldId, layerType: NON_CROP_SPECIFIC, season: SEP2017TOSEP2018}) {
                            id 
                        }
                    }
                }
            }
        ''',
        'variables': {'fieldId': field_id}
    })
    response.raise_for_status()

Once we have registered, we can access Verde attribute crop observations for each field. For this simple benchmarking, we will retrieve 3 timeseries: `normalisedDifferenceVegetationIndex`, `chlorophyllContent` and `greenVegetationCoverFraction`.

In [9]:
attributes_selection = ['normalisedDifferenceVegetationIndex', 'chlorophyllContent', 'greenVegetationCoverFraction']
attributes_data = {}
response = requests.post(GRAPHQL_ENDPOINT, headers=headers, json={
    'query': '''
        query getCropObservations($fieldIds: [ID!]!) {
            fields(where: {id: {EQ: $fieldIds}}) {
                id
                cropObservations {
                    normalisedDifferenceVegetationIndex {
                        mean
                        dateTime
                    }
                    chlorophyllContent {
                        mean
                        dateTime
                    }
                    greenVegetationCoverFraction {
                        mean
                        dateTime
                    }
                }
            }
        }
    ''',
    'variables': {'fieldIds': [*nearby_species_2018_fields['id']]}
})
response.raise_for_status()
results = response.json()

for attribute in attributes_selection:
    attribute_data = pd.io.json.json_normalize(
        results['data']['fields'], 
        record_path=['cropObservations', attribute],
        meta=['id'],
    )
    attribute_data['date_time'] = pd.to_datetime(attribute_data['dateTime'])
    attribute_data['value'] = attribute_data['mean']
    attribute_data = attribute_data.dropna()   
    attributes_data[attribute] = attribute_data[['id', 'date_time', 'value']]
    
attributes_data['normalisedDifferenceVegetationIndex']

Unnamed: 0,id,date_time,value
0,https://data.agrimetrics.co.uk/fields/-Dzkwq1l...,2017-11-23 11:13:49+00:00,0.740598
1,https://data.agrimetrics.co.uk/fields/-Dzkwq1l...,2017-11-23 11:13:49+00:00,0.740598
2,https://data.agrimetrics.co.uk/fields/-Dzkwq1l...,2017-12-08 11:14:41+00:00,0.725569
3,https://data.agrimetrics.co.uk/fields/-Dzkwq1l...,2017-12-08 11:14:41+00:00,0.725569
4,https://data.agrimetrics.co.uk/fields/-Dzkwq1l...,2017-12-18 11:14:51+00:00,0.709237
...,...,...,...
1444,https://data.agrimetrics.co.uk/fields/yCNQqWFv...,2018-09-04 11:06:21+00:00,0.385951
1445,https://data.agrimetrics.co.uk/fields/yCNQqWFv...,2018-09-17 10:51:45+00:00,0.303537
1446,https://data.agrimetrics.co.uk/fields/yCNQqWFv...,2018-09-19 11:07:19+00:00,0.148777
1447,https://data.agrimetrics.co.uk/fields/yCNQqWFv...,2018-09-24 10:57:59+00:00,0.355927


We now have crop observations for our chosen field and nearby fields of the same species. We next select a time range for comparing our fields over.

In [10]:
from datetime import timedelta
import matplotlib.pyplot as plt

def compare(date):
    fig = plt.figure(figsize=(15,1))
    for i, attribute in enumerate(attributes_selection):
        data = attributes_data[attribute]
        from_date = date - timedelta(days=14)
        to_date = date + timedelta(days=14)
        filtered = data[(data.date_time > from_date) & (data.date_time < to_date)]

        mean_data = filtered.groupby('id').mean()
        ax = plt.subplot(1, 3, i + 1)
        ax.set_title(attribute)       
        
        if mean_data.empty:
            ax.text(0.5, 0.5, 'No data', color='black')
            continue
        for field_id in mean_data.index:
            colour = 'red' if field_id == FIELD_ID else 'black'
            alpha = 1 if field_id == FIELD_ID else 0.6
            ax.axvline(mean_data.loc[field_id]['value'], color=colour, alpha=alpha, linewidth=3)
        
        mean_of_all_fields = mean_data['value'].mean()
        ax.axvline(mean_of_all_fields, color='blue', linewidth=3, label='Nearby fields average')

import ipywidgets as widgets

start_date = min([attributes_data[attribute].date_time.min() for attribute in attributes_selection])
end_date = max([attributes_data[attribute].date_time.max() for attribute in attributes_selection])
dates = pd.date_range(start_date, end_date)

widgets.interact(compare,
    date=widgets.SelectionSlider(description='Date', options=dates, style={'description_width': 'initial'})
)


interactive(children=(SelectionSlider(description='Date', options=(Timestamp('2017-09-09 11:06:51+0000', tz='U…

<function __main__.compare(date)>