<span style="background-color: yellow; color: black; font-size: 26px;">**Before you start, make sure you ran the *setup* notebook and have your GEE account.**</span>

You can find the *setup* notebook in the link: https://github.com/soilwater/precisionag-workshop-2024/blob/main/setup_python_geospatial_analysis.ipynb

# Create Field Management Zones

In this tutorial you will learn how to:

- Find, access, and download products from Google Earth Engine
- Compute NDVI from image bands 
- Load custom field boundaries in vector format
- Plot raster files
- Stack raster files to generate space-time arrays
- Learn how to merge multiple NDVI images using the Mean Relative Difference concept
- Cluster multiple images to generate field management zones
- Export map of resulting field management zones as prescription maps in Shapefile format

Notebook created by Andres Patrignani and Gabriel da Rocha Hintz - November 2024


### Import necessary modules

In [None]:
# Import modules
import ee
import glob
import json
import requests
import numpy as np
import xarray as xr
import pandas as pd
import geopandas as gpd

import matplotlib.pyplot as plt
from matplotlib import colors

from datetime import datetime, timedelta
from scipy.ndimage import gaussian_filter

from sklearn.cluster import BisectingKMeans
from sklearn.impute import SimpleImputer

import rasterio.features
from shapely.geometry import shape # shape creates a Shapely geometry


### Initialize Google Earth Engine

In [None]:
# Authenticate
ee.Authenticate()

# Initialize API
ee.Initialize()

### Create helper functions

In [None]:
def save_gee_to_geotiff(ee_image, filename, crs, scale, geom, bands=[]):
    """
    Function to save images from Google Earth Engine into local hard drive.
    """
    image_url = ee_image.getDownloadUrl({'region': geom,'scale':scale, 
                                         'bands': bands,
                                         'crs': f'EPSG:{crs}', 
                                         'format': 'GEO_TIFF'})
    
    # Request data using URL and save data as a new GeoTiff file
    response = requests.get(image_url)
    if response.status_code == 200:
        with open(filename, 'wb') as f:
            f.write(response.content)
            print(f"Saved image {filename}")
    else:
        print("Failed to download image")


def array_to_df(arr):
    """Function to convert list into dataframe"""
    df = pd.DataFrame(arr[1:])
    df.columns = arr[0]
    df['time'] = pd.to_datetime(df['time'], unit='ms')
    return df


In [None]:
# Set format of axes without offset in the form of scientific notation
plt.rcParams['axes.formatter.useoffset'] = False


In [None]:
# Create our own colormap
hex_palette = ['#FEFEFE','#CE7E45', '#DF923D', '#F1B555', '#FCD163', '#99B718', '#74A901',
             '#66A000', '#529400', '#3E8601', '#207401', '#056201', '#004C00', '#023B01',
             '#012E01', '#011D01', '#011301']


# Use the built-in ListedColormap function to do the conversion


### Define area of interest

We will use the boundaries of an actual farmer production field near Gypsum, KS.

In [None]:
# Read field boundary with Geopandas


In [None]:
# Visualize field boundaries


In [None]:
# Get JSON format of GeoDataframe
# Note that .to_json() gives us a string, so we use the json module to create a proper json object

# Define the region of interest (roi) in GEE as a ee.Geometry

# Create mask in GEE for field


In [None]:
# Get centroid of GEE geometry (note that this is of type: Point


### Get NDVI timeseries to inspect growing season trend

To inspect some of the temporal variability and help us identify the period to download some data for our project, let's plot some NDVI time series for the centroid (you can do this foy any point within the field).

We will use the following GEE product: https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MOD13Q1

In [None]:
# Define start and end dates
start_date = 
end_date = 


In [None]:
# Get collection for Modis-Terra 16-day composite at 250-meter resolution


In [None]:
# Convert array into dataframe


In [None]:
# Create figure to visualize time series


### Download Sentinel data

First we are going to request the available image dates within a user-define period of time. Then we will download a few of the images in the list so that the code runs within a short period of time.

For this part we will use a product from Sentinel that has some cloud information associated with it that will allow us to filter out images during clouy days. The product link is: https://developers.google.com/earth-engine/datasets/catalog/GOOGLE_CLOUD_SCORE_PLUS_V1_S2_HARMONIZED?hl=en#description

In [None]:
# Re-define start and end dates based on time series


In [None]:
# Select Sentinel 2 image collection


# Get the list of available image dates



In [None]:
# Select a subset of the collection dates to download

# Download each image



In [None]:
# Read one image to inspect data

# Get number of rows and columns for later use

# Visualize map for specific date



### Create space-time data array using all the images

In [None]:
# Read all image names


In [None]:
# Create DataArray


In [None]:
# Plot all of them


### Access and select data within DataArray

In [None]:
# Plot single image by date


In [None]:
# Only areas with specific NDVI


In [None]:
# Plot time series


### Compute relative difference

For each NDVI layer we will normalize the values by first subtracting the mean, and then dividing by the mean. This was we will obtain a new grid showing areas of the field that have more or less biomass than teh field average.

$$ RD = \frac{NDVI - \overline{NDVI}}{\overline{NDVI}}$$


In [None]:
# Create lambda function to compute RD


In [None]:
# Compute relative difference


In [None]:
# Create data array


In [None]:
# Compute mean relative difference



In [None]:
# Plot mean relative difference



### Create management zones

We will use clustering analysis to find homogeneous management zones. During clustering, each column represents a different feature (NDVI date, or any other variable) and rows represent reps or observations (in our case each pixel of the field).

In [None]:
# Input data using MRD

# Imputing missing data


In [None]:
# Clustering (this will result in integer labels)

# Smooth clusters using Gaussian filter


In [None]:
# Create DataArray with resulting clusters, which the new management zones


# Restore NaN cells


# Plot management zones


In [None]:
# Create a version where each cluster is represented by the median MRD


In [None]:
# Show mapped clusters using MRD values


### Vectorize resulting raster management zones

In [None]:
# Create empty arrays to save polygons for each cluster and their MRD value

# Create a GeoDataFrame with the polygons and their cluster labels


In [None]:
# Plot mapped clusters


In [None]:
# Visualize interactive map


In [None]:
# Export shapefile to use as a prescription map
