# Exploring ACIS / PRISM weather data and GDU curves

In this notebook we will download ACIS / PRISM gridded weather data and use it to compute GDU curves at a specific field. It is the start of a data processing scheme that brings insight into field state and operational planning.

Let us begin by importing some useful libraries that will be used throughout the notebook. Those are:
1. requests: Makes HTTP requests. We will use it to grab data from ACIS.
2. pandas: A popular tool for doing data analysis in python. We will use it for just about everything :).
3. matplotlib: A tool for making graphical plots. We will use it to, well, make plots!

_Note:_ I set the matplotlib style to “seaborn” because I like it better. It is not required and your welcome to adjust it to your liking.

In [4]:
import requests
import pandas as pd
import matplotlib.pyplot as plt # This is just so we don't have to type "matplotlib.pyplot" all the time.

In [48]:
# Setup some plotting styles 
#%matplotlib notebook
%matplotlib inline

plt.style.use('seaborn')

plt.rc('font', size=20)          # controls default text sizes
plt.rc('axes', titlesize=20)     # fontsize of the axes title
plt.rc('axes', labelsize=20)    # fontsize of the x and y labels
plt.rc('xtick', labelsize=20)    # fontsize of the tick labels
plt.rc('ytick', labelsize=20)    # fontsize of the tick labels
plt.rc('legend', fontsize=20)    # legend fontsize
plt.rc('figure', titlesize=20, figsize=(15, 8))  # fontsize of the figure title

Let us start by grabbing some ACIS weather data for a field at Purdue's [ACRE farm](https://www.google.com/maps/place/40%C2%B028'27.2%22N+86%C2%B059'43.7%22W/@40.4742259,-86.9975974,1242m/data=!3m1!1e3!4m5!3m4!1s0x0:0x0!8m2!3d40.4742222!4d-86.9954722). That is GPS coordinate: (40.4742259,-86.9975974).

We will grab all the _historical_ data that ACIS/PRISM has available. Let's ignore _this_ years data for now (because it is incomplete).

It is okay if you don't understand all this code right now. We will spend some time getting used to Python/Pandas before we compute the GDU curves.

In [30]:
# GPS coordinates of interest
lat = 40.4742259
lon = -86.9975974

# PRISM usally goes back to 1981
sdate = "1981-01-01"
edate = "2020-12-31"

# Make ACIS API request
w = requests.post('http://data.rcc-acis.org/GridData', json=...)

# Parse the JSON response to a Python datatypes (in this case a dictionary)


# Convert the raw Python map to a Pandas DataFrame


# Convert the date "string" into a real Pandas date object (enables some Pandas niceties)


# Tell Pandas to use the "date" columns as the index. Basically this means it will be easy 
# to select data from the DataFrame by date and date range.



# Getting comfortable with Pandas

## Useful utilities

**Task:** View the first few rows

**Task:** View the last few rows

**Task:** Print the column wise statistics of the data

## Selecting columns

**Task:** Select the entire maximum temperature column

**Task:** Select the entire precipitation columns

**Task:** Select the "date" column

## Selecting rows

**Task:** Select rows 36 through 38

**Task:** Select rows from March 4th, 1987 to May 6th, 1987

## Selecting rows and columns

**Task:** Select the maxt column from Feb 1, 2002 to April 15th, 2002

## Modifying part of column

**Task:** Compute each day's average temperature and save it as the new column "avg"

In [120]:
# Let's "copy" the DataFrame so we don't break the data for the rest of the not  ebook


# Compute average temp and save back to the data frame                                                   


## Modifying part of a column in place

**Task:** Limit the min temp to 0 F. Plot the original and modified data series to prove it worked.

In [122]:
# Select the rows to modify and then modify them.                                                   


# Getting comfortable with Matplotlib

## Plot a line

**Task:** Plot of all the max temperature data in one plot.

In [51]:
# Matplotlib combined with Pandas makes plotting really easy!
# Note: Matplotlib was able to determine the correct x-axis based on the DataFrame's index (in this case, date)


# We can improve the plot by adding labels, title, and tick rotation (the date strings are long)


# A "tight layout" just reduces the amount of white space in the figure. A bit noisier but also larger plots.



## Plotting multiple lines

**Task:** Plot both the 2020 minimum and maximum temperatures as separate lines on the same plot

In [54]:
# We store the slice into 'w' as w20 for convenience (we will use it over and over again)


# Calling `plot` multiple times add additional lines to the same figure

# Add a legend to distinguish between the two curves



## Plotting two plots on one figure

**Task:** Plot both the 2020 minimum and maximum temperatures as separate lines on one plot and 2020 precipitation on the other. Make the two plots stack vertically and share the same time axis.

In [55]:
### Put two plots on one figure

# The `subplot` command allows us to slice up one image and draw multiple independent figures into each slice
# In this case, we are slicing the image into 2 rows and 1 column. We select the first slice.


# Now we select the second slice. Notice we share the x-axis. This is so when you interact with the either plot
# the other plot is automatically adjusted to the same scale.



## Grouping and reducing

In Pandas, you can group a DataFrame by a field or condition and then apply an operation to each group. The output is a new DataFrame indexed by the "group by" value and with values equal to the return of each operation.

**Task:** Find the maximum precipitation for each year

## Merge two group reductions

**Task:** Find the maximum and minimum temperature for each year

## Plot group by reduction

**Task:** Make an error bar plot of temperature over a year

In [78]:
# Day of year: Jan 1 -> 0, Jan 2 -> 1, Jan 3 -> 2, ... without regard the actual year


# You can select a column out of the "GroupBy" before processing them
# You can use the resulting DataFrame just like before, e.g., plotting.



## Combining DataFrame and GroupBy data on one plot

**Task**: Display the average minimum and maximum temperatures over the 2020 data.

# Growing Degree Units
## Computing GDUs for an ACRE field using public ACIS/PRISM gridded weather data

Let's start by simply plotting this year's current GDU curve.

We didn't fetch 2021 weather data in our earlier request to ACIS. So, here we use the same code with new date ranges to download up to 2021-07-01 (today's date as of this writing). Just like before, we finish up by converting the fetched data into a Pandas DataFrame.

### Planting records
Let's also note the planting records for this field.

- The field was (hypothetically) planted with Dekalb DKC64-35RIB.
- The manufacture claims 2954 to black layer ([Dekalb DKC64-35RIB Datasheet](https://cdn.websites.hibu.com/f091a3ebdd4e480a8da11c597fdbfb00/files/uploaded/DKC64-35RIB.pdf)).
- The field was (hypothetically) planted on April 4th, 2021.

In [101]:
plant_date = "2021-04-25"
gdu_to_black = 2594

now_date = "2021-07-01"

# We will use the "standard" corn 86/50 max/base values
t_max = 86
t_base = 50

# Fetch data from ACIS, let's call "w_field" so we all have the same name 


# Create a Pandas DataFrame to hold the min and max temperature and precipitation data


# Convert the date "string" into a date object (so Pandas understands it as a date)


# Then make it the primary index so we can easily slice the data up by time



## GDU accumulation from planting date

**Recall**: 

$GDU = \frac{T_\textrm{high} + T_\textrm{low}}{2}$

When $T_\textrm{high} = T_\textrm{max}$ when $T_\textrm{high} > T_\textrm{max}$. All negative GDU is replaced with zero.

**Task:** Compute the current GDU accumulation for the field

In [123]:
# "Modify" the temp data for corn's upper threshold max


# Save the daily GDU totals back to the DataFrame as a new column


# Negative GDUs are assumed to be zero ... that is a colder day doesn't reduce the growth


# Determine the total GDUs from the plant date



## Estimating the current growth stage

**Recall:**

$V = 42 \frac{GDU}{GDU_\textrm{black}} - 2.23$

$R = 10.3 \frac{GDU}{GDU_\textrm{black}} - 4.37$

**Task:** Using the given growth stage formulas, estimate the current stage.

In [126]:
# Note: these formals are fit from data found in the literature


# Note: A "negative" R indicates that the plant is not yet in the reproductive stage.


## Computing historical GDU curves

**Task:** Compute the year long GDU curve for each year and plot them all on the same figure.

_Note: A "GDU curve" is the cumulative sum of the GDU from some starting point. Usually the first of the year, unless you are studying a particular field, then it would be from planting date. That is, the GDU's accumulated by the corn itself._

In [133]:
# Create a helper date range for plotting "day of year" data. Use this as the x-axis data for any plot where the 
# y data is based on "day of year" to get a real date on the x-axis
y21 = pd.date_range('2021-01-01', periods=365)

# Save a copy of the weather data as "w_corn" because we are going to "modify" the temperature data
w_corn = w.copy();

# "Modify" the temp curve for corn max


# Compute the GDU values (be sure to limit negative values)

# The 'cumsum' function is "cumulative sum". That means the output of a 'cumsum' operation is a vector the
# same length as the input vector. However, the output vector is like:
#   index 0: equal to original value at index 0
#   index 1: equal to original value at index 0 + original value at index 1
#   index 2: equal to output at index 1 + original value at index 2
#   index 3: equal to output at index 2 + original value at index 3
#   index N: equal to output at index N-1 + original value at index N
#
# In this case, we are 'cumsum'ing down the GDUs, grouped by year. So we will end up with a DataFrame indexed by
# the same date column.

# Now, re-group the accumlation by year again and loop over each, plotting a line each time (year).
for year, gdu_curve  in ...:
    # We only plot from 0:365 to avoid an x and y mismatch
    plt.plot(y21, gdu_curve[0:365], label=year)


## Plot the min, max, and average curves

**Task:** Plot the GDU curve historical extremes, the average GDU curve, the current year GDU curve, and a projection of the rest year on one plot. Use a graphical techniques to estimate date of next growth stage.

In [134]:
#######
### Find the historical extremes
#######

# We will need to now the plant date's day of year
plant_dayofyear = pd.Period(plant_date).dayofyear

# Slice the historical dataset down to days after plant date's calendar day


# Compute the curves from that new starting date


# Find the min, max, and avg accumulations for each day over the years


# Helper date range computations


# Plot the historical extremes (plt.fill_between might be nice for this)


# Plot the historical average


#######
### Find this year's GDU curve
#######

# Compute this year's GDU curve from plant date


# Plot the current years GDU curve


#######
### Projection this year's GDU curve out using the historical average
#######

# Compute the projection


# Helper date range computations


# Plot the projection of the current years GDU curve


#######
### Plot stage v16
#######
# Note: plt.hlines will plot a horizontal line on the plot
