# Calculate Yield Response to Variable Nitrogen Rates
---

**Name**: Adrian Correndo

**Semester**: Spring 2019

**Project area**: Agronomy

## **Objective**

Automating the calculation of grain yield (GY) response to different rates of nitrogen (N) fertilizer, related fertilizer use efficiencies (NUE). For each experiment, get the GY with no N added (Y0) and the maximum observed yield (Ymax). Create subgroups of experiments based on soil texture (STx).

## **Data example**
![Example](https://github.com/adriancorrendo/project/Scatter.JPG)

## **Outcomes**

*.csv file with up to 8 columns: Trial, STx, Nrate, GY, Y0, Ymax, NR, and NUE, where:

**-Y0**: GY when Nrate=0;
**-Ymax**: maximum observed GY;
**-NR**: abs. nitrogen response corresponding to each fertilizer rate different from 0.

Challenge could be related to: 

i) Nrate levels (# and kg applied N) vary across trials;
ii) Y0 and Ymax values happen at **Trial** level, while the NR and NUE values, at a sub-level by a given **Trial-Nrate combination**.

## **Rationale**

Database compund by hundreds of corn nitrogen fertilizer experiments. Automating these calculations will save me a significant amount of time when processing and analyzing the database.

## **Sketch**
![Main steps of the project](https://github.com/adriancorrendo/project/sketch.jpg)

## **Coding**
### Importing modules and datafile

In [232]:
import glob
import pandas as pd
import numpy as np
glob.os.chdir('C:/Users/correndo/Desktop/Coding/project2/')
data = pd.read_csv('CornNFR.csv') # File
df = pd.DataFrame(data)
df.head()

Unnamed: 0,TRIAL,TEXT,Nrate,GY
0,39,clay,0,3.7
1,39,clay,36,5.057
2,39,clay,62,5.392
3,39,clay,89,7.774
4,39,clay,116,8.765


In [237]:
sdf = df.sort_values(['TRIAL', 'Nrate'],ascending=True) # Sorting by trials and then Nrate in Ascending order
sdf = pd.DataFrame(sdf) # Print first 20 values
sdf.head(10)

Unnamed: 0,TRIAL,TEXT,Nrate,GY
329,1,silty_clay,0,13.317
330,1,silty_clay,84,14.434
331,1,silty_clay,140,15.267
332,1,silty_clay,196,15.405
333,1,silty_clay,280,15.496
334,2,silty_clay,0,6.604
335,2,silty_clay,84,10.691
336,2,silty_clay,140,11.122
337,2,silty_clay,196,11.246
338,2,silty_clay,280,11.722


In [238]:
# Filtering for N0 plots using pandas "CHAINING OPERATION" (also possible with BOOLEAN VARIABLE)
N0_plots = pd.DataFrame(sdf[sdf.Nrate == 0])
#print(N0_plots.head(10))
N0_plots.head()

Unnamed: 0,TRIAL,TEXT,Nrate,GY
329,1,silty_clay,0,13.317
334,2,silty_clay,0,6.604
9,3,loamy_sand,0,9.088
73,4,silt_loam,0,9.742
14,5,loamy_sand,0,12.172


In [239]:
# Filtering for N fertilized plots (Nrate > 0) using pandas "CHAINING OPERATION" (also possible with BOOLEAN VARIABLE)
Nf_plots = pd.DataFrame(sdf[sdf.Nrate > 0]) #could use "!= 0" command for 'different' from zero
print("The total number of N fertilized plots is:", len(Nf_plots))
Nrates = Nf_plots['Nrate'].unique().tolist()
print("The number of different N rates is:", len(Nrates))
Nrates_freq = Nf_plots['Nrate'].value_counts().tolist()
#Table with all the N rates and their frequency
Nrates = pd.DataFrame(data=({'N rate': Nrates, 'Counts': Nrates_freq}))
Nrates = Nrates.sort_values(['N rate'],ascending=True)
Nrates.head()

The total number of N fertilized plots is: 345
The number of different N rates is: 34


Unnamed: 0,N rate,Counts
26,27,1
19,36,3
27,54,1
10,56,16
20,62,3


In [240]:
#How many unique types of soil texture are in the database?
text_class = N0_plots['TEXT'].unique().tolist() 
text_freq = N0_plots['TEXT'].value_counts().tolist()
print("The number of different soil texture classes is:", len(texture))
STx = pd.DataFrame(data=({'Texture Class': text_class, 'Frequency': text_freq}))
STx

The number of different soil texture classes is: 6


Unnamed: 0,Texture Class,Frequency
0,silty_clay,36
1,loamy_sand,8
2,silt_loam,7
3,sandy_loam,4
4,clay,3
5,silty_clay_loam,1


In [245]:
trial_1 = pd.DataFrame(sdf[sdf.TRIAL == 1])
for i in trial_1:
    max_GY = trial_1.GY.max()
print('The Ymax of Trial#1 is:', max_GY, 't/ha')
trial_1.head()

The Ymax of Trial#1 is: 15.495999999999999 t/ha


Unnamed: 0,TRIAL,TEXT,Nrate,GY
329,1,silty_clay,0,13.317
330,1,silty_clay,84,14.434
331,1,silty_clay,140,15.267
332,1,silty_clay,196,15.405
333,1,silty_clay,280,15.496


In [214]:
# group data by TRIAL
trials = sdf.groupby("TRIAL")

In [302]:
# For loop to create sub data frames for each trial
for TRIAL, trials_df in trials:
    print(trials_df)

     TRIAL        TEXT  Nrate      GY
329      1  silty_clay      0  13.317
330      1  silty_clay     84  14.434
331      1  silty_clay    140  15.267
332      1  silty_clay    196  15.405
333      1  silty_clay    280  15.496
     TRIAL        TEXT  Nrate      GY
334      2  silty_clay      0   6.604
335      2  silty_clay     84  10.691
336      2  silty_clay    140  11.122
337      2  silty_clay    196  11.246
338      2  silty_clay    280  11.722
    TRIAL        TEXT  Nrate      GY
9       3  loamy_sand      0   9.088
10      3  loamy_sand     84  11.854
11      3  loamy_sand    140  12.826
12      3  loamy_sand    196  13.150
13      3  loamy_sand    280  13.086
    TRIAL       TEXT  Nrate      GY
73      4  silt_loam      0   9.742
74      4  silt_loam     84  11.287
75      4  silt_loam    140  11.037
76      4  silt_loam    196  10.963
77      4  silt_loam    280  11.473
    TRIAL        TEXT  Nrate      GY
14      5  loamy_sand      0  12.172
15      5  loamy_sand    125  12

In [310]:
for i in trials:
    max = trials.GY.max()
Ymax = pd.DataFrame(max)
Ymax.columns = ['GYmax']
Ymax.head()

Unnamed: 0_level_0,GYmax
TRIAL,Unnamed: 1_level_1
1,15.496
2,11.722
3,13.15
4,11.473
5,14.523


Merge data of Y0 and Ymax data

In [360]:
df_Y = pd.DataFrame(pd.merge(N0_plots, Ymax, on='TRIAL'))

In [361]:
df_Y = df_Y.drop(columns=['Ymax', 'Nrate'])
df_Y.columns = ['TRIAL', 'STx', 'Y0', 'Ymax']
df_Y.head()

Unnamed: 0,TRIAL,STx,Y0,Ymax
0,1,silty_clay,13.317,15.496
1,2,silty_clay,6.604,11.722
2,3,loamy_sand,9.088,13.15
3,4,silt_loam,9.742,11.473
4,5,loamy_sand,12.172,14.523


In [362]:
for i in df_Y:
    Max_Nresp = df_Y.Ymax - df_Y.Y0
Max_Nresp = pd.DataFrame(Max_Nresp)
Max_Nresp.columns = ['MaxNR']
Max_Nresp.head()

Unnamed: 0,MaxNR
0,2.179
1,5.118
2,4.062
3,1.731
4,2.351


In [363]:
df_Y.insert(4, 'Delta-Y', Max_Nresp.MaxNR)
df_Y.head()

Unnamed: 0,TRIAL,STx,Y0,Ymax,Delta-Y
0,1,silty_clay,13.317,15.496,2.179
1,2,silty_clay,6.604,11.722,5.118
2,3,loamy_sand,9.088,13.15,4.062
3,4,silt_loam,9.742,11.473,1.731
4,5,loamy_sand,12.172,14.523,2.351


**Next Steps:**
 - Exploring N responses to each fertilizer rate.
 - Estimating their corresponding efficiencies.

In [None]:
NR = GY(for j in Nrate if Nrate > 0) - GY (when Nrate ==0) # 'j' is a level of Nrate nested in TRIAL