# RIDGE REGRESSION    
--- 
### Reason for using Ridge Regression: 
+ As an initial step of modeling, we are using three algorithms (Ridge Regression, LASSO, and PCR - Principal Component Regression) to look at the predictors that has a good impact in predicting the crop yield. 
+ We are using Ridge Regression mainly because of two reasons:      
    + We have 161 predictors and 42007 records in the dataset. But when we are modeling on Eco-district level this ratio is very small. There is not much difference between the number of predictors and the observations.     
    + As we have a lot of predictors there's a high chance of collinearity. It eliminates multicollinearity.   
+ It introduces a little bias so that the variance can be substantially reduced, which leads to a lower overall MSE.    
+ It is a Regularization algorithm. 
--- 
### Overview of what has been done for modeling in this file: 
+ In ridge regression the first step is to **standardize** the variables. All ridge regression calculations are based on standardized variables.    
+ We also did **cross-validation** for ridge regression, to test the model's ability to predict new data that was not used in estimating it, in order to flag problems like overfitting or selection bias and to give an insight on how the model will generalize to an independent dataset (new dataset). 
+ Then we have calculated the Train MSE (Mean Squared Error), Test MSE, R Squared for training set, R squared for test set, Mean Absolute Error (MAE), and Accuracy for each model for each eco district. 

--- 

**General information**: Initially when we did data exploration, we found that there are 144 eco districts. After data wrangling, joining the four datasets into one final dataset we are left with 137 eco districts. So, the further analysis is carried out for 137 eco districts.     

**NOTE: We have added proper comments for the first model, that  is the model for the first eco district. All of the other model code is similar to it.** 

---
---

## Code: 

In [1]:
# importing all of the required libraries. 
# various functions are imported from different libraries and packages. 

import pandas as pd    # pandas
import numpy as np     # numpy
from sklearn import model_selection   # from sklearn package imported model_selection
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import Ridge
from sklearn.linear_model import Lasso
from sklearn.linear_model import ElasticNet
from sklearn.neighbors import KNeighborsRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.svm import SVR
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import RepeatedKFold
from sklearn.linear_model import RidgeCV
from sklearn.preprocessing import scale
from math import sqrt

In [2]:
dataframe = pd.read_csv("aafc_data.csv")  # importing the dataset
# summarize shape
print(dataframe.shape)
print("\n")
# summarize first few lines
dataframe.head()

(42007, 166)




Unnamed: 0.1,Unnamed: 0,TWP_ID,ECODISTRICT_ID,YEAR,YieldKgAcre,SumPcpn18_20,SumPcpn19_21,SumPcpn20_22,SumPcpn21_23,SumPcpn22_24,...,SoilMoisture29_31,SoilMoisture30_32,SoilMoisture31_33,SoilMoisture32_34,SoilMoisture33_35,SoilMoisture34_36,SoilMoisture35_37,SoilMoisture36_38,SoilMoisture37_39,SoilMoisture38_40
0,0,00101E1,852.0,2010,867.766846,53.6,111.1,109.7,117.9,46.4,...,16.960125,18.766207,17.186998,15.461519,19.738222,22.958089,27.206203,26.480087,28.678156,26.308484
1,1,00101W1,852.0,2010,673.685028,57.2,114.7,110.5,114.0,46.2,...,16.32852,17.926029,16.787544,14.779726,20.245149,23.608204,28.56099,27.324254,29.079177,26.927224
2,2,00101W2,796.0,2010,824.303864,39.0,96.4,109.8,101.2,111.4,...,13.117879,12.869142,12.831834,14.126196,16.385776,18.650751,20.287069,20.514132,19.564788,16.681692
3,3,00102E1,853.0,2010,1006.708496,37.5,158.2,157.8,161.4,46.9,...,17.060778,18.699156,17.345822,15.998957,20.091525,22.761273,26.33743,25.559602,27.611729,25.575794
4,4,00102W1,852.0,2010,869.040283,57.2,114.7,110.5,114.0,46.2,...,16.050993,17.55686,16.612026,14.48015,20.467884,23.893858,29.156274,27.695178,29.255386,27.199097


In [3]:
# dropping those columns that are not useful
dataframe.drop(['Unnamed: 0', 'TWP_ID', 'YEAR'], axis=1, inplace=True)

In [4]:
# looking at the dataset after dropping the columns 
dataframe.head()

Unnamed: 0,ECODISTRICT_ID,YieldKgAcre,SumPcpn18_20,SumPcpn19_21,SumPcpn20_22,SumPcpn21_23,SumPcpn22_24,SumPcpn23_25,SumPcpn24_26,SumPcpn25_27,...,SoilMoisture29_31,SoilMoisture30_32,SoilMoisture31_33,SoilMoisture32_34,SoilMoisture33_35,SoilMoisture34_36,SoilMoisture35_37,SoilMoisture36_38,SoilMoisture37_39,SoilMoisture38_40
0,852.0,867.766846,53.6,111.1,109.7,117.9,46.4,69.3,60.0,44.6,...,16.960125,18.766207,17.186998,15.461519,19.738222,22.958089,27.206203,26.480087,28.678156,26.308484
1,852.0,673.685028,57.2,114.7,110.5,114.0,46.2,68.1,55.9,34.9,...,16.32852,17.926029,16.787544,14.779726,20.245149,23.608204,28.56099,27.324254,29.079177,26.927224
2,796.0,824.303864,39.0,96.4,109.8,101.2,111.4,153.0,163.6,98.8,...,13.117879,12.869142,12.831834,14.126196,16.385776,18.650751,20.287069,20.514132,19.564788,16.681692
3,853.0,1006.708496,37.5,158.2,157.8,161.4,46.9,79.5,67.5,40.4,...,17.060778,18.699156,17.345822,15.998957,20.091525,22.761273,26.33743,25.559602,27.611729,25.575794
4,852.0,869.040283,57.2,114.7,110.5,114.0,46.2,68.1,55.9,34.9,...,16.050993,17.55686,16.612026,14.48015,20.467884,23.893858,29.156274,27.695178,29.255386,27.199097


---
### For the top 10 (which have large number of records) Eco districts, we have modeled them separately.   


| Ecodistrict | Count of Unique Township IDs |Count of Records |     
| :----: | :-----: | :-----: |
|748| 101 | 1199|    
|826|107|1177|    
|752|101|1111|
|745|100|1100|
|808|97|1067|
|792|92|1012|
|849|91|1001|
|729|86|946|
|753|86|946|
|709|83|913|

---
**NOTE: We have added proper comments for the first model, that  is the model for the first eco district. All of the other model code is similar to it.**   
### # For Eco District ID: 748         


In [5]:
df1 = dataframe[dataframe['ECODISTRICT_ID']==748]   # created a dataframe for one of the eco district
df1.drop(['ECODISTRICT_ID'], axis=1, inplace=True)  # dropped the 'ECODISTRICT_ID' column before modeling

# split data into X and y
x = pd.DataFrame(df1.drop(labels=['YieldKgAcre'], axis=1)) # x contains the predictors (not the target value 'YieldKgAcre')
y = pd.DataFrame(df1['YieldKgAcre'])                       # y contains the dependent variable ('YieldKgAcre').

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


In [6]:
# splitting the x and y datasets to train and test with the ratio 70:30 ratio.
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.30, random_state=40)

# Run standardization on X variables
X_train = scale(X_train)
X_test = scale(X_test)

# looking at the splitted test and train data shapes
print(X_train.shape); print(X_test.shape)


(839, 161)
(360, 161)


In [7]:
rr = Ridge(alpha=0.01)  # defining the model

# applying cross validation      
# Parameters: 
# 1. Alpha: Array of alpha values to try. Regularization strength; must be a positive float. 
#           Regularization improves the conditioning of the problem and reduces the variance of the estimates. 
#           Larger values specify stronger regularization.    
# 2. cv: Determines the cross-validation splitting strategy.
rr = RidgeCV(alphas=[0.1, 1.0, 10.0], cv=10)  


# fitting the model on training data
rr.fit(X_train, y_train)    

pred_train_rr= rr.predict(X_train)  # predicting the 'YieldKgAcre' values on the basis of trained model. Using the predict() function
print("Train MSE: ", np.sqrt(mean_squared_error(y_train,pred_train_rr))) # calculated the Training MSE using mean_squared_error() function.
print("R squared training set: ", r2_score(y_train, pred_train_rr)*100)  # calculated the R squared for the training set using r2_score() function.

pred_test_rr= rr.predict(X_test) # predicting the 'YieldKgAcre' values on the basis of trained model for the test dataset. Using the predict() function.
print("Test MSE: ", np.sqrt(mean_squared_error(y_test,pred_test_rr))) # calculated the Test MSE using mean_squared_error() function.
print("R squared test set: ", r2_score(y_test, pred_test_rr)*100)   # calculated the R squared for the test set using r2_score() function.






# Calculate the absolute errors
errors = abs(pred_test_rr - y_test)

 # Print out the mean absolute error (mae)
print('Mean Absolute Error:', np.mean(errors), 'degrees.')



# Calculate mean absolute percentage error (MAPE)
mape = 100 * (errors / y_test)

# Calculate and display the accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')

Train MSE:  76.1919905362895
R squared training set:  84.72096804187296
Test MSE:  87.90676524170678
R squared test set:  80.78561045557035
Mean Absolute Error: YieldKgAcre    69.988648
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.11
dtype: float64 %.


---
### # For Eco District ID: 826  

In [8]:
df2 = dataframe[dataframe['ECODISTRICT_ID']==826]
df2.drop(['ECODISTRICT_ID'], axis=1, inplace=True)
# split data into X and y
x = pd.DataFrame(df2.drop(labels=['YieldKgAcre'], axis=1))
y = pd.DataFrame(df2['YieldKgAcre'])

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


In [9]:
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.30, random_state=40)

# Run standardization on X variables
X_train = scale(X_train)
X_test = scale(X_test)

print(X_train.shape); print(X_test.shape)

(823, 161)
(354, 161)


In [10]:
rr = Ridge(alpha=0.01)
rr = RidgeCV(alphas=[0.1, 1.0, 10.0], cv=10)
rr.fit(X_train, y_train) 
pred_train_rr= rr.predict(X_train)
print("Train MSE: ", np.sqrt(mean_squared_error(y_train,pred_train_rr)))
print("R squared training set: ", r2_score(y_train, pred_train_rr)*100)

pred_test_rr= rr.predict(X_test)
print("Test MSE: ", np.sqrt(mean_squared_error(y_test,pred_test_rr))) 
print("R squared test set: ", r2_score(y_test, pred_test_rr)*100)






# Calculate the absolute errors
errors = abs(pred_test_rr - y_test)

 # Print out the mean absolute error (mae)
print('Mean Absolute Error:', np.mean(errors), 'degrees.')



# Calculate mean absolute percentage error (MAPE)
mape = 100 * (errors / y_test)
# Calculate and display accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')

Train MSE:  101.09165827651603
R squared training set:  75.60317443634008
Test MSE:  105.57282197153029
R squared test set:  73.57893809551786
Mean Absolute Error: YieldKgAcre    82.970694
dtype: float64 degrees.
Accuracy: YieldKgAcre    87.34
dtype: float64 %.


---
### # For Eco District ID: 752

In [11]:
df3 = dataframe[dataframe['ECODISTRICT_ID']==752]
df3.drop(['ECODISTRICT_ID'], axis=1, inplace=True)
# split data into X and y
x = pd.DataFrame(df3.drop(labels=['YieldKgAcre'], axis=1))
y = pd.DataFrame(df3['YieldKgAcre'])

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


In [12]:
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.30, random_state=40)

# Run standardization on X variables
X_train = scale(X_train)
X_test = scale(X_test)

print(X_train.shape); print(X_test.shape)

(777, 161)
(334, 161)


In [13]:
rr = Ridge(alpha=0.01)
rr = RidgeCV(alphas=[0.1, 1.0, 10.0], cv=10)
rr.fit(X_train, y_train) 
pred_train_rr= rr.predict(X_train)
print("Train MSE: ", np.sqrt(mean_squared_error(y_train,pred_train_rr)))
print("R squared training set: ", r2_score(y_train, pred_train_rr)*100)

pred_test_rr= rr.predict(X_test)
print("Test MSE: ", np.sqrt(mean_squared_error(y_test,pred_test_rr))) 
print("R squared test set: ", r2_score(y_test, pred_test_rr)*100)






# Calculate the absolute errors
errors = abs(pred_test_rr - y_test)

 # Print out the mean absolute error (mae)
print('Mean Absolute Error:', np.mean(errors), 'degrees.')



# Calculate mean absolute percentage error (MAPE)
mape = 100 * (errors / y_test)
# Calculate and display accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')

Train MSE:  78.66255663549335
R squared training set:  86.53688382863933
Test MSE:  92.89526127482065
R squared test set:  77.86303456647643
Mean Absolute Error: YieldKgAcre    71.969368
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.07
dtype: float64 %.


---
### # For Eco District ID: 745  

In [14]:
df4 = dataframe[dataframe['ECODISTRICT_ID']==745]
df4.drop(['ECODISTRICT_ID'], axis=1, inplace=True)
# split data into X and y
x = pd.DataFrame(df4.drop(labels=['YieldKgAcre'], axis=1))
y = pd.DataFrame(df4['YieldKgAcre'])

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


In [15]:
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.30, random_state=40)

# Run standardization on X variables
X_train = scale(X_train)
X_test = scale(X_test)


print(X_train.shape); print(X_test.shape)

(770, 161)
(330, 161)


In [16]:
rr = Ridge(alpha=0.01)
rr = RidgeCV(alphas=[0.1, 1.0, 10.0], cv=10)
rr.fit(X_train, y_train) 
pred_train_rr= rr.predict(X_train)
print("Train MSE: ", np.sqrt(mean_squared_error(y_train,pred_train_rr)))
print("R squared training set: ", r2_score(y_train, pred_train_rr)*100)

pred_test_rr= rr.predict(X_test)
print("Test MSE: ", np.sqrt(mean_squared_error(y_test,pred_test_rr))) 
print("R squared test set: ", r2_score(y_test, pred_test_rr)*100)






# Calculate the absolute errors
errors = abs(pred_test_rr - y_test)

 # Print out the mean absolute error (mae)
print('Mean Absolute Error:', np.mean(errors), 'degrees.')



# Calculate mean absolute percentage error (MAPE)
mape = 100 * (errors / y_test)
# Calculate and display accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')

Train MSE:  74.30969520931816
R squared training set:  89.2187411341816
Test MSE:  83.60354856900985
R squared test set:  87.42100132377485
Mean Absolute Error: YieldKgAcre    64.359002
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.73
dtype: float64 %.


---
### # For Eco District ID: 808  

In [17]:
df5 = dataframe[dataframe['ECODISTRICT_ID']==808]
df5.drop(['ECODISTRICT_ID'], axis=1, inplace=True)
# split data into X and y
x = pd.DataFrame(df5.drop(labels=['YieldKgAcre'], axis=1))
y = pd.DataFrame(df5['YieldKgAcre'])

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


In [18]:
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.30, random_state=40)   
   
# Run standardization on X variables
X_train = scale(X_train)
X_test = scale(X_test)     
        
        
print(X_train.shape); print(X_test.shape)

(746, 161)
(321, 161)


In [19]:
rr = Ridge(alpha=0.01)
rr = RidgeCV(alphas=[0.1, 1.0, 10.0], cv=10)
rr.fit(X_train, y_train) 
pred_train_rr= rr.predict(X_train)
print("Train MSE: ", np.sqrt(mean_squared_error(y_train,pred_train_rr)))
print("R squared training set: ", r2_score(y_train, pred_train_rr)*100)

pred_test_rr= rr.predict(X_test)
print("Test MSE: ", np.sqrt(mean_squared_error(y_test,pred_test_rr))) 
print("R squared test set: ", r2_score(y_test, pred_test_rr)*100)






# Calculate the absolute errors
errors = abs(pred_test_rr - y_test)

 # Print out the mean absolute error (mae)
print('Mean Absolute Error:', np.mean(errors), 'degrees.')



# Calculate mean absolute percentage error (MAPE)
mape = 100 * (errors / y_test)
# Calculate and display accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')

Train MSE:  108.80776688895307
R squared training set:  71.47017415604348
Test MSE:  125.33205832535845
R squared test set:  60.32007296901117
Mean Absolute Error: YieldKgAcre    93.876136
dtype: float64 degrees.
Accuracy: YieldKgAcre    87.32
dtype: float64 %.


---
### # For Eco District ID: 792

In [20]:
df6 = dataframe[dataframe['ECODISTRICT_ID']==792]
df6.drop(['ECODISTRICT_ID'], axis=1, inplace=True)
# split data into X and y
x = pd.DataFrame(df6.drop(labels=['YieldKgAcre'], axis=1))
y = pd.DataFrame(df6['YieldKgAcre'])

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


In [21]:
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.30, random_state=40)  

# Run standardization on X variables
X_train = scale(X_train)
X_test = scale(X_test)
  
    
print(X_train.shape); print(X_test.shape)

(708, 161)
(304, 161)


In [22]:
rr = Ridge(alpha=0.01)
rr = RidgeCV(alphas=[0.1, 1.0, 10.0], cv=10)
rr.fit(X_train, y_train) 
pred_train_rr= rr.predict(X_train)
print("Train MSE: ", np.sqrt(mean_squared_error(y_train,pred_train_rr)))
print("R squared training set: ", r2_score(y_train, pred_train_rr)*100)

pred_test_rr= rr.predict(X_test)
print("Test MSE: ", np.sqrt(mean_squared_error(y_test,pred_test_rr))) 
print("R squared test set: ", r2_score(y_test, pred_test_rr)*100)






# Calculate the absolute errors
errors = abs(pred_test_rr - y_test)

 # Print out the mean absolute error (mae)
print('Mean Absolute Error:', np.mean(errors), 'degrees.')



# Calculate mean absolute percentage error (MAPE)
mape = 100 * (errors / y_test)
# Calculate and display accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')

Train MSE:  95.13732923966857
R squared training set:  78.53762506583996
Test MSE:  107.29538676736196
R squared test set:  72.22176207981319
Mean Absolute Error: YieldKgAcre    81.675159
dtype: float64 degrees.
Accuracy: YieldKgAcre    89.82
dtype: float64 %.


---
### # For Eco District ID: 849

In [23]:
df7 = dataframe[dataframe['ECODISTRICT_ID']==849]
df7.drop(['ECODISTRICT_ID'], axis=1, inplace=True)
# split data into X and y
x = pd.DataFrame(df7.drop(labels=['YieldKgAcre'], axis=1))
y = pd.DataFrame(df7['YieldKgAcre'])

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


In [24]:
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.30, random_state=40)   
  
# Run standardization on X variables
X_train = scale(X_train)
X_test = scale(X_test)     
        
        
print(X_train.shape); print(X_test.shape)

(700, 161)
(301, 161)


In [25]:
rr = Ridge(alpha=0.01)
rr = RidgeCV(alphas=[0.1, 1.0, 10.0], cv=10)
rr.fit(X_train, y_train) 
pred_train_rr= rr.predict(X_train)
print("Train MSE: ", np.sqrt(mean_squared_error(y_train,pred_train_rr)))
print("R squared training set: ", r2_score(y_train, pred_train_rr)*100)

pred_test_rr= rr.predict(X_test)
print("Test MSE: ", np.sqrt(mean_squared_error(y_test,pred_test_rr))) 
print("R squared test set: ", r2_score(y_test, pred_test_rr)*100)






# Calculate the absolute errors
errors = abs(pred_test_rr - y_test)

 # Print out the mean absolute error (mae)
print('Mean Absolute Error:', np.mean(errors), 'degrees.')



# Calculate mean absolute percentage error (MAPE)
mape = 100 * (errors / y_test)
# Calculate and display accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')

Train MSE:  78.54796667665148
R squared training set:  90.73648086893138
Test MSE:  95.63859918528432
R squared test set:  85.97108872906384
Mean Absolute Error: YieldKgAcre    71.924404
dtype: float64 degrees.
Accuracy: YieldKgAcre    89.99
dtype: float64 %.


---
### # For Eco District ID: 729

In [26]:
df8 = dataframe[dataframe['ECODISTRICT_ID']==729]
df8.drop(['ECODISTRICT_ID'], axis=1, inplace=True)
# split data into X and y
x = pd.DataFrame(df8.drop(labels=['YieldKgAcre'], axis=1))
y = pd.DataFrame(df8['YieldKgAcre'])

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


In [27]:
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.30, random_state=40)   

# Run standardization on X variables
X_train = scale(X_train)
X_test = scale(X_test) 
     
         
print(X_train.shape); print(X_test.shape)

(662, 161)
(284, 161)


In [28]:
rr = Ridge(alpha=0.01)
rr = RidgeCV(alphas=[0.1, 1.0, 10.0], cv=10)
rr.fit(X_train, y_train) 
pred_train_rr= rr.predict(X_train)
print("Train MSE: ", np.sqrt(mean_squared_error(y_train,pred_train_rr)))
print("R squared training set: ", r2_score(y_train, pred_train_rr)*100)

pred_test_rr= rr.predict(X_test)
print("Test MSE: ", np.sqrt(mean_squared_error(y_test,pred_test_rr))) 
print("R squared test set: ", r2_score(y_test, pred_test_rr)*100)






# Calculate the absolute errors
errors = abs(pred_test_rr - y_test)

 # Print out the mean absolute error (mae)
print('Mean Absolute Error:', np.mean(errors), 'degrees.')



# Calculate mean absolute percentage error (MAPE)
mape = 100 * (errors / y_test)
# Calculate and display accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')

Train MSE:  71.19193192790989
R squared training set:  81.91779666344638
Test MSE:  98.91196333229948
R squared test set:  67.7959532281787
Mean Absolute Error: YieldKgAcre    75.362232
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.5
dtype: float64 %.


---
### # For Eco District ID: 753  

In [29]:
df9 = dataframe[dataframe['ECODISTRICT_ID']==753]
df9.drop(['ECODISTRICT_ID'], axis=1, inplace=True)
# split data into X and y
x = pd.DataFrame(df9.drop(labels=['YieldKgAcre'], axis=1))
y = pd.DataFrame(df9['YieldKgAcre'])

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


In [30]:
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.30, random_state=40)   
  
# Run standardization on X variables
X_train = scale(X_train)
X_test = scale(X_test)  

print(X_train.shape); print(X_test.shape)

(662, 161)
(284, 161)


In [31]:
rr = Ridge(alpha=0.01)
rr = RidgeCV(alphas=[0.1, 1.0, 10.0], cv=10)
rr.fit(X_train, y_train) 
pred_train_rr= rr.predict(X_train)
print("Train MSE: ", np.sqrt(mean_squared_error(y_train,pred_train_rr)))
print("R squared training set: ", r2_score(y_train, pred_train_rr)*100)

pred_test_rr= rr.predict(X_test)
print("Test MSE: ", np.sqrt(mean_squared_error(y_test,pred_test_rr))) 
print("R squared test set: ", r2_score(y_test, pred_test_rr)*100)






# Calculate the absolute errors
errors = abs(pred_test_rr - y_test)

 # Print out the mean absolute error (mae)
print('Mean Absolute Error:', np.mean(errors), 'degrees.')



# Calculate mean absolute percentage error (MAPE)
mape = 100 * (errors / y_test)
# Calculate and display accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')

Train MSE:  58.01717143917308
R squared training set:  89.16091262673463
Test MSE:  70.3129400188808
R squared test set:  84.31207850205715
Mean Absolute Error: YieldKgAcre    56.129276
dtype: float64 degrees.
Accuracy: YieldKgAcre    93.31
dtype: float64 %.


---
### # For Eco District ID: 709  

In [32]:
df10 = dataframe[dataframe['ECODISTRICT_ID']==709]
df10.drop(['ECODISTRICT_ID'], axis=1, inplace=True)
# split data into X and y
x = pd.DataFrame(df10.drop(labels=['YieldKgAcre'], axis=1))
y = pd.DataFrame(df10['YieldKgAcre'])

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


In [33]:
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.30, random_state=40)   

# Run standardization on X variables
X_train = scale(X_train)
X_test = scale(X_test)
     
print(X_train.shape); print(X_test.shape)

(639, 161)
(274, 161)


In [34]:
rr = Ridge(alpha=0.01)
rr = RidgeCV(alphas=[0.1, 1.0, 10.0], cv=10)
rr.fit(X_train, y_train) 
pred_train_rr= rr.predict(X_train)
print("Train MSE: ", np.sqrt(mean_squared_error(y_train,pred_train_rr)))
print("R squared training set: ", r2_score(y_train, pred_train_rr)*100)

pred_test_rr= rr.predict(X_test)
print("Test MSE: ", np.sqrt(mean_squared_error(y_test,pred_test_rr))) 
print("R squared test set: ", r2_score(y_test, pred_test_rr)*100)






# Calculate the absolute errors
errors = abs(pred_test_rr - y_test)

 # Print out the mean absolute error (mae)
print('Mean Absolute Error:', np.mean(errors), 'degrees.')



# Calculate mean absolute percentage error (MAPE)
mape = 100 * (errors / y_test)
# Calculate and display accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')

Train MSE:  84.4591216123592
R squared training set:  87.89976256831535
Test MSE:  104.67585292319163
R squared test set:  82.28193554839056
Mean Absolute Error: YieldKgAcre    80.437732
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.28
dtype: float64 %.


---
---
+ **Now, we will apply three for loops to get the MSEs, R-squared values, MAE, and the accuracy for the rest of the eco-districts.**      
+ **Finally, we will be able to see when the accuracy drops and decide the threshold to stop for the modeling.**  

In [35]:
# created a list of eco districts to be used in the 1st for loop
eco_district_ids_list1 = [782,825,760,767,794,749,816,830,756,765,770,773,707,724,795,736,763,780,832,723,783,696,733,822,754,687,758,766,
                          726,717,701,705,850,755,803,706,813,741,789,841,852,702,735,375,689,714,839,693,711,694,784,820,821,831,751,680,704,
                          757,817,838,715,747,776,807,690]

In [36]:
# created another list of eco districts to be used in the 2nd for loop 
eco_district_ids_list2 = [697,695,698,710,796,809,840,854,685,700,771,805,810,824,844,734,762,764,774,785,851,775,716,739,843,661,682,847,669,815,846,
       772,837,742,761,853,672,699,768,778,657,677,691,743,759,827,660,819,848,652,718,686,720]

In [37]:
# created another list of eco districts to be used in the 3rd for loop. 
# These eco districts have very less number of records.
# We have used cv=5 for these eco districts.
eco_district_ids_list3 = [811,855,379,647,659,662,668,833,834]

### # First for loop for modeling for the eco districts in the 1st list. 

In [38]:
for i in eco_district_ids_list1:   # for loop for which iterates through each eco district id in the 1st list.
    print("Summary for the Eco District ID:", i)   # printing the eco district ID
    
    # The modeling code is similar to what we have done before in this file.  
    
    df9 = dataframe[dataframe['ECODISTRICT_ID']==i]
    df9.drop(['ECODISTRICT_ID'], axis=1, inplace=True)
    # split data into X and y
    x = pd.DataFrame(df9.drop(labels=['YieldKgAcre'], axis=1))
    y = pd.DataFrame(df9['YieldKgAcre'])
    
    X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.30, random_state=40)   

    # Run standardization on X variables
    X_train = scale(X_train)
    X_test = scale(X_test)
     
    rr = Ridge(alpha=0.01)
    rr = RidgeCV(alphas=[0.1, 1.0, 10.0], cv=10)
    rr.fit(X_train, y_train) 
    pred_train_rr= rr.predict(X_train)
    print("Train MSE: ", np.sqrt(mean_squared_error(y_train,pred_train_rr)))
    print("R squared training set: ", r2_score(y_train, pred_train_rr)*100)

    pred_test_rr= rr.predict(X_test)
    print("Test MSE: ", np.sqrt(mean_squared_error(y_test,pred_test_rr))) 
    print("R squared test set: ", r2_score(y_test, pred_test_rr)*100)

    # Calculate the absolute errors
    errors = abs(pred_test_rr - y_test)

    # Print out the mean absolute error (mae)
    print('Mean Absolute Error:', np.mean(errors), 'degrees.')



    # Calculate mean absolute percentage error (MAPE)
    mape = 100 * (errors / y_test)
    # Calculate and display accuracy
    accuracy = 100 - np.mean(mape)
    print('Accuracy:', round(accuracy, 2), '%.')
    print("\n")
    print("\n")

Summary for the Eco District ID: 782
Train MSE:  73.38728951043555
R squared training set:  74.87542579446689
Test MSE:  79.92267342047288
R squared test set:  75.49278534579338
Mean Absolute Error: YieldKgAcre    62.013096
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.62
dtype: float64 %.




Summary for the Eco District ID: 825


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  101.94780596272146
R squared training set:  71.06056475475584
Test MSE:  107.45455398170826
R squared test set:  70.17855644822903
Mean Absolute Error: YieldKgAcre    82.541958
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.16
dtype: float64 %.




Summary for the Eco District ID: 760
Train MSE:  68.25776560815834
R squared training set:  83.37844977534044
Test MSE:  78.9628443209741
R squared test set:  78.19727071184695
Mean Absolute Error: YieldKgAcre    61.849039
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.2
dtype: float64 %.




Summary for the Eco District ID: 767


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  74.13721632869554
R squared training set:  81.20664943261424
Test MSE:  78.50980174701876
R squared test set:  79.65978225909683
Mean Absolute Error: YieldKgAcre    61.076218
dtype: float64 degrees.
Accuracy: YieldKgAcre    92.47
dtype: float64 %.




Summary for the Eco District ID: 794
Train MSE:  85.48554531488693
R squared training set:  76.62423463925943
Test MSE:  92.36910297978065
R squared test set:  69.7757362857084
Mean Absolute Error: YieldKgAcre    71.19386
dtype: float64 degrees.
Accuracy: YieldKgAcre    87.82
dtype: float64 %.




Summary for the Eco District ID: 749


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  76.43473286488509
R squared training set:  88.65195737751273
Test MSE:  89.23401734556235
R squared test set:  82.65732273404048
Mean Absolute Error: YieldKgAcre    70.499937
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.92
dtype: float64 %.




Summary for the Eco District ID: 816
Train MSE:  104.62950089579601
R squared training set:  66.90036097208564
Test MSE:  104.06460208064665
R squared test set:  66.77266956313753
Mean Absolute Error: YieldKgAcre    82.794587
dtype: float64 degrees.
Accuracy: YieldKgAcre    87.89
dtype: float64 %.




Summary for the Eco District ID: 830


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  89.02521199344935
R squared training set:  77.91334134477958
Test MSE:  103.71318267066971
R squared test set:  69.05758727868336
Mean Absolute Error: YieldKgAcre    81.224847
dtype: float64 degrees.
Accuracy: YieldKgAcre    87.03
dtype: float64 %.




Summary for the Eco District ID: 756
Train MSE:  76.01340503382283
R squared training set:  84.99087372701656
Test MSE:  81.41265667312669
R squared test set:  80.89661750257291
Mean Absolute Error: YieldKgAcre    64.252502
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.52
dtype: float64 %.




Summary for the Eco District ID: 765


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  68.64659014703703
R squared training set:  86.45072656257864
Test MSE:  77.23504037289355
R squared test set:  83.24250029777875
Mean Absolute Error: YieldKgAcre    60.74956
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.74
dtype: float64 %.




Summary for the Eco District ID: 770
Train MSE:  92.34138056181669
R squared training set:  68.41570413906219
Test MSE:  107.18603409976953
R squared test set:  61.5896773785051
Mean Absolute Error: YieldKgAcre    84.552845
dtype: float64 degrees.
Accuracy: YieldKgAcre    89.15
dtype: float64 %.




Summary for the Eco District ID: 773


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  69.14351032836005
R squared training set:  77.96318206125255
Test MSE:  79.7486041446179
R squared test set:  73.43662808389197
Mean Absolute Error: YieldKgAcre    61.077351
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.54
dtype: float64 %.




Summary for the Eco District ID: 707
Train MSE:  71.91620122752379
R squared training set:  89.93682411940377
Test MSE:  88.43148521709207
R squared test set:  84.2614751875006
Mean Absolute Error: YieldKgAcre    67.610559
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.02
dtype: float64 %.




Summary for the Eco District ID: 724


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  84.55168894549475
R squared training set:  88.20515632496537
Test MSE:  107.24964381757127
R squared test set:  82.1567045215535
Mean Absolute Error: YieldKgAcre    85.161541
dtype: float64 degrees.
Accuracy: YieldKgAcre    86.68
dtype: float64 %.




Summary for the Eco District ID: 795
Train MSE:  89.48125976113583
R squared training set:  81.06673272873253
Test MSE:  106.2514773745604
R squared test set:  67.61902486738951
Mean Absolute Error: YieldKgAcre    81.350338
dtype: float64 degrees.
Accuracy: YieldKgAcre    86.63
dtype: float64 %.




Summary for the Eco District ID: 736
Train MSE:  68.82779743079814
R squared training set:  87.73926306461357


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Test MSE:  91.87381660747705
R squared test set:  76.94552123167107
Mean Absolute Error: YieldKgAcre    74.848647
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.78
dtype: float64 %.




Summary for the Eco District ID: 763
Train MSE:  79.27992247579762
R squared training set:  80.656742875237
Test MSE:  114.70009470377248
R squared test set:  62.225279629719374
Mean Absolute Error: YieldKgAcre    85.242105
dtype: float64 degrees.
Accuracy: YieldKgAcre    83.72
dtype: float64 %.




Summary for the Eco District ID: 780


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  99.9352351289837
R squared training set:  71.56228962173367
Test MSE:  110.90031561083913
R squared test set:  53.21540476197584
Mean Absolute Error: YieldKgAcre    91.595072
dtype: float64 degrees.
Accuracy: YieldKgAcre    89.28
dtype: float64 %.




Summary for the Eco District ID: 832
Train MSE:  88.22697590029398
R squared training set:  84.5750873777774
Test MSE:  104.29188106950662
R squared test set:  77.43062923646829
Mean Absolute Error: YieldKgAcre    81.321895
dtype: float64 degrees.
Accuracy: YieldKgAcre    86.31
dtype: float64 %.




Summary for the Eco District ID: 723


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  100.26748710592125
R squared training set:  78.41880391915394
Test MSE:  137.9321467451304
R squared test set:  64.00405032024135
Mean Absolute Error: YieldKgAcre    109.601808
dtype: float64 degrees.
Accuracy: YieldKgAcre    82.35
dtype: float64 %.




Summary for the Eco District ID: 783
Train MSE:  67.12055977107775
R squared training set:  78.78496598718712
Test MSE:  86.67161448715586
R squared test set:  64.86034856814061
Mean Absolute Error: YieldKgAcre    64.5649
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.72
dtype: float64 %.




Summary for the Eco District ID: 696


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  92.66766218035988
R squared training set:  79.73788483618351
Test MSE:  111.92691753602013
R squared test set:  65.50409908337869
Mean Absolute Error: YieldKgAcre    86.077828
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.03
dtype: float64 %.




Summary for the Eco District ID: 733
Train MSE:  75.33717371888459
R squared training set:  77.35970517306608
Test MSE:  100.50024932404587
R squared test set:  54.595100744707615
Mean Absolute Error: YieldKgAcre    74.426157
dtype: float64 degrees.
Accuracy: YieldKgAcre    89.97
dtype: float64 %.




Summary for the Eco District ID: 822


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  96.64457619939014
R squared training set:  74.35163400269833
Test MSE:  114.56637477126031
R squared test set:  60.02361807513276
Mean Absolute Error: YieldKgAcre    86.607306
dtype: float64 degrees.
Accuracy: YieldKgAcre    87.69
dtype: float64 %.




Summary for the Eco District ID: 754
Train MSE:  86.98614148637238
R squared training set:  79.7216254510308
Test MSE:  89.86077287464694
R squared test set:  75.63154655162144
Mean Absolute Error: YieldKgAcre    72.574391
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.05
dtype: float64 %.




Summary for the Eco District ID: 687


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  78.15221659983183
R squared training set:  82.84006890649282
Test MSE:  103.12372303047867
R squared test set:  66.9670808949343
Mean Absolute Error: YieldKgAcre    75.786608
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.09
dtype: float64 %.




Summary for the Eco District ID: 758
Train MSE:  69.42585927843577
R squared training set:  82.190985789855
Test MSE:  90.22725069571881
R squared test set:  72.3071707299397
Mean Absolute Error: YieldKgAcre    69.809188
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.42
dtype: float64 %.




Summary for the Eco District ID: 766
Train MSE:  60.45174828422903
R squared training set:  89.0337789840208


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Test MSE:  77.59244373134737
R squared test set:  77.634840983337
Mean Absolute Error: YieldKgAcre    61.952711
dtype: float64 degrees.
Accuracy: YieldKgAcre    93.19
dtype: float64 %.




Summary for the Eco District ID: 726
Train MSE:  111.36056392894443
R squared training set:  79.42085089363296
Test MSE:  139.42581153454898
R squared test set:  69.43869566312222
Mean Absolute Error: YieldKgAcre    112.935284
dtype: float64 degrees.
Accuracy: YieldKgAcre    77.77
dtype: float64 %.




Summary for the Eco District ID: 717


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  110.26047525067037
R squared training set:  85.51184791197866
Test MSE:  118.23620441469254
R squared test set:  79.60587032650032
Mean Absolute Error: YieldKgAcre    93.847536
dtype: float64 degrees.
Accuracy: YieldKgAcre    85.03
dtype: float64 %.




Summary for the Eco District ID: 701
Train MSE:  94.17405232938776
R squared training set:  69.6519994478588
Test MSE:  115.25806220530667
R squared test set:  52.69835334438917
Mean Absolute Error: YieldKgAcre    85.282456
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.49
dtype: float64 %.




Summary for the Eco District ID: 705


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  57.89963522112043
R squared training set:  92.94719820535255
Test MSE:  59.222704205730544
R squared test set:  93.2248537337932
Mean Absolute Error: YieldKgAcre    46.148762
dtype: float64 degrees.
Accuracy: YieldKgAcre    94.31
dtype: float64 %.




Summary for the Eco District ID: 850
Train MSE:  68.8259762092893
R squared training set:  83.05820639610842
Test MSE:  99.93873554940471
R squared test set:  67.78832141520027
Mean Absolute Error: YieldKgAcre    77.097037
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.76
dtype: float64 %.




Summary for the Eco District ID: 755


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  66.35215389929598
R squared training set:  85.14999503324752
Test MSE:  93.34898149335382
R squared test set:  65.59181404654055
Mean Absolute Error: YieldKgAcre    74.735212
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.63
dtype: float64 %.




Summary for the Eco District ID: 803
Train MSE:  88.01207756778086
R squared training set:  76.88729740757906
Test MSE:  107.08294205714647
R squared test set:  51.5267343174555
Mean Absolute Error: YieldKgAcre    79.986261
dtype: float64 degrees.
Accuracy: YieldKgAcre    89.8
dtype: float64 %.




Summary for the Eco District ID: 706


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  60.81746054150349
R squared training set:  90.57500880255712
Test MSE:  65.57668962738548
R squared test set:  89.63290330195815
Mean Absolute Error: YieldKgAcre    52.209333
dtype: float64 degrees.
Accuracy: YieldKgAcre    92.86
dtype: float64 %.




Summary for the Eco District ID: 813
Train MSE:  116.32024694764539
R squared training set:  58.26393691327423
Test MSE:  150.66230175644583
R squared test set:  35.80590688747487
Mean Absolute Error: YieldKgAcre    118.864363
dtype: float64 degrees.
Accuracy: YieldKgAcre    84.69
dtype: float64 %.




Summary for the Eco District ID: 741
Train MSE:  50.429640158685785
R squared training set:  91.87927008183948
Test MSE:  69.30599385733828
R squared test set:  84.78484288085953
Mean Absolute Error:

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 YieldKgAcre    56.893968
dtype: float64 degrees.
Accuracy: YieldKgAcre    93.05
dtype: float64 %.




Summary for the Eco District ID: 789
Train MSE:  98.70952539961003
R squared training set:  72.35449128946051
Test MSE:  111.6488193343207
R squared test set:  68.9870845376098
Mean Absolute Error: YieldKgAcre    91.979941
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.27
dtype: float64 %.




Summary for the Eco District ID: 841
Train MSE:  93.5851506269529
R squared training set:  76.51730951542426


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Test MSE:  134.09556575270224
R squared test set:  57.383366577226965
Mean Absolute Error: YieldKgAcre    107.86505
dtype: float64 degrees.
Accuracy: YieldKgAcre    77.88
dtype: float64 %.




Summary for the Eco District ID: 852
Train MSE:  62.08463443827214
R squared training set:  87.76178636426938
Test MSE:  81.63646469181802
R squared test set:  78.17363999883706
Mean Absolute Error: YieldKgAcre    63.236526
dtype: float64 degrees.
Accuracy: YieldKgAcre    93.4
dtype: float64 %.




Summary for the Eco District ID: 702


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  64.29654846446343
R squared training set:  86.97595552143291
Test MSE:  108.14919404956976
R squared test set:  74.08054698131437
Mean Absolute Error: YieldKgAcre    86.602732
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.95
dtype: float64 %.




Summary for the Eco District ID: 735
Train MSE:  74.22415576363984
R squared training set:  78.2559375892295
Test MSE:  86.37100579295685
R squared test set:  69.03530271962157
Mean Absolute Error: YieldKgAcre    67.805859
dtype: float64 degrees.
Accuracy: YieldKgAcre    92.18
dtype: float64 %.




Summary for the Eco District ID: 375
Train MSE:  101.06601640908858
R squared training set:  89.14648320613932


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Test MSE:  110.53224657983675
R squared test set:  88.40543760843772
Mean Absolute Error: YieldKgAcre    85.527392
dtype: float64 degrees.
Accuracy: YieldKgAcre    86.08
dtype: float64 %.




Summary for the Eco District ID: 689
Train MSE:  75.85352728474341
R squared training set:  85.14539183818796
Test MSE:  135.22190195221552
R squared test set:  57.60368894235026
Mean Absolute Error: YieldKgAcre    102.662272
dtype: float64 degrees.
Accuracy: YieldKgAcre    85.88
dtype: float64 %.




Summary for the Eco District ID: 714


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  84.34288662011973
R squared training set:  86.99513459104159
Test MSE:  129.23642299857096
R squared test set:  71.76235689082961
Mean Absolute Error: YieldKgAcre    98.524879
dtype: float64 degrees.
Accuracy: YieldKgAcre    86.42
dtype: float64 %.




Summary for the Eco District ID: 839
Train MSE:  72.53779147893329
R squared training set:  91.41876340375906
Test MSE:  94.54779465987403
R squared test set:  83.98391730999855
Mean Absolute Error: YieldKgAcre    69.695287
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.05
dtype: float64 %.




Summary for the Eco District ID: 693


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  98.2930104743628
R squared training set:  83.07413744083108
Test MSE:  131.30969923452986
R squared test set:  66.31448825451571
Mean Absolute Error: YieldKgAcre    105.813395
dtype: float64 degrees.
Accuracy: YieldKgAcre    86.61
dtype: float64 %.




Summary for the Eco District ID: 711
Train MSE:  118.08869309487137
R squared training set:  74.19879922734968
Test MSE:  148.1299410988609
R squared test set:  56.69744009905469
Mean Absolute Error: YieldKgAcre    120.845238
dtype: float64 degrees.
Accuracy: YieldKgAcre    85.5
dtype: float64 %.




Summary for the Eco District ID: 694
Train MSE:  87.90070564687439
R squared training set:  83.65019345768737
Test MSE: 

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 99.58800710669067
R squared test set:  78.4593806692013
Mean Absolute Error: YieldKgAcre    80.044868
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.87
dtype: float64 %.




Summary for the Eco District ID: 784
Train MSE:  65.27289812411328
R squared training set:  81.21655532699475
Test MSE:  77.18096837899719
R squared test set:  69.6411851226662
Mean Absolute Error: YieldKgAcre    62.94923
dtype: float64 degrees.
Accuracy: YieldKgAcre    92.23
dtype: float64 %.




Summary for the Eco District ID: 820


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  103.94420705390152
R squared training set:  67.55611965787412
Test MSE:  129.91649829344007
R squared test set:  55.28419566603543
Mean Absolute Error: YieldKgAcre    101.998709
dtype: float64 degrees.
Accuracy: YieldKgAcre    87.6
dtype: float64 %.




Summary for the Eco District ID: 821
Train MSE:  96.24472848236172
R squared training set:  78.66505375289856
Test MSE:  120.95915443245998
R squared test set:  73.77640917463013
Mean Absolute Error: YieldKgAcre    95.129019
dtype: float64 degrees.
Accuracy: YieldKgAcre    83.24
dtype: float64 %.




Summary for the Eco District ID: 831
Train MSE:  69.01277530086101
R squared training set:  85.18017364041907


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Test MSE:  71.4510520068194
R squared test set:  87.12702479494804
Mean Absolute Error: YieldKgAcre    53.753053
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.2
dtype: float64 %.




Summary for the Eco District ID: 751
Train MSE:  96.26318992441406
R squared training set:  83.70169485477824
Test MSE:  106.7402998537572
R squared test set:  75.6692891906654
Mean Absolute Error: YieldKgAcre    84.677117
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.91
dtype: float64 %.




Summary for the Eco District ID: 680
Train MSE:  89.58920344794215
R squared training set:  83.34779389061188
Test MSE:  122.3199426971109
R squared test set:  73.11158834965268
Mean Absolute Error: YieldKgAcre    98.519445
dtype: float64 degrees.
Accuracy:

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 YieldKgAcre    86.65
dtype: float64 %.




Summary for the Eco District ID: 704
Train MSE:  87.6746354105886
R squared training set:  85.4090555836775
Test MSE:  129.2782681414233
R squared test set:  58.214729506014876
Mean Absolute Error: YieldKgAcre    95.293708
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.76
dtype: float64 %.




Summary for the Eco District ID: 757
Train MSE:  57.56392688256911
R squared training set:  88.02651063109619


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Test MSE:  72.8188723748231
R squared test set:  69.57072236764313
Mean Absolute Error: YieldKgAcre    63.856624
dtype: float64 degrees.
Accuracy: YieldKgAcre    92.3
dtype: float64 %.




Summary for the Eco District ID: 817
Train MSE:  96.36295413369552
R squared training set:  81.81867456098354
Test MSE:  118.08059903725734
R squared test set:  77.42655636300724
Mean Absolute Error: YieldKgAcre    88.595507
dtype: float64 degrees.
Accuracy: YieldKgAcre    86.84
dtype: float64 %.




Summary for the Eco District ID: 838
Train MSE:  78.81590963751701
R squared training set:  82.60838938903026
Test MSE:  124.4711290906284

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(



R squared test set:  53.4733401451546
Mean Absolute Error: YieldKgAcre    99.449104
dtype: float64 degrees.
Accuracy: YieldKgAcre    82.3
dtype: float64 %.




Summary for the Eco District ID: 715
Train MSE:  87.26700677723757
R squared training set:  90.57862569403615
Test MSE:  137.6489184763085
R squared test set:  73.03651810538796
Mean Absolute Error: YieldKgAcre    115.987014
dtype: float64 degrees.
Accuracy: YieldKgAcre    86.23
dtype: float64 %.




Summary for the Eco District ID: 747


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  92.41242692545985
R squared training set:  85.43591868273658
Test MSE:  116.79568200932405
R squared test set:  72.53168901986483
Mean Absolute Error: YieldKgAcre    99.582758
dtype: float64 degrees.
Accuracy: YieldKgAcre    86.43
dtype: float64 %.




Summary for the Eco District ID: 776
Train MSE:  76.18981479453473
R squared training set:  77.40801375226508
Test MSE:  126.35003795746113
R squared test set:  52.492266749021375
Mean Absolute Error: YieldKgAcre    103.834055
dtype: float64 degrees.
Accuracy: YieldKgAcre    86.81
dtype: float64 %.




Summary for the Eco District ID: 807
Train MSE:  92.95206049833237
R squared training set:  73.06743402020201
Test MSE:  104.80237933837796
R squared test set:  62.48074682440987
Mean Absolute Error:

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 YieldKgAcre    85.617267
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.88
dtype: float64 %.




Summary for the Eco District ID: 690
Train MSE:  72.40653350051025
R squared training set:  85.97173660603734
Test MSE:  115.84336944673001
R squared test set:  65.48769958919904
Mean Absolute Error: YieldKgAcre    83.986691
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.01
dtype: float64 %.






A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


### # Second for loop for modeling for the eco districts in the 2nd list. 

In [39]:
for i in eco_district_ids_list2:       # for loop for which iterates through each eco district id in the 2nd list.
    print("Summary for the Eco District ID:", i)  # printing the eco district ID
    
    
    # The modeling code is similar to what we have done before in this file. 
    df9 = dataframe[dataframe['ECODISTRICT_ID']==i]
    df9.drop(['ECODISTRICT_ID'], axis=1, inplace=True)
    # split data into X and y
    x = pd.DataFrame(df9.drop(labels=['YieldKgAcre'], axis=1))
    y = pd.DataFrame(df9['YieldKgAcre'])
    
    X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.30, random_state=40)   

    # Run standardization on X variables
    X_train = scale(X_train)
    X_test = scale(X_test)
     
    rr = Ridge(alpha=0.01)
    rr = RidgeCV(alphas=[0.1, 1.0, 10.0], cv=10)
    rr.fit(X_train, y_train) 
    pred_train_rr= rr.predict(X_train)
    print("Train MSE: ", np.sqrt(mean_squared_error(y_train,pred_train_rr)))
    print("R squared training set: ", r2_score(y_train, pred_train_rr)*100)

    pred_test_rr= rr.predict(X_test)
    print("Test MSE: ", np.sqrt(mean_squared_error(y_test,pred_test_rr))) 
    print("R squared test set: ", r2_score(y_test, pred_test_rr)*100)

    # Calculate the absolute errors
    errors = abs(pred_test_rr - y_test)

    # Print out the mean absolute error (mae)
    print('Mean Absolute Error:', np.mean(errors), 'degrees.')



    # Calculate mean absolute percentage error (MAPE)
    mape = 100 * (errors / y_test)
    # Calculate and display accuracy
    accuracy = 100 - np.mean(mape)
    print('Accuracy:', round(accuracy, 2), '%.')
    print("\n")
    print("\n")

Summary for the Eco District ID: 697


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  92.26475182846802
R squared training set:  80.23738533516618
Test MSE:  117.49643664153372
R squared test set:  76.66075982220899
Mean Absolute Error: YieldKgAcre    95.776049
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.19
dtype: float64 %.




Summary for the Eco District ID: 695


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  50.36281590526053
R squared training set:  88.23430578956376
Test MSE:  73.11520215990265
R squared test set:  77.19528499791099
Mean Absolute Error: YieldKgAcre    60.237586
dtype: float64 degrees.
Accuracy: YieldKgAcre    93.85
dtype: float64 %.




Summary for the Eco District ID: 698


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  68.86713388100733
R squared training set:  85.25784236932364
Test MSE:  86.74400743106114
R squared test set:  79.93409688117701
Mean Absolute Error: YieldKgAcre    68.082548
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.89
dtype: float64 %.




Summary for the Eco District ID: 710


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  81.22690618014171
R squared training set:  83.95181081671481
Test MSE:  111.85023242415411
R squared test set:  74.96194087714181
Mean Absolute Error: YieldKgAcre    81.333185
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.41
dtype: float64 %.




Summary for the Eco District ID: 796
Train MSE:  76.40465676521221
R squared training set:  70.70591767604837
Test MSE:  103.43968765727672
R squared test set:  62.29148260686139
Mean Absolute Error: YieldKgAcre    83.889014
dtype: float64 degrees.
Accuracy:

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 YieldKgAcre    87.59
dtype: float64 %.




Summary for the Eco District ID: 809
Train MSE:  94.58100786767758
R squared training set:  82.96926626488958


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Test MSE:  149.91111961714506
R squared test set:  42.79347673017326
Mean Absolute Error: YieldKgAcre    119.275572
dtype: float64 degrees.
Accuracy: YieldKgAcre    81.87
dtype: float64 %.




Summary for the Eco District ID: 840
Train MSE:  68.00302496975543
R squared training set:  93.16399574515796
Test MSE:  96.86988442231286
R squared test set:  85.20492358794436
Mean Absolute Error: YieldKgAcre    76.100272
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.91
dtype: float64 %.




Summary for the Eco District ID: 854


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  50.21001248252734
R squared training set:  88.5714662561619
Test MSE:  73.90043224149271
R squared test set:  79.51098708724062
Mean Absolute Error: YieldKgAcre    59.013894
dtype: float64 degrees.
Accuracy: YieldKgAcre    93.27
dtype: float64 %.




Summary for the Eco District ID: 685
Train MSE:  84.6406399129241
R squared training set:  84.29722184715848
Test MSE:  107.2205137575512
R squared test set:  80.46982707242822
Mean Absolute Error: YieldKgAcre    82.754672
dtype: float64 degrees.
Accuracy:

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 YieldKgAcre    90.08
dtype: float64 %.




Summary for the Eco District ID: 700
Train MSE:  59.68942068916356
R squared training set:  91.24395893180454
Test MSE:  78.82072989042332
R squared test set:  88.31027328125477
Mean Absolute Error:

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 YieldKgAcre    60.747129
dtype: float64 degrees.
Accuracy: YieldKgAcre    93.32
dtype: float64 %.




Summary for the Eco District ID: 771
Train MSE:  94.61260886595315
R squared training set:  70.46831668849009
Test MSE:  99.67810328920245
R squared test set:  69.2342407614134
Mean Absolute Error: YieldKgAcre    74.130448
dtype: float64 degrees.
Accuracy: YieldKgAcre    87.65
dtype: float64 %.




Summary for the Eco District ID: 805


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  90.39270794116916
R squared training set:  75.82678393021229
Test MSE:  131.14620349043736
R squared test set:  46.214123869225666
Mean Absolute Error: YieldKgAcre    96.671586
dtype: float64 degrees.
Accuracy: YieldKgAcre    86.49
dtype: float64 %.




Summary for the Eco District ID: 810
Train MSE:  88.45250259038569
R squared training set:  74.55661422087587
Test MSE:  74.45494796585528
R squared test set:  85.13934361199313
Mean Absolute Error: YieldKgAcre    57.625745
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.53
dtype: float64 %.




Summary for the Eco District ID: 824


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  99.70098946365695
R squared training set:  71.23119672501231
Test MSE:  136.2999357348129
R squared test set:  52.51556250896285
Mean Absolute Error: YieldKgAcre    102.571806
dtype: float64 degrees.
Accuracy: YieldKgAcre    84.5
dtype: float64 %.




Summary for the Eco District ID: 844
Train MSE:  48.06398039181587
R squared training set:  93.39164567592263
Test MSE:  110.79748899613769
R squared test set:  64.27486054626688
Mean Absolute Error: YieldKgAcre    85.226939
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.6
dtype: float64 %.




Summary for the Eco District ID: 734
Train MSE:  67.3069978895773
R squared training set:  81.78376365024936


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Test MSE:  102.73583809874057
R squared test set:  53.43549479489693
Mean Absolute Error: YieldKgAcre    83.452381
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.14
dtype: float64 %.




Summary for the Eco District ID: 762
Train MSE:  65.98006050365191
R squared training set:  80.71864199285538
Test MSE:  81.21094251972357
R squared test set:  65.17053071348498
Mean Absolute Error: YieldKgAcre    64.791754
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.86
dtype: float64 %.




Summary for the Eco District ID: 764


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  53.27979708614614
R squared training set:  88.71570588170987
Test MSE:  77.28767221924717
R squared test set:  69.40528977728813
Mean Absolute Error: YieldKgAcre    58.833584
dtype: float64 degrees.
Accuracy: YieldKgAcre    92.21
dtype: float64 %.




Summary for the Eco District ID: 774
Train MSE:  49.084927128750564
R squared training set:  88.28012528918642
Test MSE:  73.43326764582302
R squared test set:  78.17620243451607
Mean Absolute Error: YieldKgAcre    56.701207
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.45
dtype: float64 %.




Summary for the Eco District ID: 785
Train MSE:  38.466942601788695
R squared training set:  93.67365394353646
Test MSE:  120.14160622392868
R squared test set:  55.26384604232118
Mean Absolute Error: YieldKgAcre    90.204023
dtype: float64 degrees.
Accuracy:

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 YieldKgAcre    86.9
dtype: float64 %.




Summary for the Eco District ID: 851
Train MSE:  56.40113340381052
R squared training set:  91.46348296500778
Test MSE:  139.95320323786632
R squared test set:  59.75351983116499
Mean Absolute Error: YieldKgAcre    102.770962
dtype: float64 degrees.
Accuracy: YieldKgAcre    83.74
dtype: float64 %.




Summary for the Eco District ID: 775
Train MSE:  47.976983622690696
R squared training set:  90.69297129539483
Test MSE:  122.34757114785711
R squared test set:  42.45058609032718
Mean Absolute Error: YieldKgAcre    94.435315
dtype: float64 degrees.
Accuracy: YieldKgAcre    86.6
dtype: float64 %.




Summary for the Eco District ID: 716


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  66.63718643001312
R squared training set:  91.22154711580174
Test MSE:  91.9833498288868
R squared test set:  70.87917695154201
Mean Absolute Error: YieldKgAcre    70.800527
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.93
dtype: float64 %.




Summary for the Eco District ID: 739
Train MSE:  80.00999909820297
R squared training set:  80.62983420340942
Test MSE:  91.87904547770002
R squared test set:  70.58539540966956
Mean Absolute Error: YieldKgAcre    76.347593
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.31
dtype: float64 %.




Summary for the Eco District ID: 843
Train MSE:  62.8008756881042
R squared training set:  88.36040293346502
Test MSE:  100.38426315517526
R squared test set:  72.77181249599958
Mean Absolute Error: YieldKgAcre    73.473813
dtype: float64

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 degrees.
Accuracy: YieldKgAcre    86.88
dtype: float64 %.




Summary for the Eco District ID: 661
Train MSE:  81.84162091624451
R squared training set:  85.48142068586671
Test MSE:  187.22892516430844
R squared test set:  28.456888822162774
Mean Absolute Error: YieldKgAcre    119.302626
dtype: float64 degrees.
Accuracy: YieldKgAcre    77.68
dtype: float64 %.




Summary for the Eco District ID: 682
Train MSE:  115.63163619254493
R squared training set:  70.33453955597977


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Test MSE:  170.64218843460608
R squared test set:  48.82914242698677
Mean Absolute Error: YieldKgAcre    130.061637
dtype: float64 degrees.
Accuracy: YieldKgAcre    82.37
dtype: float64 %.




Summary for the Eco District ID: 847
Train MSE:  47.27222027634582
R squared training set:  92.16013625171703
Test MSE:  136.7735126807981
R squared test set:  48.252443606954444
Mean Absolute Error: YieldKgAcre    102.696516
dtype: float64 degrees.
Accuracy: YieldKgAcre    83.81
dtype: float64 %.




Summary for the Eco District ID: 669
Train MSE:  247.2424532303618
R squared training set:  64.86200088328388
Test MSE:  339.0580466692743
R squared test set:  -150.7473633206342
Mean Absolute Error:

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 YieldKgAcre    211.490551
dtype: float64 degrees.
Accuracy: YieldKgAcre    54.3
dtype: float64 %.




Summary for the Eco District ID: 815
Train MSE:  96.92294046949475
R squared training set:  80.19061429311996
Test MSE:  128.61938919905032
R squared test set:  -3.2241975712602455
Mean Absolute Error: YieldKgAcre    84.831199
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.03
dtype: float64 %.




Summary for the Eco District ID: 846
Train MSE:  63.64443199423893
R squared training set:  89.9726113070759
Test MSE:  121.29960615356352
R squared test set:  68.76537791067555
Mean Absolute Error: YieldKgAcre    96.613196
dtype: float64 degrees.
Accuracy: YieldKgAcre    81.28
dtype: float64 %.




Summary for the Eco District ID: 772
Train MSE:  83.62987451034536
R squared training set:  84.37896101596185
Test MSE:  102.41669568779669
R squared test set:  67.09305625145306
Mean Absolute Error: YieldKgAcre    80.454079
dtype: float64 degrees.
Accuracy: YieldKgAcre    87.72
dtype: float

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 YieldKgAcre    115.988382
dtype: float64 degrees.
Accuracy: YieldKgAcre    78.39
dtype: float64 %.




Summary for the Eco District ID: 742
Train MSE:  44.610982708049306
R squared training set:  90.32001220215868
Test MSE:  119.78809046495817
R squared test set:  55.974528455762986
Mean Absolute Error: YieldKgAcre    100.6183
dtype: float64 degrees.
Accuracy: YieldKgAcre    86.66
dtype: float64 %.




Summary for the Eco District ID: 761
Train MSE:  54.08938702727001
R squared training set:  93.32167225400589
Test MSE: 

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 133.544255272733
R squared test set:  -35.838373453394446
Mean Absolute Error: YieldKgAcre    96.26783
dtype: float64 degrees.
Accuracy: YieldKgAcre    87.56
dtype: float64 %.




Summary for the Eco District ID: 853
Train MSE:  57.81486264611546
R squared training set:  92.33054978477672
Test MSE:  126.60055467922174
R squared test set:  31.453605613835045
Mean Absolute Error: YieldKgAcre    96.02813
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.2
dtype: float64 %.




Summary for the Eco District ID: 672
Train MSE:  64.85901837991176
R squared training set:  89.56261683581562
Test MSE:  156.17756044194533
R squared test set:  15.435620438958841
Mean Absolute Error:

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 YieldKgAcre    128.738925
dtype: float64 degrees.
Accuracy: YieldKgAcre    83.31
dtype: float64 %.




Summary for the Eco District ID: 699
Train MSE:  30.572440268909475
R squared training set:  97.49428460597704
Test MSE:  119.9331700209189
R squared test set:  49.66557762087208
Mean Absolute Error: YieldKgAcre    88.943537
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.41
dtype: float64 %.




Summary for the Eco District ID: 768
Train MSE:  26.562260713731234
R squared training set:  96.57785437532883
Test MSE:  50.51867427258863
R squared test set:  81.65763060455431
Mean Absolute Error: YieldKgAcre    39.47126
dtype: float64 degrees.
Accuracy: YieldKgAcre    95.59
dtype: float64 %.




Summary for the Eco District ID: 778


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  68.80186965270343
R squared training set:  79.80789923904375
Test MSE:  86.7912248168655
R squared test set:  44.23434118608765
Mean Absolute Error: YieldKgAcre    75.028177
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.64
dtype: float64 %.




Summary for the Eco District ID: 657
Train MSE:  69.66757541134547
R squared training set:  93.27365484030588
Test MSE:  162.86682793525884
R squared test set:  62.49708829850884
Mean Absolute Error: YieldKgAcre    119.373891
dtype: float64 degrees.
Accuracy: YieldKgAcre    81.39
dtype: float64 %.




Summary for the Eco District ID: 677
Train MSE:  58.48594912348685
R squared training set:  94.78868787587929
Test MSE:  138.85387304230895
R squared test set:  83.16828418426326
Mean Absolute Error: YieldKgAcre    106.084295
dtype: float64 degrees.
Accuracy: YieldKgAcre    80.0
dtype: float64 %.




Summary for the Eco District ID: 691


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats i

Train MSE:  86.9838687111354
R squared training set:  87.44582972268506
Test MSE:  155.81412177528352
R squared test set:  -33.64738468594795
Mean Absolute Error: YieldKgAcre    128.314073
dtype: float64 degrees.
Accuracy: YieldKgAcre    82.19
dtype: float64 %.




Summary for the Eco District ID: 743
Train MSE:  30.966621465732338
R squared training set:  96.00291771451256
Test MSE:  113.13836093574442
R squared test set:  24.51124025738418
Mean Absolute Error: YieldKgAcre    96.674393
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.45
dtype: float64 %.




Summary for the Eco District ID: 759
Train MSE:  39.429182196124685
R squared training set:  94.66156551158547
Test MSE:  89.23242397602766
R squared test set:  61.62583274929921
Mean Absolute Error:

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 YieldKgAcre    81.172967
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.65
dtype: float64 %.




Summary for the Eco District ID: 827
Train MSE:  122.84151206502861
R squared training set:  74.13819724454052
Test MSE:  106.85341878541453
R squared test set:  77.85516907975327
Mean Absolute Error: YieldKgAcre    84.11746
dtype: float64 degrees.
Accuracy: YieldKgAcre    86.02
dtype: float64 %.




Summary for the Eco District ID: 660
Train MSE:  51.33494854316939
R squared training set:  93.97927898591348
Test MSE:  261.7145654007187
R squared test set:  -115.00087870320796
Mean Absolute Error: YieldKgAcre    200.295791
dtype: float64 degrees.
Accuracy: YieldKgAcre    78.47
dtype: float64 %.




Summary for the Eco District ID: 819


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  51.405027914878964
R squared training set:  94.42114493957409
Test MSE:  178.382368470144
R squared test set:  50.080686816433925
Mean Absolute Error: YieldKgAcre    143.614512
dtype: float64 degrees.
Accuracy: YieldKgAcre    71.76
dtype: float64 %.




Summary for the Eco District ID: 848
Train MSE:  64.55769732989704
R squared training set:  90.33175717042786
Test MSE:  103.18076332659875
R squared test set:  66.5860072790304
Mean Absolute Error: YieldKgAcre    82.30503
dtype: float64 degrees.
Accuracy: YieldKgAcre    88.57
dtype: float64 %.




Summary for the Eco District ID: 652
Train MSE:  28.70399514683778
R squared training set:  98.6912030990024
Test MSE:  86.28994650893029
R squared test set:  77.37463149420121
Mean Absolute Error:

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


 YieldKgAcre    67.78657
dtype: float64 degrees.
Accuracy: YieldKgAcre    91.94
dtype: float64 %.




Summary for the Eco District ID: 718
Train MSE:  43.42593857129907
R squared training set:  96.53540093411947
Test MSE:  138.1462489886452
R squared test set:  44.44751162099474
Mean Absolute Error: YieldKgAcre    119.4876
dtype: float64 degrees.
Accuracy: YieldKgAcre    83.78
dtype: float64 %.




Summary for the Eco District ID: 686


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  0.2889206178196729
R squared training set:  99.9998272247727
Test MSE:  239.3636009025442
R squared test set:  -12.500840018374149
Mean Absolute Error: YieldKgAcre    192.089789
dtype: float64 degrees.
Accuracy: YieldKgAcre    78.09
dtype: float64 %.




Summary for the Eco District ID: 720
Train MSE:  4.932685982710488
R squared training set:  99.93397926426118
Test MSE:  156.6876755306399
R squared test set:  23.0525387947258
Mean Absolute Error: YieldKgAcre    140.198162
dtype: float64 degrees.
Accuracy: YieldKgAcre    52.85
dtype: float64 %.






### # Third for loop for modeling for the eco districts in the 3rd list. 

In [40]:
for i in eco_district_ids_list3:     # for loop for which iterates through each eco district id in the 3rd list.
    print("Summary for the Eco District ID:", i)  # printing the eco district ID
    
    
    # The modeling code is similar to what we have done before in this file. 
    
    df9 = dataframe[dataframe['ECODISTRICT_ID']==i]
    df9.drop(['ECODISTRICT_ID'], axis=1, inplace=True)
    # split data into X and y
    x = pd.DataFrame(df9.drop(labels=['YieldKgAcre'], axis=1))
    y = pd.DataFrame(df9['YieldKgAcre'])
    
    X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.30, random_state=40)   

    # Run standardization on X variables
    X_train = scale(X_train)
    X_test = scale(X_test)
     
    rr = Ridge(alpha=0.01)
    rr = RidgeCV(alphas=[0.1, 1.0, 10.0], cv=5)
    rr.fit(X_train, y_train) 
    pred_train_rr= rr.predict(X_train)
    print("Train MSE: ", np.sqrt(mean_squared_error(y_train,pred_train_rr)))
    print("R squared training set: ", r2_score(y_train, pred_train_rr)*100)

    pred_test_rr= rr.predict(X_test)
    print("Test MSE: ", np.sqrt(mean_squared_error(y_test,pred_test_rr))) 
    print("R squared test set: ", r2_score(y_test, pred_test_rr)*100)

    # Calculate the absolute errors
    errors = abs(pred_test_rr - y_test)

    # Print out the mean absolute error (mae)
    print('Mean Absolute Error:', np.mean(errors), 'degrees.')



    # Calculate mean absolute percentage error (MAPE)
    mape = 100 * (errors / y_test)
    # Calculate and display accuracy
    accuracy = 100 - np.mean(mape)
    print('Accuracy:', round(accuracy, 2), '%.')
    print("\n")
    print("\n")

Summary for the Eco District ID: 811
Train MSE:  2.259945942462853
R squared training set:  99.99334876269245
Test MSE:  184.56667845592872
R squared test set:  55.23924040698831
Mean Absolute Error: YieldKgAcre    160.406923
dtype: float64 degrees.
Accuracy: YieldKgAcre    73.5
dtype: float64 %.




Summary for the Eco District ID: 855
Train MSE:  1.5038022564333173
R squared training set:  99.98903726301795
Test MSE:  98.61064706242355
R squared test set:  57.71973418295067
Mean Absolute Error: YieldKgAcre    85.349931
dtype: float64 degrees.
Accuracy: YieldKgAcre    90.29
dtype: float64 %.




Summary for the Eco District ID: 379
Train MSE:  0.15057708273482312
R squared training set:  99.99997270585291
Test MSE:  239.96571136450584
R squared test set:  -80.91422041888387
Mean Absolute Error: YieldKgAcre    219.129738
dtype: float64 degrees.
Accuracy: YieldKgAcre    78.07
dtype: float64 %.




Summary for the Eco District ID: 647


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  0.22331908923287028
R squared training set:  99.99994586474308
Test MSE:  180.8050054459498
R squared test set:  -50.43112130109042
Mean Absolute Error: YieldKgAcre    152.316491
dtype: float64 degrees.
Accuracy: YieldKgAcre    76.13
dtype: float64 %.




Summary for the Eco District ID: 659
Train MSE:  0.18859021891603053
R squared training set:  99.99994016220529
Test MSE:  199.26835782786264
R squared test set:  48.773725892018795
Mean Absolute Error: YieldKgAcre    193.833892
dtype: float64 degrees.
Accuracy: YieldKgAcre    70.15
dtype: float64 %.




Summary for the Eco District ID: 662


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(


Train MSE:  0.10604083095882147
R squared training set:  99.99995717495662
Test MSE:  364.5052921929056
R squared test set:  -125.46391123251146
Mean Absolute Error: YieldKgAcre    282.099143
dtype: float64 degrees.
Accuracy: YieldKgAcre   -70.21
dtype: float64 %.




Summary for the Eco District ID: 668
Train MSE:  0.660710058724972
R squared training set:  99.99990584325825
Test MSE:  508.55592354437016
R squared test set:  -10422.541972935682
Mean Absolute Error: YieldKgAcre    471.368038
dtype: float64 degrees.
Accuracy: YieldKgAcre   -10.49
dtype: float64 %.




Summary for the Eco District ID: 833
Train MSE:  0.1305875679943832
R squared training set:  99.99994886835745
Test MSE:  155.54788401724187
R squared test set:  -27.266891431934393
Mean Absolute Error: YieldKgAcre    142.75108
dtype: float64 degrees.
Accuracy: YieldKgAcre    59.73
dtype: float64 %.




Summary for the Eco District ID: 834
Train MSE:  0.12092114991315477
R squared training set:  99.99997425829564
Test MSE:

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().drop(
