<a href="https://colab.research.google.com/github/tapiwamesa/Amini-Soil-Prediction/blob/main/Amini_Soil_Prediction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **1. Problem Statement**

Soil fertility is a critical determinant of agricultural productivity and food security. Essential macronutrients like Nitrogen (N), Phosphorus (P), and Potassium (K), alongside micronutrients such as Calcium (Ca) and Magnesium (Mg), directly influence the growth and yield potential of crops. However, conventional soil testing methods are often costly, time-consuming, and inaccessible for many smallholder farmers, especially across Africa.

There is a pressing need for a scalable, data-driven solution that can accurately predict soil nutrient availability and recommend nutrient management strategies to optimize crop yields. By accurately predicting soil nutrient status and identifying nutrient deficits, the solution aims to enable cost-effective soil analysis at scale, empower farmers with tailored recommendations, improve fertilizer efficiency, and contribute to greater food security across the region.

# **2. Objectives**

Develop a machine learning model that:

1. Predicts the availability of 11 key soil nutrients based on soil and environmental features.

2. Calculates the nutrient gaps necessary to achieve a maize yield target of 4 tons per hectare.

3. Provides actionable insights to support precise fertilizer application and sustainable soil management.

# **3. Exploratory Data Analysis**

In [2]:
# importing dependencies

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.multioutput import MultiOutputRegressor
from sklearn.ensemble import RandomForestRegressor
import lightgbm as lgbm
from sklearn.preprocessing import StandardScaler

In [3]:
# mounting drive

from google.colab import drive

drive.mount('/content/drive')

Mounted at /content/drive


In [4]:
# importing the data

train_gap = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/Data /Zindi | Soil Nutrients Prediction/Gap_Train.csv")
test_gap = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/Data /Zindi | Soil Nutrients Prediction/Gap_Test.csv")
train = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/Data /Zindi | Soil Nutrients Prediction/Train.csv")
test = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/Data /Zindi | Soil Nutrients Prediction/Test.csv")

In [5]:
# viewing the data

train_gap.head()

Unnamed: 0,Nutrient,Required,Available,Gap,PID
0,N,100.0,3796.0,-3696.0,ID_I5RGjv
1,P,40.0,0.9928,39.0072,ID_I5RGjv
2,K,52.0,429.24,-377.24,ID_I5RGjv
3,Ca,12.0,19943.6,-19931.6,ID_I5RGjv
4,Mg,8.0,6745.2,-6737.2,ID_I5RGjv


In [6]:
train.head()

Unnamed: 0,site,PID,lon,lat,pH,alb,bio1,bio12,bio15,bio7,...,P,K,Ca,Mg,S,Fe,Mn,Zn,Cu,B
0,site_id_bIEHwl,ID_I5RGjv,70.603761,46.173798,7.75,176,248,920,108,190,...,0.34,147,6830,2310,5.66,75.2,85.0,0.82,2.98,0.24
1,site_id_nGvnKc,ID_8jWzJ5,70.590479,46.078924,7.1,181,250,1080,113,191,...,11.7,151,1180,235,19.4,96.2,409.0,2.57,4.32,0.1
2,site_id_nGvnKc,ID_UgzkN8,70.582553,46.04882,6.95,188,250,1109,111,191,...,21.8,151,1890,344,11.0,76.7,65.0,1.95,1.24,0.22
3,site_id_nGvnKc,ID_DLLHM9,70.573267,46.02191,7.83,174,250,1149,112,191,...,39.9,201,6660,719,14.9,81.9,73.0,4.9,3.08,0.87
4,site_id_7SA9rO,ID_d009mj,70.58533,46.204336,8.07,188,250,869,114,191,...,1.0,90,7340,1160,8.66,69.4,149.0,0.55,3.03,0.31


In [7]:
test_gap.head()

Unnamed: 0,Nutrient,Required,PID
0,N,100.0,ID_NGS9Bx
1,P,40.0,ID_NGS9Bx
2,K,52.0,ID_NGS9Bx
3,Ca,12.0,ID_NGS9Bx
4,Mg,8.0,ID_NGS9Bx


In [8]:
test.head()

Unnamed: 0,site,PID,lon,lat,pH,alb,bio1,bio12,bio15,bio7,...,para,parv,ph20,slope,snd20,soc20,tim,wp,xhp20,BulkDensity
0,site_id_hgJpkz,ID_NGS9Bx,69.170794,44.522885,6.86,144,256,910,108,186,...,37.940418,467.619293,6.825,1.056416,25.5,15.25,8.732471,0.016981,0.005831,1.2
1,site_id_olmuI5,ID_YdVKXw,68.885265,44.741057,7.08,129,260,851,110,187,...,35.961353,542.590149,6.725,0.730379,18.75,14.0,10.565657,0.02103,0.005134,1.24
2,site_id_PTZdJz,ID_MZAlfE,68.97021,44.675777,6.5,142,259,901,109,187,...,38.983898,416.385437,6.825,1.146542,21.0,14.0,9.590125,0.018507,0.00448,1.23
3,site_id_DOTgr8,ID_GwCCMN,69.068751,44.647707,6.82,142,261,847,109,187,...,39.948471,374.971008,6.725,0.56721,23.25,12.25,9.669279,0.021688,0.006803,1.22
4,site_id_1rQNvy,ID_K8sowf,68.990002,44.577607,6.52,145,253,1109,110,186,...,33.658615,361.233643,6.2,1.169207,26.25,18.25,7.89592,0.023016,0.000874,1.23


In [9]:
# checking columns in test dataset
test.columns

Index(['site', 'PID', 'lon', 'lat', 'pH', 'alb', 'bio1', 'bio12', 'bio15',
       'bio7', 'bp', 'cec20', 'dows', 'ecec20', 'hp20', 'ls', 'lstd', 'lstn',
       'mb1', 'mb2', 'mb3', 'mb7', 'mdem', 'para', 'parv', 'ph20', 'slope',
       'snd20', 'soc20', 'tim', 'wp', 'xhp20', 'BulkDensity'],
      dtype='object')

In [10]:
# checking columns in train dataset
train.columns

Index(['site', 'PID', 'lon', 'lat', 'pH', 'alb', 'bio1', 'bio12', 'bio15',
       'bio7', 'bp', 'cec20', 'dows', 'ecec20', 'hp20', 'ls', 'lstd', 'lstn',
       'mb1', 'mb2', 'mb3', 'mb7', 'mdem', 'para', 'parv', 'ph20', 'slope',
       'snd20', 'soc20', 'tim', 'wp', 'xhp20', 'BulkDensity', 'N', 'P', 'K',
       'Ca', 'Mg', 'S', 'Fe', 'Mn', 'Zn', 'Cu', 'B'],
      dtype='object')

In [11]:
# data types and missing values for test dataset
test.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2418 entries, 0 to 2417
Data columns (total 33 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   site         2418 non-null   object 
 1   PID          2418 non-null   object 
 2   lon          2418 non-null   float64
 3   lat          2418 non-null   float64
 4   pH           2418 non-null   float64
 5   alb          2418 non-null   int64  
 6   bio1         2418 non-null   int64  
 7   bio12        2418 non-null   int64  
 8   bio15        2418 non-null   int64  
 9   bio7         2418 non-null   int64  
 10  bp           2418 non-null   float64
 11  cec20        2418 non-null   float64
 12  dows         2418 non-null   float64
 13  ecec20       2418 non-null   float64
 14  hp20         2418 non-null   float64
 15  ls           2418 non-null   float64
 16  lstd         2418 non-null   float64
 17  lstn         2418 non-null   float64
 18  mb1          2418 non-null   float64
 19  mb2   

In [12]:
# Null values in test set
test.isnull().sum()[test.isnull().sum() != 0]

Unnamed: 0,0


In [13]:
# data types and missing values for train

train.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7744 entries, 0 to 7743
Data columns (total 44 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   site         7744 non-null   object 
 1   PID          7744 non-null   object 
 2   lon          7744 non-null   float64
 3   lat          7744 non-null   float64
 4   pH           7744 non-null   float64
 5   alb          7744 non-null   int64  
 6   bio1         7744 non-null   int64  
 7   bio12        7744 non-null   int64  
 8   bio15        7744 non-null   int64  
 9   bio7         7744 non-null   int64  
 10  bp           7744 non-null   float64
 11  cec20        7744 non-null   float64
 12  dows         7744 non-null   float64
 13  ecec20       7739 non-null   float64
 14  hp20         7739 non-null   float64
 15  ls           7744 non-null   float64
 16  lstd         7744 non-null   float64
 17  lstn         7744 non-null   float64
 18  mb1          7744 non-null   float64
 19  mb2   

In [14]:
# Null values in train set
train.isnull().sum()[train.isnull().sum() != 0]

Unnamed: 0,0
ecec20,5
hp20,5
xhp20,5
BulkDensity,4


In [15]:
# filling the null values in train set

for column in train.columns:
    if train[column].isnull().any():
        train[column].fillna(np.mean(train[column]), inplace = True)

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  train[column].fillna(np.mean(train[column]), inplace = True)


In [16]:
# checking if all nulls have been filled with column averages
train.isnull().sum()[train.isnull().sum() != 0]

Unnamed: 0,0


In [17]:
# Lets explore the train set

train.head()

Unnamed: 0,site,PID,lon,lat,pH,alb,bio1,bio12,bio15,bio7,...,P,K,Ca,Mg,S,Fe,Mn,Zn,Cu,B
0,site_id_bIEHwl,ID_I5RGjv,70.603761,46.173798,7.75,176,248,920,108,190,...,0.34,147,6830,2310,5.66,75.2,85.0,0.82,2.98,0.24
1,site_id_nGvnKc,ID_8jWzJ5,70.590479,46.078924,7.1,181,250,1080,113,191,...,11.7,151,1180,235,19.4,96.2,409.0,2.57,4.32,0.1
2,site_id_nGvnKc,ID_UgzkN8,70.582553,46.04882,6.95,188,250,1109,111,191,...,21.8,151,1890,344,11.0,76.7,65.0,1.95,1.24,0.22
3,site_id_nGvnKc,ID_DLLHM9,70.573267,46.02191,7.83,174,250,1149,112,191,...,39.9,201,6660,719,14.9,81.9,73.0,4.9,3.08,0.87
4,site_id_7SA9rO,ID_d009mj,70.58533,46.204336,8.07,188,250,869,114,191,...,1.0,90,7340,1160,8.66,69.4,149.0,0.55,3.03,0.31


In [18]:
# dropping the site and PID columns
train_set = train.drop(columns = ["site", "PID"])

In [19]:
# target columns

target_columns = ['N', 'P', 'K', 'Ca', 'Mg', 'S', 'Fe', 'Mn', 'Zn', 'Cu', 'B']
train_data = train_set.drop(columns = target_columns)

In [20]:
# viewing clean train data

train_data.head()

Unnamed: 0,lon,lat,pH,alb,bio1,bio12,bio15,bio7,bp,cec20,...,para,parv,ph20,slope,snd20,soc20,tim,wp,xhp20,BulkDensity
0,70.603761,46.173798,7.75,176,248,920,108,190,0.581573,22.0,...,20.544283,126.83548,7.05,1.962921,39.0,9.75,7.962668,0.016853,0.000708,1.46
1,70.590479,46.078924,7.1,181,250,1080,113,191,0.707011,24.0,...,18.869566,109.835541,6.975,0.162065,40.0,8.0,8.4395,0.018321,0.001676,1.52
2,70.582553,46.04882,6.95,188,250,1109,111,191,0.362439,15.25,...,24.719807,214.385269,6.725,0.744845,46.0,9.25,8.289246,0.020588,0.003885,1.46
3,70.573267,46.02191,7.83,174,250,1149,112,191,0.531739,22.0,...,27.230274,255.713043,6.625,0.708708,43.75,10.0,8.666523,0.016913,0.001714,1.48
4,70.58533,46.204336,8.07,188,250,869,114,191,0.039202,14.75,...,20.434782,86.220909,6.7,0.634153,49.25,7.0,15.139549,0.019791,0.0,1.43


In [21]:
train_data.shape

(7744, 31)

In [22]:
test.shape

(2418, 33)

# **4. Model Building**

In [23]:
# importing metrics

from sklearn.metrics import mean_squared_error, mean_absolute_error

In [24]:
# splitting train data into feature and target variables

X = train_data
y = train_set[target_columns]
X_test = test.drop(columns  = ["site", "PID"])

In [25]:
# Splitting the data into train and test

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size = 0.2, random_state = 42)

### **a. Linear Regression**

In [26]:
# importing linear regression
from sklearn.linear_model import LinearRegression

In [27]:
# Scaling the data

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_val_scaled = scaler.transform(X_val)
X_test_scaled = scaler.transform(X_test)

In [28]:
# Training the linear regression on scaled data

linear = MultiOutputRegressor(LinearRegression())
linear.fit(X_train_scaled, y_train)

In [29]:
# predicting the validation set

y_pred_linear = linear.predict(X_val_scaled)
mse_linear = mean_absolute_error(y_val, y_pred_linear)
rmse_linear = np.sqrt(mse_linear)
print("MSE:", mse_linear)
print("\nRMSE:", rmse_linear)

MSE: 214.04092670467008

RMSE: 14.630137617420763


### **b. Random Forest**

In [30]:
# Training the model

forest = MultiOutputRegressor(RandomForestRegressor(n_estimators = 100, random_state = 42))
forest.fit(X_train, y_train)

In [31]:
# predicting the validation set

y_pred_forest = forest.predict(X_val)
mse_forest = mean_absolute_error(y_val, y_pred_forest)
rmse_forest = np.sqrt(mse_forest)
print("MAE:", mse_forest)
print("\n RMSE", rmse_forest)

MAE: 159.88389656669992

 RMSE 12.644520416635022


### **c. LightGBM**

In [32]:
# Training the model

lgbm = MultiOutputRegressor(lgbm.LGBMRegressor(random_state = 42))
lgbm.fit(X_train, y_train)

[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.002483 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 6693
[LightGBM] [Info] Number of data points in the train set: 6195, number of used features: 31
[LightGBM] [Info] Start training from score 1659.143341
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.002348 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 6693
[LightGBM] [Info] Number of data points in the train set: 6195, number of used features: 31
[LightGBM] [Info] Start training from score 15.498505
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.002324 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 6693
[LightGBM] [Info] Number of data points in the train set: 6195, number of used features: 31
[LightGBM] [Info] Start tr

In [33]:
# predicting the validation set

y_pred_lgbm = lgbm.predict(X_val)
mse_lgbm = mean_absolute_error(y_val, y_pred_lgbm)
rmse_lgbm = np.sqrt(mse_lgbm)
print("MAE:", mse_lgbm)
print("\n RMSE", rmse_lgbm)

MAE: 158.05734207181388

 RMSE 12.57208582820742


## **4.1 Using LightGBM to predict Nutrient requirements**

In [39]:
# Predicting with the test set
y_test_lgbm = lgbm.predict(X_test)
y_test_lgbm

array([[ 1.69319482e+03,  3.29593200e+00,  1.75680041e+02, ...,
         1.57231855e+00,  4.55783201e+00,  2.04633225e-01],
       [ 1.26455437e+03,  4.22721938e+00,  1.77410516e+02, ...,
         1.09475113e+00,  4.03090181e+00,  1.63974895e-01],
       [ 1.73585134e+03,  6.75341888e-01,  1.75120220e+02, ...,
         1.86431114e+00,  5.26031086e+00,  2.11529735e-01],
       ...,
       [ 3.01571412e+03, -9.04307115e+00,  1.78930961e+02, ...,
         4.54512928e+00,  1.48853911e+00,  3.42467119e-01],
       [ 2.48350521e+03,  1.86489602e+01,  2.51205364e+02, ...,
         8.16556513e+00,  1.07969504e+00,  4.42044548e-01],
       [ 1.88550845e+03,  7.38597129e+00,  4.27878386e+02, ...,
         7.27905417e+00,  1.30568948e+00,  5.25903771e-01]])

In [42]:
# storing the predictions in a dataframe

nutrients = pd.DataFrame(y_test_lgbm, columns = target_columns)
nutrients.head()

Unnamed: 0,N,P,K,Ca,Mg,S,Fe,Mn,Zn,Cu,B
0,1693.194816,3.295932,175.680041,5789.811036,1823.182984,8.52325,134.157665,119.103855,1.572319,4.557832,0.204633
1,1264.554373,4.227219,177.410516,6836.541161,2356.737597,8.634354,117.000133,116.622436,1.094751,4.030902,0.163975
2,1735.851342,0.675342,175.12022,5025.604462,1959.940103,8.25744,130.323303,159.430042,1.864311,5.260311,0.21153
3,1878.928764,2.395137,183.582715,5588.023051,2037.932729,8.52325,144.155745,157.752277,1.346423,4.354029,0.164895
4,1786.109449,8.740927,235.840022,4325.733944,1428.156915,7.780887,130.658769,132.285718,1.688838,5.546616,0.196814


In [67]:
# Creating series of Available Nutrients in ppm
N_pred = nutrients['N']
P_pred = nutrients['P']
K_pred = nutrients['K']
Ca_pred = nutrients['Ca']
Mg_pred = nutrients['Mg']
S_pred = nutrients['S']
Fe_pred = nutrients['Fe']
Mn_pred = nutrients['Mn']
Zn_pred = nutrients['Zn']
Cu_pred = nutrients['Cu']
B_pred = nutrients['B']

In [72]:
# Concatenating the test data with the predicted available nutrients to extract PID and Bulk Density
nutrients_pid = pd.DataFrame({'PID':test['PID'], 'BulkDensisty':test['BulkDensity'], 'N':N_pred, 'P':P_pred, 'K':K_pred, 'Ca':Ca_pred, 'Mg':Mg_pred, 'S':S_pred, 'Fe':Fe_pred, 'Mn':Mn_pred, 'Zn':Zn_pred, 'Cu':Cu_pred, 'B':B_pred})
nutrients_pid.head()

Unnamed: 0,PID,BulkDensisty,N,P,K,Ca,Mg,S,Fe,Mn,Zn,Cu,B
0,ID_NGS9Bx,1.2,1693.194816,3.295932,175.680041,5789.811036,1823.182984,8.52325,134.157665,119.103855,1.572319,4.557832,0.204633
1,ID_YdVKXw,1.24,1264.554373,4.227219,177.410516,6836.541161,2356.737597,8.634354,117.000133,116.622436,1.094751,4.030902,0.163975
2,ID_MZAlfE,1.23,1735.851342,0.675342,175.12022,5025.604462,1959.940103,8.25744,130.323303,159.430042,1.864311,5.260311,0.21153
3,ID_GwCCMN,1.22,1878.928764,2.395137,183.582715,5588.023051,2037.932729,8.52325,144.155745,157.752277,1.346423,4.354029,0.164895
4,ID_K8sowf,1.23,1786.109449,8.740927,235.840022,4325.733944,1428.156915,7.780887,130.658769,132.285718,1.688838,5.546616,0.196814


In [91]:
# unpivoting the data to have nutrients columns as row entries

nutrients_unpivoted = pd.melt(nutrients_pid, id_vars = ["PID", "BulkDensisty"], value_vars = target_columns, var_name = 'Nutrient', value_name = 'Available Nutrients (ppm)')
nutrients_unpivoted.head()

Unnamed: 0,PID,BulkDensisty,Nutrient,Available Nutrients (ppm)
0,ID_NGS9Bx,1.2,N,1693.194816
1,ID_YdVKXw,1.24,N,1264.554373
2,ID_MZAlfE,1.23,N,1735.851342
3,ID_GwCCMN,1.22,N,1878.928764
4,ID_K8sowf,1.23,N,1786.109449


In [93]:
# Merging nutrients with the test_gap dataset to retrieve the reqiore nutrients

nutrients_complete = pd.merge(nutrients_unpivoted, test_gap, on = ['PID', 'Nutrient'])
nutrients_complete

Unnamed: 0,PID,BulkDensisty,Nutrient,Available Nutrients (ppm),Required
0,ID_NGS9Bx,1.20,N,1693.194816,100.00
1,ID_YdVKXw,1.24,N,1264.554373,100.00
2,ID_MZAlfE,1.23,N,1735.851342,100.00
3,ID_GwCCMN,1.22,N,1878.928764,100.00
4,ID_K8sowf,1.23,N,1786.109449,100.00
...,...,...,...,...,...
26593,ID_mZTENs,1.08,B,0.666246,0.08
26594,ID_oxY8vm,1.08,B,0.524198,0.08
26595,ID_aUAOl1,1.08,B,0.342467,0.08
26596,ID_6qaAmn,1.07,B,0.442045,0.08


In [103]:
# Calculating the Available Nutrients (kg/ha) and Gap

soil_depth = 20
nutrients_complete['Available Nutrients (kg/a)'] = nutrients_complete['Available Nutrients (ppm)'] * soil_depth * nutrients_complete['BulkDensisty'] * 0.1
nutrients_complete['Gap'] = nutrients_complete['Required'] - nutrients_complete['Available Nutrients (ppm)']
final = nutrients_complete.copy()

In [104]:
final['PID'] = final['PID'] + "_" + final['Nutrient']

In [125]:
# Creating a submission file

submission_file = final[["PID", "Gap"]]
submission_file.set_index("PID", inplace = True)
submission_file.head()

Unnamed: 0_level_0,Gap
PID,Unnamed: 1_level_1
ID_NGS9Bx_N,-1593.194816
ID_YdVKXw_N,-1164.554373
ID_MZAlfE_N,-1635.851342
ID_GwCCMN_N,-1778.928764
ID_K8sowf_N,-1686.109449


In [126]:
submission_file.to_csv('/content/drive/MyDrive/Colab Notebooks/Data /Zindi | Soil Nutrients Prediction/Submission_file.csv')

# **5. Conclusion**

In this project, we developed and evaluated three machine learning models thus Linear Regression, Random Forest, and LightGBM to predict soil nutrient availability and calculate nutrient gaps for optimizing maize yields.

The models were assessed using Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) metrics below:

In [135]:
Results = pd.DataFrame([[mse_linear, mse_forest, mse_lgbm], [rmse_linear, rmse_forest, rmse_lgbm]], columns = ["Linear Regression", "Random Forest", "LightGBM"], index = ["MSE", 'RMSE'])
Results

Unnamed: 0,Linear Regression,Random Forest,LightGBM
MSE,214.040927,159.883897,158.057342
RMSE,14.630138,12.64452,12.572086


- The Linear Regression model showed the highest error, indicating that simple linear relationships were insufficient to capture the complexity of soil nutrient dynamics.

- The Random Forest model significantly reduced the MSE and RMSE compared to Linear Regression, demonstrating that ensemble-based methods better model nonlinear relationships in the data.

- LightGBM achieved the best performance overall, with the lowest MSE (158.06) and RMSE (12.57), suggesting it can most accurately predict soil nutrient levels in this context.

Given its superior accuracy, **LightGBM** is the recommended model for deployment to support soil analysis and fertilizer recommendation systems. Its ability to handle complex feature interactions and large datasets makes it particularly suitable for scaling this solution across diverse agricultural regions.