## Baseline Models

This notebook builds and evaluates baseline deep learning models for crop yield prediction using the processed features from `02_feature_engineering.ipynb`.
The models include:

- Convolutional Neural Network (CNN)
- Long Short-Term Memory network (LSTM)
- Deep Neural Network (DNN)


# Ignore Future Warnings 

In [58]:
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)


# Import Libraries 

In [59]:
import numpy as np
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Conv1D, Flatten


# Load Processed Data 

In [60]:
# Load features and targets
X = np.load("../processed/X_final.npy")
y = np.load("../processed/y.npy")        # original target
y_log = np.load("../processed/y_log.npy")  # log-transformed target

print("X shape:", X.shape)
print("y shape:", y.shape)


X shape: (56717, 225)
y shape: (56717,)


# Train -- Test Split 

In [61]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y_log, test_size=0.2, random_state=42
)

print("Train shape:", X_train.shape)
print("Test shape:", X_test.shape)


Train shape: (45373, 225)
Test shape: (11344, 225)


## Building CNN Model

We first reshape the training and test feature matrices to make them compatible with a 1D Convolutional Neural Network.


In [62]:
# Reshape for Conv1D: (samples, timesteps, features)
X_train_cnn = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
X_test_cnn = X_test.reshape(X_test.shape[0], X_test.shape[1], 1)

print("CNN input shape:", X_train_cnn.shape)


CNN input shape: (45373, 225, 1)


### Define and Compile CNN Model

We create a simple baseline 1D CNN with two convolutional layers, followed by a flatten layer and dense layers for regression output.


In [63]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv1D, Flatten, Dense

# Define CNN model
cnn_model = Sequential([
    Conv1D(filters=32, kernel_size=3, activation='relu', input_shape=(X_train_cnn.shape[1], 1)),
    Conv1D(filters=16, kernel_size=3, activation='relu'),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(1)  # regression output
])

# Compile model
cnn_model.compile(optimizer='adam', loss='mse', metrics=['mae'])

# Print summary
cnn_model.summary()


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


### Train the CNN Model

We train the CNN on the training data, using 20% of it for validation.  
For this baseline, we use 10 epochs and a batch size of 64.


In [64]:
##  Train CNN 

history = cnn_model.fit(
    X_train_cnn, y_train,         # training features and target
    validation_split=0.2,         # 20% of training data for validation
    epochs=10,                    # baseline epochs
    batch_size=64                 # standard batch size
)


Epoch 1/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 5ms/step - loss: 2.8897 - mae: 0.8855 - val_loss: 0.3832 - val_mae: 0.4547
Epoch 2/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 9ms/step - loss: 0.3186 - mae: 0.4224 - val_loss: 0.3683 - val_mae: 0.4378
Epoch 3/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m12s[0m 21ms/step - loss: 0.2993 - mae: 0.4072 - val_loss: 0.3048 - val_mae: 0.3925
Epoch 4/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 15ms/step - loss: 0.3084 - mae: 0.4130 - val_loss: 0.3340 - val_mae: 0.4268
Epoch 5/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 16ms/step - loss: 0.3031 - mae: 0.4100 - val_loss: 0.3382 - val_mae: 0.4309
Epoch 6/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 4ms/step - loss: 0.3020 - mae: 0.4100 - val_loss: 0.3342 - val_mae: 0.4087
Epoch 7/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 12ms/s

In [65]:
## Evaluate on Test Set

test_loss, test_mae = cnn_model.evaluate(X_test_cnn, y_test)
print("Test MSE:", test_loss)
print("Test MAE:", test_mae)


[1m355/355[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - loss: 0.3881 - mae: 0.4639
Test MSE: 0.38806986808776855
Test MAE: 0.46392112970352173


In [66]:
## Save the model 

# cnn_model.save("../processed/cnn_baseline_model.h5")
cnn_model.save("../processed/cnn_baseline_model.keras")



## Baseline LSTM Model

We build a simple LSTM network for crop yield prediction, using the processed features.


In [67]:
# Reshape for LSTM
X_train_lstm = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
X_test_lstm = X_test.reshape(X_test.shape[0], X_test.shape[1], 1)

print("LSTM input shape:", X_train_lstm.shape)


LSTM input shape: (45373, 225, 1)


# LSTM model 

In [68]:
from tensorflow.keras.layers import Input

lstm_model = Sequential([
    Input(shape=(X_train_lstm.shape[1], 1)),  # input layer
    LSTM(64, return_sequences=False),
    Dense(32, activation='relu'),
    Dense(1)  # regression output
])

lstm_model.compile(optimizer='adam', loss='mse', metrics=['mae'])
lstm_model.summary()


# Train LSTM

In [69]:
history_lstm = lstm_model.fit(
    X_train_lstm, y_train,
    validation_split=0.2,
    epochs=10,
    batch_size=64
)


Epoch 1/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m44s[0m 75ms/step - loss: 5.0953 - mae: 1.3060 - val_loss: 1.2078 - val_mae: 0.9037
Epoch 2/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 72ms/step - loss: 1.1796 - mae: 0.9007 - val_loss: 1.1777 - val_mae: 0.8904
Epoch 3/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m54s[0m 95ms/step - loss: 1.1476 - mae: 0.8823 - val_loss: 1.1113 - val_mae: 0.8562
Epoch 4/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m33s[0m 57ms/step - loss: 1.0847 - mae: 0.8461 - val_loss: 1.0680 - val_mae: 0.8321
Epoch 5/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m31s[0m 54ms/step - loss: 1.0307 - mae: 0.8198 - val_loss: 1.0287 - val_mae: 0.8176
Epoch 6/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m32s[0m 56ms/step - loss: 0.8827 - mae: 0.7519 - val_loss: 0.7698 - val_mae: 0.6939
Epoch 7/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0

# Evaluate LSTM

In [70]:
test_loss_lstm, test_mae_lstm = lstm_model.evaluate(X_test_lstm, y_test)
print("LSTM Test MSE:", test_loss_lstm)
print("LSTM Test MAE:", test_mae_lstm)


[1m355/355[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m25s[0m 72ms/step - loss: 0.5457 - mae: 0.5769
LSTM Test MSE: 0.5456959009170532
LSTM Test MAE: 0.5768837928771973


In [71]:
## Save LSTM model
lstm_model.save("../processed/lstm_baseline_model.keras")


## Baseline DNN Model

We build a simple fully connected Deep Neural Network for crop yield prediction using the processed features.


## Define DNN model 

In [72]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

dnn_model = Sequential([
    Dense(128, activation='relu', input_shape=(X_train.shape[1],)),
    Dense(64, activation='relu'),
    Dense(32, activation='relu'),
    Dense(1)  # regression output
])

dnn_model.compile(optimizer='adam', loss='mse', metrics=['mae'])
dnn_model.summary()


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [73]:
## Train DNN 

history_dnn = dnn_model.fit(
    X_train, y_train,
    validation_split=0.2,
    epochs=10,
    batch_size=64
)


Epoch 1/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m16s[0m 17ms/step - loss: 6.3292 - mae: 1.1769 - val_loss: 0.2837 - val_mae: 0.3793
Epoch 2/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 14ms/step - loss: 0.2020 - mae: 0.3260 - val_loss: 0.1815 - val_mae: 0.2913
Epoch 3/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 12ms/step - loss: 0.1315 - mae: 0.2550 - val_loss: 0.1358 - val_mae: 0.2329
Epoch 4/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 13ms/step - loss: 0.1069 - mae: 0.2274 - val_loss: 0.1248 - val_mae: 0.2204
Epoch 5/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 13ms/step - loss: 0.0959 - mae: 0.2145 - val_loss: 0.1217 - val_mae: 0.2184
Epoch 6/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 14ms/step - loss: 0.0911 - mae: 0.2079 - val_loss: 0.1193 - val_mae: 0.2163
Epoch 7/10
[1m568/568[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 1

In [74]:
## Evaluate DNN 
test_loss_dnn, test_mae_dnn = dnn_model.evaluate(X_test, y_test)
print("DNN Test MSE:", test_loss_dnn)
print("DNN Test MAE:", test_mae_dnn)


[1m355/355[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 8ms/step - loss: 0.0943 - mae: 0.2029
DNN Test MSE: 0.09427804499864578
DNN Test MAE: 0.20291753113269806


In [75]:
## Save DNN model 

dnn_model.save("../processed/dnn_baseline_model.keras")
