Using the Mauna Lua CO2 dataset (monthly) perform forecasting using an MLP and compare the results with that of MA (Moving Average) and ARMA (Auto Regressive Moving Average)  models. Main setting: use previous “K” readings to predict next “T” reading. Example, if “K=3” and “T=1” then we use data from Jan, Feb, March and then predict the reading for April. Comment on why you observe such results. For MA or ARMA you can use any library or implement it from scratch. The choice of MLP is up to you.

In [151]:
import numpy as np
import pandas as pd
from statsmodels.tsa.arima.model import ARIMA
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error

# Load the Mauna Loa CO2 dataset
data = pd.read_csv("/Users/heerkubadia/Desktop/Sem-4/Machine Learning/co2_mm_mlo.csv")
data

Unnamed: 0,year,month,decimal date,average,deseasonalized,ndays,sdev,unc
0,1958,3,1958.2027,315.70,314.43,-1,-9.99,-0.99
1,1958,4,1958.2877,317.45,315.16,-1,-9.99,-0.99
2,1958,5,1958.3699,317.51,314.71,-1,-9.99,-0.99
3,1958,6,1958.4548,317.24,315.14,-1,-9.99,-0.99
4,1958,7,1958.5370,315.86,315.18,-1,-9.99,-0.99
...,...,...,...,...,...,...,...,...
787,2023,10,2023.7917,418.82,422.12,27,0.47,0.17
788,2023,11,2023.8750,420.46,422.46,21,0.91,0.38
789,2023,12,2023.9583,421.86,422.58,20,0.68,0.29
790,2024,1,2024.0417,422.80,422.45,27,0.73,0.27


In [152]:
# Extract the CO2 readings
co2_data = data[['average']]

In [153]:
k = 3  # Previous K readings
t = 1  # Predict next T reading

In [154]:
# Define function for moving average model
def moving_average(data, k, t):
    predictions = []
    for i in range(len(data)-(k+t)+1):
        window = data.iloc[i:i+k]
        prediction = window.mean()
        predictions.append(prediction)
    return np.array(predictions)

In [155]:
def arma_model(data, p, d, q):
    model = ARIMA(data, order=(p, d, q))
    model_fit = model.fit()
    return model_fit.forecast().iloc[0]

In [156]:
# MLP Model
X_train, y_train = [], []
for i in range(len(train)-k-t+1):
    X_train.append(train.iloc[i:i+k].values.flatten())
    y_train.append(train.iloc[i+k:i+k+t].values.flatten())
X_train, y_train = np.array(X_train), np.array(y_train)

mlp_model = MLPRegressor(hidden_layer_sizes=(50, 50), max_iter=1000, random_state=42)
mlp_model.fit(X_train, y_train)

  y = column_or_1d(y, warn=True)


In [157]:
train_size = int(len(data) * 0.8)
train, test = co2_data[:train_size], co2_data[train_size:]

In [158]:
ma_predictions = moving_average(test, k, t)

In [159]:
arma_predictions = [arma_model(test[i:i+k], p=3, d=0, q=2) for i in range(len(test)-k-t+1)]


  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observations to estimate starting parameters%s.'
  warn('Too few observati

In [160]:
# Forecast using MLP
mlp_predictions = []
for i in range(len(test)-k-t+1):
    input_data = test.iloc[i:i+k].values.flatten().reshape(1, -1)
    prediction = mlp_model.predict(input_data)
    mlp_predictions.append(prediction.flatten()[0])

In [162]:

# Evaluate performance
ma_rmse = mean_squared_error(test.iloc[k:], ma_predictions, squared=False)
arma_rmse = mean_squared_error(test.iloc[k:], arma_predictions, squared=False)
mlp_rmse = mean_squared_error(test.iloc[k:], mlp_predictions, squared=False)

print("Moving Average RMSE:", ma_rmse)
print("ARMA RMSE:", arma_rmse)
print("MLP RMSE:", mlp_rmse)


Moving Average RMSE: 2.3006705791921
ARMA RMSE: 3.4128304694359604
MLP RMSE: 2.1546906549447864




Based on the observed RMSE values, the MLP model outperforms both the MA and ARMA models in terms of forecasting accuracy. This suggests that the MLP model is better able to capture the underlying patterns in the Mauna Loa CO2 dataset and make more accurate predictions.