## **#06. Time Series and ARIMA Models**
- Instructor: [Jaeung Sim](https://jaeungs.github.io/) (University of Connecticut)
- Course: OPIM 5671 Data Mining and Time Series Forecasting
- Last updated: October 14, 2025

**Objectives**
- Implement and assess ARIMA models for times series forecasting.

### **Part 1: Colab Environment**

In [None]:
# Import Google Drive to Colab
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Check your current directory
import os
os.getcwd()

In [None]:
# Set your working directory
os.chdir('/content/drive/My Drive/Colab Notebooks/OPIM 5671 (Fall 2025)') # Change the directory to your own

### **Part 2: Data Exploration**

In [None]:
import pandas as pd

# Load the dataset
df = pd.read_csv("dataset_06_Apple_weekly_price.csv")

# Parse the 'Date' column as datetime and sort
df['Date'] = pd.to_datetime(df['Date'])
df = df.sort_values('Date')

In [None]:
# Explore the data structure
df.head()

In [None]:
# Remove commas from 'Volume' and convert to numeric
df['Volume'] = df['Volume'].str.replace(',', '').astype(int)

# Explore the updated structure
df.head()

In [None]:
import matplotlib.pyplot as plt

# Plot 'Close' price trend
plt.figure(figsize=(12, 5))
plt.plot(df['Date'], df['Close'], label='Close Price', color='blue')
plt.title('Apple Weekly Close Price')
plt.xlabel('Date')
plt.ylabel('Close Price')
plt.grid(True)
plt.legend()
plt.show()

# Plot 'Volume' trend
plt.figure(figsize=(12, 5))
plt.plot(df['Date'], df['Volume'], label='Volume', color='green')
plt.title('Apple Weekly Trading Volume')
plt.xlabel('Date')
plt.ylabel('Volume')
plt.grid(True)
plt.legend()
plt.show()

### **Part 3: ARIMA Model Estimation and Assessment**

In [None]:
from statsmodels.tsa.arima.model import ARIMA
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
import numpy as np

In [None]:
# Set 'Date' as index
df_arima = df.set_index('Date')

# Use only the 'Close' column
ts = df_arima['Close']

In [None]:
df_arima.head()

Try ARIMA(1, 0, 0) model

In [None]:
# Fit ARIMA(p=1, d=0, q=0)
model = ARIMA(ts, order=(1, 0, 0))
model_fit = model.fit()

# Summary of the model
print(model_fit.summary())

In [None]:
# Predict in-sample to evaluate model performance
y_pred = model_fit.predict(start=1, end=len(ts)-1, typ='levels')
mse = mean_squared_error(ts[1:], y_pred)
rmse = np.sqrt(mse)
mae = mean_absolute_error(ts[1:], y_pred)

print(f"In-sample MSE: {mse:.4f}")
print(f"In-sample RMSE: {rmse:.4f}")
print(f"In-sample MAE:  {mae:.4f}")

In [None]:
# Forecast the next 10 weeks
forecast = model_fit.forecast(steps=10)
print("\nForecast for next 10 weeks:\n", forecast)

Try ARIMA(1, 1, 1) model

In [None]:
# Fit ARIMA(p=1, d=1, q=1)
model = ARIMA(ts, order=(1, 1, 1))
model_fit = model.fit()

# Summary of the model
print(model_fit.summary())

# Predict in-sample to evaluate model performance
y_pred = model_fit.predict(start=1, end=len(ts)-1, typ='levels')
mse = mean_squared_error(ts[1:], y_pred)
rmse = np.sqrt(mse)
mae = mean_absolute_error(ts[1:], y_pred)

print(f"In-sample MSE: {mse:.4f}")
print(f"In-sample RMSE: {rmse:.4f}")
print(f"In-sample MAE:  {mae:.4f}")

In [None]:
# Forecast the next 10 weeks
forecast = model_fit.forecast(steps=10)
print("\nForecast for next 10 weeks:\n", forecast)