
# Time Series Analysis using ARIMA and (Optional) GARCH Modeling

## 1. Introduction
In this notebook, we will perform time series analysis using the ARIMA model.  
We will use the **AirPassengers dataset** (monthly totals of international airline passengers from 1949 to 1960).  
Students who apply ARCH/GARCH models for volatility analysis will earn **extra marks**.


## 2. Import Required Libraries

In [None]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from arch import arch_model



## 3. Load the Dataset

We use the AirPassengers dataset. It records monthly totals of international airline passengers (in thousands).


In [None]:

url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv'
df = pd.read_csv(url)
df['Month'] = pd.to_datetime(df['Month'])
df.set_index('Month', inplace=True)
df.head()



## 4. Visualize the Time Series

We plot the original series to understand its structure and trends.


In [None]:

plt.figure(figsize=(10,5))
plt.plot(df)
plt.title('Monthly International Airline Passengers (1949-1960)')
plt.xlabel('Date')
plt.ylabel('Number of Passengers')
plt.grid(True)
plt.show()



## 5. Stationarity Check

A stationary series has constant mean and variance over time.  
We first difference the series and apply the **Augmented Dickey-Fuller (ADF)** test.


In [None]:

# Differencing
df_diff = df.diff().dropna()

# Plot differenced data
plt.figure(figsize=(10,5))
plt.plot(df_diff)
plt.title('Differenced Series')
plt.grid(True)
plt.show()

# ADF Test
from statsmodels.tsa.stattools import adfuller
adf_test = adfuller(df_diff['Passengers'])
print('ADF Statistic:', adf_test[0])
print('p-value:', adf_test[1])



## 6. Identify ARIMA Parameters (p, d, q)

We use ACF and PACF plots to determine the appropriate values of **p** and **q**.


In [None]:

# ACF and PACF plots
plot_acf(df_diff)
plot_pacf(df_diff)
plt.show()



## 7. Build and Fit the ARIMA Model

Based on the plots, we fit an **ARIMA(2,1,2)** model.


In [None]:

# ARIMA Model
model = ARIMA(df['Passengers'], order=(2,1,2))
model_fit = model.fit()
model_fit.summary()



## 8. Forecast Future Values

We forecast the next 12 months and plot the forecast along with confidence intervals.


In [None]:

forecast = model_fit.get_forecast(steps=12)
forecast_ci = forecast.conf_int()

plt.figure(figsize=(10,5))
plt.plot(df, label='Observed')
plt.plot(forecast.predicted_mean, label='Forecast', color='red')
plt.fill_between(forecast_ci.index, forecast_ci.iloc[:, 0], forecast_ci.iloc[:, 1], color='pink')
plt.title('Forecast vs Observed')
plt.legend()
plt.grid(True)
plt.show()



## 9. (Optional) Volatility Analysis Using GARCH Model

For extra marks, we apply a **GARCH(1,1)** model to the differenced series.


In [None]:

returns = df_diff['Passengers']
model_garch = arch_model(returns, vol='Garch', p=1, q=1)
model_garch_fit = model_garch.fit(disp='off')
model_garch_fit.summary()



## 10. Conclusion

- ARIMA model provided reasonable forecasting for the AirPassengers dataset.
- Differencing made the series stationary, as verified by the ADF test.
- (Optional) GARCH model offered insights into volatility patterns.
