# Time Series Analysis with Air Passengers Dataset

This notebook demonstrates various time series analysis techniques using the Air Passengers dataset. We'll cover:
1. Time series visualization and decomposition
2. Stationarity testing
3. ARIMA modeling
4. Prophet forecasting
5. Model evaluation

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.arima.model import ARIMA
from prophet import Prophet
from sklearn.metrics import mean_squared_error, mean_absolute_error
import warnings
warnings.filterwarnings('ignore')

# Set random seed for reproducibility
np.random.seed(42)

# Set style for plots
plt.style.use('seaborn')
sns.set_palette('husl')

In [None]:
# Load the Air Passengers dataset
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv"
df = pd.read_csv(url)
df['Month'] = pd.to_datetime(df['Month'])
df.set_index('Month', inplace=True)
df.columns = ['Passengers']

# Display basic information
print(f"Number of observations: {len(df)}")
print(f"Date range: {df.index.min()} to {df.index.max()}")
print("\nFirst few rows:")
df.head()

In [None]:
# Plot the time series
plt.figure(figsize=(12, 6))
plt.plot(df.index, df['Passengers'])
plt.title('Air Passengers Over Time')
plt.xlabel('Date')
plt.ylabel('Number of Passengers')
plt.grid(True)
plt.show()

In [None]:
# Decompose the time series
decomposition = seasonal_decompose(df['Passengers'], model='multiplicative', period=12)

plt.figure(figsize=(12, 8))

plt.subplot(4, 1, 1)
plt.plot(df.index, df['Passengers'])
plt.title('Original Time Series')
plt.grid(True)

plt.subplot(4, 1, 2)
plt.plot(df.index, decomposition.trend)
plt.title('Trend Component')
plt.grid(True)

plt.subplot(4, 1, 3)
plt.plot(df.index, decomposition.seasonal)
plt.title('Seasonal Component')
plt.grid(True)

plt.subplot(4, 1, 4)
plt.plot(df.index, decomposition.resid)
plt.title('Residual Component')
plt.grid(True)

plt.tight_layout()
plt.show()

In [None]:
# Test for stationarity
def adf_test(timeseries):
    print('Results of Augmented Dickey-Fuller Test:')
    dftest = adfuller(timeseries, autolag='AIC')
    dfoutput = pd.Series(dftest[0:4], index=['Test Statistic', 'p-value', '#Lags Used', 'Number of Observations Used'])
    for key, value in dftest[4].items():
        dfoutput['Critical Value (%s)' % key] = value
    print(dfoutput)

adf_test(df['Passengers'])

In [None]:
# Make the time series stationary
df['Passengers_diff'] = df['Passengers'].diff()
df['Passengers_diff'].dropna(inplace=True)

# Plot the differenced series
plt.figure(figsize=(12, 6))
plt.plot(df.index[1:], df['Passengers_diff'])
plt.title('Differenced Air Passengers')
plt.xlabel('Date')
plt.ylabel('Differenced Passengers')
plt.grid(True)
plt.show()

# Test stationarity of differenced series
adf_test(df['Passengers_diff'].dropna())

In [None]:
# Split the data into train and test sets
train_size = int(len(df) * 0.8)
train, test = df['Passengers'][:train_size], df['Passengers'][train_size:]

# Fit ARIMA model
model = ARIMA(train, order=(1, 1, 1))
model_fit = model.fit()

# Make predictions
predictions = model_fit.forecast(steps=len(test))
predictions = pd.Series(predictions, index=test.index)

# Plot predictions
plt.figure(figsize=(12, 6))
plt.plot(train.index, train, label='Training Data')
plt.plot(test.index, test, label='Actual Test Data')
plt.plot(predictions.index, predictions, label='ARIMA Predictions')
plt.title('ARIMA Model Predictions')
plt.xlabel('Date')
plt.ylabel('Number of Passengers')
plt.legend()
plt.grid(True)
plt.show()

In [None]:
# Evaluate ARIMA model
mse = mean_squared_error(test, predictions)
rmse = np.sqrt(mse)
mae = mean_absolute_error(test, predictions)

print(f"ARIMA Model Evaluation:")
print(f"Mean Squared Error: {mse:.2f}")
print(f"Root Mean Squared Error: {rmse:.2f}")
print(f"Mean Absolute Error: {mae:.2f}")

In [None]:
# Prepare data for Prophet
prophet_df = df.reset_index()
prophet_df.columns = ['ds', 'y']

# Split into train and test
prophet_train = prophet_df[:train_size]
prophet_test = prophet_df[train_size:]

# Fit Prophet model
prophet_model = Prophet(seasonality_mode='multiplicative')
prophet_model.fit(prophet_train)

# Make predictions
future = prophet_model.make_future_dataframe(periods=len(test))
forecast = prophet_model.predict(future)

# Plot predictions
plt.figure(figsize=(12, 6))
plt.plot(prophet_train['ds'], prophet_train['y'], label='Training Data')
plt.plot(prophet_test['ds'], prophet_test['y'], label='Actual Test Data')
plt.plot(forecast['ds'][train_size:], forecast['yhat'][train_size:], label='Prophet Predictions')
plt.fill_between(forecast['ds'][train_size:], 
                 forecast['yhat_lower'][train_size:], 
                 forecast['yhat_upper'][train_size:], 
                 alpha=0.2)
plt.title('Prophet Model Predictions')
plt.xlabel('Date')
plt.ylabel('Number of Passengers')
plt.legend()
plt.grid(True)
plt.show()

In [None]:
# Evaluate Prophet model
prophet_predictions = forecast['yhat'][train_size:].values
mse = mean_squared_error(test, prophet_predictions)
rmse = np.sqrt(mse)
mae = mean_absolute_error(test, prophet_predictions)

print(f"Prophet Model Evaluation:")
print(f"Mean Squared Error: {mse:.2f}")
print(f"Root Mean Squared Error: {rmse:.2f}")
print(f"Mean Absolute Error: {mae:.2f}")

In [None]:
# Plot components of Prophet model
fig = prophet_model.plot_components(forecast)
plt.show()

## Conclusion

In this notebook, we explored various time series analysis techniques using the Air Passengers dataset:

1. **Time Series Decomposition**:
   - Identified trend, seasonal, and residual components
   - Observed clear seasonality and upward trend

2. **Stationarity Analysis**:
   - Original series was non-stationary
   - Differencing made the series stationary

3. **Modeling Approaches**:
   - ARIMA model captured the basic patterns
   - Prophet model provided better predictions with uncertainty intervals

4. **Model Performance**:
   - Prophet outperformed ARIMA in terms of RMSE and MAE
   - Both models captured the seasonal pattern
   - Prophet provided additional insights through component analysis

5. **Key Insights**:
   - Time series decomposition helps understand underlying patterns
   - Stationarity is important for traditional time series models
   - Modern approaches like Prophet can handle complex patterns

This notebook serves as a good starting point for understanding time series analysis and forecasting techniques. 