# Experiment 12: Time Series Forecasting Using ARIMA

## Aim
To implement time series forecasting using the ARIMA (AutoRegressive Integrated Moving Average) model and analyze the forecast results.

## Objectives
- Understand the concept of ARIMA for time series forecasting.
- Apply ARIMA to forecast a real-world time series dataset.
- Evaluate and visualize the forecasting results.

## Tools Used
- **pandas**: For handling time series data.
- **statsmodels**: For implementing the ARIMA model.
- **matplotlib** and **seaborn**: For data visualization.
- **scikit-learn**: For calculating performance metrics.

## Implementation

### Step 1: Import Libraries
```python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from statsmodels.tsa.arima.model import ARIMA
from sklearn.metrics import mean_squared_error
```

### Step 2: Load and Visualize the Data
```python
# Load sample time series data (e.g., monthly airline passenger numbers)
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv"
data = pd.read_csv(url, parse_dates=['Month'], index_col='Month')
data.rename(columns={'Passengers': 'Passenger_Count'}, inplace=True)

# Visualize the time series data
plt.figure(figsize=(10, 6))
plt.plot(data, label='Passenger Count')
plt.title("Monthly Airline Passenger Numbers")
plt.xlabel("Date")
plt.ylabel("Passenger Count")
plt.legend()
plt.show()
```

### Step 3: Check for Stationarity
```python
from statsmodels.tsa.stattools import adfuller

# Perform Augmented Dickey-Fuller test
result = adfuller(data['Passenger_Count'])
print("ADF Statistic:", result[0])
print("p-value:", result[1])

# Interpret the result
if result[1] < 0.05:
    print("The time series is stationary.")
else:
    print("The time series is not stationary.")
```

### Step 4: Differencing to Make Data Stationary
```python
# Apply first differencing
data_diff = data.diff().dropna()

# Visualize the differenced data
plt.figure(figsize=(10, 6))
plt.plot(data_diff, label='Differenced Data')
plt.title("First Differenced Time Series")
plt.xlabel("Date")
plt.ylabel("Difference in Passenger Count")
plt.legend()
plt.show()
```

### Step 5: Fit the ARIMA Model
```python
# Fit ARIMA model (order=(p, d, q))
p, d, q = 2, 1, 2  # These values can be tuned based on ACF and PACF analysis
model = ARIMA(data, order=(p, d, q))
model_fit = model.fit()

# Summary of the model
print(model_fit.summary())
```

### Step 6: Forecast Using the Model
```python
# Forecast for the next 12 months
forecast = model_fit.get_forecast(steps=12)
forecast_mean = forecast.predicted_mean
forecast_ci = forecast.conf_int()

# Visualize the forecast
plt.figure(figsize=(10, 6))
plt.plot(data, label='Observed', color='blue')
plt.plot(forecast_mean, label='Forecast', color='orange')
plt.fill_between(forecast_ci.index,
                 forecast_ci.iloc[:, 0],
                 forecast_ci.iloc[:, 1], color='orange', alpha=0.2)
plt.title("ARIMA Forecast")
plt.xlabel("Date")
plt.ylabel("Passenger Count")
plt.legend()
plt.show()
```

### Step 7: Evaluate the Model
```python
# Calculate RMSE for the training set
predictions = model_fit.predict(start=1, end=len(data))
rmse = np.sqrt(mean_squared_error(data['Passenger_Count'], predictions))
print(f"Root Mean Squared Error (RMSE): {rmse:.2f}")
```

### Step 8: Save the Model
```python
# Save the model to a file
model_fit.save('arima_model.pkl')
print("Model saved successfully!")
```

### Step 9: Summary and Observations
```python
print("\nSummary:")
print("1. The ARIMA model was implemented for time series forecasting.")
print("2. The data was made stationary using differencing techniques.")
print("3. The model achieved an RMSE of {:.2f}, indicating a good fit.".format(rmse))
print("4. Forecasts for future data points were visualized along with confidence intervals.")
