<a href="https://colab.research.google.com/github/calmrocks/master-machine-learning-engineer/blob/main/BasicModels/TimeSeries.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Case Study: Forecasting Energy Demand Using Time-Series Analysis

In this case study, we demonstrate how to apply time-series forecasting techniques to predict energy demand using the **Electricity Load Diagrams 2011** dataset, a popular open-source dataset. We will explore the steps to prepare the data, analyze its trends and seasonality, and build a forecasting model using SARIMA and LSTM.

### Dataset Overview

The Electricity Load Diagrams 2011 dataset contains hourly electrical load data from a Portuguese electricity provider. The data spans a full year, making it suitable for exploring trends, seasonality, and short-term variations.

#### Key Features:
- **Datetime**: The timestamp for each observation.
- **Load (kW)**: The hourly electricity consumption.

The dataset is available for download [here](https://archive.ics.uci.edu/ml/datasets/ElectricityLoadDiagrams2011).

### Step 1: Data Preparation

Before building a model, we need to load, clean, and preprocess the data.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

# Load the dataset
data = pd.read_csv("LD2011_2011.txt", sep=";", index_col=0, parse_dates=True)

# Select a single household's data for simplicity
household_data = data.iloc[:, 0]  # Selecting the first column
household_data = household_data.resample('H').mean()  # Resample to hourly data

# Plot the time series
plt.figure(figsize=(10, 6))
plt.plot(household_data, label="Hourly Energy Consumption")
plt.title("Hourly Energy Consumption (2011)")
plt.xlabel("Time")
plt.ylabel("Energy Consumption (kW)")
plt.legend()
plt.show()

### Step 2: Exploratory Data Analysis

#### Decomposition of Time-Series
Using Seasonal-Trend decomposition (STL) to separate the series into trend, seasonality, and residual components.

In [None]:
from statsmodels.tsa.seasonal import STL

# Perform STL decomposition
stl = STL(household_data, seasonal=24)  # 24-hour daily seasonality
result = stl.fit()

# Plot components
result.plot()
plt.show()

### Step 3: Model Building

#### A. SARIMA Model
SARIMA (Seasonal ARIMA) is well-suited for capturing both trends and seasonality.

In [None]:
from statsmodels.tsa.statespace.sarimax import SARIMAX
from sklearn.metrics import mean_squared_error

# Split data into training and testing sets
train = household_data[:'2011-10']
test = household_data['2011-11':]

# Fit SARIMA model
sarima_model = SARIMAX(train, order=(1, 1, 1), seasonal_order=(1, 1, 1, 24))
sarima_result = sarima_model.fit()

# Forecast
forecast = sarima_result.forecast(steps=len(test))

# Evaluate model
mse = mean_squared_error(test, forecast)
print(f"SARIMA Test MSE: {mse}")

# Plot forecast
plt.figure(figsize=(10, 6))
plt.plot(train, label="Training Data")
plt.plot(test, label="Test Data", color="orange")
plt.plot(test.index, forecast, label="SARIMA Forecast", color="green")
plt.legend()
plt.show()

#### B. LSTM Model
Long Short-Term Memory (LSTM) networks handle non-linear patterns and long-term dependencies in time-series data.

In [None]:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense

# Prepare data for LSTM
def create_lagged_data(series, lag):
    X, y = [], []
    for i in range(len(series) - lag):
        X.append(series[i:i+lag])
        y.append(series[i+lag])
    return np.array(X), np.array(y)

# Create lagged dataset
lag = 24
X, y = create_lagged_data(household_data.values, lag)

# Split into train and test
train_size = int(len(X) * 0.8)
X_train, X_test = X[:train_size], X[train_size:]
y_train, y_test = y[:train_size], y[train_size:]

# Reshape for LSTM input
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 1))
X_test = X_test.reshape((X_test.shape[0], X_test.shape[1], 1))

# Build LSTM model
model = Sequential([
    LSTM(50, activation='relu', input_shape=(lag, 1)),
    Dense(1)
])
model.compile(optimizer='adam', loss='mse')

# Train model
model.fit(X_train, y_train, epochs=10, batch_size=32, verbose=1)

# Forecast
lstm_forecast = model.predict(X_test)

# Plot results
plt.figure(figsize=(10, 6))
plt.plot(y_test, label="Actual Data", color="orange")
plt.plot(lstm_forecast, label="LSTM Forecast", color="green")
plt.legend()
plt.show()

### Step 4: Summary and Recommendations

#### Summary:
- **SARIMA**: Suitable for capturing linear trends and seasonality in time-series data.
- **LSTM**: Effective for handling non-linear patterns and long-term dependencies.

#### Recommendations:
- For simpler time-series data with clear seasonality, SARIMA provides interpretable and robust forecasts.
- For complex data with non-linear patterns or multiple seasonal cycles, LSTM offers better adaptability and accuracy.

#### Next Steps:
- Experiment with hybrid models combining SARIMA and LSTM to leverage their strengths.
- Explore advanced techniques like Prophet or Transformer-based time-series models for further improvement.