# ðŸš‡ Masar â€” TCN 30-Minute Crowd Forecast (Deep Learning)
---


# ðŸ“˜ Overview

This notebook trains a **Temporal Convolutional Network (TCN)** to predict station crowd levels 30 minutes into the future as part of the Masar Digital Twin.

The model uses 1-minuteâ€“interval data from the September simulation and converts each station's passenger flow into fixed-length time windows for sequence-based forecasting.


---



# ðŸŽ¯ Goal

**Forecast the future station_total (crowd level)** by learning patterns directly from past time series values using:

* Sliding window sequences (e.g., 24-minute history)

* Future prediction horizon (30 minutes ahead)

* Station-level sequences

* TCN layers that capture short- and mid-term temporal patterns



---





# ðŸ“‚ Steps





1. Load and sort station data

2. Generate time-based features (hour, weekday, events, etc.)

3. Create lag and rolling features

4. Extract station-level sequences

5. Build sliding window inputs (24 â†’ next 30 minutes)

6. Train/validation/test chronological split

7. Define and train the TCN model

8. Evaluate performance using RMSE, MAE, and RÂ²

9. Save the trained model for API integration



In [None]:
%cd /content
!git clone https://github.com/Jana-Alrzoog/2025_GP_28.git
%cd /content/2025_GP_28/masar-sim
!ls

/content
Cloning into '2025_GP_28'...
remote: Enumerating objects: 853, done.[K
remote: Counting objects: 100% (93/93), done.[K
remote: Compressing objects: 100% (78/78), done.[K
remote: Total 853 (delta 39), reused 26 (delta 13), pack-reused 760 (from 4)[K
Receiving objects: 100% (853/853), 38.37 MiB | 13.54 MiB/s, done.
Resolving deltas: 100% (319/319), done.
Updating files: 100% (186/186), done.
/content/2025_GP_28/masar-sim
data  lib  notebooks  requirements.txt	server.py  sim_core.py	sims


# Load & Prepare September Dataset

In [None]:
# Ensure we are inside the cloned GitHub repository
%cd /content/2025_GP_28

import pandas as pd

# 1) Load September dataset (generated by the simulator)
FILE_PATH = "masar-sim/data/generated/2025-09_StationData.csv"

df = pd.read_csv(FILE_PATH, parse_dates=["timestamp"])

# 2) Sort by station and timestamp to ensure correct time sequence
df = df.sort_values(["station_id", "timestamp"]).reset_index(drop=True)

# 3) Display the first 5 rows to verify that the dataset loaded correctly
df.head()

/content/2025_GP_28


  df = pd.read_csv(FILE_PATH, parse_dates=["timestamp"])


Unnamed: 0,date,timestamp,hour,minute_of_day,day_of_week,is_weekend,station_id,headway_seconds,base_demand,modifier,...,holiday_flag,lag_5,lag_15,lag_30,lag_60,lag_120,roll_mean_15,roll_std_15,roll_mean_60,target_30min
0,2025-09-01,2025-09-01 00:00:00,0,0,0,0,S1,660,0.11,1.0,...,0,,,,,,154.0,,154.0,
1,2025-09-01,2025-09-01 06:00:00,6,360,0,0,S1,660,0.210551,1.0,...,0,,,,,,295.0,,295.0,
2,2025-09-01,2025-09-01 06:01:00,6,361,0,0,S1,660,0.216663,1.0,...,0,,,,,,299.5,6.363961,299.5,
3,2025-09-01,2025-09-01 06:02:00,6,362,0,0,S1,660,0.223091,1.0,...,0,,,,,,304.0,9.0,304.0,
4,2025-09-01,2025-09-01 06:03:00,6,363,0,0,S1,660,0.229847,1.0,...,0,,,,,,308.5,11.61895,308.5,


# Import Libraries

In [None]:
import torch
import torch.nn as nn
from torch.utils.data import DataLoader, TensorDataset
import numpy as np


# Feature Engineering for TCN Sequence Modeling

In [None]:
df["hour"] = df["timestamp"].dt.hour
df["minute_of_day"] = df["hour"] * 60 + df["timestamp"].dt.minute
df["day_of_week"] = df["timestamp"].dt.dayofweek
df["is_weekend"] = df["day_of_week"].isin([4,5]).astype(int)


In [None]:
df["event_flag"] = df["event_flag"].astype(int)
df["holiday_flag"] = df["holiday_flag"].astype(int)
df["special_event_type"] = df["special_event_type"].astype("category").cat.codes

In [None]:
lags = [5, 15, 30, 60, 120]
for l in lags:
    df[f"lag_{l}"] = df.groupby("station_id")["station_total"].shift(l)

In [None]:
df["roll_mean_15"] = df.groupby("station_id")["station_total"].rolling(15).mean().reset_index(level=0, drop=True)
df["roll_std_15"] = df.groupby("station_id")["station_total"].rolling(15).std().reset_index(level=0, drop=True)
df["roll_mean_60"] = df.groupby("station_id")["station_total"].rolling(60).mean().reset_index(level=0, drop=True)

In [None]:
df["target_30m"] = df.groupby("station_id")["station_total"].shift(-30)
df = df.dropna(subset=["target_30m"])

In [None]:
df["station_id"].unique()

array(['S1', 'S2', 'S3', 'S4', 'S5', 'S6'], dtype=object)

In [None]:
STATION_ID = "S1"

df_s = (
    df[df["station_id"] == STATION_ID]
    .sort_values("timestamp")
    .reset_index(drop=True)
)

print("Rows for station", STATION_ID, ":", len(df_s))
df_s[["timestamp", "station_id", "station_total"]].head()


Rows for station S1 : 32400


Unnamed: 0,timestamp,station_id,station_total
0,2025-09-01 00:00:00,S1,154
1,2025-09-01 06:00:00,S1,295
2,2025-09-01 06:01:00,S1,304
3,2025-09-01 06:02:00,S1,313
4,2025-09-01 06:03:00,S1,322


# Sliding Window Generation for TCN Input Sequences (24 â†’ 30 Minutes)

In [None]:
window = 24
horizon = 6

values = df_s["station_total"].astype("float32").values

X_list = []
y_list = []

for i in range(len(values) - window - horizon):
    X_list.append(values[i : i + window])

    y_list.append(values[i + window + horizon - 1])

X_seq = np.array(X_list)  # (N, window)
y_seq = np.array(y_list)  # (N,)

print("X_seq shape (N, window):", X_seq.shape)
print("y_seq shape (N,):", y_seq.shape)


X_seq shape (N, window): (32370, 24)
y_seq shape (N,): (32370,)


In [None]:
import torch

X_seq = X_seq[:, np.newaxis, :]

X_tensor = torch.from_numpy(X_seq)
y_tensor = torch.from_numpy(y_seq).view(-1, 1)

print("X_tensor shape:", X_tensor.shape)
print("y_tensor shape:", y_tensor.shape)


X_tensor shape: torch.Size([32370, 1, 24])
y_tensor shape: torch.Size([32370, 1])


# Train/Val/Test Split & PyTorch DataLoader Preparation

In [None]:
N = X_tensor.shape[0]
train_end = int(N * 0.7)
val_end   = int(N * 0.85)

X_train = X_tensor[:train_end]
y_train = y_tensor[:train_end]

X_val   = X_tensor[train_end:val_end]
y_val   = y_tensor[train_end:val_end]

X_test  = X_tensor[val_end:]
y_test  = y_tensor[val_end:]

print("Train:", X_train.shape, y_train.shape)
print("Val:  ", X_val.shape,   y_val.shape)
print("Test: ", X_test.shape,  y_test.shape)


Train: torch.Size([22659, 1, 24]) torch.Size([22659, 1])
Val:   torch.Size([4855, 1, 24]) torch.Size([4855, 1])
Test:  torch.Size([4856, 1, 24]) torch.Size([4856, 1])


In [None]:
from torch.utils.data import TensorDataset, DataLoader

batch_size = 64

train_ds = TensorDataset(X_train, y_train)
val_ds   = TensorDataset(X_val, y_val)
test_ds  = TensorDataset(X_test, y_test)

train_loader = DataLoader(train_ds, batch_size=batch_size, shuffle=True)
val_loader   = DataLoader(val_ds, batch_size=batch_size, shuffle=False)
test_loader  = DataLoader(test_ds, batch_size=batch_size, shuffle=False)

print("Dataloaders ready.")


Dataloaders ready.


# TCN Model Architecture Definition & Initialization

In [None]:
import torch.nn as nn
import torch
import numpy as np

class SimpleTCN(nn.Module):
    def __init__(self, in_channels=1, n_filters=32, kernel_size=3):
        super().__init__()

        self.conv1 = nn.Conv1d(
            in_channels=in_channels,
            out_channels=n_filters,
            kernel_size=kernel_size,
            padding=(kernel_size - 1),
            dilation=1,
        )
        self.relu1 = nn.ReLU()

        self.conv2 = nn.Conv1d(
            in_channels=n_filters,
            out_channels=n_filters,
            kernel_size=kernel_size,
            padding=2 * (kernel_size - 1),  # dilation=2
            dilation=2,
        )
        self.relu2 = nn.ReLU()

        self.global_pool = nn.AdaptiveAvgPool1d(1)  # (batch, filters, 1)

        self.fc = nn.Linear(n_filters, 1)

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu1(x)

        x = self.conv2(x)
        x = self.relu2(x)

        x = self.global_pool(x)   # (batch, filters, 1)
        x = x.squeeze(-1)         # (batch, filters)
        x = self.fc(x)            # (batch, 1)
        return x

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = SimpleTCN(in_channels=1, n_filters=32, kernel_size=3).to(device)
model


SimpleTCN(
  (conv1): Conv1d(1, 32, kernel_size=(3,), stride=(1,), padding=(2,))
  (relu1): ReLU()
  (conv2): Conv1d(32, 32, kernel_size=(3,), stride=(1,), padding=(4,), dilation=(2,))
  (relu2): ReLU()
  (global_pool): AdaptiveAvgPool1d(output_size=1)
  (fc): Linear(in_features=32, out_features=1, bias=True)
)

# Training Setup: Loss Function, Optimizer & Evaluation Function

In [None]:
import torch.optim as optim

criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=1e-3)

def eval_loader(loader, name="Split"):
    model.eval()
    total_loss = 0.0
    all_true = []
    all_pred = []

    with torch.no_grad():
        for xb, yb in loader:
            xb = xb.to(device)
            yb = yb.to(device)

            preds = model(xb)
            loss = criterion(preds, yb)
            total_loss += loss.item() * xb.size(0)

            all_true.append(yb.cpu().numpy())
            all_pred.append(preds.cpu().numpy())

    total_loss /= len(loader.dataset)
    all_true = np.concatenate(all_true).ravel()
    all_pred = np.concatenate(all_pred).ravel()

    rmse = np.sqrt(np.mean((all_true - all_pred) ** 2))
    mae  = np.mean(np.abs(all_true - all_pred))

    print(f"{name} â†’ MSE: {total_loss:.2f}, RMSE: {rmse:.2f}, MAE: {mae:.2f}")
    return rmse, mae


# TCN Evaluation Function (RMSE, MAE, RÂ²)

In [None]:
from sklearn.metrics import r2_score
import numpy as np

def eval_loader(loader, name="Split"):
    model.eval()
    total_loss = 0.0
    all_true = []
    all_pred = []

    with torch.no_grad():
        for xb, yb in loader:
            xb = xb.to(device)
            yb = yb.to(device)

            preds = model(xb)
            loss = criterion(preds, yb)
            total_loss += loss.item() * xb.size(0)

            all_true.append(yb.cpu().numpy())
            all_pred.append(preds.cpu().numpy())

    total_loss /= len(loader.dataset)
    all_true = np.concatenate(all_true).ravel()
    all_pred = np.concatenate(all_pred).ravel()

    rmse = np.sqrt(np.mean((all_true - all_pred) ** 2))
    mae  = np.mean(np.abs(all_true - all_pred))
    r2   = r2_score(all_true, all_pred)

    print(f"{name} â†’ MSE: {total_loss:.2f}, RMSE: {rmse:.2f}, MAE: {mae:.2f}, RÂ²: {r2:.3f}")
    return rmse, mae, r2


In [None]:
print("\nFinal evaluation:")
eval_loader(train_loader, name="Train")
eval_loader(val_loader,   name="Val")
eval_loader(test_loader,  name="Test")


Final evaluation:
Train â†’ MSE: 40385.59, RMSE: 200.96, MAE: 84.74, RÂ²: 0.981
Val â†’ MSE: 44522.63, RMSE: 211.00, MAE: 106.57, RÂ²: 0.987
Test â†’ MSE: 39769.37, RMSE: 199.42, MAE: 83.56, RÂ²: 0.981


(np.float32(199.42259), np.float32(83.56273), 0.9812371134757996)

# ðŸ“Œ Model Performance Summary (TCN)
The TCN model shows consistent performance across Train, Validation, and Test splits, indicating that it generalizes well across different temporal windows.


* **Train RÂ² = 0.981**

    Strong fit to historical sequences; the model learns core patterns well.

* **Validation RÂ² = 0.987**

    Slightly higher accuracy than training, showing the validation window matches the learned temporal structure.

* **Test RÂ² = 0.981**

    Nearly identical to training performance, indicating excellent generalization to future unseen data.


---


# ðŸ§  Interpretation

The TCN performs consistently across all splits with minimal overfitting and no major distribution shift.
Its stable RMSE and MAE values show that it captures short- and mid-term demand patterns reliably.


---


# ðŸŽ¯ Conclusion

The model is robust, stable, and well-suited for 30-minute crowd forecasting.
Further improvements are optional rather than necessary, as performance is already strong and consistent.



---



# ðŸ§  ProTCN: Advanced Temporal Convolutional Network Model

This section defines a **deeper TCN** architecture with multiple dilated convolution layers, dropout regularization, and global average pooling. The model is designed to capture longer temporal dependencies and improve 30-minute crowd forecasting accuracy compared to the simpler baseline TCN.

In [None]:
import torch.nn as nn
import torch
import numpy as np

class ProTCNPlain(nn.Module):
    def __init__(self, in_channels=1, n_filters=64, kernel_size=3, num_layers=4, dropout=0.2):
        super().__init__()

        layers = []
        current_channels = in_channels

        # ------------------------------------------------------------
        # Build multiple stacked dilated convolution layers.
        # Each layer increases dilation exponentially: 1, 2, 4, 8, ...
        # This expands the receptive field, allowing the TCN to
        # "see" far back in the time series without using RNNs.
        # ------------------------------------------------------------
        for i in range(num_layers):
            dilation = 2 ** i
            padding = dilation  # keeps output length roughly aligned

            conv = nn.Conv1d(
                in_channels=current_channels,
                out_channels=n_filters,
                kernel_size=kernel_size,
                padding=padding,
                dilation=dilation,
            )

            # Save the block: Conv â†’ ReLU â†’ Dropout
            layers.append(conv)
            layers.append(nn.ReLU())
            layers.append(nn.Dropout(dropout))

            current_channels = n_filters  # next layer input channels

        # Stack all layers into one TCN module
        self.tcn = nn.Sequential(*layers)

        # ------------------------------------------------------------
        # Global Average Pooling:
        # Reduces the time dimension (seq_len) into a single feature
        # per filter â†’ good for stable forecasting.
        # ------------------------------------------------------------
        self.global_pool = nn.AdaptiveAvgPool1d(1)

        # Fully-connected output â†’ predict 30-minute future value
        self.fc = nn.Linear(n_filters, 1)

    def forward(self, x):
        # x shape: (batch, in_channels, seq_len)
        out = self.tcn(x)            # (batch, n_filters, L')
        out = self.global_pool(out)  # (batch, n_filters, 1)
        out = out.squeeze(-1)        # (batch, n_filters)
        out = self.fc(out)           # (batch, 1) â†’ final prediction
        return out


# ============================================================
#  Initialize the ProTCN model
# ------------------------------------------------------------
# Moves the model to GPU if available.
# in_channels = 1 â†’ input is station_total time series.
# ============================================================
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = ProTCNPlain(
    in_channels=1,
    n_filters=64,
    kernel_size=3,
    num_layers=4,
    dropout=0.2,
).to(device)

model


ProTCNPlain(
  (tcn): Sequential(
    (0): Conv1d(1, 64, kernel_size=(3,), stride=(1,), padding=(1,))
    (1): ReLU()
    (2): Dropout(p=0.2, inplace=False)
    (3): Conv1d(64, 64, kernel_size=(3,), stride=(1,), padding=(2,), dilation=(2,))
    (4): ReLU()
    (5): Dropout(p=0.2, inplace=False)
    (6): Conv1d(64, 64, kernel_size=(3,), stride=(1,), padding=(4,), dilation=(4,))
    (7): ReLU()
    (8): Dropout(p=0.2, inplace=False)
    (9): Conv1d(64, 64, kernel_size=(3,), stride=(1,), padding=(8,), dilation=(8,))
    (10): ReLU()
    (11): Dropout(p=0.2, inplace=False)
  )
  (global_pool): AdaptiveAvgPool1d(output_size=1)
  (fc): Linear(in_features=64, out_features=1, bias=True)
)

# Training the ProTCN Model (Epoch Loop + Validation Checks)

In [None]:
import torch.optim as optim
import torch.nn as nn
from sklearn.metrics import r2_score

criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=1e-3)


In [None]:
n_epochs = 10

for epoch in range(1, n_epochs + 1):
    model.train()
    total_train_loss = 0.0

    for xb, yb in train_loader:
        xb = xb.to(device)
        yb = yb.to(device)

        optimizer.zero_grad()
        preds = model(xb)
        loss = criterion(preds, yb)
        loss.backward()
        optimizer.step()

        total_train_loss += loss.item() * xb.size(0)

    total_train_loss /= len(train_loader.dataset)
    print(f"\nEpoch {epoch}/{n_epochs} - Train MSE: {total_train_loss:.2f}")
    eval_loader(val_loader, name="Val")

print("\nFinal evaluation:")
eval_loader(train_loader, name="Train")
eval_loader(val_loader,   name="Val")
eval_loader(test_loader,  name="Test")



Epoch 1/10 - Train MSE: 272213.07
Val â†’ MSE: 116237.96, RMSE: 340.94, MAE: 247.71, RÂ²: 0.966

Epoch 2/10 - Train MSE: 75451.71
Val â†’ MSE: 192709.92, RMSE: 438.99, MAE: 300.64, RÂ²: 0.944

Epoch 3/10 - Train MSE: 69170.49
Val â†’ MSE: 50444.92, RMSE: 224.60, MAE: 113.95, RÂ²: 0.985

Epoch 4/10 - Train MSE: 69005.11
Val â†’ MSE: 154843.80, RMSE: 393.50, MAE: 258.75, RÂ²: 0.955

Epoch 5/10 - Train MSE: 69206.64
Val â†’ MSE: 132184.63, RMSE: 363.57, MAE: 238.18, RÂ²: 0.961

Epoch 6/10 - Train MSE: 59068.80
Val â†’ MSE: 159735.18, RMSE: 399.67, MAE: 261.37, RÂ²: 0.953

Epoch 7/10 - Train MSE: 65100.19
Val â†’ MSE: 179002.10, RMSE: 423.09, MAE: 274.70, RÂ²: 0.948

Epoch 8/10 - Train MSE: 60051.13
Val â†’ MSE: 66946.87, RMSE: 258.74, MAE: 146.16, RÂ²: 0.980

Epoch 9/10 - Train MSE: 57599.98
Val â†’ MSE: 81483.55, RMSE: 285.45, MAE: 166.03, RÂ²: 0.976

Epoch 10/10 - Train MSE: 55895.30
Val â†’ MSE: 51269.64, RMSE: 226.43, MAE: 126.19, RÂ²: 0.985

Final evaluation:
Train â†’ MSE: 36637.12

(np.float32(191.97609), np.float32(93.63421), 0.9826121926307678)

# ðŸ“Œ Model Performance Summary (ProTCN)
The ProTCN model shows consistent and strong performance across training, validation, and test sets, with very similar RÂ² scores in all splits.

* **Train RÂ² = 0.983**

    Strong fit on historical sequences with low error values.

* **Validation RÂ² = 0.985**

    Slightly better than training, showing stable learning and good alignment with mid-range unseen data.

* **Test RÂ² = 0.983**

    Matches training performance, indicating excellent generalization to future sequences.



---


# ðŸ§  Interpretation

The ProTCN behaves consistently across all data splits with no signs of overfitting.
Training, validation, and test metrics remain tightly aligned, showing strong robustness to temporal variation.


---


# ðŸŽ¯ Conclusion
The ProTCN model performs reliably and consistently across all evaluation windows.
Its stable RÂ² scores (â‰ˆ0.98 in all splits) make it suitable for real deployment in Masarâ€™s 30-minute forecasting pipeline.
Any further improvements would be optional refinements rather than necessary fixes.