### Environment Setup

This experiment is implemented using PyTorch within a dedicated Conda virtual environment (`oran311`) based on Python 3.11.

The development environment includes:

- Python 3.11
- PyTorch 2.x (CPU version)
- NumPy
- Pandas
- XGBoost


The following cell verifies the installed PyTorch version and checks whether GPU acceleration (CUDA) is available.

Since the current setup uses the CPU-only version of PyTorch, CUDA support is not enabled.


In [1]:
import torch
print(torch.__version__)
print("CUDA available:", torch.cuda.is_available())

2.10.0+cpu
CUDA available: False


In [2]:
import sys
print(sys.executable)

!{sys.executable} -m pip install -U pip
!{sys.executable} -m pip install -U xgboost



c:\Users\10199\anaconda3\envs\oran311\python.exe
Collecting xgboost
  Using cached xgboost-3.2.0-py3-none-win_amd64.whl.metadata (2.1 kB)
Using cached xgboost-3.2.0-py3-none-win_amd64.whl (101.7 MB)
Installing collected packages: xgboost
Successfully installed xgboost-3.2.0


In [2]:
import xgboost
from xgboost import XGBRegressor
print("xgboost version:", xgboost.__version__)


xgboost version: 3.2.0


## Model 3: Hybrid DNN–XGBoost (DNN Feature Extractor + XGBoost Regressor)

This model follows the hybrid pipeline described in the Cam-Ready paper.  
The key idea is to split the learning process into two stages:

1) A DNN is trained as a **feature extractor** to learn compact latent representations.  
2) The DNN is frozen, and a separate **XGBoost regressor** is trained on the extracted embeddings.

### DNN Feature Extractor Architecture
- Dense layers: 587 → 261 → 186 → 99
- Bottleneck embedding layer: 16 neurons
- Output head (for DNN training): 1 neuron (MSE loss)

### XGBoost Regressor (trained on embeddings)
- max_depth = 5
- n_estimators = 256
- learning_rate = 0.22


In [3]:
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error

import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader

# 1) load data
df = pd.read_csv("clean_ul_stage1.csv")

feature_cols = ["airtime", "selected_mcs", "txgain"]

target_col = "pm_power"

df = df.dropna(subset=feature_cols + [target_col]).copy()
for c in feature_cols + [target_col]:
    df[c] = pd.to_numeric(df[c], errors="coerce")
df = df.dropna(subset=feature_cols + [target_col]).copy()
df = df[df[target_col] > 0].copy() 

X = df[feature_cols].values
y = df[target_col].values


# 2) split: train/test then train/val
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_val,  y_train, y_val  = train_test_split(X_train, y_train, test_size=0.1, random_state=42)

# 3) scale (fit ONLY on train)
scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_val_s   = scaler.transform(X_val)
X_test_s  = scaler.transform(X_test)

print("Shapes:", X_train_s.shape, X_val_s.shape, X_test_s.shape)


Shapes: (4153, 3) (462, 3) (1154, 3)


In [4]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim

from sklearn.metrics import mean_squared_error, mean_absolute_error

from xgboost import XGBRegressor

class HybridFeatureExtractor(nn.Module):
    """
    DNN feature extractor + a small regression head.
    We train this end-to-end first, then freeze and use the 16-dim embeddings for XGBoost.
    """
    def __init__(self, input_dim):
        super().__init__()

        self.feature_net = nn.Sequential(
            nn.Linear(input_dim, 587),
            nn.ReLU(),

            nn.Linear(587, 261),
            nn.ReLU(),

            nn.Linear(261, 186),
            nn.ReLU(),

            nn.Linear(186, 99),
            nn.ReLU(),

            nn.Linear(99, 16)   # bottleneck embeddings (paper uses 16)
        )

        # regression head for training the DNN stage
        self.reg_head = nn.Linear(16, 1)

    def forward(self, x):
        emb = self.feature_net(x)
        out = self.reg_head(emb)
        return out, emb

X_train_tensor = torch.FloatTensor(X_train_s)
X_test_tensor  = torch.FloatTensor(X_test_s)

y_train_tensor = torch.FloatTensor(y_train).view(-1, 1)
y_test_tensor  = torch.FloatTensor(y_test).view(-1, 1)

input_dim = X_train_s.shape[1]




In [5]:
model3_dnn = HybridFeatureExtractor(input_dim)

criterion = nn.MSELoss()
optimizer = optim.Adam(model3_dnn.parameters(), lr=0.001)

epochs = 200
batch_size = 64

best_val_mse = float("inf")

print("\nUL dataset (model3 - Hybrid DNN) training:")

X_val_tensor = torch.FloatTensor(X_val_s)
y_val_tensor = torch.FloatTensor(y_val).view(-1, 1)


for epoch in range(epochs):
    model3_dnn.train()

    perm = torch.randperm(X_train_tensor.size(0))

    for i in range(0, X_train_tensor.size(0), batch_size):
        idx = perm[i:i+batch_size]
        bx = X_train_tensor[idx]
        by = y_train_tensor[idx]

        optimizer.zero_grad()
        pred, _ = model3_dnn(bx)
        loss = criterion(pred, by)
        loss.backward()
        optimizer.step()

    # ---- epoch-end evaluation ----
    model3_dnn.eval()
    with torch.no_grad():
        train_pred, _ = model3_dnn(X_train_tensor)
        val_pred, _   = model3_dnn(X_val_tensor)

        train_mse = criterion(train_pred, y_train_tensor).item()
        val_mse   = criterion(val_pred, y_val_tensor).item()

    if val_mse < best_val_mse:
        best_val_mse = val_mse

    # same printing style as model1
    if (epoch == 0) or ((epoch + 1) % 10 == 0):
        print(f"Epoch {epoch+1:03d} | train MSE {train_mse:.6f} | val MSE {val_mse:.6f}")

print(f"Best val MSE: {best_val_mse}")

model3_dnn.eval()

with torch.no_grad():
    _, emb_train = model3_dnn(X_train_tensor)
    _, emb_test  = model3_dnn(X_test_tensor)

emb_train = emb_train.numpy()
emb_test  = emb_test.numpy()

print("Embeddings shape (train):", emb_train.shape)
print("Embeddings shape (test) :", emb_test.shape)



UL dataset (model3 - Hybrid DNN) training:
Epoch 001 | train MSE 2.250002 | val MSE 2.081907
Epoch 010 | train MSE 0.100358 | val MSE 0.089086
Epoch 020 | train MSE 0.092565 | val MSE 0.084168
Epoch 030 | train MSE 0.096473 | val MSE 0.086918
Epoch 040 | train MSE 0.105854 | val MSE 0.100533
Epoch 050 | train MSE 0.101289 | val MSE 0.091574
Epoch 060 | train MSE 0.088105 | val MSE 0.080541
Epoch 070 | train MSE 0.161307 | val MSE 0.159567
Epoch 080 | train MSE 0.099074 | val MSE 0.091948
Epoch 090 | train MSE 0.083177 | val MSE 0.077287
Epoch 100 | train MSE 0.103379 | val MSE 0.100173
Epoch 110 | train MSE 0.090587 | val MSE 0.087209
Epoch 120 | train MSE 0.124728 | val MSE 0.122327
Epoch 130 | train MSE 0.127948 | val MSE 0.126945
Epoch 140 | train MSE 0.101011 | val MSE 0.094028
Epoch 150 | train MSE 0.097970 | val MSE 0.095075
Epoch 160 | train MSE 0.102417 | val MSE 0.094853
Epoch 170 | train MSE 0.080623 | val MSE 0.076628
Epoch 180 | train MSE 0.095710 | val MSE 0.091870
Epoch 

In [6]:
xgb = XGBRegressor(
    max_depth=5,
    n_estimators=256,
    learning_rate=0.22,
    objective="reg:squarederror",
    random_state=42
)

xgb.fit(emb_train, y_train)

y_pred_xgb = xgb.predict(emb_test).reshape(-1)
y_true = np.asarray(y_test).reshape(-1)

def mean_relative_error(y_true, y_pred, eps=1e-9):
    y_true = np.asarray(y_true).reshape(-1)
    y_pred = np.asarray(y_pred).reshape(-1)
    return float(np.mean(np.abs(y_true - y_pred) / (np.abs(y_true) + eps)) * 100)

mse  = mean_squared_error(y_true, y_pred_xgb)
rmse = float(np.sqrt(mse))
mae  = mean_absolute_error(y_true, y_pred_xgb)
mre  = mean_relative_error(y_true, y_pred_xgb)

print("\n=== Model 3: Hybrid DNN–XGBoost ===")
print("X:", feature_cols, " y:", target_col)
print(f"MSE  : {mse:.6f}")
print(f"RMSE : {rmse:.6f}")
print(f"MAE  : {mae:.6f}")
print(f"MRE% : {mre:.4f}")



=== Model 3: Hybrid DNN–XGBoost ===
X: ['airtime', 'selected_mcs', 'txgain']  y: pm_power
MSE  : 0.106532
RMSE : 0.326393
MAE  : 0.226258
MRE% : 1.9640
