In [1]:
import pandas as pd
from src.utils import evaluate

In [2]:
df = pd.read_csv('data/STORM_preprocessed_medianfill_1.csv', index_col=0) # 200 column
evaluate_dict = dict()

| Mạng Deep Learning     | Điểm Mạnh                                                 | Điểm Yếu                                               | Khi Nào Dùng                                      |
|------------------------|-----------------------------------------------------------|--------------------------------------------------------|--------------------------------------------------|
| **MLP (Multi-Layer Perceptron)** | - Đơn giản, dễ triển khai<br>- Hiệu quả cho dữ liệu nhỏ, đơn giản | - Không tốt cho dữ liệu có cấu trúc phức tạp<br>- Dễ overfitting với dữ liệu lớn | - Dữ liệu tabular (bảng)<br>- Khi không cần xử lý dữ liệu tuần tự hoặc không gian |
| **RNN (Recurrent Neural Network)** | - Tốt cho dữ liệu tuần tự<br>- Phân tích chuỗi thời gian hiệu quả | - Khó huấn luyện do vanishing gradient<br>- Chậm khi xử lý chuỗi dài | - Dự báo chuỗi thời gian<br>- Phân tích lịch sử giao dịch hoặc chuỗi sự kiện |
| **LSTM (Long Short-Term Memory)** | - Giải quyết vanishing gradient của RNN<br>- Ghi nhớ thông tin dài hạn tốt | - Tốn nhiều tài nguyên tính toán<br>- Khó tinh chỉnh | - Chuỗi thời gian dài<br>- Khi cần ghi nhớ các sự kiện quan trọng từ xa |
| **GRU (Gated Recurrent Unit)** | - Nhẹ và nhanh hơn LSTM<br>- Hiệu quả với chuỗi ngắn | - Khả năng biểu diễn thông tin dài hạn kém hơn LSTM | - Khi cần tốc độ nhanh hơn LSTM<br>- Chuỗi thời gian ngắn |
| **CNN (Convolutional Neural Network)** | - Khả năng trích xuất đặc trưng mạnh<br>- Phù hợp với dữ liệu hình ảnh và không gian | - Không hiệu quả với dữ liệu tuần tự | - Hồi quy trên hình ảnh (VD: dự đoán giá từ ảnh)<br>- Phân tích dữ liệu không gian |
| **ResNet (Residual Network)** | - Khả năng xử lý mạng sâu mà không gặp vanishing gradient<br>- Hiệu quả trong trích xuất đặc trưng phức tạp | - Tốn nhiều tài nguyên tính toán<br>- Cần nhiều dữ liệu để tránh overfitting | - Khi cần xây dựng mạng sâu<br>- Dự đoán hoặc phân loại trên dữ liệu hình ảnh phức tạp |
| **Transformers**       | - Xử lý tốt cả dữ liệu tuần tự và không gian<br>- Hiệu quả với dữ liệu lớn | - Tốn nhiều tài nguyên<br>- Cần dữ liệu lớn để huấn luyện tốt | - Dự báo chuỗi thời gian dài hạn<br>- Xử lý ngôn ngữ tự nhiên và dự báo chuỗi phức tạp |
| **TabNet**             | - Hiệu quả với dữ liệu bảng (tabular)<br>- Có thể giải thích mô hình nhờ cơ chế chú ý (attention) | - Khó tinh chỉnh và tối ưu<br>- Cần nhiều dữ liệu hơn so với MLP | - Khi cần mô hình vừa mạnh vừa có thể giải thích<br>- Phù hợp với các bài toán dữ liệu bảng phức tạp |


# Model Selection for Hurricane Data Regression

1. **`TabNet`** 
- TabNet is highly effective for **structured tabular data**, such as historical hurricane records where features may include wind speed, pressure, sea surface temperature, and atmospheric conditions.
- It uses a **sequential attention mechanism** that allows the model to focus on relevant features at different steps, improving interpretability.
- Unlike traditional neural networks, TabNet balances both **accuracy and interpretability**, making it useful when understanding the contribution of each feature to predictions is essential.

1. **`ResNet (Residual Network)`**
- ResNet is a powerful architecture for **deep learning models**, especially for complex data representations. Its **residual connections** prevent gradient loss, making it effective for deeper networks.
- While originally designed for image data, ResNet has been adapted for other types of data where deeper architectures are needed to capture complex patterns.


1. **`LSTM (Long Short-Term Memory)`**
- LSTM networks are ideal for **time-series data** because they can **remember long-term dependencies** and handle sequential relationships effectively. 
- Unlike standard RNNs, LSTM avoids the **vanishing gradient problem**, making it suitable for learning patterns in long-term hurricane data.
- Since hurricanes are influenced by **seasonal cycles and long-term climatic trends**, LSTM is a great choice to capture these **temporal dependencies** across years.


## 1. Target 1 : TotalDeaths

In [3]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

LINEAR_TARGETS = ["TotalDeaths", "NoInjured", "TotalDamageAdjusted(000US$)"]
ATTRIBUTES = ['Year', 'Month', 'MainLandfallLocation', 'OFDAResponse', 'Appeal', 'Declaration', 'LandfallMagnitude(kph)', 'LandfallPressure(mb)']
CATEGORICAL_TARGETS = ['Flood', 'Slide']

X = df[ATTRIBUTES + CATEGORICAL_TARGETS]
y = df[LINEAR_TARGETS[0]]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)

# standardize
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

### 1.1. TabNet

In [4]:
import numpy as np
import torch
from pytorch_tabnet.tab_model import TabNetRegressor
from sklearn.metrics import mean_squared_error

# Convert data to NumPy arrays (if they are not already)
X_train_np = X_train_scaled
X_test_np = X_test_scaled
y_train_np = y_train.values.reshape(-1, 1)
y_test_np = y_test.values.reshape(-1, 1)

# Define the TabNet Regressor
tabnet_model = TabNetRegressor()

# Train the model with verbose set to 0
tabnet_model.fit(
    X_train_np, y_train_np,
    eval_set=[(X_test_np, y_test_np)],
    eval_metric=['rmse'],
    max_epochs=100,
    patience=100,
    batch_size=32,
    virtual_batch_size=8
)

ModuleNotFoundError: No module named 'torch'

In [32]:
eval_values = evaluate(tabnet_model, X_test_np, y_test_np, threshold=0.3, mode="regression")
evaluate_dict = {}
evaluate_dict["TabNet"] = eval_values

eval_values

{'mae': 48.32,
 'mse': 6010.4,
 'rmse': 77.53,
 'mae_upperbound_tolerance': -32.68,
 'rmse_upperbound_tolerance': -51.47,
 'mse_upperbound_tolerance': -3747.03}

### 1.2. ResNet

In [33]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

In [36]:
def residual_block(x, units):
    shortcut = x
    x = layers.Dense(units, activation='relu')(x)
    x = layers.Dense(units)(x)  # No activation for the second layer
    x = layers.add([x, shortcut])  # Add the shortcut
    x = layers.Activation('relu')(x)
    return x


def build_resnet(input_shape, output_units):
    inputs = keras.Input(shape=input_shape)
    x = layers.Dense(64, activation='relu')(inputs)

    # Add several residual blocks
    for _ in range(3):  # Adjust the number of blocks as needed
        x = residual_block(x, 64)

    x = layers.Dense(32, activation='relu')(x)
    outputs = layers.Dense(output_units)(x)  # For regression, no activation here

    model = keras.Model(inputs=inputs, outputs=outputs)
    return model

resnet_model = build_resnet(input_shape=(X_train_scaled.shape[1],), output_units=1)
resnet_model.compile(optimizer='adam', loss='mean_squared_error')

checkpoint = tf.keras.callbacks.ModelCheckpoint("best_model.keras", save_best_only=True, monitor="val_loss", mode="min")

resnet_model.fit(X_train_scaled, y_train_np, epochs=100, batch_size=32, validation_data=(X_test_scaled, y_test_np), callbacks=[checkpoint], verbose=0)


<keras.src.callbacks.history.History at 0x23ab41815b0>

In [37]:
eval_values = evaluate(resnet_model, X_test_scaled, y_test_np, threshold=0.3, mode="regression")
evaluate_dict["ResNet"] = eval_values

eval_values

[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 89ms/step


{'mae': 204.56,
 'mse': 458741.26,
 'rmse': 677.3,
 'mae_upperbound_tolerance': -188.91,
 'rmse_upperbound_tolerance': -651.25,
 'mse_upperbound_tolerance': -456477.89}

### 1.3 LSTM

In [5]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout

ModuleNotFoundError: No module named 'tensorflow'

In [42]:
# Reshape the data for LSTM
def create_dataset(X, y, time_steps=1):
    Xs, ys = [], []
    for i in range(len(X) - time_steps):
        Xs.append(X[i:(i + time_steps)])
        ys.append(y.iloc[i + time_steps])  # Corresponding y value
    return np.array(Xs), np.array(ys)

TIME_STEPS = 1  # You can change this value based on your needs
X_train_lstm, y_train_lstm = create_dataset(pd.DataFrame(X_train_scaled), pd.Series(y_train), TIME_STEPS)
X_test_lstm, y_test_lstm = create_dataset(pd.DataFrame(X_test_scaled), pd.Series(y_test), TIME_STEPS)

# Reshape input to be [samples, time steps, features]
X_train_lstm = X_train_lstm.reshape((X_train_lstm.shape[0], X_train_lstm.shape[1], X_train_lstm.shape[2]))
X_test_lstm = X_test_lstm.reshape((X_test_lstm.shape[0], X_test_lstm.shape[1], X_test_lstm.shape[2]))

# Build the LSTM model

model = Sequential()

# Backbone
model.add(LSTM(512, activation='relu'))
model.add(Dropout(0.5))
model.add(LSTM(256, activation='relu'))
model.add(Dropout(0.5))

# fully connected
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(64, activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(16, activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(1))

model.summary()

  super().__init__(**kwargs)


<keras.src.callbacks.history.History at 0x23ab879d730>

In [None]:
# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model
model.fit(X_train_lstm, y_train_lstm, epochs=100, batch_size=32, validation_data=(X_test_lstm, y_test_lstm), verbose=0)

In [43]:
eval_values = evaluate(model, X_test_lstm, y_test_lstm, threshold=0.3, mode="regression")
evaluate_dict["LSTM"] = eval_values

eval_values

[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 181ms/step


{'mae': 45.44,
 'mse': 8351.36,
 'rmse': 91.39,
 'mae_upperbound_tolerance': -31.02,
 'rmse_upperbound_tolerance': -66.05,
 'mse_upperbound_tolerance': -6212.31}

In [44]:
# compare metrics value
def highlight_max(s):
    is_max = s == s.max()
    return ['color: red' if v else '' for v in is_max]

def highlight_min(s):
    is_min = s == s.min()
    return ['color: red' if v else '' for v in is_min]

def highlight_row(row, selected_method):
    return ['background-color: black;' if row['Method'] in selected_method else ''
            for _ in row]

selected_method = [model.__class__.__name__]
eval_value_df = pd.DataFrame(evaluate_dict).T.reset_index().rename(columns={"index":"Method"})

eval_value_df = (
    eval_value_df.style
    .apply(highlight_max, subset=["mae_upperbound_tolerance", "rmse_upperbound_tolerance", "mse_upperbound_tolerance"])
    .apply(highlight_min, subset=["mae", "mse", "rmse"])
    .apply(lambda row: highlight_row(row, selected_method), axis=1 )
    .format(precision=2)
)

eval_value_df

Unnamed: 0,Method,mae,mse,rmse,mae_upperbound_tolerance,rmse_upperbound_tolerance,mse_upperbound_tolerance
0,TabNet,48.32,6010.4,77.53,-32.68,-51.47,-3747.03
1,ResNet,204.56,458741.26,677.3,-188.91,-651.25,-456477.89
2,LSTM,45.44,8351.36,91.39,-31.02,-66.05,-6212.31


## 2. Target 2 : NoInjured

In [48]:
LINEAR_TARGETS = ["TotalDeaths", "NoInjured", "TotalDamageAdjusted(000US$)"]
ATTRIBUTES = ['Year', 'Month', 'MainLandfallLocation', 'OFDAResponse', 'Appeal', 'Declaration', 'LandfallMagnitude(kph)', 'LandfallPressure(mb)']
CATEGORICAL_TARGETS = ['Flood', 'Slide']

X = df[ATTRIBUTES + CATEGORICAL_TARGETS]
y = df[LINEAR_TARGETS[1]]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)

# standardize
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

### 2.1. TabNet

In [52]:
import numpy as np
import torch
from pytorch_tabnet.tab_model import TabNetRegressor
from sklearn.metrics import mean_squared_error

# Convert data to NumPy arrays (if they are not already)
X_train_np = X_train_scaled
X_test_np = X_test_scaled
y_train_np = y_train.values.reshape(-1, 1)
y_test_np = y_test.values.reshape(-1, 1)

# Define the TabNet Regressor
tabnet_model = TabNetRegressor()

# Train the model with verbose set to 0
tabnet_model.fit(
    X_train_np, y_train_np,
    eval_set=[(X_test_np, y_test_np)],
    eval_metric=['rmse'],
    max_epochs=100,
    patience=100,
    batch_size=32,
    virtual_batch_size=8
)



epoch 0  | loss: 51176.87402| val_0_rmse: 419.46671|  0:00:00s
epoch 1  | loss: 41880.02051| val_0_rmse: 419.15594|  0:00:00s
epoch 2  | loss: 43763.24365| val_0_rmse: 418.67334|  0:00:00s
epoch 3  | loss: 32999.54761| val_0_rmse: 417.40374|  0:00:00s
epoch 4  | loss: 41008.11987| val_0_rmse: 415.96654|  0:00:00s
epoch 5  | loss: 43509.63452| val_0_rmse: 415.42674|  0:00:01s
epoch 6  | loss: 47634.48047| val_0_rmse: 414.59451|  0:00:01s
epoch 7  | loss: 35576.15259| val_0_rmse: 416.31958|  0:00:01s
epoch 8  | loss: 35163.25983| val_0_rmse: 415.82687|  0:00:01s
epoch 9  | loss: 41436.69873| val_0_rmse: 414.20771|  0:00:01s
epoch 10 | loss: 35672.5542| val_0_rmse: 412.48017|  0:00:01s
epoch 11 | loss: 42418.5011| val_0_rmse: 410.94681|  0:00:02s
epoch 12 | loss: 42225.78857| val_0_rmse: 410.6045|  0:00:02s
epoch 13 | loss: 29692.9375| val_0_rmse: 410.91237|  0:00:02s
epoch 14 | loss: 37681.67114| val_0_rmse: 411.19666|  0:00:02s
epoch 15 | loss: 36882.25058| val_0_rmse: 410.53452|  0:00:



In [53]:
eval_values = evaluate(tabnet_model, X_test_np, y_test_np, threshold=0.3, mode="regression")
evaluate_dict = {}
evaluate_dict["TabNet"] = eval_values

eval_values

{'mae': 122.22,
 'mse': 154905.79,
 'rmse': 393.58,
 'mae_upperbound_tolerance': -88.02,
 'rmse_upperbound_tolerance': -272.35,
 'mse_upperbound_tolerance': -105915.95}

### 2.2. ResNet

In [54]:
def residual_block(x, units):
    shortcut = x
    x = layers.Dense(units, activation='relu')(x)
    x = layers.Dense(units)(x)  # No activation for the second layer
    x = layers.add([x, shortcut])  # Add the shortcut
    x = layers.Activation('relu')(x)
    return x


def build_resnet(input_shape, output_units):
    inputs = keras.Input(shape=input_shape)
    x = layers.Dense(64, activation='relu')(inputs)

    # Add several residual blocks
    for _ in range(3):  # Adjust the number of blocks as needed
        x = residual_block(x, 64)

    x = layers.Dense(32, activation='relu')(x)
    outputs = layers.Dense(output_units)(x)  # For regression, no activation here

    model = keras.Model(inputs=inputs, outputs=outputs)
    return model

resnet_model = build_resnet(input_shape=(X_train_scaled.shape[1],), output_units=1)
resnet_model.compile(optimizer='adam', loss='mean_squared_error')

checkpoint = tf.keras.callbacks.ModelCheckpoint("best_model.keras", save_best_only=True, monitor="val_loss", mode="min")

resnet_model.fit(X_train_scaled, y_train_np, epochs=100, batch_size=32, validation_data=(X_test_scaled, y_test_np), callbacks=[checkpoint], verbose=0)

<keras.src.callbacks.history.History at 0x23abac1bcb0>

In [55]:
eval_values = evaluate(resnet_model, X_test_scaled, y_test_np, threshold=0.3, mode="regression")
evaluate_dict["ResNet"] = eval_values

eval_values

[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 91ms/step


{'mae': 127.81,
 'mse': 168930.73,
 'rmse': 411.01,
 'mae_upperbound_tolerance': -93.61,
 'rmse_upperbound_tolerance': -289.78,
 'mse_upperbound_tolerance': -119940.89}

### 2.3 LSTM

In [56]:
# Reshape the data for LSTM
def create_dataset(X, y, time_steps=1):
    Xs, ys = [], []
    for i in range(len(X) - time_steps):
        Xs.append(X[i:(i + time_steps)])
        ys.append(y.iloc[i + time_steps])  # Corresponding y value
    return np.array(Xs), np.array(ys)

TIME_STEPS = 1  # You can change this value based on your needs
X_train_lstm, y_train_lstm = create_dataset(pd.DataFrame(X_train_scaled), pd.Series(y_train), TIME_STEPS)
X_test_lstm, y_test_lstm = create_dataset(pd.DataFrame(X_test_scaled), pd.Series(y_test), TIME_STEPS)

# Reshape input to be [samples, time steps, features]
X_train_lstm = X_train_lstm.reshape((X_train_lstm.shape[0], X_train_lstm.shape[1], X_train_lstm.shape[2]))
X_test_lstm = X_test_lstm.reshape((X_test_lstm.shape[0], X_test_lstm.shape[1], X_test_lstm.shape[2]))

# Build the LSTM model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(X_train_lstm.shape[1], X_train_lstm.shape[2])))
model.add(Dropout(0.2))
model.add(Dense(1))  # Output layer for regression (adjust this based on the number of targets)

# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model
model.fit(X_train_lstm, y_train_lstm, epochs=100, batch_size=32, validation_data=(X_test_lstm, y_test_lstm), verbose=0)

  super().__init__(**kwargs)


<keras.src.callbacks.history.History at 0x23abb10b470>

In [57]:
eval_values = evaluate(model, X_test_lstm, y_test_lstm, threshold=0.3, mode="regression")
evaluate_dict["LSTM"] = eval_values

eval_values

[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 198ms/step


{'mae': 111.92,
 'mse': 176573.78,
 'rmse': 420.21,
 'mae_upperbound_tolerance': -76.94,
 'rmse_upperbound_tolerance': -297.4,
 'mse_upperbound_tolerance': -126298.8}

In [58]:
# compare metrics value
def highlight_max(s):
    is_max = s == s.max()
    return ['color: red' if v else '' for v in is_max]

def highlight_min(s):
    is_min = s == s.min()
    return ['color: red' if v else '' for v in is_min]

def highlight_row(row, selected_method):
    return ['background-color: black;' if row['Method'] in selected_method else ''
            for _ in row]

selected_method = [model.__class__.__name__]
eval_value_df = pd.DataFrame(evaluate_dict).T.reset_index().rename(columns={"index":"Method"})

eval_value_df = (
    eval_value_df.style
    .apply(highlight_max, subset=["mae_upperbound_tolerance", "rmse_upperbound_tolerance", "mse_upperbound_tolerance"])
    .apply(highlight_min, subset=["mae", "mse", "rmse"])
    .apply(lambda row: highlight_row(row, selected_method), axis=1 )
    .format(precision=2)
)

eval_value_df

Unnamed: 0,Method,mae,mse,rmse,mae_upperbound_tolerance,rmse_upperbound_tolerance,mse_upperbound_tolerance
0,TabNet,122.22,154905.79,393.58,-88.02,-272.35,-105915.95
1,ResNet,127.81,168930.73,411.01,-93.61,-289.78,-119940.89
2,LSTM,111.92,176573.78,420.21,-76.94,-297.4,-126298.8


## 3. Target 3 : TotalDamageAdjusted(000US$)

In [59]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

LINEAR_TARGETS = ["TotalDeaths", "NoInjured", "TotalDamageAdjusted(000US$)"]
ATTRIBUTES = ['Year', 'Month', 'MainLandfallLocation', 'OFDAResponse', 'Appeal', 'Declaration', 'LandfallMagnitude(kph)', 'LandfallPressure(mb)']
CATEGORICAL_TARGETS = ['Flood', 'Slide']

X = df[ATTRIBUTES + CATEGORICAL_TARGETS]
y = df[LINEAR_TARGETS[2]]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)

# standardize
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

### 3.1 TabNet

In [60]:
import numpy as np
import torch
from pytorch_tabnet.tab_model import TabNetRegressor
from sklearn.metrics import mean_squared_error

# Convert data to NumPy arrays (if they are not already)
X_train_np = X_train_scaled
X_test_np = X_test_scaled
y_train_np = y_train.values.reshape(-1, 1)
y_test_np = y_test.values.reshape(-1, 1)

# Define the TabNet Regressor
tabnet_model = TabNetRegressor()

# Train the model with verbose set to 0
tabnet_model.fit(
    X_train_np, y_train_np,
    eval_set=[(X_test_np, y_test_np)],
    eval_metric=['rmse'],
    max_epochs=100,
    patience=100,
    batch_size=32,
    virtual_batch_size=8
)



epoch 0  | loss: 80834277376.0| val_0_rmse: 111230.12582|  0:00:00s
epoch 1  | loss: 71588611072.0| val_0_rmse: 111229.40948|  0:00:00s
epoch 2  | loss: 74787424256.0| val_0_rmse: 111228.98446|  0:00:00s
epoch 3  | loss: 58221058048.0| val_0_rmse: 111227.06913|  0:00:00s
epoch 4  | loss: 66301756416.0| val_0_rmse: 111226.37414|  0:00:00s
epoch 5  | loss: 69256152064.0| val_0_rmse: 111224.56016|  0:00:00s
epoch 6  | loss: 79422047232.0| val_0_rmse: 111222.99023|  0:00:01s
epoch 7  | loss: 67533618176.0| val_0_rmse: 111219.85812|  0:00:01s
epoch 8  | loss: 66835597824.0| val_0_rmse: 111219.45679|  0:00:01s
epoch 9  | loss: 71169016320.0| val_0_rmse: 111216.7922|  0:00:01s
epoch 10 | loss: 65461849088.0| val_0_rmse: 111213.91041|  0:00:01s
epoch 11 | loss: 72556908032.0| val_0_rmse: 111209.91132|  0:00:01s
epoch 12 | loss: 71420956160.0| val_0_rmse: 111205.8656|  0:00:01s
epoch 13 | loss: 75314334208.0| val_0_rmse: 111203.0334|  0:00:02s
epoch 14 | loss: 72367283456.0| val_0_rmse: 111198.



In [63]:
eval_values = evaluate(tabnet_model, X_test_np, y_test_np, threshold=0.3, mode="regression")
evaluate_dict = {}
evaluate_dict["TabNet"] = eval_values

eval_values

{'mae': 58275.8,
 'mse': 12093829806.1,
 'rmse': 109971.95,
 'mae_upperbound_tolerance': -40384.24,
 'rmse_upperbound_tolerance': -81804.5,
 'mse_upperbound_tolerance': -9449145216.65}

### 3.2 ResNet

In [64]:
def residual_block(x, units):
    shortcut = x
    x = layers.Dense(units, activation='relu')(x)
    x = layers.Dense(units)(x)  # No activation for the second layer
    x = layers.add([x, shortcut])  # Add the shortcut
    x = layers.Activation('relu')(x)
    return x


def build_resnet(input_shape, output_units):
    inputs = keras.Input(shape=input_shape)
    x = layers.Dense(64, activation='relu')(inputs)

    # Add several residual blocks
    for _ in range(3):  # Adjust the number of blocks as needed
        x = residual_block(x, 64)

    x = layers.Dense(32, activation='relu')(x)
    outputs = layers.Dense(output_units)(x)  # For regression, no activation here

    model = keras.Model(inputs=inputs, outputs=outputs)
    return model

resnet_model = build_resnet(input_shape=(X_train_scaled.shape[1],), output_units=1)
resnet_model.compile(optimizer='adam', loss='mean_squared_error')

checkpoint = tf.keras.callbacks.ModelCheckpoint("best_model.keras", save_best_only=True, monitor="val_loss", mode="min")

resnet_model.fit(X_train_scaled, y_train_np, epochs=100, batch_size=32, validation_data=(X_test_scaled, y_test_np), callbacks=[checkpoint], verbose=0)

<keras.src.callbacks.history.History at 0x23ac061c6b0>

In [65]:
eval_values = evaluate(resnet_model, X_test_scaled, y_test_np, threshold=0.3, mode="regression")
evaluate_dict["ResNet"] = eval_values

eval_values

[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 93ms/step


{'mae': 88226.12,
 'mse': 21733006445.02,
 'rmse': 147421.19,
 'mae_upperbound_tolerance': -70334.56,
 'rmse_upperbound_tolerance': -119253.73,
 'mse_upperbound_tolerance': -19088321855.58}

### 3.3 LSTM

In [69]:
# Reshape the data for LSTM
def create_dataset(X, y, time_steps=1):
    Xs, ys = [], []
    for i in range(len(X) - time_steps):
        Xs.append(X[i:(i + time_steps)])
        ys.append(y.iloc[i + time_steps])  # Corresponding y value
    return np.array(Xs), np.array(ys)

TIME_STEPS = 1  # You can change this value based on your needs
X_train_lstm, y_train_lstm = create_dataset(pd.DataFrame(X_train_scaled), pd.Series(y_train), TIME_STEPS)
X_test_lstm, y_test_lstm = create_dataset(pd.DataFrame(X_test_scaled), pd.Series(y_test), TIME_STEPS)

# Reshape input to be [samples, time steps, features]
X_train_lstm = X_train_lstm.reshape((X_train_lstm.shape[0], X_train_lstm.shape[1], X_train_lstm.shape[2]))
X_test_lstm = X_test_lstm.reshape((X_test_lstm.shape[0], X_test_lstm.shape[1], X_test_lstm.shape[2]))

# Build the LSTM model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(X_train_lstm.shape[1], X_train_lstm.shape[2])))
model.add(Dropout(0.2))
model.add(Dense(1))  # Output layer for regression (adjust this based on the number of targets)

# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model
model.fit(X_train_lstm, y_train_lstm, epochs=100, batch_size=32, validation_data=(X_test_lstm, y_test_lstm), verbose=0)

  super().__init__(**kwargs)


<keras.src.callbacks.history.History at 0x23ac3b42d20>

In [70]:
eval_values = evaluate(model, X_test_lstm, y_test_lstm, threshold=0.3, mode="regression")
evaluate_dict["LSTM"] = eval_values

eval_values

[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 180ms/step


{'mae': 60492.84,
 'mse': 12691205641.19,
 'rmse': 112655.25,
 'mae_upperbound_tolerance': -42337.64,
 'rmse_upperbound_tolerance': -84144.33,
 'mse_upperbound_tolerance': -9981629863.11}

In [68]:
# compare metrics value
def highlight_max(s):
    is_max = s == s.max()
    return ['color: red' if v else '' for v in is_max]

def highlight_min(s):
    is_min = s == s.min()
    return ['color: red' if v else '' for v in is_min]

def highlight_row(row, selected_method):
    return ['background-color: black;' if row['Method'] in selected_method else ''
            for _ in row]

selected_method = [model.__class__.__name__]
eval_value_df = pd.DataFrame(evaluate_dict).T.reset_index().rename(columns={"index":"Method"})

eval_value_df = (
    eval_value_df.style
    .apply(highlight_max, subset=["mae_upperbound_tolerance", "rmse_upperbound_tolerance", "mse_upperbound_tolerance"])
    .apply(highlight_min, subset=["mae", "mse", "rmse"])
    .apply(lambda row: highlight_row(row, selected_method), axis=1 )
    .format(precision=2)
)

eval_value_df

Unnamed: 0,Method,mae,mse,rmse,mae_upperbound_tolerance,rmse_upperbound_tolerance,mse_upperbound_tolerance
0,TabNet,58275.8,12093829806.1,109971.95,-40384.24,-81804.5,-9449145216.65
1,ResNet,88226.12,21733006445.02,147421.19,-70334.56,-119253.73,-19088321855.58
2,LSTM,60495.14,12691482300.88,112656.48,-42339.94,-84145.56,-9981906522.8


TabNet hoạt động tốt với target 1, target 2 và 3