# Neural Networks

For: Pao Pao

The idea for this file is to train a a ANN model given the dataset. The data files you will need to import is unfortunately not ready. But for now, write and test the code using `model_building_data.csv` which is provided in the data folder. Keep in mind that the final training/testing files will have more fields.

Neural Network based models cant handle missing data. Also, the range should be between -1 to 1 where possible. Also, an ANN is *not* a time series model. So lagged variables are needed.

Use Keras the both models. The task itself is not too complex so Keras is more suitable.

Tune all parameters using Optuna if possible (or use 3-fold cv) with the timesplit function like in assignment 1.

This file should save the output of the prediction in the format:

| ticker | quarter_year  | log_revenue_prediction | CAR_prediction |
|--------|---------------|------------------------|----------------|
| BAC    | Q1 2001       | 123                    | 0.5            |
| JPM    | Q1 2001       | 456                    | 0.8            |
| WFC    | Q1 2001       | 789                    | 0.25           |

In [1]:
!pip install optuna

Collecting optuna
  Downloading optuna-4.3.0-py3-none-any.whl.metadata (17 kB)
Collecting alembic>=1.5.0 (from optuna)
  Downloading alembic-1.15.2-py3-none-any.whl.metadata (7.3 kB)
Collecting colorlog (from optuna)
  Downloading colorlog-6.9.0-py3-none-any.whl.metadata (10 kB)
Downloading optuna-4.3.0-py3-none-any.whl (386 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m386.6/386.6 kB[0m [31m8.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading alembic-1.15.2-py3-none-any.whl (231 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m231.9/231.9 kB[0m [31m21.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading colorlog-6.9.0-py3-none-any.whl (11 kB)
Installing collected packages: colorlog, alembic, optuna
Successfully installed alembic-1.15.2 colorlog-6.9.0 optuna-4.3.0


In [2]:
import pandas as pd
import numpy as np

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.callbacks import EarlyStopping

from sklearn.model_selection import KFold
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error
from sklearn.decomposition import PCA

import optuna

In [3]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Revenue Prediction

In [4]:
df_train_rev = pd.read_csv("/content/drive/MyDrive/NUS/FT5005/training_data/train_data_REV_with_text.csv")
df_test_rev = pd.read_csv("/content/drive/MyDrive/NUS/FT5005/training_data/test_data_REV_with_text.csv")
df_train_rev = df_train_rev.sort_values(by=['tic', 'datacqtr']).reset_index(drop=True)
df_test_rev = df_test_rev.sort_values(by=['tic', 'datacqtr']).reset_index(drop=True)

**Lag some variables**

- Total Current Operating Revenue
- Net Charge-Offs
- Invested Capital - Total

In [5]:
# Create the lagged column
df_train_rev['Total Current Operating Revenue_lag1'] = df_train_rev.groupby('tic')['Total Current Operating Revenue'].shift(1)
df_train_rev['Total Current Operating Revenue_lag2'] = df_train_rev.groupby('tic')['Total Current Operating Revenue'].shift(2)
df_train_rev['Total Current Operating Revenue_lag3'] = df_train_rev.groupby('tic')['Total Current Operating Revenue'].shift(3)
df_train_rev['Total Current Operating Revenue_lag4'] = df_train_rev.groupby('tic')['Total Current Operating Revenue'].shift(4)

df_train_rev['Net Charge-Offs_lag1'] = df_train_rev.groupby('tic')['Net Charge-Offs'].shift(1)
df_train_rev['Net Charge-Offs_lag2'] = df_train_rev.groupby('tic')['Net Charge-Offs'].shift(2)
df_train_rev['Net Charge-Offs_lag3'] = df_train_rev.groupby('tic')['Net Charge-Offs'].shift(3)
df_train_rev['Net Charge-Offs_lag4'] = df_train_rev.groupby('tic')['Net Charge-Offs'].shift(4)

df_train_rev['Invested Capital - Total_lag1'] = df_train_rev.groupby('tic')['Invested Capital - Total'].shift(1)
df_train_rev['Invested Capital - Total_lag2'] = df_train_rev.groupby('tic')['Invested Capital - Total'].shift(2)
df_train_rev['Invested Capital - Total_lag3'] = df_train_rev.groupby('tic')['Invested Capital - Total'].shift(3)
df_train_rev['Invested Capital - Total_lag4'] = df_train_rev.groupby('tic')['Invested Capital - Total'].shift(4)



df_test_rev['Total Current Operating Revenue_lag1'] = df_test_rev.groupby('tic')['Total Current Operating Revenue'].shift(1)
df_test_rev['Total Current Operating Revenue_lag2'] = df_test_rev.groupby('tic')['Total Current Operating Revenue'].shift(2)
df_test_rev['Total Current Operating Revenue_lag3'] = df_test_rev.groupby('tic')['Total Current Operating Revenue'].shift(3)
df_test_rev['Total Current Operating Revenue_lag4'] = df_test_rev.groupby('tic')['Total Current Operating Revenue'].shift(4)
df_test_rev['Net Charge-Offs_lag1'] = df_test_rev.groupby('tic')['Net Charge-Offs'].shift(1)
df_test_rev['Net Charge-Offs_lag2'] = df_test_rev.groupby('tic')['Net Charge-Offs'].shift(2)
df_test_rev['Net Charge-Offs_lag3'] = df_test_rev.groupby('tic')['Net Charge-Offs'].shift(3)
df_test_rev['Net Charge-Offs_lag4'] = df_test_rev.groupby('tic')['Net Charge-Offs'].shift(4)
df_test_rev['Invested Capital - Total_lag1'] = df_test_rev.groupby('tic')['Invested Capital - Total'].shift(1)
df_test_rev['Invested Capital - Total_lag2'] = df_test_rev.groupby('tic')['Invested Capital - Total'].shift(2)
df_test_rev['Invested Capital - Total_lag3'] = df_test_rev.groupby('tic')['Invested Capital - Total'].shift(3)
df_test_rev['Invested Capital - Total_lag4'] = df_test_rev.groupby('tic')['Invested Capital - Total'].shift(4)

In [6]:
# Drop NA

df_train_rev = df_train_rev.dropna()
df_test_rev = df_test_rev.dropna()

In [7]:
X_rev_train = df_train_rev.drop(columns=["datacqtr", "tic", "Total Current Operating Revenue"]).copy().to_numpy()
y_rev_train = df_train_rev["Total Current Operating Revenue"].copy().to_numpy()

X_rev_test = df_test_rev.drop(columns=["datacqtr", "tic", "Total Current Operating Revenue"]).copy().to_numpy()
y_rev_test = df_test_rev["Total Current Operating Revenue"].copy().to_numpy()

| Hyperparameter         | Type          | Range / Options                              | Description                                           |
|------------------------|---------------|----------------------------------------------|-------------------------------------------------------|
| `n_layers`             | Integer       | 1 to 3                                       | Number of hidden Dense layers                        |
| `n_units_l{i}`         | Integer       | 32 to 256 (step=32)                          | Number of units in the i-th hidden layer             |
| `activation`           | Categorical   | `'relu'`, `'tanh'`                           | Activation function for all layers                   |
| `dropout_l{i}`         | Float         | 0.0 to 0.5 (step=0.1)                        | Dropout rate after the i-th hidden layer             |
| `optimizer`            | Categorical   | `'adam'`, `'rmsprop'`                        | Optimizer choice                                     |
| `lr` (learning rate)   | Float (log)   | 1e-4 to 1e-2                                  | Learning rate for the optimizer                      |
| `batch_size`           | Categorical   | 16, 32, 64, 128                              | Batch size used in training                          |

In [8]:
# Fit PCA on training data
pca = PCA(n_components=0.95)  # Keep 95% of variance (or set an integer number of components)
X_rev_train_pca = pca.fit_transform(X_rev_train)
X_rev_test_pca = pca.transform(X_rev_test)


In [9]:
print("Dimensions before PCA:", X_rev_train.shape)
print("Dimensions after PCA:", X_rev_train_pca.shape)

Dimensions before PCA: (7276, 66)
Dimensions after PCA: (7276, 24)


In [10]:
def build_model(trial):
    n_layers = trial.suggest_int('n_layers', 1, 3)
    model = keras.Sequential()
    model.add(layers.Input(shape=(X_rev_train_pca.shape[1],)))

    for i in range(n_layers):
        num_units = trial.suggest_int(f'n_units_l{i}', 32, 256, step=32)
        activation = trial.suggest_categorical('activation', ['relu', 'tanh'])
        dropout_rate = trial.suggest_float(f'dropout_l{i}', 0.0, 0.5, step=0.1)

        model.add(layers.Dense(num_units, activation=activation))
        model.add(layers.Dropout(dropout_rate))

    model.add(layers.Dense(1))  # Output layer

    # Optimizer and learning rate
    optimizer_name = trial.suggest_categorical('optimizer', ['adam', 'rmsprop'])
    learning_rate = trial.suggest_float('lr', 1e-4, 1e-2, log=True)

    if optimizer_name == 'adam':
        optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
    else:
        optimizer = keras.optimizers.RMSprop(learning_rate=learning_rate)

    model.compile(
        optimizer=optimizer,
        loss='mse',
        metrics=['mae']
    )
    return model

def objective(trial):
    model = build_model(trial)

    early_stop = EarlyStopping(
        monitor='val_loss',
        patience=5,
        restore_best_weights=True
    )

    history = model.fit(
        X_rev_train_pca, y_rev_train,
        validation_split=0.2,
        epochs=50,
        batch_size=trial.suggest_categorical('batch_size', [16, 32, 64, 128]),
        callbacks=[early_stop],
        verbose=0
    )

    val_loss = min(history.history['val_loss'])
    return val_loss

# Run the Optuna study
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=50)

print("Best trial:")
print(study.best_trial)


[I 2025-05-01 17:16:12,359] A new study created in memory with name: no-name-09945514-1e2d-4ef9-bc91-06aa722d1ec3
[I 2025-05-01 17:16:33,091] Trial 0 finished with value: 0.000783366325777024 and parameters: {'n_layers': 3, 'n_units_l0': 192, 'activation': 'relu', 'dropout_l0': 0.0, 'n_units_l1': 160, 'dropout_l1': 0.2, 'n_units_l2': 160, 'dropout_l2': 0.4, 'optimizer': 'rmsprop', 'lr': 0.0025704833829208857, 'batch_size': 32}. Best is trial 0 with value: 0.000783366325777024.
[I 2025-05-01 17:16:44,918] Trial 1 finished with value: 0.004505567252635956 and parameters: {'n_layers': 3, 'n_units_l0': 224, 'activation': 'relu', 'dropout_l0': 0.1, 'n_units_l1': 32, 'dropout_l1': 0.5, 'n_units_l2': 160, 'dropout_l2': 0.30000000000000004, 'optimizer': 'adam', 'lr': 0.00021559735254999827, 'batch_size': 32}. Best is trial 0 with value: 0.000783366325777024.
[I 2025-05-01 17:17:01,074] Trial 2 finished with value: 0.00269292457960546 and parameters: {'n_layers': 3, 'n_units_l0': 128, 'activati

Best trial:
FrozenTrial(number=22, state=1, values=[0.0004934084718115628], datetime_start=datetime.datetime(2025, 5, 1, 17, 20, 29, 398149), datetime_complete=datetime.datetime(2025, 5, 1, 17, 20, 37, 779983), params={'n_layers': 2, 'n_units_l0': 192, 'activation': 'tanh', 'dropout_l0': 0.1, 'n_units_l1': 224, 'dropout_l1': 0.30000000000000004, 'optimizer': 'rmsprop', 'lr': 0.0009280111407856223, 'batch_size': 128}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'n_layers': IntDistribution(high=3, log=False, low=1, step=1), 'n_units_l0': IntDistribution(high=256, log=False, low=32, step=32), 'activation': CategoricalDistribution(choices=('relu', 'tanh')), 'dropout_l0': FloatDistribution(high=0.5, log=False, low=0.0, step=0.1), 'n_units_l1': IntDistribution(high=256, log=False, low=32, step=32), 'dropout_l1': FloatDistribution(high=0.5, log=False, low=0.0, step=0.1), 'optimizer': CategoricalDistribution(choices=('adam', 'rmsprop')), 'lr': FloatDistribution(high=

In [38]:
# Get the best hyperparameters
best_params = study.best_params

early_stop = EarlyStopping(
        monitor='val_loss',
        patience=10,
        restore_best_weights=True
)

# Build and train the model with the best hyperparameters
best_model = build_model(study.best_trial)
best_model.fit(X_rev_train_pca, y_rev_train, validation_split=0.2, epochs=50,
              batch_size=best_params['batch_size'], verbose=1, callbacks=[early_stop])

# Make predictions on the test set
y_rev_pred = best_model.predict(X_rev_test_pca)

# Evaluate the model
r2 = r2_score(y_rev_test, y_rev_pred)
mse = mean_squared_error(y_rev_test, y_rev_pred)
mae = mean_absolute_error(y_rev_test, y_rev_pred)

print(f"R-squared: {r2}")
print(f"Mean Squared Error: {mse}")
print(f"Mean Absolute Error: {mae}")


Epoch 1/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 42ms/step - loss: 0.0701 - mae: 0.1885 - val_loss: 0.0062 - val_mae: 0.0599
Epoch 2/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0184 - mae: 0.1009 - val_loss: 0.0053 - val_mae: 0.0548
Epoch 3/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0108 - mae: 0.0788 - val_loss: 0.0034 - val_mae: 0.0463
Epoch 4/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0067 - mae: 0.0631 - val_loss: 0.0026 - val_mae: 0.0380
Epoch 5/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0043 - mae: 0.0494 - val_loss: 0.0024 - val_mae: 0.0376
Epoch 6/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0027 - mae: 0.0398 - val_loss: 0.0024 - val_mae: 0.0405
Epoch 7/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0022 

In [40]:
df_test_rev_predict = df_test_rev[["tic", "datacqtr"]].copy()
df_test_rev_predict["neural_network_rev_predict"] = y_rev_pred

In [42]:
df_test_rev_predict.to_csv("/content/drive/MyDrive/NUS/FT5005/training_data/neural_network_rev_predict_test.csv", index=False)

## CAR5 Prediction

In [12]:
df_train_car = pd.read_csv("/content/drive/MyDrive/NUS/FT5005/training_data/train_data_CAR5_with_text.csv")
df_test_car = pd.read_csv("/content/drive/MyDrive/NUS/FT5005/training_data/test_data_CAR5_with_text.csv")
df_train_car = df_train_car.sort_values(by=['tic', 'datacqtr']).reset_index(drop=True)
df_test_car = df_test_car.sort_values(by=['tic', 'datacqtr']).reset_index(drop=True)

In [13]:
# Create the lagged column
df_train_car['Total Current Operating Revenue_lag1'] = df_train_car.groupby('tic')['Total Current Operating Revenue'].shift(1)
# df_train_car['Total Current Operating Revenue_lag2'] = df_train_car.groupby('tic')['Total Current Operating Revenue'].shift(2)
# df_train_car['Total Current Operating Revenue_lag3'] = df_train_car.groupby('tic')['Total Current Operating Revenue'].shift(3)
# df_train_car['Total Current Operating Revenue_lag4'] = df_train_car.groupby('tic')['Total Current Operating Revenue'].shift(4)

df_train_car['Net Charge-Offs_lag1'] = df_train_car.groupby('tic')['Net Charge-Offs'].shift(1)
# df_train_car['Net Charge-Offs_lag2'] = df_train_car.groupby('tic')['Net Charge-Offs'].shift(2)
# df_train_car['Net Charge-Offs_lag3'] = df_train_car.groupby('tic')['Net Charge-Offs'].shift(3)
# df_train_car['Net Charge-Offs_lag4'] = df_train_car.groupby('tic')['Net Charge-Offs'].shift(4)

df_train_car['Invested Capital - Total_lag1'] = df_train_car.groupby('tic')['Invested Capital - Total'].shift(1)
# df_train_car['Invested Capital - Total_lag2'] = df_train_car.groupby('tic')['Invested Capital - Total'].shift(2)
# df_train_car['Invested Capital - Total_lag3'] = df_train_car.groupby('tic')['Invested Capital - Total'].shift(3)
# df_train_car['Invested Capital - Total_lag4'] = df_train_car.groupby('tic')['Invested Capital - Total'].shift(4)

df_train_car['car5_lag1'] = df_train_car.groupby('tic')['car5'].shift(1)
# df_train_car['car5_lag2'] = df_train_car.groupby('tic')['car5'].shift(2)
# df_train_car['car5_lag3'] = df_train_car.groupby('tic')['car5'].shift(3)
# df_train_car['car5_lag4'] = df_train_car.groupby('tic')['car5'].shift(4)



df_test_car['Total Current Operating Revenue_lag1'] = df_test_car.groupby('tic')['Total Current Operating Revenue'].shift(1)
# df_test_car['Total Current Operating Revenue_lag2'] = df_test_car.groupby('tic')['Total Current Operating Revenue'].shift(2)
# df_test_car['Total Current Operating Revenue_lag3'] = df_test_car.groupby('tic')['Total Current Operating Revenue'].shift(3)
# df_test_car['Total Current Operating Revenue_lag4'] = df_test_car.groupby('tic')['Total Current Operating Revenue'].shift(4)

df_test_car['Net Charge-Offs_lag1'] = df_test_car.groupby('tic')['Net Charge-Offs'].shift(1)
# df_test_car['Net Charge-Offs_lag2'] = df_test_car.groupby('tic')['Net Charge-Offs'].shift(2)
# df_test_car['Net Charge-Offs_lag3'] = df_test_car.groupby('tic')['Net Charge-Offs'].shift(3)
# df_test_car['Net Charge-Offs_lag4'] = df_test_car.groupby('tic')['Net Charge-Offs'].shift(4)

df_test_car['Invested Capital - Total_lag1'] = df_test_car.groupby('tic')['Invested Capital - Total'].shift(1)
# df_test_car['Invested Capital - Total_lag2'] = df_test_car.groupby('tic')['Invested Capital - Total'].shift(2)
# df_test_car['Invested Capital - Total_lag3'] = df_test_car.groupby('tic')['Invested Capital - Total'].shift(3)
# df_test_car['Invested Capital - Total_lag4'] = df_test_car.groupby('tic')['Invested Capital - Total'].shift(4)

df_test_car['car5_lag1'] = df_test_car.groupby('tic')['car5'].shift(1)
# df_test_car['car5_lag2'] = df_test_car.groupby('tic')['car5'].shift(2)
# df_test_car['car5_lag3'] = df_test_car.groupby('tic')['car5'].shift(3)
# df_test_car['car5_lag4'] = df_test_car.groupby('tic')['car5'].shift(4)

In [14]:
df_train_car

Unnamed: 0,datacqtr,tic,car5,GDP CHANGE (-1 to 1),UNEMPLOYMENT RATE (0 to 1),PRIME LOAN RATE (0 to 1),DEPOSITS CHANGE (-1 to 1),CONSUMER PRICE INDEX (0 to 1),SAVINGS PER GROSS INCOME (-1 to 1),Net Interest Income,...,reviews_rating,text_blob_reviews_sentiment,vader_reviews_sentiment_neg,vader_reviews_sentiment_pos,bert_reviews_label,bert_reviews_score,Total Current Operating Revenue_lag1,Net Charge-Offs_lag1,Invested Capital - Total_lag1,car5_lag1
0,2002Q3,ABVA,-0.000718,0.536682,0.226950,0.240000,0.155932,0.858035,0.630435,0.000000,...,0.500000,0.000000,0.500000,0.500000,0.500000,0.500000,,,,
1,2002Q4,ABVA,-0.035508,0.525962,0.241135,0.193016,0.207168,0.754757,0.652174,0.001665,...,0.500000,0.000000,0.500000,0.500000,0.500000,0.500000,0.018448,0.907823,0.000000,-0.000718
2,2003Q1,ABVA,0.137994,0.544421,0.241135,0.160000,0.109794,0.700878,0.554348,0.003983,...,0.500000,0.000000,0.500000,0.500000,0.500000,0.500000,0.017356,0.908680,0.000000,-0.035508
3,2003Q2,ABVA,-0.051305,0.557456,0.269504,0.158750,0.161457,0.579028,0.565217,0.007631,...,0.500000,0.000000,0.500000,0.500000,0.500000,0.500000,0.032957,0.908803,0.000000,0.137994
4,2003Q3,ABVA,0.049040,0.616419,0.269504,0.120000,0.201673,0.521375,0.554348,0.017963,...,0.500000,0.000000,0.500000,0.500000,0.500000,0.500000,0.053341,0.909171,0.003413,-0.051305
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8014,2019Q4,ZION,0.068700,0.543011,0.000000,0.253548,0.179586,0.721845,0.608696,0.633072,...,0.500000,0.000000,0.500000,0.500000,0.500000,0.500000,0.604401,0.823592,0.591243,-0.044618
8015,2020Q1,ZION,-0.056801,0.429512,0.024823,0.183226,0.188084,0.711620,0.652174,0.630463,...,0.415414,0.119543,0.079556,0.177932,0.419173,0.688792,0.601003,0.523349,0.594882,0.068700
8016,2020Q2,ZION,0.048808,0.000000,1.000000,0.000000,1.000000,0.474737,0.217391,0.633502,...,0.500000,0.000000,0.500000,0.500000,0.500000,0.500000,0.595861,0.653172,0.596922,-0.056801
8017,2020Q3,ZION,-0.007967,1.000000,0.553191,0.000000,0.271433,0.514690,0.315217,0.632209,...,0.500000,0.000000,0.500000,0.500000,0.500000,0.500000,0.588731,0.482752,0.593118,0.048808


In [15]:
df_train_car = df_train_car.dropna()
df_test_car = df_test_car.dropna()

In [16]:
X_car_train = df_train_car.drop(columns=["datacqtr", "tic", "car5"]).copy().to_numpy()
y_car_train = df_train_car["car5"].copy().to_numpy()

X_car_test = df_test_car.drop(columns=["datacqtr", "tic", "car5"]).copy().to_numpy()
y_car_test = df_test_car["car5"].copy().to_numpy()

In [17]:
# Fit PCA on training data
pca = PCA(n_components=0.95)  # Keep 95% of variance (or set an integer number of components)
X_car_train_pca = pca.fit_transform(X_car_train)
X_car_test_pca = pca.transform(X_car_test)

In [22]:
print("Dimensions before PCA:", X_car_train.shape)
print("Dimensions after PCA:", X_car_train_pca.shape)

Dimensions before PCA: (7874, 41)
Dimensions after PCA: (7874, 18)


In [20]:
def build_model2(trial):
    n_layers = trial.suggest_int('n_layers', 1, 8)
    model = keras.Sequential()
    model.add(layers.Input(shape=(X_car_train_pca.shape[1],)))

    for i in range(n_layers):
        num_units = trial.suggest_int(f'n_units_l{i}', 32, 256, step=32)
        activation = trial.suggest_categorical('activation', ['relu', 'tanh', 'sigmoid', 'gelu'])

        model.add(layers.Dense(num_units, activation=activation))

    dropout_rate = trial.suggest_float(f'dropout_l{i}', 0.0, 0.5, step=0.1)
    model.add(layers.Dropout(dropout_rate))
    model.add(layers.Dense(1))  # Output layer

    # Optimizer and learning rate
    optimizer_name = trial.suggest_categorical('optimizer', ['adam', 'rmsprop'])
    learning_rate = trial.suggest_float('lr', 1e-4, 1e-2, log=True)

    if optimizer_name == 'adam':
        optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
    else:
        optimizer = keras.optimizers.RMSprop(learning_rate=learning_rate)

    model.compile(
        optimizer=optimizer,
        loss='mse',
        metrics=['mae']
    )
    return model

def objective2(trial):
    model = build_model2(trial)

    early_stop = EarlyStopping(
        monitor='val_loss',
        patience=5,
        restore_best_weights=True
    )

    history = model.fit(
        X_car_train_pca, y_car_train,
        validation_split=0.2,
        epochs=50,
        batch_size=trial.suggest_categorical('batch_size', [16, 32, 64, 128]),
        callbacks=[early_stop],
        verbose=0
    )

    val_loss = min(history.history['val_loss'])
    return val_loss

# Run the Optuna study
study2 = optuna.create_study(direction='minimize')
study2.optimize(objective2, n_trials=100)

print("Best trial:")
print(study2.best_trial)


[I 2025-05-01 17:49:17,731] A new study created in memory with name: no-name-dbd3b730-d65a-4cc9-81bf-20f251babb38
[I 2025-05-01 17:49:30,985] Trial 0 finished with value: 0.003152876626700163 and parameters: {'n_layers': 1, 'n_units_l0': 128, 'activation': 'tanh', 'dropout_l0': 0.30000000000000004, 'optimizer': 'adam', 'lr': 0.0001402475170550775, 'batch_size': 16}. Best is trial 0 with value: 0.003152876626700163.
[I 2025-05-01 17:49:58,055] Trial 1 finished with value: 0.003229738213121891 and parameters: {'n_layers': 5, 'n_units_l0': 192, 'activation': 'sigmoid', 'n_units_l1': 32, 'n_units_l2': 128, 'n_units_l3': 224, 'n_units_l4': 96, 'dropout_l4': 0.30000000000000004, 'optimizer': 'rmsprop', 'lr': 0.005348177038006766, 'batch_size': 128}. Best is trial 0 with value: 0.003152876626700163.
[I 2025-05-01 17:50:20,976] Trial 2 finished with value: 0.0031295681837946177 and parameters: {'n_layers': 5, 'n_units_l0': 96, 'activation': 'tanh', 'n_units_l1': 256, 'n_units_l2': 32, 'n_units

Best trial:
FrozenTrial(number=65, state=1, values=[0.002965005347505212], datetime_start=datetime.datetime(2025, 5, 1, 18, 16, 8, 145961), datetime_complete=datetime.datetime(2025, 5, 1, 18, 16, 49, 433348), params={'n_layers': 8, 'n_units_l0': 128, 'activation': 'gelu', 'n_units_l1': 128, 'n_units_l2': 224, 'n_units_l3': 160, 'n_units_l4': 192, 'n_units_l5': 128, 'n_units_l6': 32, 'n_units_l7': 224, 'dropout_l7': 0.1, 'optimizer': 'adam', 'lr': 0.0003945520618327073, 'batch_size': 16}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'n_layers': IntDistribution(high=8, log=False, low=1, step=1), 'n_units_l0': IntDistribution(high=256, log=False, low=32, step=32), 'activation': CategoricalDistribution(choices=('relu', 'tanh', 'sigmoid', 'gelu')), 'n_units_l1': IntDistribution(high=256, log=False, low=32, step=32), 'n_units_l2': IntDistribution(high=256, log=False, low=32, step=32), 'n_units_l3': IntDistribution(high=256, log=False, low=32, step=32), 'n_units_l4':

In [43]:
# Get the best hyperparameters
best_params2 = study2.best_params

early_stop = EarlyStopping(
        monitor='val_loss',
        patience=10,
        restore_best_weights=True
)

# Build and train the model with the best hyperparameters
best_model2 = build_model2(study2.best_trial)
best_model2.fit(X_car_train_pca, y_car_train, epochs=50, validation_split=0.2,
              batch_size=best_params['batch_size'], verbose=1, callbacks=[early_stop])

# Make predictions on the test set
y_car_pred = best_model2.predict(X_car_test_pca)

# Evaluate the model
r2 = r2_score(y_car_test, y_car_pred)
mse = mean_squared_error(y_car_test, y_car_pred)
mae = mean_absolute_error(y_car_test, y_car_pred)

print(f"R-squared: {r2}")
print(f"Mean Squared Error: {mse}")
print(f"Mean Absolute Error: {mae}")


Epoch 1/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 45ms/step - loss: 0.0033 - mae: 0.0410 - val_loss: 0.0033 - val_mae: 0.0412
Epoch 2/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0033 - mae: 0.0416 - val_loss: 0.0032 - val_mae: 0.0407
Epoch 3/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0031 - mae: 0.0404 - val_loss: 0.0032 - val_mae: 0.0404
Epoch 4/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 0.0032 - mae: 0.0408 - val_loss: 0.0032 - val_mae: 0.0404
Epoch 5/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 0.0031 - mae: 0.0406 - val_loss: 0.0031 - val_mae: 0.0402
Epoch 6/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0033 - mae: 0.0414 - val_loss: 0.0032 - val_mae: 0.0404
Epoch 7/50
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0032 

In [44]:
df_test_car_predict = df_test_car[["tic", "datacqtr"]].copy()
df_test_car_predict["neural_network_car_predict"] = y_car_pred

In [45]:
df_test_car_predict.to_csv("/content/drive/MyDrive/NUS/FT5005/training_data/neural_network_car_predict_test.csv", index=False)

## K Fold for Stacking

In [24]:
# Revenue

n_splits = 10
kf = KFold(n_splits=n_splits, shuffle=True, random_state=42)

# Arrays to collect out-of-fold predictions
oof_preds = np.zeros((X_rev_train_pca.shape[0],))

for train_idx, val_idx in kf.split(X_rev_train_pca):
    X_train_fold, X_val_fold = X_rev_train_pca[train_idx], X_rev_train_pca[val_idx]
    y_train_fold, y_val_fold = y_rev_train[train_idx], y_rev_train[val_idx]

    early_stop = EarlyStopping(
        monitor='val_loss',
        patience=10,
        restore_best_weights=True
    )

    # Build and train model
    model = build_model(study.best_trial)
    model.fit(X_train_fold, y_train_fold,
              epochs=50,
              validation_split=0.2,
              batch_size=best_params['batch_size'],
              verbose=0,
              callbacks=[early_stop]
    )

    # OOF predictions
    oof_preds[val_idx] = model.predict(X_val_fold).flatten()


[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 83ms/step
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 14ms/step
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 15ms/step
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 14ms/step
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 14ms/step
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 15ms/step
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 73ms/step
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 15ms/step
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 14ms/step
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 14ms/step


In [26]:
len(oof_preds)

7276

In [27]:
len(df_train_rev)

7276

In [28]:
df_oof_rev_prediction = df_train_rev[["tic", "datacqtr"]].copy()
df_oof_rev_prediction["neural_network_rev_predict"] = oof_preds

In [30]:
df_oof_rev_prediction.to_csv("/content/drive/MyDrive/NUS/FT5005/training_data/neural_network_rev_predict.csv", index=False)

In [33]:
# Car

n_splits = 10
kf = KFold(n_splits=n_splits, shuffle=True, random_state=42)

# Arrays to collect out-of-fold predictions
oof_preds = np.zeros((X_car_train_pca.shape[0],))

for train_idx, val_idx in kf.split(X_car_train_pca):
    X_train_fold, X_val_fold = X_car_train_pca[train_idx], X_car_train_pca[val_idx]
    y_train_fold, y_val_fold = y_car_train[train_idx], y_car_train[val_idx]

    early_stop = EarlyStopping(
        monitor='val_loss',
        patience=10,
        restore_best_weights=True
    )

    # Build and train model
    model = build_model2(study2.best_trial)
    model.fit(X_train_fold, y_train_fold,
              epochs=50,
              validation_split=0.2,
              batch_size=best_params2['batch_size'],
              verbose=0,
              callbacks=[early_stop]
    )

    # OOF predictions
    oof_preds[val_idx] = model.predict(X_val_fold).flatten()


[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 16ms/step
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 18ms/step
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 14ms/step
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step


In [34]:
df_oof_car_prediction = df_train_car[["tic", "datacqtr"]].copy()
df_oof_car_prediction["neural_network_car_predict"] = oof_preds

In [35]:
df_oof_car_prediction.to_csv("/content/drive/MyDrive/NUS/FT5005/training_data/neural_network_car_predict.csv", index=False)

In [37]:
df_oof_car_prediction.head()

Unnamed: 0,tic,datacqtr,neural_network_car_predict
1,ABVA,2002Q4,0.015901
2,ABVA,2003Q1,0.009487
3,ABVA,2003Q2,0.01286
4,ABVA,2003Q3,0.000866
5,ABVA,2003Q4,0.001111
