## Applying SMOTE for Oversampling Part 07

**Address Class Imbalance:**

+ Since the dataset is imbalanced, apply techniques specifically designed to handle this. Here are some approaches:

+ `Oversampling the Minority Class`: Use techniques like SMOTE (Synthetic Minority Over-sampling Technique) or simple oversampling to create synthetic examples for the minority class.

+ `Undersampling the Majority Class`: Reduce the number of samples from the majority class to balance the dataset. However, this could lead to losing important data.

+ `Use Class Weights`: Adjust the weights for the classes in your models so that the model pays more attention to the minority class. Many algorithms, including neural networks and TabNet, allow you to set class weights.

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from imblearn.over_sampling import SMOTE
from pytorch_tabnet.tab_model import TabNetClassifier
from sklearn.metrics import classification_report

In [13]:
# Load your dataset
data = pd.read_csv('no_missing_values_customer_data.csv')

# Convert the target variable 'Churn' to numeric
data['Churn'] = data['Churn'].map({'Yes': 1, 'No': 0})

# Encode categorical variables using Label Encoding
for col in data.select_dtypes(include=['object']).columns:
    if col != 'customerID':
        data[col] = LabelEncoder().fit_transform(data[col])

# Separate features and target
features = data.drop(['Churn', 'customerID'], axis=1).values
target = data['Churn'].values

# Apply SMOTE for oversampling the minority class
smote = SMOTE(random_state=42)
features_resampled, target_resampled = smote.fit_resample(features, target)


# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(features_resampled, target_resampled, test_size=0.2, random_state=42)

In [14]:
# Initialize TabNetClassifier
tabnet_clf = TabNetClassifier()

# Train TabNet model
tabnet_clf.fit(
    X_train, y_train,
    eval_set=[(X_test, y_test)],
    eval_metric=['accuracy'],
    max_epochs=100,
    patience=10,
    batch_size=256,
    virtual_batch_size=128
)

# Make predictions
y_pred = tabnet_clf.predict(X_test)

# Evaluate the model
print("Classification Report for TabNet with SMOTE:\n", classification_report(y_test, y_pred))



epoch 0  | loss: 0.59131 | val_0_accuracy: 0.64831 |  0:00:03s
epoch 1  | loss: 0.51501 | val_0_accuracy: 0.73961 |  0:00:05s
epoch 2  | loss: 0.50643 | val_0_accuracy: 0.74831 |  0:00:08s
epoch 3  | loss: 0.4963  | val_0_accuracy: 0.74541 |  0:00:10s
epoch 4  | loss: 0.48998 | val_0_accuracy: 0.76232 |  0:00:13s
epoch 5  | loss: 0.48046 | val_0_accuracy: 0.76957 |  0:00:16s
epoch 6  | loss: 0.47429 | val_0_accuracy: 0.77391 |  0:00:19s
epoch 7  | loss: 0.47016 | val_0_accuracy: 0.76425 |  0:00:22s
epoch 8  | loss: 0.46079 | val_0_accuracy: 0.78019 |  0:00:24s
epoch 9  | loss: 0.47276 | val_0_accuracy: 0.78261 |  0:00:27s
epoch 10 | loss: 0.47308 | val_0_accuracy: 0.77101 |  0:00:30s
epoch 11 | loss: 0.46351 | val_0_accuracy: 0.78261 |  0:00:32s
epoch 12 | loss: 0.45261 | val_0_accuracy: 0.77488 |  0:00:35s
epoch 13 | loss: 0.4538  | val_0_accuracy: 0.79034 |  0:00:38s
epoch 14 | loss: 0.44513 | val_0_accuracy: 0.79517 |  0:00:41s
epoch 15 | loss: 0.44658 | val_0_accuracy: 0.78116 |  0



Classification Report for TabNet with SMOTE:
               precision    recall  f1-score   support

           0       0.82      0.79      0.80      1021
           1       0.80      0.83      0.81      1049

    accuracy                           0.81      2070
   macro avg       0.81      0.81      0.81      2070
weighted avg       0.81      0.81      0.81      2070



## Imporving the SMOTE Oversmapled Model

**Hyperparameter Tuning Using Optuna**

In [30]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split, StratifiedKFold, cross_val_predict
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from imblearn.over_sampling import SMOTE
from pytorch_tabnet.tab_model import TabNetClassifier
import optuna

In [23]:
# Load your dataset
data = pd.read_csv('no_missing_values_customer_data.csv')

# Convert the target variable 'Churn' to numeric
data['Churn'] = data['Churn'].map({'Yes': 1, 'No': 0})

# Encode categorical variables using Label Encoding
for col in data.select_dtypes(include=['object']).columns:
    if col != 'customerID':
        data[col] = LabelEncoder().fit_transform(data[col])

# Separate features and target
features = data.drop(['Churn', 'customerID'], axis=1).values
target = data['Churn'].values

# Apply SMOTE for oversampling the minority class
smote = SMOTE(random_state=42)
features_resampled, target_resampled = smote.fit_resample(features, target)

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(features_resampled, target_resampled, test_size=0.2, random_state=42)

In [28]:
# Define the hyperparameter optimization function
def objective(trial):
    # Define the hyperparameters
    params = {
        'n_d': trial.suggest_int('n_d', 8, 64),
        'n_a': trial.suggest_int('n_a', 8, 64),
        'n_steps': trial.suggest_int('n_steps', 3, 10),
        'gamma': trial.suggest_float('gamma', 1.0, 2.0),
        'lambda_sparse': trial.suggest_float('lambda_sparse', 0.0001, 0.01),
        'optimizer_params': {'lr': trial.suggest_float('lr', 1e-4, 1e-2)},  # Set learning rate inside optimizer_params
        'mask_type': trial.suggest_categorical('mask_type', ['entmax', 'sparsemax']),
        'n_shared': trial.suggest_int('n_shared', 1, 3),
        'n_independent': trial.suggest_int('n_independent', 1, 3)
    }
    
    # Initialize TabNetClassifier with the parameters
    tabnet_clf = TabNetClassifier(
        n_d=params['n_d'],
        n_a=params['n_a'],
        n_steps=params['n_steps'],
        gamma=params['gamma'],
        lambda_sparse=params['lambda_sparse'],
        optimizer_params=params['optimizer_params'],
        mask_type=params['mask_type'],
        n_shared=params['n_shared'],
        n_independent=params['n_independent']
    )

    # Train the model
    tabnet_clf.fit(
        X_train, y_train,
        eval_set=[(X_test, y_test)],
        eval_metric=['accuracy'],
        max_epochs=100,
        patience=10,
        batch_size=256,
        virtual_batch_size=128
    )
    
    # Predict and calculate accuracy
    y_pred = tabnet_clf.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    
    return accuracy

# Run the hyperparameter optimization
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=10)

# Get the best parameters
best_params = study.best_params
print("Best Hyperparameters:", best_params)

[I 2024-09-07 18:45:01,389] A new study created in memory with name: no-name-7a22dd48-7f19-4ad0-a730-f346c03cd823


epoch 0  | loss: 0.8011  | val_0_accuracy: 0.64444 |  0:00:02s
epoch 1  | loss: 0.6376  | val_0_accuracy: 0.67923 |  0:00:03s
epoch 2  | loss: 0.59369 | val_0_accuracy: 0.69324 |  0:00:05s
epoch 3  | loss: 0.58041 | val_0_accuracy: 0.72754 |  0:00:08s
epoch 4  | loss: 0.56052 | val_0_accuracy: 0.72512 |  0:00:11s
epoch 5  | loss: 0.54879 | val_0_accuracy: 0.72995 |  0:00:14s
epoch 6  | loss: 0.54015 | val_0_accuracy: 0.75024 |  0:00:17s
epoch 7  | loss: 0.52615 | val_0_accuracy: 0.757   |  0:00:19s
epoch 8  | loss: 0.5218  | val_0_accuracy: 0.75797 |  0:00:22s
epoch 9  | loss: 0.52369 | val_0_accuracy: 0.77391 |  0:00:25s
epoch 10 | loss: 0.51652 | val_0_accuracy: 0.77343 |  0:00:27s
epoch 11 | loss: 0.51081 | val_0_accuracy: 0.77536 |  0:00:30s
epoch 12 | loss: 0.49798 | val_0_accuracy: 0.76908 |  0:00:32s
epoch 13 | loss: 0.50179 | val_0_accuracy: 0.77585 |  0:00:34s
epoch 14 | loss: 0.5069  | val_0_accuracy: 0.76522 |  0:00:36s
epoch 15 | loss: 0.52686 | val_0_accuracy: 0.76908 |  0

[I 2024-09-07 18:46:18,414] Trial 0 finished with value: 0.7903381642512077 and parameters: {'n_d': 38, 'n_a': 12, 'n_steps': 5, 'gamma': 1.851012759830538, 'lambda_sparse': 0.0015634995647997355, 'lr': 0.0021002571125329043, 'mask_type': 'sparsemax', 'n_shared': 1, 'n_independent': 2}. Best is trial 0 with value: 0.7903381642512077.


epoch 0  | loss: 0.95303 | val_0_accuracy: 0.63285 |  0:00:05s
epoch 1  | loss: 0.70689 | val_0_accuracy: 0.67536 |  0:00:11s
epoch 2  | loss: 0.63966 | val_0_accuracy: 0.70097 |  0:00:17s
epoch 3  | loss: 0.63358 | val_0_accuracy: 0.73237 |  0:00:22s
epoch 4  | loss: 0.59308 | val_0_accuracy: 0.7401  |  0:00:30s
epoch 5  | loss: 0.57875 | val_0_accuracy: 0.74203 |  0:00:38s
epoch 6  | loss: 0.61578 | val_0_accuracy: 0.76329 |  0:00:45s
epoch 7  | loss: 0.56759 | val_0_accuracy: 0.74348 |  0:00:50s
epoch 8  | loss: 0.55283 | val_0_accuracy: 0.74686 |  0:00:54s
epoch 9  | loss: 0.54542 | val_0_accuracy: 0.76039 |  0:00:59s
epoch 10 | loss: 0.53859 | val_0_accuracy: 0.73575 |  0:01:03s
epoch 11 | loss: 0.55914 | val_0_accuracy: 0.75411 |  0:01:07s
epoch 12 | loss: 0.53917 | val_0_accuracy: 0.74879 |  0:01:11s
epoch 13 | loss: 0.53441 | val_0_accuracy: 0.76087 |  0:01:15s
epoch 14 | loss: 0.53002 | val_0_accuracy: 0.76618 |  0:01:20s
epoch 15 | loss: 0.56033 | val_0_accuracy: 0.75797 |  0

[I 2024-09-07 18:50:26,374] Trial 1 finished with value: 0.8 and parameters: {'n_d': 62, 'n_a': 33, 'n_steps': 7, 'gamma': 1.639451683595382, 'lambda_sparse': 0.0018270698424881831, 'lr': 0.0017941163567835791, 'mask_type': 'sparsemax', 'n_shared': 2, 'n_independent': 3}. Best is trial 1 with value: 0.8.


epoch 0  | loss: 1.23755 | val_0_accuracy: 0.54251 |  0:00:01s
epoch 1  | loss: 0.71656 | val_0_accuracy: 0.61787 |  0:00:04s
epoch 2  | loss: 0.64609 | val_0_accuracy: 0.65845 |  0:00:06s
epoch 3  | loss: 0.60329 | val_0_accuracy: 0.67729 |  0:00:08s
epoch 4  | loss: 0.58689 | val_0_accuracy: 0.70097 |  0:00:10s
epoch 5  | loss: 0.56999 | val_0_accuracy: 0.72367 |  0:00:11s
epoch 6  | loss: 0.55869 | val_0_accuracy: 0.73671 |  0:00:13s
epoch 7  | loss: 0.55439 | val_0_accuracy: 0.73527 |  0:00:15s
epoch 8  | loss: 0.53998 | val_0_accuracy: 0.75797 |  0:00:17s
epoch 9  | loss: 0.53828 | val_0_accuracy: 0.75604 |  0:00:19s
epoch 10 | loss: 0.52692 | val_0_accuracy: 0.75845 |  0:00:21s
epoch 11 | loss: 0.52444 | val_0_accuracy: 0.75507 |  0:00:23s
epoch 12 | loss: 0.51561 | val_0_accuracy: 0.75845 |  0:00:25s
epoch 13 | loss: 0.52147 | val_0_accuracy: 0.76329 |  0:00:27s
epoch 14 | loss: 0.51283 | val_0_accuracy: 0.76232 |  0:00:29s
epoch 15 | loss: 0.50977 | val_0_accuracy: 0.76957 |  0

[I 2024-09-07 18:53:10,704] Trial 2 finished with value: 0.8111111111111111 and parameters: {'n_d': 25, 'n_a': 9, 'n_steps': 4, 'gamma': 1.9859411393133408, 'lambda_sparse': 0.009945299843254766, 'lr': 0.001936965926206663, 'mask_type': 'sparsemax', 'n_shared': 2, 'n_independent': 2}. Best is trial 2 with value: 0.8111111111111111.


epoch 0  | loss: 0.95383 | val_0_accuracy: 0.54976 |  0:00:06s
epoch 1  | loss: 0.65692 | val_0_accuracy: 0.69082 |  0:00:11s
epoch 2  | loss: 0.60555 | val_0_accuracy: 0.71063 |  0:00:17s
epoch 3  | loss: 0.59434 | val_0_accuracy: 0.7372  |  0:00:24s
epoch 4  | loss: 0.58334 | val_0_accuracy: 0.70725 |  0:00:31s
epoch 5  | loss: 0.56881 | val_0_accuracy: 0.7628  |  0:00:37s
epoch 6  | loss: 0.54559 | val_0_accuracy: 0.76184 |  0:00:43s
epoch 7  | loss: 0.54109 | val_0_accuracy: 0.77053 |  0:00:49s
epoch 8  | loss: 0.51494 | val_0_accuracy: 0.76763 |  0:00:55s
epoch 9  | loss: 0.53352 | val_0_accuracy: 0.77923 |  0:01:01s
epoch 10 | loss: 0.51451 | val_0_accuracy: 0.75797 |  0:01:06s
epoch 11 | loss: 0.53359 | val_0_accuracy: 0.77633 |  0:01:12s
epoch 12 | loss: 0.54065 | val_0_accuracy: 0.77778 |  0:01:18s
epoch 13 | loss: 0.52501 | val_0_accuracy: 0.77391 |  0:01:24s
epoch 14 | loss: 0.53409 | val_0_accuracy: 0.77488 |  0:01:30s
epoch 15 | loss: 0.51166 | val_0_accuracy: 0.74686 |  0

[I 2024-09-07 18:58:51,341] Trial 3 finished with value: 0.81256038647343 and parameters: {'n_d': 49, 'n_a': 48, 'n_steps': 8, 'gamma': 1.5774190160053538, 'lambda_sparse': 0.009367256371796542, 'lr': 0.002727103127337467, 'mask_type': 'entmax', 'n_shared': 3, 'n_independent': 2}. Best is trial 3 with value: 0.81256038647343.


epoch 0  | loss: 0.86813 | val_0_accuracy: 0.6715  |  0:00:02s
epoch 1  | loss: 0.59741 | val_0_accuracy: 0.71449 |  0:00:05s
epoch 2  | loss: 0.55518 | val_0_accuracy: 0.72512 |  0:00:08s
epoch 3  | loss: 0.54091 | val_0_accuracy: 0.74058 |  0:00:11s
epoch 4  | loss: 0.53224 | val_0_accuracy: 0.73285 |  0:00:14s
epoch 5  | loss: 0.52112 | val_0_accuracy: 0.74251 |  0:00:16s
epoch 6  | loss: 0.52463 | val_0_accuracy: 0.77101 |  0:00:18s
epoch 7  | loss: 0.50773 | val_0_accuracy: 0.77536 |  0:00:21s
epoch 8  | loss: 0.50778 | val_0_accuracy: 0.78164 |  0:00:24s
epoch 9  | loss: 0.50237 | val_0_accuracy: 0.77778 |  0:00:26s
epoch 10 | loss: 0.49519 | val_0_accuracy: 0.79517 |  0:00:28s
epoch 11 | loss: 0.49455 | val_0_accuracy: 0.79324 |  0:00:31s
epoch 12 | loss: 0.48476 | val_0_accuracy: 0.80531 |  0:00:34s
epoch 13 | loss: 0.49259 | val_0_accuracy: 0.79517 |  0:00:36s
epoch 14 | loss: 0.47721 | val_0_accuracy: 0.78551 |  0:00:39s
epoch 15 | loss: 0.47088 | val_0_accuracy: 0.78164 |  0

[I 2024-09-07 18:59:53,527] Trial 4 finished with value: 0.8053140096618358 and parameters: {'n_d': 40, 'n_a': 27, 'n_steps': 6, 'gamma': 1.6830263040540707, 'lambda_sparse': 0.006238563421097442, 'lr': 0.002759147054723279, 'mask_type': 'entmax', 'n_shared': 1, 'n_independent': 1}. Best is trial 3 with value: 0.81256038647343.


epoch 0  | loss: 0.95036 | val_0_accuracy: 0.68696 |  0:00:03s
epoch 1  | loss: 0.56556 | val_0_accuracy: 0.74155 |  0:00:08s
epoch 2  | loss: 0.52603 | val_0_accuracy: 0.76812 |  0:00:13s
epoch 3  | loss: 0.49962 | val_0_accuracy: 0.76667 |  0:00:19s
epoch 4  | loss: 0.48876 | val_0_accuracy: 0.77778 |  0:00:23s
epoch 5  | loss: 0.47136 | val_0_accuracy: 0.77874 |  0:00:43s
epoch 6  | loss: 0.46894 | val_0_accuracy: 0.78309 |  0:00:48s
epoch 7  | loss: 0.45701 | val_0_accuracy: 0.78454 |  0:00:54s
epoch 8  | loss: 0.43739 | val_0_accuracy: 0.78986 |  0:00:57s
epoch 9  | loss: 0.43746 | val_0_accuracy: 0.79082 |  0:01:00s
epoch 10 | loss: 0.42732 | val_0_accuracy: 0.78551 |  0:01:03s
epoch 11 | loss: 0.43008 | val_0_accuracy: 0.78841 |  0:01:06s
epoch 12 | loss: 0.41952 | val_0_accuracy: 0.79034 |  0:01:10s
epoch 13 | loss: 0.42594 | val_0_accuracy: 0.79469 |  0:01:13s
epoch 14 | loss: 0.41194 | val_0_accuracy: 0.78647 |  0:01:17s
epoch 15 | loss: 0.41338 | val_0_accuracy: 0.79614 |  0

[I 2024-09-07 19:02:39,826] Trial 5 finished with value: 0.808695652173913 and parameters: {'n_d': 56, 'n_a': 46, 'n_steps': 4, 'gamma': 1.3863947752490262, 'lambda_sparse': 0.004157997279131891, 'lr': 0.0014447417995726575, 'mask_type': 'entmax', 'n_shared': 1, 'n_independent': 3}. Best is trial 3 with value: 0.81256038647343.


epoch 0  | loss: 1.07013 | val_0_accuracy: 0.50097 |  0:00:03s
epoch 1  | loss: 0.83189 | val_0_accuracy: 0.57488 |  0:00:07s
epoch 2  | loss: 0.73361 | val_0_accuracy: 0.63188 |  0:00:10s
epoch 3  | loss: 0.6924  | val_0_accuracy: 0.6913  |  0:00:13s
epoch 4  | loss: 0.64174 | val_0_accuracy: 0.72222 |  0:00:17s
epoch 5  | loss: 0.61892 | val_0_accuracy: 0.72995 |  0:00:20s
epoch 6  | loss: 0.61087 | val_0_accuracy: 0.72657 |  0:00:24s
epoch 7  | loss: 0.58724 | val_0_accuracy: 0.73671 |  0:00:27s
epoch 8  | loss: 0.59337 | val_0_accuracy: 0.743   |  0:00:30s
epoch 9  | loss: 0.57669 | val_0_accuracy: 0.74541 |  0:00:34s
epoch 10 | loss: 0.56564 | val_0_accuracy: 0.74734 |  0:00:37s
epoch 11 | loss: 0.56132 | val_0_accuracy: 0.75749 |  0:00:41s
epoch 12 | loss: 0.5641  | val_0_accuracy: 0.76618 |  0:00:44s
epoch 13 | loss: 0.55292 | val_0_accuracy: 0.76763 |  0:00:47s
epoch 14 | loss: 0.54612 | val_0_accuracy: 0.75749 |  0:00:50s
epoch 15 | loss: 0.54369 | val_0_accuracy: 0.75845 |  0

[I 2024-09-07 19:03:59,855] Trial 6 finished with value: 0.7676328502415459 and parameters: {'n_d': 64, 'n_a': 37, 'n_steps': 4, 'gamma': 1.6496414164629933, 'lambda_sparse': 0.005485383045884183, 'lr': 0.0003006260164849362, 'mask_type': 'sparsemax', 'n_shared': 3, 'n_independent': 1}. Best is trial 3 with value: 0.81256038647343.


epoch 0  | loss: 0.69076 | val_0_accuracy: 0.67971 |  0:00:01s
epoch 1  | loss: 0.56483 | val_0_accuracy: 0.71111 |  0:00:03s
epoch 2  | loss: 0.52801 | val_0_accuracy: 0.73865 |  0:00:05s
epoch 3  | loss: 0.50787 | val_0_accuracy: 0.757   |  0:00:07s
epoch 4  | loss: 0.49639 | val_0_accuracy: 0.75652 |  0:00:09s
epoch 5  | loss: 0.48939 | val_0_accuracy: 0.76715 |  0:00:11s
epoch 6  | loss: 0.4781  | val_0_accuracy: 0.77488 |  0:00:13s
epoch 7  | loss: 0.476   | val_0_accuracy: 0.7744  |  0:00:15s
epoch 8  | loss: 0.46647 | val_0_accuracy: 0.78068 |  0:00:17s
epoch 9  | loss: 0.45708 | val_0_accuracy: 0.78406 |  0:00:19s
epoch 10 | loss: 0.45271 | val_0_accuracy: 0.77585 |  0:00:20s
epoch 11 | loss: 0.44681 | val_0_accuracy: 0.78889 |  0:00:22s
epoch 12 | loss: 0.44264 | val_0_accuracy: 0.78116 |  0:00:23s
epoch 13 | loss: 0.43543 | val_0_accuracy: 0.78213 |  0:00:25s
epoch 14 | loss: 0.42965 | val_0_accuracy: 0.79662 |  0:00:26s
epoch 15 | loss: 0.42717 | val_0_accuracy: 0.78696 |  0

[I 2024-09-07 19:05:31,423] Trial 7 finished with value: 0.808695652173913 and parameters: {'n_d': 30, 'n_a': 15, 'n_steps': 3, 'gamma': 1.6317190077030883, 'lambda_sparse': 0.004874839157031982, 'lr': 0.0018849505923335876, 'mask_type': 'entmax', 'n_shared': 2, 'n_independent': 1}. Best is trial 3 with value: 0.81256038647343.


epoch 0  | loss: 0.78311 | val_0_accuracy: 0.53382 |  0:00:04s
epoch 1  | loss: 0.59601 | val_0_accuracy: 0.68937 |  0:00:09s
epoch 2  | loss: 0.55735 | val_0_accuracy: 0.72222 |  0:00:13s
epoch 3  | loss: 0.53768 | val_0_accuracy: 0.7372  |  0:00:18s
epoch 4  | loss: 0.51262 | val_0_accuracy: 0.75362 |  0:00:23s
epoch 5  | loss: 0.51974 | val_0_accuracy: 0.76425 |  0:00:28s
epoch 6  | loss: 0.49723 | val_0_accuracy: 0.7599  |  0:00:33s
epoch 7  | loss: 0.50137 | val_0_accuracy: 0.75266 |  0:00:39s
epoch 8  | loss: 0.50238 | val_0_accuracy: 0.75749 |  0:00:44s
epoch 9  | loss: 0.49053 | val_0_accuracy: 0.78841 |  0:00:50s
epoch 10 | loss: 0.48808 | val_0_accuracy: 0.78696 |  0:00:55s
epoch 11 | loss: 0.47967 | val_0_accuracy: 0.77053 |  0:01:00s
epoch 12 | loss: 0.47303 | val_0_accuracy: 0.79469 |  0:01:05s
epoch 13 | loss: 0.46692 | val_0_accuracy: 0.78937 |  0:01:09s
epoch 14 | loss: 0.47125 | val_0_accuracy: 0.78986 |  0:01:14s
epoch 15 | loss: 0.46206 | val_0_accuracy: 0.78068 |  0

[I 2024-09-07 19:09:48,581] Trial 8 finished with value: 0.8154589371980676 and parameters: {'n_d': 25, 'n_a': 30, 'n_steps': 4, 'gamma': 1.971946952264732, 'lambda_sparse': 0.0007485402043618457, 'lr': 0.002788528289062548, 'mask_type': 'entmax', 'n_shared': 3, 'n_independent': 3}. Best is trial 8 with value: 0.8154589371980676.


epoch 0  | loss: 0.75241 | val_0_accuracy: 0.65314 |  0:00:02s
epoch 1  | loss: 0.52849 | val_0_accuracy: 0.72174 |  0:00:03s
epoch 2  | loss: 0.496   | val_0_accuracy: 0.74783 |  0:00:06s
epoch 3  | loss: 0.48075 | val_0_accuracy: 0.757   |  0:00:08s
epoch 4  | loss: 0.46747 | val_0_accuracy: 0.77295 |  0:00:10s
epoch 5  | loss: 0.45471 | val_0_accuracy: 0.78019 |  0:00:13s
epoch 6  | loss: 0.4387  | val_0_accuracy: 0.79227 |  0:00:15s
epoch 7  | loss: 0.42798 | val_0_accuracy: 0.77826 |  0:00:18s
epoch 8  | loss: 0.43432 | val_0_accuracy: 0.79275 |  0:00:20s
epoch 9  | loss: 0.42482 | val_0_accuracy: 0.79855 |  0:00:23s
epoch 10 | loss: 0.41913 | val_0_accuracy: 0.79758 |  0:00:25s
epoch 11 | loss: 0.42362 | val_0_accuracy: 0.80145 |  0:00:28s
epoch 12 | loss: 0.41952 | val_0_accuracy: 0.80628 |  0:00:30s
epoch 13 | loss: 0.41393 | val_0_accuracy: 0.80531 |  0:00:33s
epoch 14 | loss: 0.41389 | val_0_accuracy: 0.81353 |  0:00:35s
epoch 15 | loss: 0.40376 | val_0_accuracy: 0.81498 |  0

[I 2024-09-07 19:11:00,739] Trial 9 finished with value: 0.8236714975845411 and parameters: {'n_d': 56, 'n_a': 50, 'n_steps': 3, 'gamma': 1.7738473640154637, 'lambda_sparse': 0.005413185480278897, 'lr': 0.008417007144643893, 'mask_type': 'entmax', 'n_shared': 1, 'n_independent': 3}. Best is trial 9 with value: 0.8236714975845411.


Best Hyperparameters: {'n_d': 56, 'n_a': 50, 'n_steps': 3, 'gamma': 1.7738473640154637, 'lambda_sparse': 0.005413185480278897, 'lr': 0.008417007144643893, 'mask_type': 'entmax', 'n_shared': 1, 'n_independent': 3}


In [32]:
# Initialize the TabNet model with the best hyperparameters
optimized_tabnet_clf = TabNetClassifier(
    n_d=best_params['n_d'],
    n_a=best_params['n_a'],
    n_steps=best_params['n_steps'],
    gamma=best_params['gamma'],
    lambda_sparse=best_params['lambda_sparse'],
    optimizer_params={'lr': best_params['lr']},  # Correctly pass the learning rate
    mask_type=best_params['mask_type'],
    n_shared=best_params['n_shared'],
    n_independent=best_params['n_independent']
)

# Perform cross-validation to evaluate model performance
cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
y_cv_pred = cross_val_predict(optimized_tabnet_clf, features_resampled, target_resampled, cv=cv)

# Calculate cross-validated accuracy
cv_accuracy = accuracy_score(target_resampled, y_cv_pred)
print("Cross-Validated Accuracy:", cv_accuracy)

# Train the optimized model on the full dataset
optimized_tabnet_clf.fit(
    X_train, y_train,
    eval_set=[(X_test, y_test)],
    eval_metric=['accuracy'],
    max_epochs=100,
    patience=10,
    batch_size=256,
    virtual_batch_size=128
)

# Make final predictions on the test set
y_pred_optimized = optimized_tabnet_clf.predict(X_test)

# Calculate the classification report and confusion matrix
print("Classification Report for Optimized TabNet:\n", classification_report(y_test, y_pred_optimized))
print("Confusion Matrix for Optimized TabNet:\n", confusion_matrix(y_test, y_pred_optimized))



epoch 0  | loss: 1.14022 |  0:00:01s
epoch 1  | loss: 0.57973 |  0:00:02s
epoch 2  | loss: 0.52046 |  0:00:03s
epoch 3  | loss: 0.4879  |  0:00:04s
epoch 4  | loss: 0.46895 |  0:00:04s
epoch 5  | loss: 0.45676 |  0:00:05s
epoch 6  | loss: 0.45219 |  0:00:06s
epoch 7  | loss: 0.43486 |  0:00:07s
epoch 8  | loss: 0.42348 |  0:00:08s
epoch 9  | loss: 0.42023 |  0:00:09s
epoch 10 | loss: 0.41318 |  0:00:09s
epoch 11 | loss: 0.40969 |  0:00:10s
epoch 12 | loss: 0.39724 |  0:00:11s
epoch 13 | loss: 0.39585 |  0:00:12s
epoch 14 | loss: 0.38526 |  0:00:13s
epoch 15 | loss: 0.38526 |  0:00:15s
epoch 16 | loss: 0.37619 |  0:00:16s
epoch 17 | loss: 0.37946 |  0:00:17s
epoch 18 | loss: 0.36892 |  0:00:19s
epoch 19 | loss: 0.36855 |  0:00:20s
epoch 20 | loss: 0.36446 |  0:00:21s
epoch 21 | loss: 0.35831 |  0:00:22s
epoch 22 | loss: 0.35687 |  0:00:23s
epoch 23 | loss: 0.34799 |  0:00:25s
epoch 24 | loss: 0.34914 |  0:00:26s
epoch 25 | loss: 0.34383 |  0:00:27s
epoch 26 | loss: 0.32719 |  0:00:29s
e



epoch 0  | loss: 1.12742 |  0:00:01s
epoch 1  | loss: 0.58407 |  0:00:03s
epoch 2  | loss: 0.51794 |  0:00:04s
epoch 3  | loss: 0.48681 |  0:00:06s
epoch 4  | loss: 0.46947 |  0:00:07s
epoch 5  | loss: 0.45933 |  0:00:08s
epoch 6  | loss: 0.44599 |  0:00:10s
epoch 7  | loss: 0.43837 |  0:00:11s
epoch 8  | loss: 0.42354 |  0:00:13s
epoch 9  | loss: 0.42432 |  0:00:14s
epoch 10 | loss: 0.41392 |  0:00:16s
epoch 11 | loss: 0.41543 |  0:00:17s
epoch 12 | loss: 0.41205 |  0:00:18s
epoch 13 | loss: 0.41255 |  0:00:19s
epoch 14 | loss: 0.40892 |  0:00:20s
epoch 15 | loss: 0.40056 |  0:00:22s
epoch 16 | loss: 0.39638 |  0:00:23s
epoch 17 | loss: 0.38054 |  0:00:25s
epoch 18 | loss: 0.37647 |  0:00:27s
epoch 19 | loss: 0.37048 |  0:00:28s
epoch 20 | loss: 0.36521 |  0:00:30s
epoch 21 | loss: 0.35904 |  0:00:31s
epoch 22 | loss: 0.36881 |  0:00:32s
epoch 23 | loss: 0.35857 |  0:00:34s
epoch 24 | loss: 0.36185 |  0:00:35s
epoch 25 | loss: 0.3535  |  0:00:36s
epoch 26 | loss: 0.35157 |  0:00:37s
e



epoch 0  | loss: 1.12104 |  0:00:02s
epoch 1  | loss: 0.58736 |  0:00:05s
epoch 2  | loss: 0.52709 |  0:00:07s
epoch 3  | loss: 0.49788 |  0:00:09s
epoch 4  | loss: 0.47901 |  0:00:11s
epoch 5  | loss: 0.46644 |  0:00:14s
epoch 6  | loss: 0.45243 |  0:00:16s
epoch 7  | loss: 0.44479 |  0:00:19s
epoch 8  | loss: 0.43515 |  0:00:21s
epoch 9  | loss: 0.42203 |  0:00:24s
epoch 10 | loss: 0.42479 |  0:00:26s
epoch 11 | loss: 0.41894 |  0:00:28s
epoch 12 | loss: 0.42327 |  0:00:29s
epoch 13 | loss: 0.42036 |  0:00:31s
epoch 14 | loss: 0.40611 |  0:00:32s
epoch 15 | loss: 0.39827 |  0:00:33s
epoch 16 | loss: 0.40195 |  0:00:35s
epoch 17 | loss: 0.39689 |  0:00:36s
epoch 18 | loss: 0.39337 |  0:00:38s
epoch 19 | loss: 0.38474 |  0:00:39s
epoch 20 | loss: 0.38063 |  0:00:40s
epoch 21 | loss: 0.37323 |  0:00:42s
epoch 22 | loss: 0.37107 |  0:00:43s
epoch 23 | loss: 0.35884 |  0:00:44s
epoch 24 | loss: 0.35876 |  0:00:46s
epoch 25 | loss: 0.35346 |  0:00:47s
epoch 26 | loss: 0.36215 |  0:00:48s
e



epoch 0  | loss: 1.10564 |  0:00:05s
epoch 1  | loss: 0.56305 |  0:00:06s
epoch 2  | loss: 0.51644 |  0:00:08s
epoch 3  | loss: 0.48879 |  0:00:10s
epoch 4  | loss: 0.46642 |  0:00:12s
epoch 5  | loss: 0.45855 |  0:00:13s
epoch 6  | loss: 0.44481 |  0:00:15s
epoch 7  | loss: 0.44039 |  0:00:16s
epoch 8  | loss: 0.43542 |  0:00:18s
epoch 9  | loss: 0.42846 |  0:00:20s
epoch 10 | loss: 0.41297 |  0:00:21s
epoch 11 | loss: 0.41296 |  0:00:23s
epoch 12 | loss: 0.40515 |  0:00:24s
epoch 13 | loss: 0.40054 |  0:00:26s
epoch 14 | loss: 0.3939  |  0:00:28s
epoch 15 | loss: 0.3889  |  0:00:30s
epoch 16 | loss: 0.39025 |  0:00:31s
epoch 17 | loss: 0.39103 |  0:00:33s
epoch 18 | loss: 0.38946 |  0:00:35s
epoch 19 | loss: 0.38296 |  0:00:37s
epoch 20 | loss: 0.38473 |  0:00:39s
epoch 21 | loss: 0.38866 |  0:00:41s
epoch 22 | loss: 0.38651 |  0:00:42s
epoch 23 | loss: 0.3813  |  0:00:44s
epoch 24 | loss: 0.37321 |  0:00:47s
epoch 25 | loss: 0.37445 |  0:00:49s
epoch 26 | loss: 0.36517 |  0:00:51s
e



epoch 0  | loss: 1.12544 |  0:00:01s
epoch 1  | loss: 0.55245 |  0:00:02s
epoch 2  | loss: 0.50675 |  0:00:03s
epoch 3  | loss: 0.48214 |  0:00:05s
epoch 4  | loss: 0.46208 |  0:00:06s
epoch 5  | loss: 0.45412 |  0:00:08s
epoch 6  | loss: 0.43959 |  0:00:10s
epoch 7  | loss: 0.44062 |  0:00:12s
epoch 8  | loss: 0.42525 |  0:00:13s
epoch 9  | loss: 0.42373 |  0:00:15s
epoch 10 | loss: 0.41165 |  0:00:16s
epoch 11 | loss: 0.40097 |  0:00:18s
epoch 12 | loss: 0.39586 |  0:00:19s
epoch 13 | loss: 0.39224 |  0:00:21s
epoch 14 | loss: 0.39192 |  0:00:22s
epoch 15 | loss: 0.38773 |  0:00:24s
epoch 16 | loss: 0.37523 |  0:00:25s
epoch 17 | loss: 0.37765 |  0:00:27s
epoch 18 | loss: 0.36493 |  0:00:29s
epoch 19 | loss: 0.35977 |  0:00:30s
epoch 20 | loss: 0.34915 |  0:00:32s
epoch 21 | loss: 0.35324 |  0:00:34s
epoch 22 | loss: 0.35521 |  0:00:35s
epoch 23 | loss: 0.35263 |  0:00:37s
epoch 24 | loss: 0.34735 |  0:00:38s
epoch 25 | loss: 0.34313 |  0:00:40s
epoch 26 | loss: 0.33619 |  0:00:41s
e



Classification Report for Optimized TabNet:
               precision    recall  f1-score   support

           0       0.85      0.78      0.81      1021
           1       0.80      0.87      0.83      1049

    accuracy                           0.82      2070
   macro avg       0.83      0.82      0.82      2070
weighted avg       0.83      0.82      0.82      2070

Confusion Matrix for Optimized TabNet:
 [[797 224]
 [141 908]]


### Understand the Data Processing and Model Development
First, let's break down your code to identify the critical steps for preprocessing, training, and optimizing your model:

**Data Loading and Preprocessing:**

+ Loading the dataset.
+ Converting the target variable to numeric.
+ Encoding categorical variables using Label Encoding.
+ Applying the SMOTE technique for handling class imbalance.

**Splitting the Data:**

+ Splitting the processed data into training and testing sets.

**Hyperparameter Optimization:**

+ Using Optuna for hyperparameter tuning of the TabNetClassifier.

**Model Training and Evaluation:**

+ Training the optimized model using cross-validation.
+ Evaluating the model's performance with a classification report and confusion matrix.

### Save the Preprocessing Objects and Model

To deploy the model, save the preprocessing objects (e.g., label encoders, SMOTE object) and the trained model itself. We will use the joblib library for this purpose.

In [19]:
## required libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split, StratifiedKFold, cross_val_predict
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score, precision_score, recall_score, f1_score
from imblearn.over_sampling import SMOTE
from pytorch_tabnet.tab_model import TabNetClassifier
import optuna
import joblib

In [15]:
## Load the dataset
data = pd.read_csv('no_missing_values_customer_data.csv')

## Convert the target variable 'Churn' to numeric
data['Churn'] = data['Churn'].map({'Yes': 1, 'No': 0})

## Encode categorical variables using Label Encoding
label_encoders = {}  # Dictionary to store label encoders
for col in data.select_dtypes(include=['object']).columns:
    if col != 'customerID': ## removing the column CustomerID because it will not give importance in the predcition
        le = LabelEncoder()  ## defining the label encoder
        data[col] = le.fit_transform(data[col])  ## tranbsform the columns to encode all categorical columns
        label_encoders[col] = le  # Save the encoder for deployment

## Save the label encoders
joblib.dump(label_encoders, 'label_encoders.pkl')

## Separate features and target
features = data.drop(['Churn', 'customerID'], axis=1).values  ## drop the customerID column
target = data['Churn'].values
print("Shape of Features in Raw Data : ---",features.shape)
print("Shape of Targets in Raw Data : ---",target.shape)

##  Apply SMOTE for oversampling the minority class
## Synthetic Minority Over-sampling Technique
smote = SMOTE(random_state=42)
features_resampled, target_resampled = smote.fit_resample(features, target)
print("Shape of Features after doing the SMOTE Technique : ---",features_resampled.shape)
print("Shape of Targets after doing the SMOTE Technique : ---",target_resampled.shape)
## Save the SMOTE object
joblib.dump(smote, 'smote.pkl')

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(features_resampled, target_resampled, test_size=0.2, random_state=42)

print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)

Shape of Features in Raw Data : --- (7043, 19)
Shape of Targets in Raw Data : --- (7043,)
Shape of Features after doing the SMOTE Technique : --- (10348, 19)
Shape of Targets after doing the SMOTE Technique : --- (10348,)
(8278, 19)
(2070, 19)
(8278,)
(2070,)


In [16]:
## define the hyperparameters
## Define the hyperparameter optimization function
def objective(trial):
    # Define the hyperparameters
    params = {
        'n_d': trial.suggest_int('n_d', 8, 64),
        'n_a': trial.suggest_int('n_a', 8, 64),
        'n_steps': trial.suggest_int('n_steps', 3, 10),
        'gamma': trial.suggest_float('gamma', 1.0, 2.0),
        'lambda_sparse': trial.suggest_float('lambda_sparse', 0.0001, 0.01),
        'optimizer_params': {'lr': trial.suggest_float('lr', 1e-4, 1e-2)},
        'mask_type': trial.suggest_categorical('mask_type', ['entmax', 'sparsemax']),
        'n_shared': trial.suggest_int('n_shared', 1, 3),
        'n_independent': trial.suggest_int('n_independent', 1, 3)
    }
    
    # Initialize TabNetClassifier with the parameters
    tabnet_clf = TabNetClassifier(
        n_d=params['n_d'],
        n_a=params['n_a'],
        n_steps=params['n_steps'],
        gamma=params['gamma'],
        lambda_sparse=params['lambda_sparse'],
        optimizer_params=params['optimizer_params'],
        mask_type=params['mask_type'],
        n_shared=params['n_shared'],
        n_independent=params['n_independent']
    )

    # Train the model
    tabnet_clf.fit(
        X_train, y_train,
        eval_set=[(X_test, y_test)],
        eval_metric=['accuracy'],
        max_epochs=100,
        patience=10,
        batch_size=256,
        virtual_batch_size=128
    )
    
    # Predict and calculate accuracy
    y_pred = tabnet_clf.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    
    return accuracy

In [20]:
## Run the hyperparameter optimization
## this step is to get the best hyperparameters from the defined hyperparameters
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=1)

# Get the best parameters
best_params = study.best_params
print("Best Hyperparameters:", best_params)

[I 2024-09-09 14:27:56,261] A new study created in memory with name: no-name-1ca64b1c-1e94-4ca8-a04b-759a4495ab9e


epoch 0  | loss: 0.85121 | val_0_accuracy: 0.5343  |  0:00:03s
epoch 1  | loss: 0.62698 | val_0_accuracy: 0.64976 |  0:00:10s
epoch 2  | loss: 0.57145 | val_0_accuracy: 0.72174 |  0:00:16s
epoch 3  | loss: 0.55581 | val_0_accuracy: 0.7343  |  0:00:27s
epoch 4  | loss: 0.54609 | val_0_accuracy: 0.72899 |  0:00:37s
epoch 5  | loss: 0.51952 | val_0_accuracy: 0.77005 |  0:00:43s
epoch 6  | loss: 0.52895 | val_0_accuracy: 0.78792 |  0:00:50s
epoch 7  | loss: 0.51045 | val_0_accuracy: 0.7686  |  0:00:56s
epoch 8  | loss: 0.5068  | val_0_accuracy: 0.75845 |  0:01:02s
epoch 9  | loss: 0.51035 | val_0_accuracy: 0.76329 |  0:01:08s
epoch 10 | loss: 0.50876 | val_0_accuracy: 0.77488 |  0:01:13s
epoch 11 | loss: 0.50459 | val_0_accuracy: 0.77536 |  0:01:19s
epoch 12 | loss: 0.51326 | val_0_accuracy: 0.75266 |  0:01:25s
epoch 13 | loss: 0.50742 | val_0_accuracy: 0.75797 |  0:01:31s
epoch 14 | loss: 0.50134 | val_0_accuracy: 0.77923 |  0:01:36s
epoch 15 | loss: 0.48908 | val_0_accuracy: 0.78406 |  0

[I 2024-09-09 14:31:19,956] Trial 0 finished with value: 0.797584541062802 and parameters: {'n_d': 23, 'n_a': 29, 'n_steps': 8, 'gamma': 1.9889182765540496, 'lambda_sparse': 0.004552405345846334, 'lr': 0.0043902484776940495, 'mask_type': 'entmax', 'n_shared': 2, 'n_independent': 3}. Best is trial 0 with value: 0.797584541062802.


Best Hyperparameters: {'n_d': 23, 'n_a': 29, 'n_steps': 8, 'gamma': 1.9889182765540496, 'lambda_sparse': 0.004552405345846334, 'lr': 0.0043902484776940495, 'mask_type': 'entmax', 'n_shared': 2, 'n_independent': 3}


In [21]:
## Initialize the TabNet model with the best hyperparameters
## this is training the model by the best hyperparameters
optimized_tabnet_clf = TabNetClassifier(
    n_d=best_params['n_d'],
    n_a=best_params['n_a'],
    n_steps=best_params['n_steps'],
    gamma=best_params['gamma'],
    lambda_sparse=best_params['lambda_sparse'],
    optimizer_params={'lr': best_params['lr']},
    mask_type=best_params['mask_type'],
    n_shared=best_params['n_shared'],
    n_independent=best_params['n_independent']
)



# Perform cross-validation to evaluate model performance
cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
y_cv_pred = cross_val_predict(optimized_tabnet_clf, features_resampled, target_resampled, cv=cv)

# Calculate cross-validated accuracy
cv_accuracy = accuracy_score(target_resampled, y_cv_pred)
print("Cross-Validated Accuracy:", cv_accuracy)

# Calculate cross-validated precision, recall, and F1-score
cv_precision = precision_score(target_resampled, y_cv_pred)
cv_recall = recall_score(target_resampled, y_cv_pred)
cv_f1 = f1_score(target_resampled, y_cv_pred)

print(f"Cross-Validated Precision: {cv_precision:.2f}")
print(f"Cross-Validated Recall: {cv_recall:.2f}")
print(f"Cross-Validated F1 Score: {cv_f1:.2f}")

# Train the optimized model on the full dataset
optimized_tabnet_clf.fit(
    X_train, y_train,
    eval_set=[(X_test, y_test)],
    eval_metric=['accuracy'],
    max_epochs=100,
    patience=10,
    batch_size=256,
    virtual_batch_size=128
)

# Make final predictions on the test set
y_pred_optimized = optimized_tabnet_clf.predict(X_test)

# Calculate and print evaluation metrics
print("Classification Report for Optimized TabNet:\n", classification_report(y_test, y_pred_optimized))
print("Confusion Matrix for Optimized TabNet:\n", confusion_matrix(y_test, y_pred_optimized))

# Save the trained model in a deployable format
joblib.dump(optimized_tabnet_clf, 'optimized_tabnet_model.pkl')



epoch 0  | loss: 0.99832 |  0:00:03s
epoch 1  | loss: 0.70475 |  0:00:07s
epoch 2  | loss: 0.64388 |  0:00:12s
epoch 3  | loss: 0.59762 |  0:00:16s
epoch 4  | loss: 0.58686 |  0:00:19s
epoch 5  | loss: 0.56716 |  0:00:22s
epoch 6  | loss: 0.54784 |  0:00:25s
epoch 7  | loss: 0.53829 |  0:00:29s
epoch 8  | loss: 0.5365  |  0:00:32s
epoch 9  | loss: 0.53782 |  0:00:36s
epoch 10 | loss: 0.52871 |  0:00:39s
epoch 11 | loss: 0.53489 |  0:00:42s
epoch 12 | loss: 0.52635 |  0:00:45s
epoch 13 | loss: 0.52194 |  0:00:48s
epoch 14 | loss: 0.52556 |  0:00:51s
epoch 15 | loss: 0.51255 |  0:00:54s
epoch 16 | loss: 0.5018  |  0:00:56s
epoch 17 | loss: 0.50491 |  0:01:00s
epoch 18 | loss: 0.49649 |  0:01:03s
epoch 19 | loss: 0.50394 |  0:01:06s
epoch 20 | loss: 0.4932  |  0:01:08s
epoch 21 | loss: 0.4793  |  0:01:11s
epoch 22 | loss: 0.48236 |  0:01:13s
epoch 23 | loss: 0.48775 |  0:01:17s
epoch 24 | loss: 0.4871  |  0:01:19s
epoch 25 | loss: 0.49325 |  0:01:22s
epoch 26 | loss: 0.49116 |  0:01:27s
e



epoch 0  | loss: 1.04078 |  0:00:02s
epoch 1  | loss: 0.75535 |  0:00:05s
epoch 2  | loss: 0.66623 |  0:00:08s
epoch 3  | loss: 0.60226 |  0:00:10s
epoch 4  | loss: 0.56409 |  0:00:13s
epoch 5  | loss: 0.55288 |  0:00:16s
epoch 6  | loss: 0.53676 |  0:00:19s
epoch 7  | loss: 0.53344 |  0:00:21s
epoch 8  | loss: 0.52739 |  0:00:22s
epoch 9  | loss: 0.5367  |  0:00:24s
epoch 10 | loss: 0.53051 |  0:00:27s
epoch 11 | loss: 0.52324 |  0:00:29s
epoch 12 | loss: 0.52898 |  0:00:31s
epoch 13 | loss: 0.53066 |  0:00:33s
epoch 14 | loss: 0.5571  |  0:00:36s
epoch 15 | loss: 0.52522 |  0:00:38s
epoch 16 | loss: 0.50748 |  0:00:40s
epoch 17 | loss: 0.49906 |  0:00:42s
epoch 18 | loss: 0.49279 |  0:00:45s
epoch 19 | loss: 0.49732 |  0:00:47s
epoch 20 | loss: 0.49256 |  0:00:50s
epoch 21 | loss: 0.49292 |  0:00:52s
epoch 22 | loss: 0.48799 |  0:00:55s
epoch 23 | loss: 0.48576 |  0:00:57s
epoch 24 | loss: 0.48565 |  0:00:59s
epoch 25 | loss: 0.47464 |  0:01:01s
epoch 26 | loss: 0.49586 |  0:01:03s
e



epoch 0  | loss: 1.10099 |  0:00:03s
epoch 1  | loss: 0.82752 |  0:00:05s
epoch 2  | loss: 0.71027 |  0:00:08s
epoch 3  | loss: 0.64346 |  0:00:11s
epoch 4  | loss: 0.6221  |  0:00:13s
epoch 5  | loss: 0.589   |  0:00:15s
epoch 6  | loss: 0.56873 |  0:00:19s
epoch 7  | loss: 0.56076 |  0:00:22s
epoch 8  | loss: 0.53911 |  0:00:24s
epoch 9  | loss: 0.5278  |  0:00:26s
epoch 10 | loss: 0.50902 |  0:00:29s
epoch 11 | loss: 0.50843 |  0:00:31s
epoch 12 | loss: 0.50458 |  0:00:34s
epoch 13 | loss: 0.50365 |  0:00:36s
epoch 14 | loss: 0.50452 |  0:00:38s
epoch 15 | loss: 0.51185 |  0:00:41s
epoch 16 | loss: 0.51004 |  0:00:44s
epoch 17 | loss: 0.49696 |  0:00:47s
epoch 18 | loss: 0.49103 |  0:00:49s
epoch 19 | loss: 0.49406 |  0:00:52s
epoch 20 | loss: 0.49722 |  0:00:54s
epoch 21 | loss: 0.48419 |  0:00:57s
epoch 22 | loss: 0.47546 |  0:00:59s
epoch 23 | loss: 0.47511 |  0:01:01s
epoch 24 | loss: 0.47721 |  0:01:04s
epoch 25 | loss: 0.4805  |  0:01:06s
epoch 26 | loss: 0.48848 |  0:01:08s
e



epoch 0  | loss: 1.01198 |  0:00:03s
epoch 1  | loss: 0.73262 |  0:00:06s
epoch 2  | loss: 0.64642 |  0:00:08s
epoch 3  | loss: 0.59773 |  0:00:11s
epoch 4  | loss: 0.56888 |  0:00:13s
epoch 5  | loss: 0.56854 |  0:00:15s
epoch 6  | loss: 0.54994 |  0:00:18s
epoch 7  | loss: 0.53959 |  0:00:21s
epoch 8  | loss: 0.53687 |  0:00:24s
epoch 9  | loss: 0.52077 |  0:00:29s
epoch 10 | loss: 0.50797 |  0:00:32s
epoch 11 | loss: 0.50689 |  0:00:35s
epoch 12 | loss: 0.50246 |  0:00:39s
epoch 13 | loss: 0.50133 |  0:00:42s
epoch 14 | loss: 0.50627 |  0:00:44s
epoch 15 | loss: 0.50131 |  0:00:47s
epoch 16 | loss: 0.5118  |  0:00:50s
epoch 17 | loss: 0.51375 |  0:00:52s
epoch 18 | loss: 0.52618 |  0:00:55s
epoch 19 | loss: 0.51036 |  0:00:58s
epoch 20 | loss: 0.49367 |  0:01:01s
epoch 21 | loss: 0.49533 |  0:01:04s
epoch 22 | loss: 0.49412 |  0:01:06s
epoch 23 | loss: 0.50875 |  0:01:09s
epoch 24 | loss: 0.51089 |  0:01:11s
epoch 25 | loss: 0.49139 |  0:01:14s
epoch 26 | loss: 0.49607 |  0:01:17s
e



epoch 0  | loss: 1.05808 |  0:00:04s
epoch 1  | loss: 0.76057 |  0:00:07s
epoch 2  | loss: 0.63957 |  0:00:10s
epoch 3  | loss: 0.59593 |  0:00:13s
epoch 4  | loss: 0.56737 |  0:00:17s
epoch 5  | loss: 0.55201 |  0:00:21s
epoch 6  | loss: 0.54401 |  0:00:25s
epoch 7  | loss: 0.53671 |  0:00:29s
epoch 8  | loss: 0.5301  |  0:00:33s
epoch 9  | loss: 0.53862 |  0:00:37s
epoch 10 | loss: 0.53064 |  0:00:40s
epoch 11 | loss: 0.51804 |  0:00:43s
epoch 12 | loss: 0.4992  |  0:00:46s
epoch 13 | loss: 0.50137 |  0:00:49s
epoch 14 | loss: 0.50721 |  0:00:52s
epoch 15 | loss: 0.50717 |  0:00:55s
epoch 16 | loss: 0.51596 |  0:00:58s
epoch 17 | loss: 0.49525 |  0:01:01s
epoch 18 | loss: 0.51578 |  0:01:03s
epoch 19 | loss: 0.5056  |  0:01:06s
epoch 20 | loss: 0.50621 |  0:01:08s
epoch 21 | loss: 0.54212 |  0:01:11s
epoch 22 | loss: 0.49725 |  0:01:14s
epoch 23 | loss: 0.49958 |  0:01:17s
epoch 24 | loss: 0.49579 |  0:01:20s
epoch 25 | loss: 0.49383 |  0:01:22s
epoch 26 | loss: 0.49778 |  0:01:26s
e



Classification Report for Optimized TabNet:
               precision    recall  f1-score   support

           0       0.81      0.77      0.79      1021
           1       0.78      0.83      0.81      1049

    accuracy                           0.80      2070
   macro avg       0.80      0.80      0.80      2070
weighted avg       0.80      0.80      0.80      2070

Confusion Matrix for Optimized TabNet:
 [[782 239]
 [180 869]]


['optimized_tabnet_model.pkl']

In [23]:
data = pd.read_csv('no_missing_values_customer_data.csv')
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7043 entries, 0 to 7042
Data columns (total 21 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   customerID        7043 non-null   object 
 1   gender            7043 non-null   object 
 2   SeniorCitizen     7043 non-null   int64  
 3   Partner           7043 non-null   object 
 4   Dependents        7043 non-null   object 
 5   tenure            7043 non-null   int64  
 6   PhoneService      7043 non-null   object 
 7   MultipleLines     7043 non-null   object 
 8   InternetService   7043 non-null   object 
 9   OnlineSecurity    7043 non-null   object 
 10  OnlineBackup      7043 non-null   object 
 11  DeviceProtection  7043 non-null   object 
 12  TechSupport       7043 non-null   object 
 13  StreamingTV       7043 non-null   object 
 14  StreamingMovies   7043 non-null   object 
 15  Contract          7043 non-null   object 
 16  PaperlessBilling  7043 non-null   object 
