# Kaggle Playground - Season 4 Episode 7
## Binary Classification of Insurance Cross Selling

Competion link - https://www.kaggle.com/competitions/playground-series-s4e7

### Steps
- Import the necessary libraries, packages and modules
- Unzip the zipped files
- Read the datsets as data framers

### Understand the problem

The objective of this competition is to predict which customers respond positively to an automobile insurance offer.

In [1]:
# Import the necessary libraries, packages and modules

import warnings
warnings.filterwarnings('ignore')

import dtale    # Use of a web progrm to analysis the data deeply
import keras_tuner as kt
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import pickle
import seaborn as sns
import statsmodels.api as sm
import tensorflow as tf
import zipfile

from keras import Sequential, Model
from keras.callbacks import EarlyStopping
from keras.layers import Dense, Dropout, BatchNormalization, Input, concatenate
from keras.metrics import AUC
from imblearn.over_sampling import RandomOverSampler
#from pandas_profiling import ProfileReport
from sklearn.linear_model import LogisticRegression, RidgeClassifier
from sklearn.ensemble import AdaBoostClassifier, BaggingClassifier, GradientBoostingClassifier, RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix, roc_auc_score
from sklearn.model_selection import cross_val_score, train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from tensorflow import keras
from xgboost import XGBClassifier

sns.set()
%matplotlib inline

In [2]:
train_df = pd.read_csv('train.csv')
test_df = pd.read_csv('test.csv')

train_df.head()

Unnamed: 0,id,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Response
0,0,Male,21,1,35.0,0,1-2 Year,Yes,65101.0,124.0,187,0
1,1,Male,43,1,28.0,0,> 2 Years,Yes,58911.0,26.0,288,1
2,2,Female,25,1,14.0,1,< 1 Year,No,38043.0,152.0,254,0
3,3,Female,35,1,1.0,0,1-2 Year,Yes,2630.0,156.0,76,0
4,4,Female,36,1,15.0,1,1-2 Year,No,31951.0,152.0,294,0


In [3]:
test_df.head()

Unnamed: 0,id,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage
0,11504798,Female,20,1,47.0,0,< 1 Year,No,2630.0,160.0,228
1,11504799,Male,47,1,28.0,0,1-2 Year,Yes,37483.0,124.0,123
2,11504800,Male,47,1,43.0,0,1-2 Year,Yes,2630.0,26.0,271
3,11504801,Female,22,1,47.0,1,< 1 Year,No,24502.0,152.0,115
4,11504802,Male,51,1,19.0,0,1-2 Year,No,34115.0,124.0,148


### Checking for incorrect datatypes

- There are no incorrect datatypes 
- The type of columns in both train and test are some
- Below are the observations
     0.   id                   - int64      - insignificant
     1.   Gender               - object     - categorical - change to numeric
     2.   Age                  - int64      - categorical - numeric
     3.   Driving_License      - int64      - categorical - numeric
     4.   Region_Code          - float64    - categorical - numeric
     5.   Previously_Insured   - int64      - categorical - numeric
     6.   Vehicle_Age          - object     - categorical - change to numeric
     7.   Vehicle_Damage       - object     - categorical - change to numeric
     8.   Annual_Premium       - float64    - numeric
     9.   Policy_Sales_Channel - float64    - not sure
     10.  Vintage              - int64      - not sure

In [4]:
train_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11504798 entries, 0 to 11504797
Data columns (total 12 columns):
 #   Column                Dtype  
---  ------                -----  
 0   id                    int64  
 1   Gender                object 
 2   Age                   int64  
 3   Driving_License       int64  
 4   Region_Code           float64
 5   Previously_Insured    int64  
 6   Vehicle_Age           object 
 7   Vehicle_Damage        object 
 8   Annual_Premium        float64
 9   Policy_Sales_Channel  float64
 10  Vintage               int64  
 11  Response              int64  
dtypes: float64(3), int64(6), object(3)
memory usage: 1.0+ GB


In [5]:
test_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7669866 entries, 0 to 7669865
Data columns (total 11 columns):
 #   Column                Dtype  
---  ------                -----  
 0   id                    int64  
 1   Gender                object 
 2   Age                   int64  
 3   Driving_License       int64  
 4   Region_Code           float64
 5   Previously_Insured    int64  
 6   Vehicle_Age           object 
 7   Vehicle_Damage        object 
 8   Annual_Premium        float64
 9   Policy_Sales_Channel  float64
 10  Vintage               int64  
dtypes: float64(3), int64(5), object(3)
memory usage: 643.7+ MB


In [6]:
column_names = train_df.columns.tolist()

for i in column_names:
    print(i, train_df[i].nunique(), 'unique values')

id 11504798 unique values
Gender 2 unique values
Age 66 unique values
Driving_License 2 unique values
Region_Code 54 unique values
Previously_Insured 2 unique values
Vehicle_Age 3 unique values
Vehicle_Damage 2 unique values
Annual_Premium 51728 unique values
Policy_Sales_Channel 152 unique values
Vintage 290 unique values
Response 2 unique values


### Encoding categorical variables

- Columns needing encoding
    - Gender - Label encoder
    - Vehicle_Age - Mapped encoder
    - Vehicle_Damage - Label encoder
- All columns are now numeric, we can proceed with building the models

In [7]:
# Proceeding with encoding
# Label encoder on gender column

train_df['Gender'] = train_df['Gender'].astype('category')
train_df['Gender'] = train_df['Gender'].cat.codes

test_df['Gender'] = test_df['Gender'].astype('category')
test_df['Gender'] = test_df['Gender'].cat.codes

train_df.head()

Unnamed: 0,id,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Response
0,0,1,21,1,35.0,0,1-2 Year,Yes,65101.0,124.0,187,0
1,1,1,43,1,28.0,0,> 2 Years,Yes,58911.0,26.0,288,1
2,2,0,25,1,14.0,1,< 1 Year,No,38043.0,152.0,254,0
3,3,0,35,1,1.0,0,1-2 Year,Yes,2630.0,156.0,76,0
4,4,0,36,1,15.0,1,1-2 Year,No,31951.0,152.0,294,0


In [8]:
unique_veh_age = train_df['Vehicle_Age'].unique
print(unique_veh_age)

<bound method Series.unique of 0            1-2 Year
1           > 2 Years
2            < 1 Year
3            1-2 Year
4            1-2 Year
              ...    
11504793     1-2 Year
11504794     < 1 Year
11504795     < 1 Year
11504796     1-2 Year
11504797     < 1 Year
Name: Vehicle_Age, Length: 11504798, dtype: object>


In [9]:
# Define the mapping for encoding

veh_age_mapping = {
    '< 1 Year': 0,
    '1-2 Year': 1,
    '> 2 Years': 2
}

# Encode the 'Vehicle_Age' column

train_df['Vehicle_Age'] = train_df['Vehicle_Age'].map(veh_age_mapping)
test_df['Vehicle_Age'] = test_df['Vehicle_Age'].map(veh_age_mapping)

train_df.head()

Unnamed: 0,id,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Response
0,0,1,21,1,35.0,0,1,Yes,65101.0,124.0,187,0
1,1,1,43,1,28.0,0,2,Yes,58911.0,26.0,288,1
2,2,0,25,1,14.0,1,0,No,38043.0,152.0,254,0
3,3,0,35,1,1.0,0,1,Yes,2630.0,156.0,76,0
4,4,0,36,1,15.0,1,1,No,31951.0,152.0,294,0


In [10]:
# Encoding 'Vehicle_Damage' column - using label encoding

train_df['Vehicle_Damage'] = train_df['Vehicle_Damage'].astype('category')
train_df['Vehicle_Damage'] = train_df['Vehicle_Damage'].cat.codes

test_df['Vehicle_Damage'] = test_df['Vehicle_Damage'].astype('category')
test_df['Vehicle_Damage'] = test_df['Vehicle_Damage'].cat.codes

train_df.head()

Unnamed: 0,id,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Response
0,0,1,21,1,35.0,0,1,1,65101.0,124.0,187,0
1,1,1,43,1,28.0,0,2,1,58911.0,26.0,288,1
2,2,0,25,1,14.0,1,0,0,38043.0,152.0,254,0
3,3,0,35,1,1.0,0,1,1,2630.0,156.0,76,0
4,4,0,36,1,15.0,1,1,0,31951.0,152.0,294,0


In [11]:
train_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11504798 entries, 0 to 11504797
Data columns (total 12 columns):
 #   Column                Dtype  
---  ------                -----  
 0   id                    int64  
 1   Gender                int8   
 2   Age                   int64  
 3   Driving_License       int64  
 4   Region_Code           float64
 5   Previously_Insured    int64  
 6   Vehicle_Age           int64  
 7   Vehicle_Damage        int8   
 8   Annual_Premium        float64
 9   Policy_Sales_Channel  float64
 10  Vintage               int64  
 11  Response              int64  
dtypes: float64(3), int64(7), int8(2)
memory usage: 899.7 MB


In [12]:
test_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7669866 entries, 0 to 7669865
Data columns (total 11 columns):
 #   Column                Dtype  
---  ------                -----  
 0   id                    int64  
 1   Gender                int8   
 2   Age                   int64  
 3   Driving_License       int64  
 4   Region_Code           float64
 5   Previously_Insured    int64  
 6   Vehicle_Age           int64  
 7   Vehicle_Damage        int8   
 8   Annual_Premium        float64
 9   Policy_Sales_Channel  float64
 10  Vintage               int64  
dtypes: float64(3), int64(6), int8(2)
memory usage: 541.3 MB


### Droping the insignificant columns

- Since id is insignificant we can drop that column from both test and train.


In [13]:
train_df = train_df.drop(['id'], axis = 1)
test_df = test_df.drop(['id'], axis = 1)

train_df.head(2)

Unnamed: 0,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Response
0,1,21,1,35.0,0,1,1,65101.0,124.0,187,0
1,1,43,1,28.0,0,2,1,58911.0,26.0,288,1


In [14]:
test_df.head(2)

Unnamed: 0,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage
0,0,20,1,47.0,0,0,0,2630.0,160.0,228
1,1,47,1,28.0,0,1,1,37483.0,124.0,123


### Train test split of the train df

In [15]:
# Since we have only one data set, spliting it into train and test (validation)

raw_train_df, validation_df = train_test_split(train_df, train_size = 0.75, random_state = 1, stratify = train_df['Response'])
raw_train_df.head(2)

Unnamed: 0,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Response
6400262,0,26,1,28.0,0,0,0,54497.0,26.0,234,0
8095698,0,25,1,30.0,1,0,0,38748.0,152.0,131,0


In [16]:
validation_df.head(2)

Unnamed: 0,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Response
6517611,1,44,1,28.0,0,1,1,2630.0,157.0,91,0
1591313,0,23,1,14.0,1,0,0,35345.0,152.0,272,0


In [17]:
raw_train_df.shape

(8628598, 11)

In [18]:
validation_df.shape

(2876200, 11)

In [19]:
raw_train_df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 8628598 entries, 6400262 to 8402201
Data columns (total 11 columns):
 #   Column                Dtype  
---  ------                -----  
 0   Gender                int8   
 1   Age                   int64  
 2   Driving_License       int64  
 3   Region_Code           float64
 4   Previously_Insured    int64  
 5   Vehicle_Age           int64  
 6   Vehicle_Damage        int8   
 7   Annual_Premium        float64
 8   Policy_Sales_Channel  float64
 9   Vintage               int64  
 10  Response              int64  
dtypes: float64(3), int64(6), int8(2)
memory usage: 674.8 MB


In [20]:
validation_df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 2876200 entries, 6517611 to 326523
Data columns (total 11 columns):
 #   Column                Dtype  
---  ------                -----  
 0   Gender                int8   
 1   Age                   int64  
 2   Driving_License       int64  
 3   Region_Code           float64
 4   Previously_Insured    int64  
 5   Vehicle_Age           int64  
 6   Vehicle_Damage        int8   
 7   Annual_Premium        float64
 8   Policy_Sales_Channel  float64
 9   Vintage               int64  
 10  Response              int64  
dtypes: float64(3), int64(6), int8(2)
memory usage: 224.9 MB


In [21]:
raw_train_df.describe()

Unnamed: 0,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Response
count,8628598.0,8628598.0,8628598.0,8628598.0,8628598.0,8628598.0,8628598.0,8628598.0,8628598.0,8628598.0,8628598.0
mean,0.5412746,38.389,0.9980113,26.41771,0.4630153,0.6032037,0.5027108,30461.89,112.4161,163.8887,0.1229973
std,0.4982935,14.99678,0.04455088,12.99227,0.4986303,0.5678678,0.4999927,16444.75,54.03797,79.97808,0.3284341
min,0.0,20.0,0.0,0.0,0.0,0.0,0.0,2630.0,1.0,10.0,0.0
25%,0.0,24.0,1.0,15.0,0.0,0.0,0.0,25279.0,29.0,99.0,0.0
50%,1.0,36.0,1.0,28.0,0.0,1.0,1.0,31826.0,151.0,166.0,0.0
75%,1.0,49.0,1.0,35.0,1.0,1.0,1.0,39454.0,152.0,232.0,0.0
max,1.0,85.0,1.0,52.0,1.0,2.0,1.0,540165.0,163.0,299.0,1.0


In [22]:
validation_df.describe()

Unnamed: 0,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Response
count,2876200.0,2876200.0,2876200.0,2876200.0,2876200.0,2876200.0,2876200.0,2876200.0,2876200.0,2876200.0,2876200.0
mean,0.5415802,38.36725,0.998054,26.42163,0.4629403,0.6028183,0.5025867,30459.83,112.4533,163.9249,0.1229974
std,0.4982682,14.98347,0.04407022,12.98954,0.4986248,0.5678204,0.4999934,16484.7,54.02893,79.98389,0.3284342
min,0.0,20.0,0.0,0.0,0.0,0.0,0.0,2630.0,1.0,10.0,0.0
25%,0.0,24.0,1.0,15.0,0.0,0.0,0.0,25272.0,29.0,99.0,0.0
50%,1.0,36.0,1.0,28.0,0.0,1.0,1.0,31817.0,151.0,166.0,0.0
75%,1.0,49.0,1.0,35.0,1.0,1.0,1.0,39443.0,152.0,232.0,0.0
max,1.0,85.0,1.0,52.0,1.0,2.0,1.0,540165.0,163.0,299.0,1.0


### Splitting dependent and independent variable

In [23]:
# Splitting dependent and independent variable

raw_x_train = raw_train_df.drop(['Response'], axis = 1)
raw_y_train = raw_train_df['Response']

raw_x_val = validation_df.drop(['Response'], axis = 1)
raw_y_val = validation_df['Response']

raw_x_train.head()

Unnamed: 0,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage
6400262,0,26,1,28.0,0,0,0,54497.0,26.0,234
8095698,0,25,1,30.0,1,0,0,38748.0,152.0,131
5898936,1,58,1,8.0,1,1,0,2630.0,26.0,142
3958879,0,54,1,28.0,0,1,1,46156.0,26.0,24
2335270,1,45,1,10.0,0,1,1,2630.0,124.0,257


### Standardisation of raw data 

In [24]:
# Using satandardisation technique

ssc = StandardScaler()
scaled_x_train = pd.DataFrame(ssc.fit_transform(raw_x_train))
scaled_y_train = raw_y_train
scaled_x_val = pd.DataFrame(ssc.fit_transform(raw_x_val))
scaled_y_val = raw_y_val

scaled_x_train.head(2)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,-1.086257,-0.826111,0.04464,0.121787,-0.928574,-1.062226,-1.005436,1.461568,-1.599175,0.876632
1,-1.086257,-0.892792,0.04464,0.275725,1.07692,-1.062226,-1.005436,0.503876,0.732519,-0.411221


In [25]:
scaled_x_val.head(2)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,0.920026,0.375931,0.044156,0.121511,-0.928434,0.699485,0.99484,-1.688222,0.824496,-0.911745
1,-1.086925,-1.025614,0.044156,-0.95628,1.077082,-1.061636,-1.005187,0.296346,0.731953,1.351211


In [26]:
raw_x_train.shape

(8628598, 10)

In [27]:
raw_inputs = raw_x_train.shape[1]
raw_inputs

10

In [28]:
scaled_inputs = scaled_x_train.shape[1]

In [None]:
# Designing the Model
scaled_model = Sequential()

scaled_model.add(Dense(input_dim = scaled_inputs, activation = 'relu', units = 128))
scaled_model.add(BatchNormalization())
scaled_model.add(Dense(activation = 'relu', units = 128))
scaled_model.add(BatchNormalization())
scaled_model.add(Dense(activation = 'relu', units = 64))
scaled_model.add(BatchNormalization())
scaled_model.add(Dense(activation = 'relu', units = 64))
scaled_model.add(BatchNormalization())
scaled_model.add(Dense(activation = 'relu', units = 32))
scaled_model.add(BatchNormalization())
scaled_model.add(Dense(activation = 'relu', units = 32))
scaled_model.add(BatchNormalization())
scaled_model.add(Dense(activation = 'sigmoid', units = 1))

scaled_model.summary()

In [None]:
# Compiling the model

scaled_model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = [AUC (name = 'auroc')])

# Training the model

history_scaled = scaled_model.fit(scaled_x_train, scaled_y_train, validation_data = (scaled_x_val, scaled_y_val), epochs = 5)

In [None]:
# Designing the Model
scaled_model2 = Sequential()

scaled_model2.add(Dense(input_dim = scaled_inputs, activation = 'relu', units = 128))
scaled_model2.add(BatchNormalization())
scaled_model2.add(Dense(activation = 'relu', units = 128))
scaled_model2.add(BatchNormalization())
scaled_model2.add(Dense(activation = 'relu', units = 128))
scaled_model2.add(BatchNormalization())
scaled_model2.add(Dense(activation = 'relu', units = 64))
scaled_model2.add(BatchNormalization())
scaled_model2.add(Dense(activation = 'relu', units = 64))
scaled_model2.add(BatchNormalization())
scaled_model2.add(Dense(activation = 'relu', units = 64))
scaled_model2.add(BatchNormalization())
scaled_model2.add(Dense(activation = 'relu', units = 32))
scaled_model2.add(BatchNormalization())
scaled_model2.add(Dense(activation = 'relu', units = 32))
scaled_model2.add(BatchNormalization())
scaled_model2.add(Dense(activation = 'relu', units = 32))
scaled_model2.add(BatchNormalization())
scaled_model2.add(Dense(activation = 'sigmoid', units = 1))

scaled_model2.summary()

In [None]:
# Compiling the model

scaled_model2.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = [AUC (name = 'auroc')])

# Training the model

history_scaled2 = scaled_model2.fit(scaled_x_train, scaled_y_train, validation_data = (scaled_x_val, scaled_y_val), epochs = 5)

In [None]:
# Designing the Model
scaled_model3 = Sequential()

scaled_model3.add(Dense(input_dim = scaled_inputs, activation = 'relu', units = 128))
scaled_model3.add(BatchNormalization())
scaled_model3.add(Dense(activation = 'relu', units = 128))
scaled_model3.add(BatchNormalization())
scaled_model3.add(Dense(activation = 'relu', units = 64))
scaled_model3.add(BatchNormalization())
scaled_model3.add(Dense(activation = 'relu', units = 32))
scaled_model3.add(BatchNormalization())
scaled_model3.add(Dense(activation = 'sigmoid', units = 1))

scaled_model3.summary()

In [None]:
# Compiling the model

scaled_model3.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = [AUC (name = 'auroc')])

# Training the model

history_scaled3 = scaled_model3.fit(scaled_x_train, scaled_y_train, validation_data = (scaled_x_val, scaled_y_val), epochs = 5)

In [29]:
def transform_categorical_features(df):
    gender_map = {'Male': 0, 'Female': 1}
    vehicle_age_map = {'< 1 Year': 0, '1-2 Year': 1, '> 2 Years': 2}
    vehicle_damage_map = {'No': 0, 'Yes': 1}
    
    df['Gender'] = df['Gender'].map(gender_map)
    df['Vehicle_Age'] = df['Vehicle_Age'].map(vehicle_age_map)
    df['Vehicle_Damage'] = df['Vehicle_Damage'].map(vehicle_damage_map)
    
    return df

def adjust_data_types(df):
    df['Region_Code'] = df['Region_Code'].astype(int)
    df['Annual_Premium'] = df['Annual_Premium'].astype(int)
    df['Policy_Sales_Channel'] = df['Policy_Sales_Channel'].astype(int)
    
    return df

def create_additional_features(df):
    df['Prev_Insured_Annual_Premium'] = pd.factorize(df['Previously_Insured'].astype(str) + df['Annual_Premium'].astype(str))[0]
    df['Prev_Insured_Vehicle_Age'] = pd.factorize(df['Previously_Insured'].astype(str) + df['Vehicle_Age'].astype(str))[0]
    df['Prev_Insured_Vehicle_Damage'] = pd.factorize(df['Previously_Insured'].astype(str) + df['Vehicle_Damage'].astype(str))[0]
    df['Prev_Insured_Vintage'] = pd.factorize(df['Previously_Insured'].astype(str) + df['Vintage'].astype(str))[0]
    
    return df

def optimize_memory_usage(df):
    start_mem_usage = df.memory_usage().sum() / 1024 ** 2
    
    for col in df.columns:
        col_type = df[col].dtype
        
        if col_type.name in ['category', 'object']:
            raise ValueError(f"Column '{col}' is of type '{col_type.name}'")

        c_min = df[col].min()
        c_max = df[col].max()
        
        if str(col_type)[:3] == 'int':
            
            if c_min > np.iinfo(np.int8).min and c_max < np.iinfo(np.int8).max:
                df[col] = df[col].astype(np.int8)
                
            elif c_min > np.iinfo(np.int16).min and c_max < np.iinfo(np.int16).max:
                df[col] = df[col].astype(np.int16)
                
            elif c_min > np.iinfo(np.int32).min and c_max < np.iinfo(np.int32).max:
                df[col] = df[col].astype(np.int32)
                
            elif c_min > np.iinfo(np.int64).min and c_max < np.iinfo(np.int64).max:
                df[col] = df[col].astype(np.int64)
        
        else:
        
            if c_min > np.finfo(np.float16).min and c_max < np.finfo(np.float16).max:
                df[col] = df[col].astype(np.float16)
            
            elif c_min > np.finfo(np.float32).min and c_max < np.finfo(np.float32).max:
                df[col] = df[col].astype(np.float32)
            
            else:
                df[col] = df[col].astype(np.float64)

    end_mem_usage = df.memory_usage().sum() / 1024**2
    print(f'------ Memory usage before: {start_mem_usage:.2f} MB')
    print(f'------ Memory usage after: {end_mem_usage:.2f} MB')
    print(f'------ Reduced memory usage by {(100 * (start_mem_usage - end_mem_usage) / start_mem_usage):.1f}%')
    print('**********************' * 5)

    return df

def apply_scaling(df, scaler_type, columns):

    if scaler_type == 'S':
        scaler = StandardScaler() 
    
    elif scaler_type == 'M':
        scaler = MinMaxScaler()  
    
    elif scaler_type == 'R':
        scaler = RobustScaler()  
    
    elif scaler_type == 'A':
        scaler = MaxAbsScaler() 
    
    elif scaler_type == 'Q':
        scaler = QuantileTransformer(output_distribution='normal') 
    
    elif scaler_type == 'P':
        scaler = PowerTransformer() 
    
    else:
        raise ValueError("Invalid scaler type. Choose 'S' for StandardScaler, 'M' for MinMaxScaler, 'R' for RobustScaler, 'A' for MaxAbsScaler,'Q' for QuantileTransformer, or 'P' for PowerTransformer.")

    scaled_data = df.copy()

    for col in columns:
        scaled_data[col] = scaler.fit_transform(scaled_data[[col]])

    return scaled_data


In [30]:
featured_train_df = transform_categorical_features(train_df)
featured_validation_df = transform_categorical_features(validation_df)
featured_test_df = transform_categorical_features(test_df)

featured_train_df = adjust_data_types(featured_train_df)
featured_validation_df = adjust_data_types(featured_validation_df)
featured_test_df = adjust_data_types(featured_test_df)

featured_train_df = create_additional_features(featured_train_df)
featured_validation_df = create_additional_features(featured_validation_df)
featured_test_df = create_additional_features(featured_test_df)

featured_train_df = optimize_memory_usage(featured_train_df)
featured_validation_df = optimize_memory_usage(featured_validation_df)
featured_test_df = optimize_memory_usage(featured_test_df)

------ Memory usage before: 1184.96 MB
------ Memory usage after: 493.73 MB
------ Reduced memory usage by 58.3%
**************************************************************************************************************
------ Memory usage before: 318.18 MB
------ Memory usage after: 145.38 MB
------ Reduced memory usage by 54.3%
**************************************************************************************************************
------ Memory usage before: 731.46 MB
------ Memory usage after: 321.84 MB
------ Reduced memory usage by 56.0%
**************************************************************************************************************


In [31]:
scaler_type = 'S'
columns_to_scale_xgb = featured_test_df.columns.tolist()

# Trying with Standard Scaler

featured_train_df = apply_scaling(featured_train_df, scaler_type, columns_to_scale_xgb)
featured_validation_df = apply_scaling(featured_validation_df, scaler_type, columns_to_scale_xgb)
featured_test_df = apply_scaling(featured_test_df, scaler_type, columns_to_scale_xgb)

#logger.info(f"Data scaling completed. Time elapsed: {time.time() - start_time:.2f} seconds")

featured_train_df.head(2)

Unnamed: 0,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Response,Prev_Insured_Annual_Premium,Prev_Insured_Vehicle_Age,Prev_Insured_Vehicle_Damage,Prev_Insured_Vintage
0,,-1.15941,0.044519,0.660528,-0.928539,,,2.105145,0.214202,0.288852,0,-0.929348,-0.928539,-0.928539,-1.62432
1,,0.307897,0.044519,0.121718,-0.928539,,,1.728962,-1.599414,1.551675,1,-0.929292,-0.928539,-0.928539,-1.618162


In [32]:
# Splitting dependent and independent variable

featured_x_train = featured_train_df.drop(['Response'], axis = 1)
featured_y_train = featured_train_df['Response']

featured_x_val = featured_validation_df.drop(['Response'], axis = 1)
featured_y_val = featured_validation_df['Response']

featured_x_train.head()

Unnamed: 0,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Prev_Insured_Annual_Premium,Prev_Insured_Vehicle_Age,Prev_Insured_Vehicle_Damage,Prev_Insured_Vintage
0,,-1.15941,0.044519,0.660528,-0.928539,,,2.105145,0.214202,0.288852,-0.929348,-0.928539,-0.928539,-1.62432
1,,0.307897,0.044519,0.121718,-0.928539,,,1.728962,-1.599414,1.551675,-0.929292,-0.928539,-0.928539,-1.618162
2,,-0.892627,0.044519,-0.955902,1.07696,,,0.460756,0.732378,1.126566,-0.929236,1.07696,1.07696,-1.612004
3,,-0.225669,0.044519,-1.95655,-0.928539,,,-1.691389,0.806403,-1.099003,-0.929181,-0.928539,-0.928539,-1.605846
4,,-0.158974,0.044519,-0.878929,1.07696,,,0.090529,0.732378,1.626694,-0.929125,1.07696,1.07696,-1.599688


In [33]:
scaled_inputs1 = featured_x_train.shape[1]

In [None]:
# Designing the Model

scaled_model4 = Sequential()

scaled_model4.add(Dense(input_dim = scaled_inputs1, activation = 'relu', units = 128))
scaled_model4.add(BatchNormalization())
scaled_model4.add(Dense(activation = 'relu', units = 128))
scaled_model4.add(BatchNormalization())
scaled_model4.add(Dense(activation = 'relu', units = 64))
scaled_model4.add(BatchNormalization())
scaled_model4.add(Dense(activation = 'relu', units = 64))
scaled_model4.add(BatchNormalization())
scaled_model4.add(Dense(activation = 'relu', units = 32))
scaled_model4.add(BatchNormalization())
scaled_model4.add(Dense(activation = 'relu', units = 32))
scaled_model4.add(BatchNormalization())
scaled_model4.add(Dense(activation = 'sigmoid', units = 1))

scaled_model4.summary()

In [None]:
# Compiling the model

scaled_model4.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = [AUC (name = 'auroc')])

# Training the model

history_scaled4 = scaled_model4.fit(featured_x_train, featured_y_train, 
                                   validation_data = (featured_x_val, featured_y_val), epochs = 5)

In [None]:
# Designing the Model

scaled_model5 = Sequential()

scaled_model5.add(Dense(input_dim = scaled_inputs1, activation = 'relu', units = 128))
scaled_model5.add(BatchNormalization())
scaled_model5.add(Dense(activation = 'relu', units = 128))
scaled_model5.add(BatchNormalization())
scaled_model5.add(Dense(activation = 'relu', units = 128))
scaled_model5.add(BatchNormalization())
scaled_model5.add(Dense(activation = 'relu', units = 64))
scaled_model5.add(BatchNormalization())
scaled_model5.add(Dense(activation = 'relu', units = 64))
scaled_model5.add(BatchNormalization())
scaled_model5.add(Dense(activation = 'relu', units = 64))
scaled_model5.add(BatchNormalization())
scaled_model5.add(Dense(activation = 'relu', units = 32))
scaled_model5.add(BatchNormalization())
scaled_model5.add(Dense(activation = 'relu', units = 32))
scaled_model5.add(BatchNormalization())
scaled_model5.add(Dense(activation = 'relu', units = 32))
scaled_model5.add(BatchNormalization())
scaled_model5.add(Dense(activation = 'sigmoid', units = 1))

scaled_model5.summary()

In [None]:
# Compiling the model

scaled_model5.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = [AUC (name = 'auroc')])

# Training the model

history_scaled5 = scaled_model5.fit(featured_x_train, featured_y_train, 
                                   validation_data = (featured_x_val, featured_y_val), epochs = 5)

In [None]:
# Designing the Model

scaled_model6 = Sequential()

scaled_model6.add(Dense(input_dim = scaled_inputs1, activation = 'relu', units = 128))
scaled_model6.add(BatchNormalization())
scaled_model6.add(Dense(activation = 'relu', units = 128))
scaled_model6.add(BatchNormalization())
scaled_model6.add(Dense(activation = 'relu', units = 64))
scaled_model6.add(BatchNormalization())
scaled_model6.add(Dense(activation = 'relu', units = 32))
scaled_model6.add(BatchNormalization())
scaled_model6.add(Dense(activation = 'sigmoid', units = 1))

scaled_model6.summary()

In [None]:
# Compiling the model

scaled_model6.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = [AUC (name = 'auroc')])

# Training the model

history_scaled6 = scaled_model6.fit(featured_x_train, featured_y_train, 
                                   validation_data = (featured_x_val, featured_y_val), epochs = 5)

In [34]:
adjusted_train_df = transform_categorical_features(train_df)
adjusted_validation_df = transform_categorical_features(validation_df)
adjusted_test_df = transform_categorical_features(test_df)

adjusted_train_df = adjust_data_types(adjusted_train_df)
adjusted_validation_df = adjust_data_types(adjusted_validation_df)
adjusted_test_df = adjust_data_types(adjusted_test_df)

In [35]:
scaler_type = 'S'
columns_to_scale_xgb = adjusted_test_df.columns.tolist()

# Trying with Standard Scaler

adjusted_train_df = apply_scaling(adjusted_train_df, scaler_type, columns_to_scale_xgb)
adjusted_validation_df = apply_scaling(adjusted_validation_df, scaler_type, columns_to_scale_xgb)
adjusted_test_df = apply_scaling(adjusted_test_df, scaler_type, columns_to_scale_xgb)

# logger.info(f"Data scaling completed. Time elapsed: {time.time() - start_time:.2f} seconds")

adjusted_train_df.head(2)

Unnamed: 0,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Response,Prev_Insured_Annual_Premium,Prev_Insured_Vehicle_Age,Prev_Insured_Vehicle_Damage,Prev_Insured_Vintage
0,,-1.15941,0.044519,0.660528,-0.928539,,,2.105145,0.214202,0.288852,0,-0.929348,-0.928539,-0.928539,-1.62432
1,,0.307897,0.044519,0.121718,-0.928539,,,1.728962,-1.599414,1.551675,1,-0.929292,-0.928539,-0.928539,-1.618162


In [36]:
# Splitting dependent and independent variable

adjusted_x_train = adjusted_train_df.drop(['Response'], axis = 1)
adjusted_y_train = adjusted_train_df['Response']

adjusted_x_val = adjusted_validation_df.drop(['Response'], axis = 1)
adjusted_y_val = adjusted_validation_df['Response']

adjusted_x_train.head()

Unnamed: 0,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Prev_Insured_Annual_Premium,Prev_Insured_Vehicle_Age,Prev_Insured_Vehicle_Damage,Prev_Insured_Vintage
0,,-1.15941,0.044519,0.660528,-0.928539,,,2.105145,0.214202,0.288852,-0.929348,-0.928539,-0.928539,-1.62432
1,,0.307897,0.044519,0.121718,-0.928539,,,1.728962,-1.599414,1.551675,-0.929292,-0.928539,-0.928539,-1.618162
2,,-0.892627,0.044519,-0.955902,1.07696,,,0.460756,0.732378,1.126566,-0.929236,1.07696,1.07696,-1.612004
3,,-0.225669,0.044519,-1.95655,-0.928539,,,-1.691389,0.806403,-1.099003,-0.929181,-0.928539,-0.928539,-1.605846
4,,-0.158974,0.044519,-0.878929,1.07696,,,0.090529,0.732378,1.626694,-0.929125,1.07696,1.07696,-1.599688


In [37]:
scaled_inputs2 = adjusted_x_train.shape[1]

In [None]:
# Designing the Model

scaled_model7 = Sequential()

scaled_model7.add(Dense(input_dim = scaled_inputs2, activation = 'relu', units = 128))
scaled_model7.add(BatchNormalization())
scaled_model7.add(Dense(activation = 'relu', units = 128))
scaled_model7.add(BatchNormalization())
scaled_model7.add(Dense(activation = 'relu', units = 64))
scaled_model7.add(BatchNormalization())
scaled_model7.add(Dense(activation = 'relu', units = 64))
scaled_model7.add(BatchNormalization())
scaled_model7.add(Dense(activation = 'relu', units = 32))
scaled_model7.add(BatchNormalization())
scaled_model7.add(Dense(activation = 'relu', units = 32))
scaled_model7.add(BatchNormalization())
scaled_model7.add(Dense(activation = 'sigmoid', units = 1))

scaled_model7.summary()

In [None]:
# Compiling the model

scaled_model7.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = [AUC (name = 'auroc')])

# Training the model

history_scaled7 = scaled_model7.fit(adjusted_x_train, adjusted_y_train, 
                                   validation_data = (adjusted_x_val, adjusted_y_val), epochs = 5)

In [None]:
# Designing the Model

scaled_model8 = Sequential()

scaled_model8.add(Dense(input_dim = scaled_inputs2, activation = 'relu', units = 128))
scaled_model8.add(BatchNormalization())
scaled_model8.add(Dense(activation = 'relu', units = 128))
scaled_model8.add(BatchNormalization())
scaled_model8.add(Dense(activation = 'relu', units = 128))
scaled_model8.add(BatchNormalization())
scaled_model8.add(Dense(activation = 'relu', units = 64))
scaled_model8.add(BatchNormalization())
scaled_model8.add(Dense(activation = 'relu', units = 64))
scaled_model8.add(BatchNormalization())
scaled_model8.add(Dense(activation = 'relu', units = 64))
scaled_model8.add(BatchNormalization())
scaled_model8.add(Dense(activation = 'relu', units = 32))
scaled_model8.add(BatchNormalization())
scaled_model8.add(Dense(activation = 'relu', units = 32))
scaled_model8.add(BatchNormalization())
scaled_model8.add(Dense(activation = 'relu', units = 32))
scaled_model8.add(BatchNormalization())
scaled_model8.add(Dense(activation = 'sigmoid', units = 1))

scaled_model8.summary()

In [None]:
# Compiling the model

scaled_model8.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = [AUC (name = 'auroc')])

# Training the model

history_scaled8 = scaled_model8.fit(adjusted_x_train, adjusted_y_train, 
                                   validation_data = (adjusted_x_val, adjusted_y_val), epochs = 2)

In [None]:
# Designing the Model

scaled_model9 = Sequential()

scaled_model9.add(Dense(input_dim = scaled_inputs2, activation = 'relu', units = 128))
scaled_model9.add(BatchNormalization())
scaled_model9.add(Dense(activation = 'relu', units = 128))
scaled_model9.add(BatchNormalization())
scaled_model9.add(Dense(activation = 'relu', units = 64))
scaled_model9.add(BatchNormalization())
scaled_model9.add(Dense(activation = 'relu', units = 32))
scaled_model9.add(BatchNormalization())
scaled_model9.add(Dense(activation = 'sigmoid', units = 1))

scaled_model9.summary()

In [None]:
# Compiling the model

scaled_model9.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = [AUC (name = 'auroc')])

# Training the model

history_scaled9 = scaled_model9.fit(adjusted_x_train, adjusted_y_train, 
                                   validation_data = (adjusted_x_val, adjusted_y_val), epochs = 2)

In [38]:
optimised_train_df = transform_categorical_features(train_df)
optimised_validation_df = transform_categorical_features(validation_df)
optimised_test_df = transform_categorical_features(test_df)

optimised_train_df = adjust_data_types(optimised_train_df)
optimised_validation_df = adjust_data_types(optimised_validation_df)
optimised_test_df = adjust_data_types(optimised_test_df)

optimised_train_df = optimize_memory_usage(optimised_train_df)
optimised_validation_df = optimize_memory_usage(optimised_validation_df)
optimised_test_df = optimize_memory_usage(optimised_test_df)

------ Memory usage before: 548.59 MB
------ Memory usage after: 493.73 MB
------ Reduced memory usage by 10.0%
**************************************************************************************************************
------ Memory usage before: 159.09 MB
------ Memory usage after: 145.38 MB
------ Reduced memory usage by 8.6%
**************************************************************************************************************
------ Memory usage before: 358.41 MB
------ Memory usage after: 321.84 MB
------ Reduced memory usage by 10.2%
**************************************************************************************************************


In [39]:
scaler_type = 'S'
columns_to_scale_xgb = optimised_test_df.columns.tolist()

# Trying with Standard Scaler

optimised_train_df = apply_scaling(optimised_train_df, scaler_type, columns_to_scale_xgb)
optimised_validation_df = apply_scaling(optimised_validation_df, scaler_type, columns_to_scale_xgb)
optimised_test_df = apply_scaling(optimised_test_df, scaler_type, columns_to_scale_xgb)

# logger.info(f"Data scaling completed. Time elapsed: {time.time() - start_time:.2f} seconds")

optimised_train_df.head(2)

Unnamed: 0,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Response,Prev_Insured_Annual_Premium,Prev_Insured_Vehicle_Age,Prev_Insured_Vehicle_Damage,Prev_Insured_Vintage
0,,-1.15941,0.044519,0.660528,-0.928539,,,2.105145,0.214202,0.288852,0,-0.929348,-0.928539,-0.928539,-1.62432
1,,0.307897,0.044519,0.121718,-0.928539,,,1.728962,-1.599414,1.551675,1,-0.929292,-0.928539,-0.928539,-1.618162


In [40]:
# Splitting dependent and independent variable

optimised_x_train = optimised_train_df.drop(['Response'], axis = 1)
optimised_y_train = optimised_train_df['Response']

optimised_x_val = optimised_validation_df.drop(['Response'], axis = 1)
optimised_y_val = optimised_validation_df['Response']

optimised_x_train.head()

Unnamed: 0,Gender,Age,Driving_License,Region_Code,Previously_Insured,Vehicle_Age,Vehicle_Damage,Annual_Premium,Policy_Sales_Channel,Vintage,Prev_Insured_Annual_Premium,Prev_Insured_Vehicle_Age,Prev_Insured_Vehicle_Damage,Prev_Insured_Vintage
0,,-1.15941,0.044519,0.660528,-0.928539,,,2.105145,0.214202,0.288852,-0.929348,-0.928539,-0.928539,-1.62432
1,,0.307897,0.044519,0.121718,-0.928539,,,1.728962,-1.599414,1.551675,-0.929292,-0.928539,-0.928539,-1.618162
2,,-0.892627,0.044519,-0.955902,1.07696,,,0.460756,0.732378,1.126566,-0.929236,1.07696,1.07696,-1.612004
3,,-0.225669,0.044519,-1.95655,-0.928539,,,-1.691389,0.806403,-1.099003,-0.929181,-0.928539,-0.928539,-1.605846
4,,-0.158974,0.044519,-0.878929,1.07696,,,0.090529,0.732378,1.626694,-0.929125,1.07696,1.07696,-1.599688


In [41]:
scaled_inputs3 = optimised_x_train.shape[1]

In [42]:
# Designing the Model

scaled_model10 = Sequential()

scaled_model10.add(Dense(input_dim = scaled_inputs3, activation = 'relu', units = 128))
scaled_model10.add(BatchNormalization())
scaled_model10.add(Dense(activation = 'relu', units = 128))
scaled_model10.add(BatchNormalization())
scaled_model10.add(Dense(activation = 'relu', units = 64))
scaled_model10.add(BatchNormalization())
scaled_model10.add(Dense(activation = 'relu', units = 64))
scaled_model10.add(BatchNormalization())
scaled_model10.add(Dense(activation = 'relu', units = 32))
scaled_model10.add(BatchNormalization())
scaled_model10.add(Dense(activation = 'relu', units = 32))
scaled_model10.add(BatchNormalization())
scaled_model10.add(Dense(activation = 'sigmoid', units = 1))

scaled_model10.summary()

In [43]:
# Compiling the model

scaled_model10.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = [AUC (name = 'auroc')])

# Training the model

history_scaled10 = scaled_model10.fit(optimised_x_train, optimised_y_train, 
                                   validation_data = (optimised_x_val, optimised_y_val), epochs = 2)

Epoch 1/2
[1m 28995/359525[0m [32m━[0m[37m━━━━━━━━━━━━━━━━━━━[0m [1m14:30[0m 3ms/step - auroc: 0.5007 - loss: 0.3763

KeyboardInterrupt: 

In [44]:
# Designing the Model

scaled_model11 = Sequential()

scaled_model11.add(Dense(input_dim = scaled_inputs3, activation = 'relu', units = 128))
scaled_model11.add(BatchNormalization())
scaled_model11.add(Dense(activation = 'relu', units = 128))
scaled_model11.add(BatchNormalization())
scaled_model11.add(Dense(activation = 'relu', units = 128))
scaled_model11.add(BatchNormalization())
scaled_model11.add(Dense(activation = 'relu', units = 64))
scaled_model11.add(BatchNormalization())
scaled_model11.add(Dense(activation = 'relu', units = 64))
scaled_model11.add(BatchNormalization())
scaled_model11.add(Dense(activation = 'relu', units = 64))
scaled_model11.add(BatchNormalization())
scaled_model11.add(Dense(activation = 'relu', units = 32))
scaled_model11.add(BatchNormalization())
scaled_model11.add(Dense(activation = 'relu', units = 32))
scaled_model11.add(BatchNormalization())
scaled_model11.add(Dense(activation = 'relu', units = 32))
scaled_model11.add(BatchNormalization())
scaled_model11.add(Dense(activation = 'sigmoid', units = 1))

scaled_model11.summary()

In [45]:
# Compiling the model

scaled_model11.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = [AUC (name = 'auroc')])

# Training the model

history_scaled11 = scaled_model11.fit(optimised_x_train, optimised_y_train, 
                                   validation_data = (optimised_x_val, optimised_y_val), epochs = 2)

Epoch 1/2
[1m 20940/359525[0m [32m━[0m[37m━━━━━━━━━━━━━━━━━━━[0m [1m19:49[0m 4ms/step - auroc: 0.4988 - loss: 0.3800

KeyboardInterrupt: 

In [46]:
# Designing the Model

scaled_model12 = Sequential()

scaled_model12.add(Dense(input_dim = scaled_inputs3, activation = 'relu', units = 128))
scaled_model12.add(BatchNormalization())
scaled_model12.add(Dense(activation = 'relu', units = 128))
scaled_model12.add(BatchNormalization())
scaled_model12.add(Dense(activation = 'relu', units = 64))
scaled_model12.add(BatchNormalization())
scaled_model12.add(Dense(activation = 'relu', units = 32))
scaled_model12.add(BatchNormalization())
scaled_model12.add(Dense(activation = 'sigmoid', units = 1))

scaled_model12.summary()

In [47]:
# Compiling the model

scaled_model12.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = [AUC (name = 'auroc')])

# Training the model

history_scaled12 = scaled_model12.fit(optimised_x_train, optimised_y_train, 
                                   validation_data = (optimised_x_val, optimised_y_val), epochs = 2)

Epoch 1/2
[1m 57902/359525[0m [32m━━━[0m[37m━━━━━━━━━━━━━━━━━[0m [1m10:04[0m 2ms/step - auroc: 0.5008 - loss: 0.3754

KeyboardInterrupt: 