Following figure is a typical Deep learning model [1] The Convolution part does the feature extraction from the input and the part with dense layers does the classification part. In our case we have extracted features. So, our model only consists of dense layers.
<img src="cnn.jpg" width="400" height="50">

Importing panda and numpy library to load data from the xlsx file into a data frame and extract features out of it.

In [1]:
import pandas as pd
import numpy as np

Next loading data into a data frame `df` and then displaying first few rows.

In [2]:
df =pd.read_excel("IT_3.xlsx")
df.head()

Unnamed: 0,ID,target,Gender,EngineHP,credit_history,Years_Experience,annual_claims,Marital_Status,Vehical_type,Miles_driven_annually,size_of_family,Age_bucket,EngineHP_bucket,Years_Experience_bucket,Miles_driven_annually_bucket,credit_history_bucket,State
0,1,1,F,522,656,1,0,Married,Car,14749.0,5,<18,>350,<3,<15k,Fair,IL
1,2,1,F,691,704,16,0,Married,Car,15389.0,6,28-34,>350,15-30,15k-25k,Good,NJ
2,3,1,M,133,691,15,0,Married,Van,9956.0,3,>40,90-160,15-30,<15k,Good,CT
3,4,1,M,146,720,9,0,Married,Van,77323.0,3,18-27,90-160,9-14',>25k,Good,CT
4,5,1,M,128,771,33,1,Married,Van,14183.0,4,>40,90-160,>30,<15k,Very Good,WY


In [None]:
df.shape

In [None]:
df.columns

We do not need `ID` column

In [3]:
df = df.drop(['ID'], axis=1)

In [None]:
df.columns

Only selecting the columns which are going to use as feature

In [64]:
features = df[['Vehical_type',  'EngineHP_bucket', 'Years_Experience_bucket', 'Miles_driven_annually_bucket', 'credit_history_bucket', 'State']]

In [None]:
features.columns

Next we are doing one hot encoding by using `pandas` function `get_dummies`, that means every value in a categorical type column will be turned into a column and the value will be replace by 1 and 0. 

| ID | car | millage |
| --- | --- | --- |
| 1 | Mercedes | 2000 |
| 2 | Mercedes | 50000 |
| 3 | Honda | 3000 |
| 4 | VW | 50 |

The above table will be turn into as following table.

| ID | car Mercedes | car Honda| car VW | millage |
| --- | --- | --- | --- | --- |
| 1 | 1 | 0 | 0 | 2000 |
| 2 | 1 | 0 | 0 | 50000 |
| 3 | 0 | 1 | 0 | 3000 |
| 4 | 0 | 0 | 1 | 50 |

In [65]:
features = pd.get_dummies(features)

In [None]:
features.columns

In [66]:
features = features.to_numpy(copy = True)
features

array([[1, 0, 0, ..., 0, 0, 0],
       [1, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 1, 0, ..., 0, 0, 0],
       [0, 1, 0, ..., 0, 0, 0],
       [0, 1, 0, ..., 0, 0, 0]], dtype=uint8)

Using `target` columns for labels

In [7]:
labels = df[["target"]]
labels = labels.to_numpy(copy = True)
labels

array([[1],
       [1],
       [1],
       ...,
       [0],
       [1],
       [1]], dtype=int64)

Using `scikit-learn` library and `train_test_split` function to divide our data for training the model and testing the trained model.

In [67]:
from sklearn.model_selection import train_test_split
train_features, test_features, train_labels, test_labels = train_test_split(features, labels, test_size = 0.25, random_state = 42)

Importing necessary libraries for defining deep learning model, layers and training and testing the model

In [9]:
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense, Dropout

In [None]:
len(features[0])

In [None]:
len(features)

We create a model with few dense and dropout layers. For dense layer the activation function is `relu` and the last layer it is `sigmoid`. The first layer is input layer with input dimension is equal to the number of feature value. We print out the model summary at the end of defining the model.  

In [68]:
model = Sequential()
model.add(Dense(100, activation="relu", input_dim = len(features[0])))
model.add(Dropout(0.2))
model.add(Dense(70, activation="relu"))
model.add(Dropout(0.4))
model.add(Dense(100, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(20, activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(1, activation="sigmoid"))
model.summary()

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_20 (Dense)             (None, 100)               7200      
_________________________________________________________________
dropout_16 (Dropout)         (None, 100)               0         
_________________________________________________________________
dense_21 (Dense)             (None, 70)                7070      
_________________________________________________________________
dropout_17 (Dropout)         (None, 70)                0         
_________________________________________________________________
dense_22 (Dense)             (None, 100)               7100      
_________________________________________________________________
dropout_18 (Dropout)         (None, 100)               0         
_________________________________________________________________
dense_23 (Dense)             (None, 20)               

To optimize the training process we use `adam` optimizer function, `binary_crossentropy` calculation for the loss calculation and then `accuracy` as a metrics. So, the model will try to adjust it weight and bias value to achieve higher possible accuracy with lower possible loss and in a optimized way.

In [69]:
model.compile(optimizer = "adam", loss = "binary_crossentropy", metrics = ["accuracy"] )

We have defined model and also define the compilation techniques. Now we shall run the actual training with the training data. It is better to use a portion of data for the validation of the training process to see whether with the data we are having under fitting or over fitting problem. To store the logs and to generate the diagram to compare the loss and accuracy curve of training and validation during training process we first need to use `tensorboard` library. Then we shall use `fit` function to run the actual training.  

In [70]:
from keras.callbacks import TensorBoard
from time import time
log_dir='logs/{}'.format(time())
tensorboard = TensorBoard(log_dir=log_dir)

In [71]:
#batch_size means how many number will take for fit
model.fit(train_features,train_labels,batch_size = 50, epochs= 50, validation_split =0.2, shuffle = True, callbacks=[tensorboard])

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<tensorflow.python.keras.callbacks.History at 0x2029300b820>

Now we plot loss and accuracy curve using the `tensorboard` to see whether there is under fitting or over fitting.

In [13]:
%load_ext tensorboard

In [74]:
%tensorboard --logdir {log_dir} --host localhost --port 8088

Reusing TensorBoard on port 8088 (pid 9924), started 0:00:59 ago. (Use '!kill 9924' to kill it.)

Finally, we evaluate the trained model with the test data.

In [72]:
loss,accuracy = model.evaluate(test_features,test_labels,batch_size = 10)



Printing loss and accuracy value as a final evaluation of our model to predict the `target` column that gives the indication whether the insurance holder is going to make a claim or not.

In [None]:
print("loss: {}, accuracy: {}".format(loss,accuracy))

References:
1. Daniel Weimer, Bernd Scholz-Reiter, Moshe Shpitalni,
Design of deep convolutional neural network architectures for automated feature extraction in industrial inspection,
CIRP Annals,
Volume 65, Issue 1,
2016,
Pages 417-420,
ISSN 0007-8506,