Cross Vadation: K-Fold CV, stratified Fold CV, Time Series CV

## Part 1: Data preprocessing


### Importing the libraries


In [18]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

### Importing the dataset

In [19]:
df = pd.read_csv('Churn_Modelling.csv')

### Review data

In [20]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
RowNumber          10000 non-null int64
CustomerId         10000 non-null int64
Surname            10000 non-null object
CreditScore        10000 non-null int64
Geography          10000 non-null object
Gender             10000 non-null object
Age                10000 non-null int64
Tenure             10000 non-null int64
Balance            10000 non-null float64
NumOfProducts      10000 non-null int64
HasCrCard          10000 non-null int64
IsActiveMember     10000 non-null int64
EstimatedSalary    10000 non-null float64
Exited             10000 non-null int64
dtypes: float64(2), int64(9), object(3)
memory usage: 1.1+ MB


In [21]:
df.head()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [22]:
df.tail()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
9995,9996,15606229,Obijiaku,771,France,Male,39,5,0.0,2,1,0,96270.64,0
9996,9997,15569892,Johnstone,516,France,Male,35,10,57369.61,1,1,1,101699.77,0
9997,9998,15584532,Liu,709,France,Female,36,7,0.0,1,0,1,42085.58,1
9998,9999,15682355,Sabbatini,772,Germany,Male,42,3,75075.31,2,1,0,92888.52,1
9999,10000,15628319,Walker,792,France,Female,28,4,130142.79,1,1,0,38190.78,0


### Split data into the independent vs dependent variables

In [28]:
X = df.iloc[:,3:13].values
y = df.iloc[:,-1].values

### Encoding categorical data


In [24]:
from sklearn.preprocessing import LabelEncoder, OneHotEncoder

For Geography,

In [29]:
labelencoder_X_1 = LabelEncoder() 

In [30]:
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])# column [1] for Geography

For gender,

In [31]:
labelencoder_X_2 = LabelEncoder()

In [33]:
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])

In [34]:
# create dummy variable for countries column:
onehotencoder = OneHotEncoder(categorical_features = [1])
X = onehotencoder.fit_transform(X).toarray()
#remove the 1st column to avoid dummy variable trap:
X = X[:,1:] 


### Split data into train and test sets

In [38]:
from sklearn.model_selection import train_test_split

In [39]:
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.25, random_state = 0)

### Feature Scaling

In [35]:
from sklearn.preprocessing import StandardScaler

In [36]:
sc_X = StandardScaler()

In [40]:
X_train = sc_X.fit_transform(X_train)

In [41]:
X_test = sc_X.fit_transform(X_test)

## Part 2: Making the ANN


### Import the Keras libraries and packages:

In [43]:
import keras 
from keras.models import Sequential  #initialize the neural network
from keras.layers import Dense # bulid the layers of ANN

### Initialising the ANN:

In [44]:
classifier = Sequential()

### Adding the input layer and the first hidden layer:

In [45]:
classifier.add(Dense(6, kernel_initializer = 'uniform', activation = 'relu', input_dim = 11)) 

Note:
    - The 1st hidden layer has 6 nodes
    - 'uniform': initialize the weights randomly and close to zero
    - 'relu': rectifier activation function
    - rectifier for hidden layers  (sigmoid func for output layer)

### Adding the 2nd hidden layer:

In [46]:
classifier.add(Dense(6, kernel_initializer = 'uniform', activation = 'relu')) 

### Adding the output layer:

In [47]:
classifier.add(Dense(1, kernel_initializer = 'uniform', activation = 'sigmoid')) 

Note:
    - For the output more 2 categories (ex:3), we have to change the number of units and the activation func by 3 and 'softmax'.
    - Softmax is the sigmoid function to three or more categories output. 

 ### Compiling the ANN:

In [48]:
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'] )

Here,
    - _optimizer_: algorithm to find the optimal set of weights
    - _loss_: for the output more 2 categories (ex:3), we change to 'categorical_crossentropy'.
    - _metrics_: criterion to evaluate our model.


### Fitting the ANN to the Training set:


In [49]:
classifier.fit(X_train, y_train, batch_size = 10, epochs = 100)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<keras.callbacks.History at 0x1a3348be80>

Here,             
    - bacth_size: the number of observations after  which you want to update the weights
     - epochs : number of rounds that the whole training set pass through the ANN

### Making the predictions and evaluating the model:


#### Predicting the Test set results:

In [50]:
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)

#### Making the confusion matrix

In [51]:
from sklearn.metrics import confusion_matrix

In [52]:
cm = confusion_matrix(y_test,y_pred)

In [53]:
cm

array([[1940,   51],
       [ 342,  167]])