# Deep Learning On Poker Hand Dataset
In this notebook, we will apply deep learning to enhance the prediction accuracy that we couldnt get from ML models.

In [1]:
import warnings
warnings.filterwarnings('ignore')

## Helper functions

In [44]:
import pandas as pd

def load_data(path):
    df = pd.read_csv(path)
    return df

In [29]:
from sklearn.preprocessing import StandardScaler

def scale_data(x):
    scaler = StandardScaler()
    x = scaler.fit_transform(x)
    return x

In [62]:
from sklearn.model_selection import train_test_split

def get_prep_data():
    df_train = load_data('../../dataset/poker-hand-traintest')
    df_test = load_data('../../dataset/poker-hand-test')
    
    df_train = df_train.iloc[:, 1:]
    df_test = df_test.iloc[:25009, 1:]
    
    df = pd.concat([df_train, df_test])
    
    x = df.iloc[:, 0:10]
    x = scale_data(x)
    y = df['Hand']
    data_splits = train_test_split(x, y, test_size=0.2)
    return data_splits

In [63]:
x_train, x_test, y_train, y_test = get_prep_data()
print(f'x_train: {x_train.shape} \ny_train: {y_train.shape} \nx_test: {x_test.shape} \ny_test: {y_test.shape}')

x_train: (40014, 10) 
y_train: (40014,) 
x_test: (10004, 10) 
y_test: (10004,)


## Building DFFNN V1

In [84]:
from keras.models import Sequential
from keras.layers import Dense

model_v1 = Sequential()

#Layers (includes input, hidden, and output layers).
model_v1.add(Dense(units=128, input_dim=10, activation='relu'))
model_v1.add(Dense(units=128, activation='tanh'))
model_v1.add(Dense(units=128, activation='tanh'))
model_v1.add(Dense(units=128, activation='tanh'))
model_v1.add(Dense(units=128, activation='tanh'))
model_v1.add(Dense(units=128, activation='tanh'))
model_v1.add(Dense(units=128, activation='tanh'))
model_v1.add(Dense(units=10, activation='softmax'))
             
#Compiling the model.
model_v1.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

#Summary of the model.
model_v1.summary()

Model: "sequential_15"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_68 (Dense)             (None, 128)               1408      
_________________________________________________________________
dense_69 (Dense)             (None, 128)               16512     
_________________________________________________________________
dense_70 (Dense)             (None, 128)               16512     
_________________________________________________________________
dense_71 (Dense)             (None, 128)               16512     
_________________________________________________________________
dense_72 (Dense)             (None, 128)               16512     
_________________________________________________________________
dense_73 (Dense)             (None, 128)               16512     
_________________________________________________________________
dense_74 (Dense)             (None, 128)             

In [85]:
model_v1.fit(x_train, y_train, batch_size=500, epochs=150, validation_data=(x_test, y_test))

Epoch 1/150
Epoch 2/150
Epoch 3/150
Epoch 4/150
Epoch 5/150
Epoch 6/150
Epoch 7/150
Epoch 8/150
Epoch 9/150
Epoch 10/150
Epoch 11/150
Epoch 12/150
Epoch 13/150
Epoch 14/150
Epoch 15/150
Epoch 16/150
Epoch 17/150
Epoch 18/150
Epoch 19/150
Epoch 20/150
Epoch 21/150
Epoch 22/150
Epoch 23/150
Epoch 24/150
Epoch 25/150
Epoch 26/150
Epoch 27/150
Epoch 28/150
Epoch 29/150
Epoch 30/150
Epoch 31/150
Epoch 32/150
Epoch 33/150
Epoch 34/150
Epoch 35/150
Epoch 36/150
Epoch 37/150
Epoch 38/150
Epoch 39/150
Epoch 40/150
Epoch 41/150
Epoch 42/150
Epoch 43/150
Epoch 44/150
Epoch 45/150
Epoch 46/150
Epoch 47/150
Epoch 48/150
Epoch 49/150
Epoch 50/150
Epoch 51/150
Epoch 52/150
Epoch 53/150
Epoch 54/150
Epoch 55/150
Epoch 56/150
Epoch 57/150
Epoch 58/150


Epoch 59/150
Epoch 60/150
Epoch 61/150
Epoch 62/150
Epoch 63/150
Epoch 64/150
Epoch 65/150
Epoch 66/150
Epoch 67/150
Epoch 68/150
Epoch 69/150
Epoch 70/150
Epoch 71/150
Epoch 72/150
Epoch 73/150
Epoch 74/150
Epoch 75/150
Epoch 76/150
Epoch 77/150
Epoch 78/150
Epoch 79/150
Epoch 80/150
Epoch 81/150
Epoch 82/150
Epoch 83/150
Epoch 84/150
Epoch 85/150
Epoch 86/150
Epoch 87/150
Epoch 88/150
Epoch 89/150
Epoch 90/150
Epoch 91/150
Epoch 92/150
Epoch 93/150
Epoch 94/150
Epoch 95/150
Epoch 96/150
Epoch 97/150
Epoch 98/150
Epoch 99/150
Epoch 100/150
Epoch 101/150
Epoch 102/150
Epoch 103/150
Epoch 104/150
Epoch 105/150
Epoch 106/150
Epoch 107/150
Epoch 108/150
Epoch 109/150
Epoch 110/150
Epoch 111/150
Epoch 112/150
Epoch 113/150
Epoch 114/150
Epoch 115/150
Epoch 116/150
Epoch 117/150
Epoch 118/150
Epoch 119/150
Epoch 120/150
Epoch 121/150
Epoch 122/150
Epoch 123/150
Epoch 124/150
Epoch 125/150
Epoch 126/150
Epoch 127/150
Epoch 128/150
Epoch 129/150
Epoch 130/150
Epoch 131/150
Epoch 132/150
Epoch

<tensorflow.python.keras.callbacks.History at 0x279b307b730>

In [86]:
testing_result = model_v1.evaluate(x_test, y_test)
training_evaluation = model_v1.evaluate(x_train, y_train)
print(f'Training Accuracy = {training_evaluation[1]*100:.2f}%\nTesting Accuracy = {testing_result[1]*100:.2f}%')

Training Accuracy = 99.95%
Testing Accuracy = 99.62%


## Building DFFNN V2
Here, we will use sklearn's multilayer perceptron.

In [91]:
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score

model_v2 = MLPClassifier(
            hidden_layer_sizes=(128,128),
            activation='tanh',
            solver='adam',
            batch_size=500,
            learning_rate='constant',
            max_iter=150,
            verbose=True
           )

model_v2.fit(x_train, y_train)

Iteration 1, loss = 1.18062621
Iteration 2, loss = 0.98338842
Iteration 3, loss = 0.96989284
Iteration 4, loss = 0.96372402
Iteration 5, loss = 0.95835276
Iteration 6, loss = 0.95769834
Iteration 7, loss = 0.95742972
Iteration 8, loss = 0.95496665
Iteration 9, loss = 0.95464886
Iteration 10, loss = 0.95226437
Iteration 11, loss = 0.95129767
Iteration 12, loss = 0.95146385
Iteration 13, loss = 0.95066333
Iteration 14, loss = 0.94896713
Iteration 15, loss = 0.94792651
Iteration 16, loss = 0.94667833
Iteration 17, loss = 0.94578998
Iteration 18, loss = 0.94618540
Iteration 19, loss = 0.94357678
Iteration 20, loss = 0.94247971
Iteration 21, loss = 0.94140274
Iteration 22, loss = 0.94058101
Iteration 23, loss = 0.93818483
Iteration 24, loss = 0.93762343
Iteration 25, loss = 0.93556430
Iteration 26, loss = 0.93232824
Iteration 27, loss = 0.93228984
Iteration 28, loss = 0.92878958
Iteration 29, loss = 0.92592463
Iteration 30, loss = 0.92567307
Iteration 31, loss = 0.91856443
Iteration 32, los

MLPClassifier(activation='tanh', batch_size=500, hidden_layer_sizes=(128, 128),
              max_iter=150, verbose=True)

In [96]:
y_pred = model_v2.predict(x_test)
print(f'Testing Accuracy = {accuracy_score(y_test, y_pred)*100}%')

Testing Accuracy = 97.7109156337465%


## Conclusion

If we compare our both versions, we can see that v1 is clearly the best model. However, if we increase some parameters in sklearn's model, we can achieve similar results in v2. Furthermore, we can increase our test data and then see the accuracy. However, we already divided the test set into the same size as of train set and with that both of the models performed very well. Hence, we will save both of them for comparative study.

## Saving the model

In [94]:
model_v1.save('dl_v1.h5')

import pickle
pickle.dump(model_v2, open('dl_v2.pkl', 'wb'))

## Using the models

In [2]:
from keras.models import load_model
import pickle

model_1 = load_model('dl_v1.h5')
model_2 = pickle.load(open('dl_v2.pkl', 'rb'))

In [6]:
import numpy as np

pred_1 = model_1.predict(np.array([2,11,2,13,2,10,2,12,2,1]).reshape(1,-1)).argmax()
pred_2 = model_2.predict(np.array([2,11,2,13,2,10,2,12,2,1]).reshape(1,-1))

print(pred_1, pred_2)

2 [9]


In [10]:
pred_1 = model_1.predict(np.array([4,1,4,13,4,12,4,11,4,10]).reshape(1,-1)).argmax()
pred_2 = model_2.predict(np.array([1,2,1,4,1,5,1,3,1,6]).reshape(1,-1))

print(pred_1, pred_2)

2 [4]


This concludes our deep learning session.