# Yahtzee

__Fully Connected Networks__

_By Marnick van der Arend & Jeroen Smienk_

![Yahtzee](yahtzee.png)

In [1]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf

%matplotlib inline

__MODEL_PATH = 'yahtzee-models'
__TENSOR_LOG_DIR = 'yahtzee-logs'
__DATASET = 'yahtzee-dataset.csv'

## Dataset

Let's start with looking at the provided dataset:

It consists of 5832 'dice throws' and the label that describes that type of throw.

The existing Yahtzee labels are: `nothing`, `small-straight`, `three-of-a-kind`, `large-straight`,
 `full-house`, `four-of-a-kind`, and `yathzee`.
 
 Name | Description
--- | ---
3-of-a-kind | Three dice the same
4-of-a-kind | Four dice the same
Full-house | Three of one number and two of another
Small-straight | Four sequential dice
Large-straight | Five sequential dice
Yahtzee | All five dice the same
Nothing | None of the above combinations

In [2]:
df = pd.read_csv(__DATASET)
df.head(5)

Unnamed: 0,dice1,dice2,dice3,dice4,dice5,label
0,3,6,6,2,5,nothing
1,3,6,1,3,4,nothing
2,2,2,5,5,3,nothing
3,1,3,6,6,1,nothing
4,1,4,6,3,5,small-straight


Let's see how the labels are distributed in the dataset:

In [3]:
df.label.value_counts() / len(df.index)

nothing            0.663237
three-of-a-kind    0.153635
small-straight     0.094136
full-house         0.037894
large-straight     0.030521
four-of-a-kind     0.019890
yathzee            0.000686
Name: label, dtype: float64

We see that 66.3% of all data is `nothing` throws. This makes sense of course because how harder the dice throws get, the less chance exists they occur.

The most difficult throw (`yahtzee`) only occurs 0.069%! This might make it hard for any of our models to really learn what a `yahtzee` throw is.

### One-Hot Encoding

In order to classify these categorical labels, we have to 'one-hot encode' them:

In [4]:
def one_hot_encode(df):
    return pd.get_dummies(df, prefix=['label'])

one_hot_df = one_hot_encode(df)
one_hot_df.head(10)

Unnamed: 0,dice1,dice2,dice3,dice4,dice5,label_four-of-a-kind,label_full-house,label_large-straight,label_nothing,label_small-straight,label_three-of-a-kind,label_yathzee
0,3,6,6,2,5,0,0,0,1,0,0,0
1,3,6,1,3,4,0,0,0,1,0,0,0
2,2,2,5,5,3,0,0,0,1,0,0,0
3,1,3,6,6,1,0,0,0,1,0,0,0
4,1,4,6,3,5,0,0,0,0,1,0,0
5,4,1,4,3,1,0,0,0,1,0,0,0
6,4,4,4,6,2,0,0,0,0,0,1,0
7,3,2,5,6,3,0,0,0,1,0,0,0
8,3,4,3,6,2,0,0,0,1,0,0,0
9,3,3,1,5,4,0,0,0,1,0,0,0


In [5]:
# Save the labels
LABELS = one_hot_df.columns[5:].tolist()

Before we can train any model, we have to split the data and the labels into X and Y after shuffling them:

In [6]:
def get_shuffled_xy(one_hot_df):
    shuffled = one_hot_df.sample(frac=1.)
    return shuffled.iloc[:,:5].copy(), shuffled.iloc[:,5:].copy()

X, Y = get_shuffled_xy(one_hot_df)
X.head(5)

Unnamed: 0,dice1,dice2,dice3,dice4,dice5
3675,2,1,5,3,3
3382,5,3,2,4,1
5481,6,2,1,6,6
1286,3,5,6,5,2
1311,3,5,1,1,2


We also split the dataset into a 85:15 split for training and validating the model:

In [7]:
def get_split(x, y, frac):
    split = int(len(x.index) * frac)
    return x.iloc[:split], y.iloc[:split], x.iloc[split:], y.iloc[split:]

X_train, Y_train, X_valid, Y_valid = get_split(X, Y, .85)
X_train, Y_train, X_test, Y_test = get_split(X_train, Y_train, .85)

print('Split X (train, test, validation):', X_train.shape, X_test.shape, X_valid.shape)
print('Split Y (train, test, validation):', Y_train.shape, Y_test.shape, Y_valid.shape)

Split X (train, test, validation): (4213, 5) (744, 5) (875, 5)
Split Y (train, test, validation): (4213, 7) (744, 7) (875, 7)


We define a function that returns a random batch of a certain size. This batch is used in training to train faster.

In [8]:
def get_batch(data, labels, batch_size):
    x_batch = data.sample(frac=batch_size / len(data.index))
    return x_batch, labels.loc[x_batch.index]

We also define a function that does all the above so we can load an arbitrary dataset:

In [9]:
def get_csv_xy(file):
    return get_shuffled_xy(one_hot_encode(pd.read_csv(file)))

## Models

We designed several models:

_Note: all models have a last layers without an activation function, because the SoftMax is done in the cross entropy calculation._

rank | name | layers | score
--- | --- | --- | ---
5 | model_8 | (64, Tanh, Drop=.2) (128, Tanh, Drop=.3) (256, Tanh, Drop=.4) (512, Tanh, Drop=.5) (64, Tanh) | 0.93714285
4 | model_7 | (64, ReLU) (128, ReLU) (256, ReLU) (512, ReLU, Drop=.3) (64, ReLU) | 0.94057140
3 | model_6 | (200, Tanh) (300, Tanh) (600, Tanh) | ±0.97
1 | model_5_2 | (600, Tanh, Drop=.3) (300, Tanh, Drop=.3) (200, Tanh, Drop=.3) | ±0.992
2 | model_5 | (600, Tanh) (300, Tanh) (200, Tanh) | 0.97942860
6 | model_4 | (128, Tanh) (64, Tanh) (32, Tanh) | 0.87771430
7 | model_3 | (12, ReLU) (24, ReLU) (48, ReLU, Drop=.1) (96, ReLU) | 0.73028570
8 | model_2 | (128, ReLU) (64, ReLU) (32, ReLU) | 0.82400000
9 | model_1 | (128, Sigmoid) | ±0.65

The best model uses dropout to prevent overfitting and the ReLU activation function which proved to perform better.

### Metric Graphs

__Legend:__

- Model 1: BLUE
- Model 2: RED
- Model 3: LIGHT BLUE
- Model 4: PINK
- Model 5: GREEN
- Model 6: GRAY
- Model 7: ORANGE
- Model 8: ORANGE

#### Batch Accuracy Graph

![Batch Accuracy](yahtzee-acc.png)

#### Batch Loss Graph

![Batch Loss](yahtzee-loss.png)

In [10]:
def model_1(x, output_shape):
    """
    Single hidden layer with 128 neurons and Sigmoid activation function.
    """
    l_1 = tf.layers.dense(x, units=128, activation=tf.nn.sigmoid)
    return tf.layers.dense(l_1, units=output_shape, activation=None)

In [11]:
def model_2(x, output_shape):
    """
    Three hidden layers with different amounts of neurons and relu activation functions.
    """
    l_1 = tf.layers.dense(x, units=128, activation=tf.nn.relu)
    l_2 = tf.layers.dense(l_1, units=64, activation=tf.nn.relu)
    l_3 = tf.layers.dense(l_2, units=32, activation=tf.nn.relu)
    return tf.layers.dense(l_3, units=output_shape, activation=None)

In [12]:
def model_3(x, output_shape):
    """
    Six hidden layers with different amounts of neurons and 
    relu activation functions and 2 dropout layers.
    """
    l_1 = tf.layers.dense(x, units=12, activation=tf.nn.relu)
    l_2 = tf.layers.dense(l_1, units=24, activation=tf.nn.relu)
    l_3 = tf.layers.dense(l_2, units=48, activation=tf.nn.relu)
    d_3 = tf.layers.dropout(l_3, rate=.1)
    l_4 = tf.layers.dense(d_3, units=96, activation=tf.nn.relu)
    return tf.layers.dense(l_4, units=output_shape, activation=None)

In [13]:
def model_4(x, output_shape):
    """
    Three hidden layers with different amounts of neurons and relu activation functions.
    """
    l_1 = tf.layers.dense(x, units=128, activation=tf.nn.tanh)
    l_2 = tf.layers.dense(l_1, units=64, activation=tf.nn.tanh)
    l_3 = tf.layers.dense(l_2, units=32, activation=tf.nn.tanh)
    return tf.layers.dense(l_3, units=output_shape, activation=None)

In [14]:
def model_5(x, output_shape):
    """
    High number of neurons in layers, decreasing per layer; no dropout; TanH activation.
    """
    l_1 = tf.layers.dense(x, units=600, activation=tf.nn.tanh)
    l_2 = tf.layers.dense(l_1, units=300, activation=tf.nn.tanh)
    l_3 = tf.layers.dense(l_2, units=200, activation=tf.nn.tanh)
    return tf.layers.dense(l_3, units=output_shape, activation=None)

In [15]:
def model_5_2(x, output_shape):
    """
    High number of neurons in layers, decreasing per layer; consistent dropout; TanH activation.
    """
    l_1 = tf.layers.dense(x, units=600, activation=tf.nn.tanh)
    d_1 = tf.layers.dropout(l_1, rate=.3)
    l_2 = tf.layers.dense(d_1, units=300, activation=tf.nn.tanh)
    d_2 = tf.layers.dropout(l_2, rate=.3)
    l_3 = tf.layers.dense(d_2, units=200, activation=tf.nn.tanh)
    d_3 = tf.layers.dropout(l_3, rate=.3)
    return tf.layers.dense(d_3, units=output_shape, activation=None)

In [16]:
def model_6(x, output_shape):
    """
    High number of neurons in layers, increasing per layer; no dropout; TanH activation.
    """
    l_1 = tf.layers.dense(x, units=200, activation=tf.nn.tanh)
    l_2 = tf.layers.dense(l_1, units=300, activation=tf.nn.tanh)
    l_3 = tf.layers.dense(l_2, units=600, activation=tf.nn.tanh)
    return tf.layers.dense(l_3, units=output_shape, activation=None)

In [17]:
def model_7(x, output_shape):
    """
    High number of neurons in layers; ReLU activation; single dropout layer.
    """
    l_1 = tf.layers.dense(x, units=64, activation=tf.nn.relu)
    l_2 = tf.layers.dense(l_1, units=128, activation=tf.nn.relu)
    l_3 = tf.layers.dense(l_2, units=256, activation=tf.nn.relu)
    l_4 = tf.layers.dense(l_3, units=512, activation=tf.nn.relu)
    d_4 = tf.layers.dropout(l_4, rate=.3)
    l_5 = tf.layers.dense(d_4, units=64, activation=tf.nn.relu)
    return tf.layers.dense(l_5, units=output_shape, activation=None)

In [18]:
def model_8(x, output_shape):
    """
    More layers; TanH activation; increasing amount of dropout.
    """
    l_1 = tf.layers.dense(x, units=64, activation=tf.nn.tanh)
    d_1 = tf.layers.dropout(l_1, rate=.2)
    l_2 = tf.layers.dense(d_1, units=128, activation=tf.nn.tanh)
    d_2 = tf.layers.dropout(l_2, rate=.3)
    l_3 = tf.layers.dense(d_2, units=256, activation=tf.nn.tanh)
    d_3 = tf.layers.dropout(l_3, rate=.4)
    l_4 = tf.layers.dense(d_3, units=512, activation=tf.nn.tanh)
    d_4 = tf.layers.dropout(l_4, rate=.5)
    l_5 = tf.layers.dense(d_4, units=64, activation=tf.nn.tanh)
    return tf.layers.dense(l_5, units=output_shape, activation=None)

We start with the placeholder for our 5-dice input and 7-class output and choose a model:

In [19]:
def setup_model(model):
    tf.reset_default_graph()
    x = tf.placeholder(tf.float32, shape=[None, X.shape[1]], name='x')
    y = tf.placeholder(tf.float32, shape=[None, Y.shape[1]], name='y')
    return x, y, model(x, Y.shape[1])

We choose an optimizer and define a loss functon and the accuracy metric:

In [20]:
def setup_ops(y, y_pred):
    # Loss function
    cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(labels=y, logits=y_pred)
    loss_fn = tf.reduce_mean(cross_entropy)

    # Optimizer minimizes the loss
    optimizer = tf.train.AdamOptimizer(learning_rate=.001).minimize(loss_fn)

    # Accuracy metric
    #   checks if the indices of the highest values in the real 
    #   and predicted arrays are equal
    prediction = tf.argmax(y_pred, axis=1)
    real_label = tf.argmax(y, axis=1)
    correct = tf.equal(real_label, prediction)
    accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
    return loss_fn, optimizer, prediction, real_label, correct, accuracy

## Train, Test, Validate

We train the model using a certain batch size and for a number of iterations while posting scalars to TensorBoard:

In [21]:
def train_model(model, data, split_frac=.85, verbose=2, save=False):
    X, Y = data
    x, y, y_pred = setup_model(model)
    loss_fn, optimizer, _, _, _, accuracy = setup_ops(y, y_pred)

    iters = 5000
    train_batch_size = 100
    
    X_train, Y_train, X_valid, Y_valid = get_split(X, Y, split_frac)    
    X_train, Y_train, X_test, Y_test = get_split(X_train, Y_train, split_frac)

    init_op = tf.global_variables_initializer()
    saver = tf.train.Saver()
    session = tf.Session()
    with session:
        session.run(init_op)

        # Defining the metrics we want to log in TensorBoard
        sum_loss_train = tf.summary.scalar('loss_train', loss_fn)
        sum_loss_test = tf.summary.scalar('loss_test', loss_fn)
        sum_acc_train = tf.summary.scalar('acc_train', accuracy)
        sum_acc_test = tf.summary.scalar('acc_test', accuracy)
        tf.summary.merge_all()
        writer = tf.summary.FileWriter(os.path.join(__TENSOR_LOG_DIR, model.__name__), session.graph)

        # Start training for a certain number of iterations
        for i in range(iters):
            # Every iteration we get a random batch of the training data
            x_batch, y_batch = get_batch(X_train, Y_train, train_batch_size)
            # We train the model by providing the 'optimizer' variable to the run function.
            # We also want to calculate the accuracy and loss TensorBoard metrics
            loss_val, _, acc_val, sum_1, sum_2 = session.run([loss_fn, optimizer, accuracy, 
                                                              sum_loss_train, sum_acc_train], 
                                                             feed_dict={x: x_batch, y: y_batch})
            # Write the metrics to TensorBoard
            writer.add_summary(sum_1, global_step=i)
            writer.add_summary(sum_2, global_step=i)

            # Validate every 50 iterations
            if i % 50 == 0:
                # DO NOT PROVIDE THE 'optimzer' VARIABLE HERE
                # ELSE THE MODEL WILL TRAIN ON THE TEST DATA
                acc_val, sum_1, sum_2 = session.run([accuracy, sum_loss_test, sum_acc_test], 
                                                    feed_dict={x: X_test, y: Y_test})
                # Write the metrics to TensorBoard
                writer.add_summary(sum_1, global_step=i)
                writer.add_summary(sum_2, global_step=i)
                if verbose >= 2:
                    print('Testing - i:', i+1, ' Accuracy:', acc_val)

        # Validate the model with unseen data
        acc_ = session.run([accuracy], feed_dict={x: X_valid, y: Y_valid})
        if verbose >= 1:
            print('Validation accuracy:', acc_)

        # Save the model
        if save:
            path = saver.save(session, '{}.ckpt'.format(os.path.join(__MODEL_PATH, model.__name__, model.__name__)))
            if verbose >= 1:
                print('Model saved at: {}'.format(path))

        return acc_

In [22]:
train_model(model_5_2, (X, Y), verbose=2, save=False)

Testing - i: 1  Accuracy: 0.66129035
Testing - i: 51  Accuracy: 0.66129035
Testing - i: 101  Accuracy: 0.66129035
Testing - i: 151  Accuracy: 0.6586022
Testing - i: 201  Accuracy: 0.65994626
Testing - i: 251  Accuracy: 0.66263443
Testing - i: 301  Accuracy: 0.65994626
Testing - i: 351  Accuracy: 0.6639785
Testing - i: 401  Accuracy: 0.6639785
Testing - i: 451  Accuracy: 0.6747312
Testing - i: 501  Accuracy: 0.6532258
Testing - i: 551  Accuracy: 0.65456986
Testing - i: 601  Accuracy: 0.672043
Testing - i: 651  Accuracy: 0.67876345
Testing - i: 701  Accuracy: 0.672043
Testing - i: 751  Accuracy: 0.67741936
Testing - i: 801  Accuracy: 0.69489247
Testing - i: 851  Accuracy: 0.6935484
Testing - i: 901  Accuracy: 0.702957
Testing - i: 951  Accuracy: 0.7258065
Testing - i: 1001  Accuracy: 0.766129
Testing - i: 1051  Accuracy: 0.7486559
Testing - i: 1101  Accuracy: 0.7983871
Testing - i: 1151  Accuracy: 0.82123655
Testing - i: 1201  Accuracy: 0.827957
Testing - i: 1251  Accuracy: 0.85215056
Te

[0.99314284]

### Cross Validation

We shuffle and train the model multiple times to rule out luck:

In [23]:
def cross_validate_train(model, file, cv=3, split_frac=.85):
    accuracies = []
    for i in range(cv):
        print('{}/{}'.format(i, cv))
        accuracies.append(train_model(model, get_csv_xy(file), verbose=0, save=False))
    return cv, np.mean(accuracies)

In [24]:
cv, mean_acc = cross_validate_train(model_5_2, __DATASET)
print('Cross validation ({}x): {}'.format(cv, mean_acc))

0/3
1/3
2/3
Cross validation (3x): 0.9817142486572266


## Predicting with the Best Model

In [25]:
"""
EDIT THIS TO RUN BEST MODEL
"""
__TEACHER_VALIDATION_SET = 'yahtzee-other.csv'

We load the model and the dataset and run just the accuracy metric:

In [26]:
model_to_load = model_5_2

load_path = '{}.ckpt'.format(os.path.join(__MODEL_PATH, 
                                          model_to_load.__name__, 
                                          model_to_load.__name__))

X_teacher, Y_teacher = get_csv_xy(__TEACHER_VALIDATION_SET)

x, y, y_pred = setup_model(model_to_load)
loss_fn, optimizer, prediction, real_label, correct, accuracy = setup_ops(y, y_pred)

with tf.Session() as saved_session:
    tf.train.Saver().restore(saved_session, load_path)

    # Validate the model with unseen data
    acc_, correct_, prediction_, real_label_ = saved_session.run([accuracy, correct, prediction, real_label], 
                                                     feed_dict={x: X_teacher, y: Y_teacher})
    print('Validation accuracy imported {}: {}'.format(model_to_load.__name__, acc_))

INFO:tensorflow:Restoring parameters from yahtzee-models/model_5_2/model_5_2.ckpt
Validation accuracy imported model_5_2: 0.8781294226646423


### Incorrect Predictions

In [27]:
def get_label_name(argmax):
    return LABELS[argmax][6:]


# Predictions
print('a b c d e - real label     - prediction     - is correct')
for i in range(len(correct_)):
    row = X_teacher.iloc[i]
    if not correct_[i]:
        print('{} {} {} {} {} - {} - {} - {}'.format(row[0], row[1], row[2], row[3], row[4],
                                                     get_label_name(real_label_[i]),
                                                     get_label_name(prediction_[i]), correct_[i]))

a b c d e - real label     - prediction     - is correct
5 3 4 2 3 - small-straight - nothing - False
3 5 2 5 4 - small-straight - nothing - False
4 3 2 5 2 - small-straight - nothing - False
2 4 4 5 3 - small-straight - nothing - False
4 2 3 4 5 - small-straight - nothing - False
2 5 3 4 5 - small-straight - nothing - False
5 2 5 4 3 - small-straight - nothing - False
3 3 5 2 4 - small-straight - nothing - False
4 2 3 3 5 - small-straight - nothing - False
5 3 2 3 4 - small-straight - nothing - False
5 3 4 4 2 - small-straight - nothing - False
2 5 4 3 4 - small-straight - nothing - False
4 5 5 2 3 - small-straight - nothing - False
4 5 2 2 3 - small-straight - nothing - False
2 5 3 2 4 - small-straight - nothing - False
4 4 5 3 2 - small-straight - nothing - False
2 5 5 3 4 - small-straight - nothing - False
2 4 2 5 3 - small-straight - nothing - False
5 3 3 4 2 - small-straight - nothing - False
4 3 2 5 5 - small-straight - nothing - False
4 2 3 5 3 - small-straight - nothing - Fals

3 4 2 1 4 - small-straight - nothing - False
5 4 3 2 3 - small-straight - nothing - False
4 3 4 5 2 - small-straight - nothing - False
4 3 2 5 3 - small-straight - nothing - False
3 5 2 4 2 - small-straight - nothing - False
5 4 5 2 3 - small-straight - nothing - False
3 4 2 5 4 - small-straight - nothing - False
4 5 3 2 4 - small-straight - nothing - False
2 2 4 3 5 - small-straight - nothing - False
2 4 5 2 3 - small-straight - nothing - False
2 3 4 4 5 - small-straight - nothing - False
2 3 5 3 4 - small-straight - nothing - False
2 4 4 3 5 - small-straight - nothing - False


Okay, so what we noticed is that the less than 20 predictions our model got wrong are all `small-` and `large-straight`s mixups. It has a hard time to predict these two labels correct.

## Conclusion & Findings

### Dataset

- This type of data is not appropriate for machine learning. A simple rule-based algorithm could determine the labels with 100% accuracy.
- We could 'cheat' the system by generating the other 20% of dice combinations and train on that, because there is finite number of 'dice throw' combinations, but we didn't. We just used those to see how well we really did.
- ~~The dataset rows containing number 2, 3, 4 and 5 in any order are incorrectly labelled as `nothing` instead of `small-straight`.~~
- The dataset contains 5832 rows which is enough for this deep learning excersice.
- The distribution of the labels may cause the model to not optimally learn the very rare throws like `yahtzee`.

### Model

We tested with smaller and larger numbers of neurons, more and less layers, and with and without dropout, but in the end the best model was a large number of neurons with just three layers and no dropout. You would normally think this would totally overfit, but it does not according to the validation accuracy.



#### Activation Functions

We looked at Sigmoid, ReLU and TanH. Sigmoid performed less well than ReLU and TanH. To our surprise TanH worked best. We were surprised because we were simply 'used to' ReLU working best.

#### Predictions

- We noticed that out model has a hard time predicting the labels `small-straight` and `large-straight`. It mixes them up or sometimes predicts `nothing` instead. These are the only mistakes the model makes.
- So it predicts the `yahtzee`s correctly despite our exptactions it wouldn't, because of the rarity of `yahtzee` in the training data.

...

...

...

...

...

...

## Extra: 100% Success

_Note: This requires 'Predicting with the Best Model' to be executed._

Creating rules to 'predict' the labels give a 100% accuracy.

In [28]:
DICE = 5
EYES = 6


def inc_dice_cnt(i):
    """
    Moves to the next dice combination
    """
    if i == DICE:
        return
    dice_count[i] += 1
    if dice_count[i] == EYES:
        dice_count[i] = 0
        inc_dice_cnt(i+1)


def get_cnt(arr):
    """
    Get an occurences array with the count of a value at the index of that value
    
    get_cnt([0, 1, 1, 3, 3]) == [1, 2, 0, 2, 0]
    """
    cnt = np.zeros(EYES)
    for i in arr:
        cnt[i] += 1
    return cnt


def get_seq(arr):
    """
    Get the largest number of sequential numbers in the array
    
    get_seq([0, 5, 3, 2, 4]) == 4
    """
    sort = np.unique(sorted(arr))
    seq, top_seq = 0, 0
    for i in range(1, len(sort)):
        if sort[i] - 1 == sort[i-1]:
            seq += 1
        else:
            top_seq = max(seq, top_seq)
            seq = 0
    return max(seq, top_seq)+1


def get_label(arr):
    """
    Get the Yahtzee label for a given dice combination array
    """
    cnt = get_cnt(arr)
    seq = get_seq(arr)
    if 5 in cnt:
        return 'yathzee'  # intentional typo because teacher supplied dataset also includes this typo
    if seq == 5:
        return 'large-straight'
    if seq == 4:
        return 'small-straight'
    if 3 in cnt and 2 in cnt:
        return 'full-house'
    if 4 in cnt:
        return 'four-of-a-kind'
    if 3 in cnt:
        return 'three-of-a-kind'
    return 'nothing'

In [29]:
print('a b c d e - real label     - prediction     - is correct')

nr_correct = 0
for i, (_, row) in enumerate(X_teacher.iterrows()):
    man_prediction = get_label(row-1)
    real_label_val = get_label_name(real_label_[i])
    is_correct = man_prediction == real_label_val
    if not is_correct:
        print('{} {} {} {} {} - {} - {} - {}'.format(row[0], row[1], row[2], row[3], row[4],
                                                     real_label_val, man_prediction, is_correct))
    else:
        nr_correct += 1
    
print('Rule based accuracy: {}'.format(nr_correct / len(X_teacher.index)))

a b c d e - real label     - prediction     - is correct
Rule based accuracy: 1.0


### Generating All Combinations

In [30]:
all_throws = np.zeros((EYES**DICE, DICE), dtype=np.int)
dice_count = np.zeros(DICE, np.int)

labels = []
for i in range(all_throws.shape[0]):
    all_throws[i] = dice_count
    labels.append(get_label(dice_count))
    inc_dice_cnt(0)

df_all = pd.DataFrame(all_throws+1, columns=['dice1', 'dice2', 'dice3', 'dice4', 'dice5'])
df_all['label'] = labels
# df_all.to_csv('yahtzee-all.csv', index=False)

In [31]:
df_other = pd.concat([df, df_all]).drop_duplicates(keep=False)
# df_other.to_csv('yahtzee-other.csv', index=False)