# Dino Game model fitting

## Summary
* Initial go at dino game
* Read-in Spencer's data and do some very minimal pre-processing to make things play nicely with popular ML pacakges like sklearn, keras, and PyTorch
* Fit basic Softmax Regression model -- performs WAY better than I expected
* Though this nice performance could be super misguiding -- could be a result of the data structure I mentioend above, or the fact that these samples aren't IID (one frame of the game relies on another) so we are inherently violating the assumptions of a LogReg model. For this, I suggest a "thinning" type of approach across even more dino-games if we want to learn an actual mapping
* Will likely need a lot more data to train a deep-net, especially since LogReg performs so well.

### Inputs

* Spencer went all-out and sent massive image files in RGB channels (lol) so for now, I will just flatten these, drop that third channel (will mess up scaling of values in matrix though), and train on these vectors.

### Outputs

* A label prediction for the key to press (or what action the model should take given the pixel values in the image?)

### Modeling task
* Given an input image $X$, output a label for the action to be taken by the model (jump, duck, nothing)

### Evaluation metric
* Classification accuracy 
* Cross-entropy loss for training

### Models
* Multinomial logistic/Softmax regression
* ConvNets (1D and 2D) -- Spencer suggest ResNet, but I have so little knowledge in this field, I kinda of want to do a "survey" first
* SVM models?

### To-do
* Data pre-processing to make this task more "learnable" is needed. 
    * Images should be converted to grayscale -- models not learning well
    * Perhaps move from predicting the integer value of the key-press and move to a one-hot encoding (I can write a utility function to go back and forth from the two to make the actual dinosaur move)
   
* More modeling
     * Move to Google Collab for ConvNets -- LogReg was painfully slow on my local machine to fit
     * Flat try fancy models next!

In [1]:
import random
import pickle
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, classification_report, accuracy_score
from pathlib import Path
from keras.utils.np_utils import to_categorical
from typing import List
import cv2 as cv
# Using SMOTE for the over sampling portion.
from imblearn.over_sampling import SMOTE
%matplotlib inline

In [2]:
ta1 = os.path.join(Path(os.getcwd()).parent,'Window_capture\\Data\\command_keys_act.npy')
sa1 = os.path.join(Path(os.getcwd()).parent,'Window_capture\\Data\\cleaned_data_act.npy')

In [3]:
labels1 = np.load(ta1)
images1 = np.load(sa1, allow_pickle = True)
print("Length Command Keys Shape: ",labels1.shape)
print("Length Screenshot Shape: ",images1.shape)
print("Screenshot Shape: ",images1[0].shape)

Length Command Keys Shape:  (1823,)
Length Screenshot Shape:  (1823, 288, 1450)
Screenshot Shape:  (288, 1450)


In [4]:
target_address = os.path.join(Path(os.getcwd()).parent,'Window_capture\\Data\\command_keys.npy')
# screenshot_address = os.path.join(Path(os.getcwd()).parent,'Window_capture\\Data\\screenshots.npy')
screenshot_address = os.path.join(Path(os.getcwd()).parent,'Window_capture\\Data\\cleaned_data.npy')

labels = np.load(target_address)
images = np.load(screenshot_address, allow_pickle = True)


print("Length Command Keys Shape: ",labels.shape)
print("Length Screenshot Shape: ",images.shape)
print("Screenshot Shape: ",images[0].shape)

Length Command Keys Shape:  (12563,)
Length Screenshot Shape:  (12563, 288, 1450)
Screenshot Shape:  (288, 1450)


In [5]:
# cv.imshow('Canny Image',images[0]) # We will use this.
# cv.waitKey(0) # press any key to quit.
# cv.destroyWindow('Canny Image')

In [6]:
labels0 = np.concatenate((labels, labels1))
labels0.shape # Total # of labels.

(14386,)

In [7]:
np.unique(labels0, return_counts = True) # We see quite a bit of imbalance among the do nothing / jump / duck

(array([-1, 38, 40]), array([12070,  1132,  1184], dtype=int64))

In [8]:
res_list = [i for i, value in enumerate(labels0) if value == -1]
idx = np.random.choice(len(res_list), 11000, replace=False) # Randomly choose X number of entries to be deleted specified as -1
idx # indices to remove from the image dataset (that has the do nothing observations denoted as '-1')

array([10494,  7110,  2044, ...,  7381,  6506, 10714])

In [9]:
images.shape # Just taking a look at the number of osbervations from the images 

(12563, 288, 1450)

In [10]:
images_flat = pd.DataFrame(images[:, :, :].flatten().reshape(images.shape[0], 417600))
images_flat = images_flat.drop(images_flat.index[idx]) # flatten images then converted to dataframe for easier removal of idx

In [11]:
images_flat.shape # Result. Total Rows - number of rows to be forgotten

(1563, 417600)

In [12]:
images_flat1 = pd.DataFrame(images1[:, :, :].flatten().reshape(images1.shape[0], 417600)) # Jumps and ducks
images_flat1.shape

(1823, 417600)

In [13]:
imgs = np.vstack((images_flat, images_flat1))
imgs.shape # Final Shape

(3386, 417600)

In [14]:
labels0 = np.delete(labels0, idx)
labels0.shape

(3386,)

In [15]:
np.unique(labels0, return_counts = True)

(array([-1, 38, 40]), array([1492,  921,  973], dtype=int64))

In [16]:
X_train, X_test, y_train, y_test = train_test_split(imgs, labels0, test_size = 0.25)

In [17]:
np.unique(y_train, return_counts = True)

(array([-1, 38, 40]), array([1140,  689,  710], dtype=int64))

In [18]:
#Oversampling the data
smote = SMOTE(random_state = 101)
X_train_samp, y_train_samp = smote.fit_resample(X_train, y_train)

In [19]:
np.unique(y_train_samp, return_counts = True) # Oversampled Balanced.

(array([-1, 38, 40]), array([1140, 1140, 1140], dtype=int64))

In [20]:
# Fit LogReg model
log_reg = LogisticRegression()
log_reg.fit(X_train_samp,y_train_samp)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression


LogisticRegression()

In [21]:
y_hat = log_reg.predict(X_test)
print(f'LogReg accuracy on held-out frames = {round(accuracy_score(y_test, y_hat),4)}')

LogReg accuracy on held-out frames = 0.9764


In [22]:
confusion_matrix(y_test, y_hat, labels=[-1, 38, 40])
target_names = ['nothing', 'up', 'down']
print(classification_report(y_test, y_hat, target_names=target_names))

              precision    recall  f1-score   support

     nothing       0.97      0.99      0.98       352
          up       0.96      0.96      0.96       232
        down       0.99      0.98      0.98       263

    accuracy                           0.98       847
   macro avg       0.98      0.97      0.98       847
weighted avg       0.98      0.98      0.98       847



In [23]:
pickle.dump(log_reg, open('Existing_Models/log-reg.pkl', 'wb'))

In [None]:
images_flat = images[:, :, :, 0].flatten().reshape(3760, 417600)

In [None]:
# We want an input vector of 3760x(1450x288)
print(f'Shape of input vector: {images_flat.shape}')
print(f'Shape of targets: {labels.shape}')

#### View the actual frames from the screen capture, and the corresponding label

In [None]:
# Take a look at the idx'th frame
idx = 10
plt.rcParams['figure.figsize'] = [25, 10]
print(f'Label= {labels[idx]}')
plt.imshow(images[idx], interpolation='nearest')
plt.show()

#### Split into train-test splits for now (will do train-dev-test next week when I get into hyperparameter tuning)

In [None]:
X_train, X_test, y_train, y_test = train_test_split(images_flat, labels, test_size = 0.25)

In [None]:
print(f'{len(X_train)} training examples')
print(f'{len(X_test)} testing examples')

In [None]:
# Fit LogReg model
log_reg = LogisticRegression()
log_reg.fit(X_train, y_train)

In [None]:
y_hat = log_reg.predict(X_test)

In [None]:
print(f'LogReg accuracy on held-out frames = {round(accuracy_score(y_test, y_hat),4)}')

In [None]:
confusion_matrix(y_test, y_hat, labels=[32, 38, 40])

In [None]:
target_names = ['start', 'up', 'down']
print(classification_report(y_test, y_hat, target_names=target_names))

In [None]:
pickle.dump(log_reg, open('log-reg.pkl', 'wb'))