### This notebook chronicles the development of a myopic time-series model for running your vehicle on autopilot in a game such as GTA 5 or Forza.

# THEORY:

### Recurrent LSTM networks are fairly good at predicting on time series data. However, I've made a slight mathematical adjustment to the recurrent training data- instead of training the model on a series of discrete data points indicating net displacement, I've augmented the data to represent a series of Gaussian distributions centred around the ground-truth value from the original dataset.

### This has a few properties: 

### (1) The predictions Y_n are a function of a randomly sampled value Y_n-1; in the limit as the amount of such recurrent samples approaches infinity, the distribution of training data becomes a Gaussian distribution function. Through integral calculus, it can be shown that the expectation value of each distribution is equal to the value before the one being predicted. This allows us to model our time series data as a martingale, which is fitting since noisy data undergoes geometric brownian motion, and enables us to regularise the data to many possible outcomes as a result, i.e. order from chaos. 

### (2) Since the data is a martingale AND a Gaussian process, using Baye's theorem for several joint probabilities, it can be shown that the logarithm of the probability of predicting a certain value does in fact vary proportionally to the expectation value of the mean-squared-loss function from linear regression models. 

### TL;DR Applying a regularization technique from variational autoencoders to a LSTM model allows us to take advantage of the recurrent  nature of the model to basically train it on a martingale, and then because of how the math works out (pen and paper wise), can be treated as a regular problem just by using a mean-squared-error loss function. Additionally, because Gaussians are continuous and infinitely differentiable (although they don't have the nicest derivatives), I'm led to believe that backpropagation will train the network using information from these Gaussians and hence generalize the model to adapt to them more in the future. 

# APPLICATION: 

### What we have is a time series model to predict the behaviour of a car in 2 dimensions. Assuming that a video game obeys most of the laws of classical mechanics, it should suffice to calculate things like velocity and acceleration by computing first and second order derivatives of the position in 2D.

### The ML model predicts the components of acceleration in the plane of the road when activated. While 'inactive', the model is training in the background on keylogger data. This is done simultaneously via implementing both the model and a keylogger via python's multithreading module. The force is (assumed) constant, and when the user leaves his chair to accept a pizza at the front door, the keylogger will detect that there has been no user input for long enough that autopilot kicks in!

### The network determines based on Gaussians centred around recent driving data what keys to press, i.e. hit the brakes or gas, right or left turn holds. Granted, in this naive form, the car is able to predict human driving patterns-but it is completely blind. The assumption is that you have enough time to answer the door, return and hopefully you haven't crashed! 

In [19]:
#Data pipeline has 3 parts:

#2 Model training threads

#Keylogging thread 

#Model deployment thread 

In [72]:
import warnings
import numpy as np
import pandas as pd
import pyautogui

ModuleNotFoundError: No module named 'pyautogui'

In [73]:
pip install pyautogui

Collecting pyautogui
  Using cached PyAutoGUI-0.9.53.tar.gz (59 kB)
Collecting pymsgbox
  Using cached PyMsgBox-1.0.9.tar.gz (18 kB)
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h    Preparing wheel metadata ... [?25ldone
[?25hCollecting PyTweening>=1.0.1
  Using cached pytweening-1.0.4.tar.gz (14 kB)
Collecting pyscreeze>=0.1.21
  Using cached PyScreeze-0.1.28.tar.gz (25 kB)
^C
  Installing build dependencies ... [?25l[?25hcanceled
[31mERROR: Operation cancelled by user[0m
Note: you may need to restart the kernel to use updated packages.


In [None]:
#Data Preprocessing
from sklearn import model_selection
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split 
from sklearn.preprocessing import MinMaxScaler
#Time Series Machine Learning Based
import keras
from keras.models import Sequential
from keras.layers import Dense, LSTM, Dropout  
from keras.models import Sequential 
import tensorflow as tf
#Discrete

In [55]:
from multiprocessing import Process
import time 
import os
import random
from pynput.keyboard import Key, Listener, KeyCode
import logging

In [21]:
logdir = os.getcwd() + '/'
logdir

'/home/dolan/Desktop/DataScience/selfdriving/'

In [70]:
array = []


def onPress(key):
    global array
    a = KeyCode.from_char("a")
    logging.info(str(key))
    array.append(key)
    start = time.time()
    if key == Key.esc:
        return False
    elif key == a:
        print('Hooray!')
    else:
        print(key)
    print(time.time() - start)
    
    #if key == Key.esc:
    # return false
with Listener(on_press = onPress) as listener:
    listener.join() 
    #listener.append()  


Key.down
0.0013761520385742188
Key.up
0.0001418590545654297
Key.down
0.00020575523376464844
Key.up
0.0001533031463623047
Key.up
0.00017499923706054688
Key.down
0.0001316070556640625
Key.up
0.000171661376953125


KeyboardInterrupt: 

In [49]:
val = array[-1]
if str(val) == 'a':
    print('yay')

In [65]:
test = KeyCode.from_char("a")
if array[-1] == test:
    print('Hooray')

Hooray


In [13]:
logging.info()

TypeError: info() missing 1 required positional argument: 'msg'

In [28]:
#Multithreading practice example

counter = 0

def fizz():
    global counter 
    while counter < 10:
        print('fizz', counter)
        counter += 1
def buzz():
    global counter 
    while counter < 10:
        print('buzz', counter)
        counter += 1
p1 = Process(target = fizz)
p1.start()
p2 = Process(target = buzz)
p2.start()


fizz buzz0 0
fizz 
buzz1 
fizz1 
2
buzzfizz 3 2

fizz buzz 43
fizz 5
fizz 6

buzzfizz 4
buzz 5
buzz  67

fizzbuzz  87

fizzbuzz 8 
9buzz
 9


In [77]:
class Vehicle:
    
    def __init__(self, memory_window_size, model):
        self.latest_vx = 0.0
        self.latest_vy = 0.0 #2D velocity components
        '''In order to have up-to-data training data, the two arrays must be QUEUES.
        The oldest information needs to be forgotten first, so pop from the front of each every time
        a new value is added to the velocity data for the graph.
        
        #Assumption is that by cumulatively adding increments of the acceleration with each keypress,
        the sum in the onPress function effectively numerically integrates the acceleration wrt time,
        i,e. += 1.0 * alpha, our incremental change in time (although we should adjust this by finding differences
        in Unix time!)
        
        
        '''
        self.model = model
        self.history = None
        self.vx = [] #Update this parameter with new values each time
        self.vy = []
        
        self.X_horizontal = None
        self.y_horizontal = None
        
        self.X_vertical = None
        self.y_vertical = None
        
        self.memory_window_size = memory_window_size
        
        self.X_horizontal, self.y_horizontal = [0] * memory_window_size , [0] * memory_window_size
        self.X_vertical, self.y_vertical  = [0] * memory_window_size , [0] * memory_window_size

        
        
        self.time_elapsed = 0.0
        
    def onPress(self, key, alpha=0.1):
        logging.info(str(key))
        w = KeyCode.from_char("w")
        s = KeyCode.from_char("s")
        d = KeyCode.from_char("d")
        a = KeyCode.from_char("a")
        #training_thread = process(target=self.train)
        #Generate velocity-time data by numerically integrating acceleration over each keypress timestep
        start = time.time()
        self.key = key
        if self.key == w:
            self.latest_vy += 1.0 * alpha #dv = adt
            self.vy.append(self.latest_vy)
        elif self.key == s:
            self.latest_vy -= 1.0 * alpha
            self.vy.append(self.latest_vy)
        elif self.key == d:
            self.latest_vx += 1.0 * alpha
            self.vx.append(self.latest_vx)
        elif self.key == a:
            self.latest_ax -= 1.0 * alpha
            self.vx.append(self.latest_vx)
        elif self.key == Key.esc:
            self.time_elapsed = 0.0
            return false
        alpha = time.time() - start #Calculate delta_t for numerical integrals
        
    def listen(self, called=False):
        p = KeyCode.from_char("p")
        if called is True:
            with Listener(on_press = self.onPress) as listener:

                listener.join() 
                listener.append()  
    
    
    def gaussian_noise(self, value, num_noise_pts=100):
        sigma = random.uniform(0, 1) 
        return np.random.normal(value, sigma,(num_noise_pts, value.shape[0])) #standard deviation can be obtained more accu 

    def generate_gaussian_training_data(self, memory_window_size=1000):  
#         for i in range(1,len(vx)):
        
        #Add new data to training set, and pop outdated data from the front 
        self.X_horizontal.append(self.gaussian_noise(self.vx[i-1]))
        self.y_horizontal.append(self.vx[i])
        self.X_horizontal.pop(0), y_horizontal.pop(0)
        
        self.X_vertical.append(self.gaussian_noise(self.vy[i-1]))
        self.y_vertical.append(self.vy[i])
        
        self.X_vertical.pop(0), self.y_vertical(0)
        
        self.X_horizontal, self.y_horizontal = np.array(self.X_horizontal), np.array(self.y_horizontal)
        self.X_vertical, self.y_vertical = np.array(self.X_vertical), np.array(self.y_vertical)
        
        self.X_horizontal = self.X_horizontal/np.max(self.X_horizontal)
        self.y_horizontal = self.y_horizontal/np.max(self.y_horizontal)
        
        self.X_vertical = self.X_vertical/np.max(X_vertical)
        self.y_vertical = self.y_vertical/np.max(y_vertical)
    
    def autopilot(self, timestep=0.0015):
        '''Generate a Gaussians every timestep centred around the 
        previously predicted value if evaluating in real time.
        '''
        X_horiz_eval = self.gaussian_noise(self.X_horizontal[-1])
        X_vert_eval = self.gaussian_noise(self.X_vertical[-1])
        
        predicted_x = model.predict(np.expand_dims(X_horiz_eval))  
        predicted_y = model.predict(np.expand_dims(X_vert_eval))  
        
        if predicted_x > 0.0:
            pyautogui.press('d')
        elif predicted_x < 0.0:
            pyautogui.press('a')
        else:
            pass
        if predicted_y > 0.0:
            pyautogui.press('w')
        elif predicted_y < 0.0:
            pyautogui.press('s')
        else:
            pass 
        
    def go_gadget_go(self): #Activate the operational loop of the model using 
                            #multithreading...
        
        
        if self.key == Keycode.from_char("p"):
            keylogger = Process(target=self.listen)
            neural_net = Process(target=self.autopilot)
            
            keylogger.start()
            
            
            neural_net.start()
        
            keylogger.join()
            neural_net.join()
        if self.key == KeyCode.from_char("o"):
            
            neural_net.terminate() #Terminate the AI-prediction-based process
                                   #Sounded 
        

TypeError: __init__() missing 1 required positional argument: 'model'

In [79]:
#The input data should have shape (car.memory_size_window, num_noise_pts=100, 1)
'''For unseen future data, a Gaussian distribution function should be used to iteratively generate the next 
Gaussian distribution as a function of the value of the last prediction, i.e. training data as a function of 
time should be considered a martingale. That is for each timestep, generate the next distribution as a 
Gaussian centred around the previous data point.

#Maybe also just pretrain the recurrent LSTM and deploy it live only if needed... we don't need this portfolio 
project to turn into a PhD thesis :)

#If the model is pretrained on a sufficiently large time window, simply use the above generator function to create 
the next Gaussian distribution of random data, set X_test = np.expand_dims(X_test, axis=0) to reshape it for 
single unit data inputs (for one unit of training data in this case the Gaussian distr) and then use model.predict

'''
model = Sequential()
#100 for the 100 noise points attributed to each Gaussian distr
model.add(LSTM(units=100, return_sequences=True,input_shape=(100, 1)))#X_train.shape[1:]
model.add(Dropout(0.2))
model.add(LSTM(150, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(150, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(150, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(150, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(150))
model.add(Dropout(0.2))
model.add(Dense(1))

In [80]:
model.compile(optimizer = 'adam', loss = 'mean_squared_error')

In [81]:
#To train two models concurrently while running the keylogger simultaneously, 

#we likely need to train the model using the GradientTape method.

#Write a training loop from scratch and update all parameters, train the models and if AFK, 

#Run a model.predict function on the future data. #Ofc, because it is trained on a series of Gaussians, 

#we could just generate random Gaussians with random means (itself a Gaussian process!) and simply give them

#extremely wide variances so that we can predict these random clusters for input to the model as a function of 

#time.
car = Vehicle(100, model)

In [82]:
# car.model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm_6 (LSTM)               (None, 100, 100)          40800     
                                                                 
 dropout_6 (Dropout)         (None, 100, 100)          0         
                                                                 
 lstm_7 (LSTM)               (None, 100, 150)          150600    
                                                                 
 dropout_7 (Dropout)         (None, 100, 150)          0         
                                                                 
 lstm_8 (LSTM)               (None, 100, 150)          180600    
                                                                 
 dropout_8 (Dropout)         (None, 100, 150)          0         
                                                                 
 lstm_9 (LSTM)               (None, 100, 150)         