# Welcome !

PyMouseGesture is a project that aims to show how to build a simple Keras RNN model by using the example of interpreting mouse movement as gestures.

## Organization
1. [Collecting data](#Collecting_data)
1. [Data Visualization](#data_vis)
1. [Cleaning and Structuring data](#Process_data)
1. [Building an RNN in Keras](#build)
1. [Training the model](#train)
1. [Live inference !](#Live_Inference)

<a id = "Collecting_data"></a>

## Collecting data


The data was collected using the python library [pynput](https://pypi.org/project/pynput/). The script mouse_data_collector.py is a helper script that is used to collect and label the mouse data simultaneously. However if you exit the script the collected data is written to the data.csv file in the directory overwriting any previous file of the same name.

The logic behind the script is simple:
- create a mouse event listener.
- define methods to be called when mouse is clicked and mouse is moved
- if mouse is moved collect data for the current gesture
- if mouse is clicked stop collecting data for the current gesture.
- ask the user to label the gesture
- save the collected data as a list of x and y co-ordinates

In [2]:
#first let us import the necessary packages and finish setup
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#set numpy random seed for reproducible results
np.random.seed(0)

%matplotlib inline

<a id = 'data_vis'></a>

## Data Visualization

Let us first visualize our collected data before doing anything else. We are using [pandas](https://pandas.pydata.org/) to display the data and [matplotlib's pyplot](https://matplotlib.org/users/pyplot_tutorial.html) to plot the examples.

In [86]:
#read data.csv using pd.read_csv
data = pd.read_csv('data.csv')
print(data.columns)
data.head()

Index(['sequence', 'x_coordinates', 'y_coordinates'], dtype='object')


Unnamed: 0,sequence,x_coordinates,y_coordinates
0,0,"[305, 305, 303, 299, 295, 289, 281, 272, 264, ...","[141, 141, 141, 141, 141, 139, 138, 137, 136, ..."
1,0,"[238, 234, 226, 214, 200, 175, 149, 124, 103, ...","[166, 165, 165, 165, 167, 174, 183, 194, 206, ..."
2,0,"[162, 157, 146, 134, 102, 69, 17, -37, -66, -5...","[257, 256, 253, 252, 250, 250, 255, 267, 293, ..."
3,0,"[239, 237, 234, 230, 223, 212, 199, 181, 162, ...","[205, 204, 204, 204, 204, 205, 209, 214, 219, ..."
4,0,"[248, 247, 245, 240, 226, 214, 188, 160, 120, ...","[221, 220, 219, 218, 217, 215, 214, 214, 214, ..."


Let's plot and visualize how different sequences are represented

In [87]:
type(data.at[0,'x_coordinates'])

str

In [106]:
string_to_list('[1,2,3,4]')

[1, 2, 3, 4]

In [83]:
def string_to_list(string_list):
    """
    Converts list read from csv as string back into integer list. Returns error if string literal has non numeric characters
    
    Args:
    ----
    string_list - A list of integers read as a string literal
    
    Returns:
    -------
    int_list - A list of integers corresponding to the input string list
    
    ##Example
    z = string_to_list(data.at[0,'x_coordinates']) where data is a pandas DataFrame
        
    """
    return list(map(int,string_list.strip('[]').split(',')))
    

In [107]:
X = np.zeros((data.shape))

In [109]:
X[:,0] = data['sequence']
for i in range(data.shape[0]):
    X[i,1] = (string_to_list(data.at[i,'x_coordinates']
    #X[i,2] = string_to_list(data.at[i,'y_coordinates'])

ValueError: setting an array element with a sequence.

## Live Inference <a id = 'Live_Inference' ></a> 