# SIXT33N version C
## Phase 2: Principal Component Analysis

### EE 16B: Designing Information Devices and Systems II, Fall 2015

**Name 1**:

**Login**: ee16b-


**Name 2**:

**Login**: ee16b-

##Table of Contents

* [Introduction](#intro)
* [Part 1: Data Collection](#part1)
* [Part 2: Principal Component Analysis](#part2)
* [Part 3: Classification](#part3)
* [Part 4: Audio Link](#part4)

<a id='intro'></a>
## Introduction

In this phase we will work out the mechanics of turning a gesture into a command that the Launchpad will execute.

For this version of the project you will generate commands by drawing patterns using a mouse on a PC. There are five different commands that you will draw, at minimum. (If you want to implement more commands then go for it!)

- Straight vertical line from bottom to top = Fast
- Straight vertical line from top to bottom = Slow
- Straight horizontal line from right to left = Turn Left
- Straight horizontal line from left to right = Turn Right
- Clockwise circle starting at 12 'o clock = Party Mode

You will draw these shapes in a Python canvas on your computer, so all of your PCA training and classification will run on your PC. Based on the classifier result, the PC will then transmit an audio signal, which will be picked up by the Launchpad.

The goals of this phase are as follows:
- Collect gesture data
- PCA + Classifier (PC, 5 commands)
- Check accuracy
- Send resulting command through audio link

<a id='part1'></a>
##<span style="color:blue">Part 1: Data Collection</span>

To start our classifier training, we need to first gather some data. We have developed a simple script that allows you to collect the xy-coordinates as you draw on a canvas. To use this script, run

<b>`python capture.py log.csv`</b>

on the command line. This will bring up a white canvas where you can draw a pattern. Start by drawing a line from left to right 20 times in the box. Now, you can obtain the data by looking for a file called <b>`log.csv`</b>. If this was the data that you want to use for your "Right" command, rename the file as <b>`right.csv`</b>.

Repeat the process for <b>`left.csv`</b>, <b>`up.csv`</b>, <b>`down.csv`</b> and <b>`circle.csv`</b>. Remember to rename the file before running the script again if you want to discard the last run since the script appends to the file.

Make sure you have 20 data points for each of 5 different gestures: right, left, up, down and clockwise circle. To get a robust training set, vary your gestures a bit by doing it in different parts of the canvas and in different sizes. 

<a id='part2'></a>
##<span style="color:blue">Part 2: Principal Component Analysis</span>

Now that you have some data, you can apply PCA to classify the different gestures. Run through the cells below to set iPython up.

In [None]:
import numpy as np
import numpy.linalg
import matplotlib.pyplot as plt
import csv
%matplotlib inline

In [None]:
def read_csv(filename):
    """
    Reads a csv file and returns the first 20 recordings from the file
    Input:
        filename: csv filename
    Output:
        data: a 20x66 matrix corresponding to the first 20 readings in the csv file. Each row corresponds
            to a reading; the first 33 values are x-coordinates while the second33 values are y-coordinates
    """
    data = []
    with open(filename, 'r') as f:
        reader = csv.reader(f)
        for i,row in enumerate(reader):
            data.append(np.array([float(i) for i in row]).T)
    data = np.array(data)
    data = np.hstack((data[::2,:], data[1::2,:]))
    data = data[:20,:] # Take only first 20 readings
    return data

Using the <b>`read_csv`</b> function above, build the <b>`A`</b> matrix for PCA. The function <a href="http://docs.scipy.org/doc/numpy/reference/generated/numpy.hstack.html"><b>`np.hstack`</b></a> might be helpful here. Then plot the data using the <b>`plot_data`</b> function.

In [None]:
# Load your 5 csv files and stack them into a 100x66 numpy array using the read_csv function
data = read_csv('up.csv')
data = np.vstack((data, read_csv('down.csv')))
data = np.vstack((data, read_csv('left.csv')))
data = np.vstack((data, read_csv('right.csv')))
data = np.vstack((data, read_csv('circle.csv')))

In [None]:
def plot_data(d):
    """
    Plots data in the right canvas coordinates
    Input:
        d: Nx66 data array
    Output:
        plots N curves on the same graph
    """
    plt.gca().invert_yaxis()
    for line in d:
        l = len(line)
        plt.plot(line[:l/2], line[l/2:])

In [None]:
# See the gesture paths you recorded
plot_data(data)

Looking at the plot above, PCA might have some trouble classifying since the data is not normalized in any way. To solve this issue, implement a 'normalization' scheme where you center each gesture recording to (0,0). To do this, first find the mean of the x-coordinates and y-coordinates, then simply subtract that mean from each point. Remember that the first 33 elements of each row are the x-coordinates and the second 33 elements of each row are the y-coordinates; you have to normalize them separately. Your data should look similar to the plot below.

<center>
<img width='400px' src="http://inst.eecs.berkeley.edu/~ee16b/fa15/lab_pics/proj-gesture-norm.png">
</center>

In [None]:
# Normalize your data so it is more suitable for PCA
data_norm = np.copy(data)
# YOUR CODE HERE
for i in range(len(data_norm)):
    data_norm[i,:] =
    

In [None]:
# Plot normalized data
plot_data(data_norm)

In [None]:
# Plot each component of normalized data separately
plt.plot(data_norm[:,:33].T)
plt.title('x-coordinate data points')
plt.figure()
plt.plot(data_norm[:,33:].T)
plt.show()
plt.title('y-coordinate data points')

Now you can try using SVD to retrieve the principal components. After you have done so, plot the sigma values. If they are not satisfactory, think of other ways you can normalize the data and modify the cells above until you are satisfied.

In [None]:
# Call SVD on the normalized data matrix
[u,s,v] = 

In [None]:
# Plot the sigma values
plt.stem(s)
plt.xlim([-0.5,10])
plt.title('Sigma values')

**<span style="color:red">How many principal components are significant?</span>**

YOUR ANSWER HERE

Now plot out the significant principal components you found above and project the data on the new space.

In [None]:
# Plot the significant principal components
# YOUR CODE HERE


In [None]:
# Project the data matrix on to the first 3 principal components
# YOUR CODE HERE
proj = 


Let's plot the data with only 2 principal components.

In [None]:
# Plot the projection on the first 2 principal components
n = 20 # Number of recordings of each gesture
plt.scatter(proj[:n,0], proj[:n,1], c=['red'], edgecolor='None')
plt.scatter(proj[n:2*n,0], proj[n:2*n,1], c=['blue'], edgecolor='None')
plt.scatter(proj[2*n:3*n,0], proj[2*n:3*n,1], c=['green'], edgecolor='None')
plt.scatter(proj[3*n:4*n,0], proj[3*n:4*n,1], c=['cyan'], edgecolor='None')
plt.scatter(proj[4*n:5*n,0], proj[4*n:5*n,1], c=['black'], edgecolor='None')
plt.legend(['up', 'down', 'left', 'right', 'circle'],loc='center left', bbox_to_anchor=(1, 0.5))

Now plot with all 3 principal components we calculated.

In [None]:
# Plot the projection on the first 3 principal components
from mpl_toolkits.mplot3d import Axes3D
n=20
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter(proj[:n,0], proj[:n,1], proj[:n,2], c=['red'], edgecolor='None')
ax.scatter(proj[n:2*n,0], proj[n:2*n,1], proj[n:2*n,2], c=['blue'], edgecolor='None')
ax.scatter(proj[2*n:3*n,0], proj[2*n:3*n,1], proj[2*n:3*n,2], c=['green'], edgecolor='None')
ax.scatter(proj[3*n:4*n,0], proj[3*n:4*n,1], proj[3*n:4*n,2], c=['cyan'], edgecolor='None')
ax.scatter(proj[4*n:5*n,0], proj[4*n:5*n,1], proj[4*n:5*n,2], c=['black'], edgecolor='None')

# Point of view - modify this to move around the camera position
ax.view_init(elev=30, azim=45) 

ax.legend(['up', 'down', 'left', 'right', 'circle'],loc='center left', bbox_to_anchor=(1, 0.5))

Try looking at the data from another angle by replacing the ax.view_init line with: <b>`ax.view_init(elev=0, azim=45)`</b>. This provides a different camera position so you can get a better idea of how multiple principle components classify your data.



**<span style="color:red">After looking at the plots above, how many principal components would you choose to make your classification easier?</span>**

YOUR ANSWER HERE

The plots above should be very easy to classify. If you do not see nice clustering, try to re-record the data with straighter lines. If it still fails, seek help from a GSI.

<a id='part3'></a>
##<span style="color:blue">Part 3: Classification</span>

Using the plots above, we will define a way of classifying the different gestures. Fill in the skeleton code below to determine the gesture of a new reading vector and try out the classification function. Don't forget to do the same normalization to the vector as we are feeding in raw data.

Note the colors for each gesture in the legend of the plots above.

In [None]:
def classify(vector, new_basis):
    """
    Classifies a new reading vector into one of the 5 gestures.
    Inputs:
        vector: 1x66 reading vector - first 33 elements correspond to x-coordinates
            and second 33 elements correspond to y-coordinates
        new_basis: Nx66 matrix with the basis of the new space, where N is the number
            of principal components used
    Output:
        String of the classified gesture names
    """
    # YOUR CODE HERE
    proj = 
    if (...):
        return 'up'
    if (...):
        return 'down'
    if (...):
        return 'left'
    if (...):
        return 'right'
    if (...):
        return 'circle'

In [None]:
# Try out the classification function
print(classify(data[0,:], ...)) # Modify to use other vectors

<a id='part4'></a>
##<span style="color:blue">Part 4: Audio Link</span>

###Materials
- Microphone front-end circuit
- Launchpad + USB

Remember from last time, to send some commands from the PC to the Launchpad, we will create a very basic audio channel using the PC speakers and the microphone circuit you just built. Just like On-Off Keyeing (OOK) in the wireless module in EE16A, we will modulate our command on a sinusoid. In our case, we will use a 1kHz sinusoid since it is in the best frequency range of the microphone. The command will be encoded in the time between 2 pulses. For example a sinusoid for 0.1s continued with nothing for 0.1s and then another sinusoid for 0.1s encodes a certain command. If the time of the empty space between the two pulses is 0.3s instead, it encodes a different command.

Now we will incorporate the PCA classification with the audio link. The code from the last phase is reproduced below with a little modification. Upload the sketch <b>`classify.ino`</b> to the Launchpad. 

Make sure that the microphone circuit from the last phase is still working. Remember that the first op-amp is powered from the voltage regulator while the second op-amp is powered by the Launchpad's 3.3V pin. Probe the output voltage of the front end circuit and make sure the DC level is around 1.6V and the signal saturates at 0V and 3.3V.

Once everything is set up, open the Serial Monitor in Energia and run the cells below. The Launchpad will print out the gesture if it is able to recognize the audio command. Try calling your classify function on one of the test vectors that you created. The audio code will output the appropriate pulses and send the command to the Launchpad.

In [None]:
%pylab inline
import numpy as np
import matplotlib.pyplot as plt
from scipy import io
import warnings
warnings.filterwarnings('ignore')
import pyaudio
import wave

def play_audio( data, p, fs):
    """
    Plays audio using pyAudio
    Parameters:
        data: audio data array
        p: pyAudio object
        fs: sampling rate
    Returns: None
    """
    ostream = p.open(format=pyaudio.paFloat32, channels=1, rate=fs,output=True)
    ostream.write( data.astype(np.float32).tostring() )

def generate_pulses(mask, fs, f, length):
    """ 
    Generate audio encoding
    Parameters:
        mask: List containing audio mask (on-off)
        fs: sampling frequency
        f: carrier frequency
        length: length of each pulse/entry in mask
    Returns:
        Numpy array containing encoded data
    """
    end = len(mask)*length
    mask = np.repeat(mask, fs*length)
    x = np.linspace(0,end,len(mask))
    data = np.sin(2*pi*f*x)
    return data * mask

In [None]:
fs = 44100
f = 1000

one = generate_pulses([1, 0, 1, 0], fs, f, 0.1)  # Pulses of 0.1s apart
two = generate_pulses([1, 0, 0, 1, 0], fs, f, 0.1)  # Pulses of 0.2s apart
three = generate_pulses([1, 0, 0, 0, 1, 0], fs, f, 0.1)  # Pulses of 0.3s apart
four = generate_pulses([1, 0, 0, 0, 0, 1, 0], fs, f, 0.1)  # Pulses of 0.4s apart
five = generate_pulses([1, 0, 0, 0, 0, 0, 1, 0], fs, f, 0.1)  # Pulses of 0.5s apart

In [None]:
p = pyaudio.PyAudio()

# Call your classify function with data that you want to test
gesture = classify(...)

if (gesture == 'up'):
    play_audio(one, p, fs )
if (gesture == 'down'):
    play_audio(two, p, fs )
if (gesture == 'left'):
    play_audio(three, p, fs )
if (gesture == 'right'):
    play_audio(four, p, fs )
if (gesture == 'circle'):
    play_audio(five, p, fs )

p.terminate()