## Computer Vision

Let's do some very basic computer vision. We're going to import the MNIST handwritten digits data and $k$NN to predict values (i.e. "see/read").

1. To load the data, run the following code in a chunk:
```
from keras.datasets import mnist
df = mnist.load_data('minst.db')
train,test = df
X_train, y_train = train
X_test, y_test = test
```
The `y_test` and `y_train` vectors, for each index `i`, tell you want number is written in the corresponding index in `X_train[i]` and `X_test[i]`. The value of `X_train[i]` and `X_test[i]`, however, is a 28$\times$28 array whose entries contain values between 0 and 256. Each element of the matrix is essentially a "pixel" and the matrix encodes a representation of a number. To visualize this, run the following code to see the first ten numbers:
```
import matplotlib.pyplot as plt
import numpy as np
np.set_printoptions(edgeitems=30, linewidth=100000)
for i in range(5): 
    print(y_test[i],'\n') # Print the label
    print(X_test[i],'\n') # Print the matrix of values
    plt.contourf(np.rot90(X_test[i].transpose())) # Make a contour plot of the matrix values
    plt.show()
```
OK, those are the data: Labels attached to handwritten digits encoded as a matrix.

2. What is the shape of `X_train` and `X_test`? What is the shape of `X_train[i]` and `X_test[i]` for each index `i`? What is the shape of `y_train` and `y_test`?
3. Use Numpy's `.reshape()` method to covert the training and testing data from a matrix into an vector of features. So, `X_test[index].reshape((1,784))` will convert the $index$-th element of `X_test` into a $28\times 28=784$-length row vector of values, rather than a matrix. Turn `X_train` into an $N \times 784$ matrix $X$ that is suitable for scikit-learn's kNN classifier where $N$ is the number of observations and $784=28*28$ (you could use, for example, a `for` loop).
4. Use the reshaped `X_test` and `y_test` data to create a $k$-nearest neighbor classifier of digit. What is the optimal number of neighbors $k$? If you can't determine this, play around with different values of $k$ for your classifier.
5. For the optimal number of neighbors, how well does your predictor perform on the test set? Use a confusion matrix and compute accuracy.
6. For your confusion matrix, which mistakes are most likely? Do you find any interesting patterns?
7. So, this is how computers "see." They convert an image into a matrix of values, that matrix becomes a vector in a dataset, and then we deploy ML tools on it as if it was any other kind of tabular data. To make sure you follow this, invent a way to represent a color photo in matrix form, and then describe how you could convert it into tabular data. (Hint: RGB color codes provide a method of encoding a numeric value that represents a color.)

In [7]:
import pandas as pd
import numpy as np

Z_train = pd.read_parquet('./data/Z_train.parquet')
Z_test = pd.read_parquet('./data/Z_test.parquet')
y_train = pd.read_parquet('./data/y_train.parquet')
y_test = pd.read_parquet('./data/y_test.parquet')

print(Z_test)


      0    1    2    3    4    5    6    7    8    9    ...  774  775  776  \
0       0    0    0    0    0    0    0    0    0    0  ...    0    0    0   
1       0    0    0    0    0    0    0    0    0    0  ...    0    0    0   
2       0    0    0    0    0    0    0    0    0    0  ...    0    0    0   
3       0    0    0    0    0    0    0    0    0    0  ...    0    0    0   
4       0    0    0    0    0    0    0    0    0    0  ...    0    0    0   
...   ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...  ...   
9995    0    0    0    0    0    0    0    0    0    0  ...    0    0    0   
9996    0    0    0    0    0    0    0    0    0    0  ...    0    0    0   
9997    0    0    0    0    0    0    0    0    0    0  ...    0    0    0   
9998    0    0    0    0    0    0    0    0    0    0  ...    0    0    0   
9999    0    0    0    0    0    0    0    0    0    0  ...    0    0    0   

      777  778  779  780  781  782  783  
0       0    0    0  

In [10]:
from sklearn.neighbors import KNeighborsClassifier
import matplotlib as plt

# determine the optimal k:
k_bar = 50
k_grid = np.arange(2,k_bar) # the range of k's to consider
accuracy = np.zeros(k_bar) 

for k in range(k_bar):
    knn = KNeighborsClassifier(n_neighbors=k+1)
    predictor = knn.fit(Z_train.values,y_train) 
    accuracy[k] = knn.score(Z_test.values,y_test)

accuracy_max = np.max(accuracy) # highest recorded accuracy
max_index = np.where(accuracy==accuracy_max) 
k_star = k_grid[max_index] # find the optimal value of k
print(k_star)

plt.plot(np.arange(0,k_bar),accuracy) # plot accuracy by k
plt.xlabel("k")
plt.title("optimal k:"+str(k_star))
plt.ylabel('Accuracy')
plt.show()

'''
The code ran for 30 minutes and still didn't finish running...
'''

  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)
  return self._fit(X, y)


KeyboardInterrupt: 

In [12]:
knn = KNeighborsClassifier(n_neighbors=1)
predictor = knn.fit(Z_train.values,y_train) 
y_hat = predictor.predict(Z_test.values) 

accuracy = knn.score(Z_test.values,y_test)
print('Accuracy: ', accuracy)

pd.crosstab(y_test, y_hat)

  return self._fit(X, y)


KeyboardInterrupt: 

In [None]:
'''
Because I could not get the code chunks to finish running on a 
timely manner the following observations are based on the results on
solution notebook.
With k=3, the kNN classifier achieves ~90% accuracy on the test 
set. When mistakes occur, they tend to involve confusions 
between digits that are visually similar, such as mistaking an 8 for 
a 3, or a 7 for a 1.
To represent a color photo, instead of a single 28x28 grid of 
pixel intensities (for grayscale), we would use three 28x28 
matrices—one for the intensity of red, green, and blue (RGB). 
These matrices could then be flattened into long vectors and 
combined side by side into a single row for each image, creating 
a structure similar to tabular data for machine learning.
'''