## Computer Vision

Let's do some very basic computer vision. We're going to import the MNIST handwritten digits data and $k$NN to predict values (i.e. "see/read").

1. To load the data, run the following code in a chunk:
```
import pickle
with open('data/minst.pkl', 'rb') as f:
    data = pickle.load(f)
X_train, y_train = data['X_train'], data['y_train']
X_test, y_test = data['X_test'], data['y_test']
```
The `y_test` and `y_train` vectors, for each index `i`, tell you want number is written in the corresponding index in `X_train[i]` and `X_test[i]`. The value of `X_train[i]` and `X_test[i]`, however, is a 28$\times$28 array whose entries contain values between 0 and 256. Each element of the matrix is essentially a "pixel" and the matrix encodes a representation of a number. To visualize this, run the following code to see the first ten numbers:
```
import matplotlib.pyplot as plt
import numpy as np
np.set_printoptions(edgeitems=30, linewidth=100000)
for i in range(5):
    print(y_test[i],'\n') # Print the label
    print(X_test[i],'\n') # Print the matrix of values
    plt.contourf(np.rot90(X_test[i].transpose())) # Make a contour plot of the matrix values
    plt.show()
```
OK, those are the data: Labels attached to handwritten digits encoded as a matrix. OUR ANSWERS ARE BELOW IN CODE CHUNKS

2. What is the shape of `X_train` and `X_test`? What is the shape of `X_train[i]` and `X_test[i]` for each index `i`? What is the shape of `y_train` and `y_test`?
3. Use Numpy's `.reshape()` method to covert the training and testing data from a matrix into an vector of features. So, `X_test[index].reshape((1,784))` will convert the $index$-th element of `X_test` into a $28\times 28=784$-length row vector of values, rather than a matrix. Turn `X_train` into an $N \times 784$ matrix $X$ that is suitable for scikit-learn's kNN classifier where $N$ is the number of observations and $784=28*28$ (you could use, for example, a `for` loop).
4. Use the reshaped `X_test` and `y_test` data to create a $k$-nearest neighbor classifier of digit. What is the optimal number of neighbors $k$? If you can't determine this, play around with different values of $k$ for your classifier.
5. For the optimal number of neighbors, how well does your predictor perform on the test set? Report the accuracy, compute a confusion matrix, and explain your findings.
6. For your confusion matrix, which mistakes are most likely? Do you find any interesting patterns?
7. So, this is how computers "see." They convert an image into a matrix of values, that matrix becomes a vector in a dataset, and then we deploy ML tools on it as if it was any other kind of tabular data. To make sure you follow this, invent a way to represent a color photo in matrix form, and then describe how you could convert it into tabular data. (Hint: RGB color codes provide a method of encoding a numeric value that represents a color.)

In [None]:
# Q1:

import pickle
with open('/content/minst.pkl', 'rb') as f:
    data = pickle.load(f)
X_train, y_train = data['X_train'], data['y_train']
X_test, y_test = data['X_test'], data['y_test']

FileNotFoundError: [Errno 2] No such file or directory: '/content/minst.pkl'

In [None]:
import matplotlib.pyplot as plt
import numpy as np
np.set_printoptions(edgeitems=30, linewidth=100000)
for i in range(5):
    print(y_test[i],'\n') # Print the label
    print(X_test[i],'\n') # Print the matrix of values
    plt.contourf(np.rot90(X_test[i].transpose())) # Make a contour plot of the matrix values
    plt.show()

In [None]:
# Q2:
# What is the shape of X_train and X_test?
# What is the shape of X_train[i] and X_test[i] for each index i?
# What is the shape of y_train and y_test?

print(X_train.shape , X_test.shape)

print(X_train[3].shape , X_test[3].shape) # Same shape for all index i

print(y_train.shape , y_test.shape)


In [None]:
# Q3:
# Use Numpy's .reshape() method to covert the training and testing data from a matrix into an vector of features.
# So, X_test[index].reshape((1,784)) will convert the  𝑖𝑛𝑑𝑒𝑥 -th element of X_test into a  28×28=784 -length row vector of values, rather than a matrix.
# Turn X_train into an  𝑁×784  matrix  𝑋  that is suitable for scikit-learn's kNN classifier where  𝑁  is the number of observations and  784=28∗28 (you could use, for example, a for loop).

X_train = X_train.reshape((60000,784))
X_test = X_test.reshape((10000,784))

In [None]:
# Q4:
# Use the reshaped X_test and y_test data to create a  𝑘 -nearest neighbor classifier of digit.
# What is the optimal number of neighbors  𝑘 ?
# If you can't determine this, play around with different values of  𝑘  for your classifier.

from sklearn.neighbors import KNeighborsClassifier
import numpy as np

N_train = len(y_train)
N_test = len(y_test)

## Solve for k that maximizes accuracy:
k_bar = 10 # Number of k's to try
Acc = [] # We'll store the accuracy here

for k in range(k_bar):
    model = KNeighborsClassifier(n_neighbors=k+1) # Create a sk model for k
    fitted_model = model.fit(X_train,y_train) # Train the model on our data
    y_hat = fitted_model.predict(X_test) # Predict values for test set
    Acc.append( np.sum( y_hat == y_test )/N_test ) # Accuracy on testing data

Acc_max = np.max(Acc) # Find highest recorded Accuracy
max_index = np.where(Acc==Acc_max) # Find the indices that equal the maximum
k_star = max_index[0]+1 # Find the optimal value of k; why index+1?
print(k_star)

In [None]:
# Q5:
# For the optimal number of neighbors, how well does your predictor perform on the test set?
# Report the accuracy, compute a confusion matrix, and explain your findings.

## Fit optimal model:
model = KNeighborsClassifier(n_neighbors=k_star[0]) # Create a sk model for k
fitted_model = model.fit(X_train,y_train) # Train the model on our data
y_hat = fitted_model.predict(X_test) # Predict values for test set


In [None]:
from sklearn.metrics import accuracy_score
acc = accuracy_score(y_test, y_hat)
print(acc)

import pandas as pd
pd.crosstab(y_test,y_hat)

The predictor performs very well on the test set with an accuracy of 97.05%. From the confusion matrix we can see that the model frequently makes the correct prediction, the diagonal entries are far more populated than the off-diagonals, incorrect categorizations.

6. For your confusion matrix, which mistakes are most likely? Do you find any interesting patterns?

The most common confusion is mistaking a 7 for a 1 with 21 incorrect predictions, follow by categorizing a 4 as a 9 with 19 wrong cases. Also, it seems like the numbers 8 and 9 are most oftenly mistaken for other numbers.



7. So, this is how computers "see." They convert an image into a matrix of values, that matrix becomes a vector in a dataset, and then we deploy ML tools on it as if it was any other kind of tabular data. To make sure you follow this, invent a way to represent a color photo in matrix form, and then describe how you could convert it into tabular data. (Hint: RGB color codes provide a method of encoding a numeric value that represents a color.)

A way we could represent a color photo in matrix form is to determine each pixels' RGB values (0-255) and create a vector based on the intensity of each color. Turning this into tabular data is relatively straight forward. We could make each pixel in the photo a row and have "Red", "Green", and "Blue", as in RGB, be the columns. Each mix of RBG relates to a different color. For example, a pixel may have the RBG mix of (23,230,165) and another pixel might have a RGB mix of (145,46,67) which relates to a different color.

In [None]:
#For example-$10x10 color image (100 pixels)
import numpy as np
image = np.random.randint(0, 256, (10, 10, 3))

print("Original shape:", image.shape)      # (10, 10, 3)

# Flatten to one row (vector)
flat = image.reshape(1, 10*10*3)
print("Flattened shape:", flat.shape)      # (1, 300)

# Or tabular form: 100 rows, 3 columns (each pixel as row)
tabular = image.reshape(100, 3)
print("Tabular shape:", tabular.shape)     # (100, 3)

Original shape: (10, 10, 3)
Flattened shape: (1, 300)
Tabular shape: (100, 3)
