## Lab 7 Classifier Tools

<hr>

### 1. VisionX documentation reminder

1. The traditional method to obtain documentation about a linux command is "man <command-name>" this also works for VisionX commands
2. The method to gain information about anything is to google. Thus, to gain information about anything including command documentation consider using a google search on "visionx v4 <anything>" Example: "visionx v4 multichannel" and ignore the Ads.

<hr>
    
### 2. bash scripts for processing a set of files

Example: to convert a set of files in VisionX format with a  .vx extension to png format files with the same rootname and a .png extension:
```sh
#!/bin/sh
for i in *.vx
do
  root=`echo $i| sed -e 's/\(.*\)\..*/\1/'`
  echo vxport  -png if=$i of=$root.png
done
```
Given a directory with the files im1.vx and im2.vx the above script generated the following:
```sh
vxport -png if=im1.vx of=im1.png
vxport -png if=im2.vx of=im2.png
```
To make the script execute rather than just print the above commands remove the word "echo" (bash print command) from the script.
<p>

To be a little more fancy if you have already created a .csv file "labels.csv" with the content:
```
tst0.png,7
tst1.png,2
```
then consider the script:
```sh
  #!/bin/sh
  for i in $(cut -d, -f1 labels.csv)
  do
    root=`echo $i| sed -e 's/\(.*\)\..*/\1/'`
    echo vxport  -png if=$root.vx  of=$root.png
  done
```
Which produces the output:
```sh
vxport -png if=tst0.vx of=tst0.png
vxport -png if=tst1.vx of=tst1.png
```

OF course equivalent scripts may be written in python if you prefer.


### 3. Print the times for training and testing

In [None]:
import time
start_time = time.time()
#Classifier Declaration
KNN = KNeighborsClassifier(n_neighbors=3)
#Train the classifier
KNN.fit(X,y)
train_time = time.time() - start_time
start_time = time.time()
print("Training time %.3f seconds" % train_time)
#Evaluate the result
score = KNN.score(X_test,y_test)
test_time = time.time() - start_time
print("Test time %.3f seconds" % test_time)
print("Test score with 3NN is: %.4f" % score)

### 4. Example Classifier parameters that can be used with MNIST

In [None]:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.svm import SVC

### Nearest Neighbor Classifier
#
KNN = KNeighborsClassifier(n_neighbors=3)
#
###  Multi-layer perceptron
#
# Single hidden layer
mlp = MLPClassifier(hidden_layer_sizes=(50,), max_iter=10, alpha=1e-4,
                    solver='sgd', verbose=10, tol=1e-4, random_state=1,
                    learning_rate_init=.1)
# Note, the iteration limit will be reached! 
#
# Two hidden layers
mlp_2 = MLPClassifier(hidden_layer_sizes=(50,50), max_iter=10, alpha=1e-4,
                    solver='sgd', verbose=10, tol=1e-4, random_state=1,
                    learning_rate_init=.1)
#
### Support Vector Machines
##
## Linear
clf = SVC(kernel = 'linear', C = 1)
##
## cubic polynomial
clf = SVC(kernel = 'poly',degree = 3, C = 1)
##
## Radial basis functions
clf = SVC(kernel = 'rbf', C = 1, gamma = 0.5)


### 5. Principle Component Analysis (PCA) 

In [None]:
###Apply PCA after data splitting and standardization.
#Wisely choose the number of components (n_components) or the amount of variance retained. 
pca = PCA(n_components=<value>)
pca.fit(X_train)
X_train = pca.transform(X_train)
X_test = pca.transform(X_test)

### 6. K- Nearest Neighbor Visualization of Decision Boundary
The following elegant code will show the decision boundary in a two-dimensional graph for
two features. Unfortunately, many more features are required for lab 7 MNIST so
it is not sutable for that task. The visualization problem for mutiple dimensions
remians and unsolved task.

In [None]:

cmap_light = ListedColormap(['orange', 'c', 'cornflowerblue','b','r',
                             'g','m','y','w','lightsalmon'])
h = .02  # step size in the mesh

X_train1 = X_train[:200, :2] #Visualizing 2 features of first 200 images
y_train1 = y_train[:200]

KNN = KNeighborsClassifier(n_neighbors=3)
KNN.fit(X_train1, y_train1)

x_min, x_max = X_train1[:, 0].min() - 1, X_train1[:, 0].max() + 1
y_min, y_max = X_train1[:, 1].min() - 1, X_train1[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                     np.arange(y_min, y_max, h))
Z = KNN.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

# Plot the decision boundary
plt.figure()
plt.pcolormesh(xx, yy, Z1, cmap=cmap_light);

# Plot also the training points
plt.scatter(X_train1[:, 0], X_train1[:, 1], c=y_train1, cmap=cmap_light,
            edgecolor='k', s=20);
plt.xlim(xx.min(), xx.max());
plt.ylim(yy.min(), yy.max());
plt.title("10-Class classification (k = 3)")