# Chapter 3 
# Starting with Machine Learning 2/3

# Contents #
***
- [ ] Linear regression
- [ ] The MNIST dataset
- [x] Classifiers
- [x] The nearest neighbor algorithm
- [ ] Data clustering
- [ ] The k-means algorithm

# Classifiers
***
* Assigns each new input datum (instance) to one of the possible categories (classes). 
* Binary classification / Multiclass classification.
* Supervised learning category.



**The basic steps to follow to resolve a supervised classification problem are as follows:**
1. Build the training examples in order to represent the actual context and application on which to accomplish the classification. 
2. Choose the classifier and the corresponding algorithm implementation.
2. Train the algorithm on the training set and set any control parameters through validation. 
4. Evaluate the accuracy and performance of the classifier by applying a set of new instances (test set).


# The nearest neighbor algorithm
***
The K-nearest neighbor (KNN) is a supervised learning algorithm for both classification or regression. It is a system that assigns the class of the sample tested according to its distance from the objects stored in the memory.

<center><img src="imgs\knn_mean_of_intensity.gif" width="50%" height="50%"></center>

# Eulidean Distance
***

The distance, d, is defined as the Euclidean distance between two points:
<center><img src="imgs\knn_equation.png" width="40%" height="40%"></center>

* n is the dimension of the space.
* The advantage is the ability to classify objects whose classes are not linearly separable.
* Stable classifier.
* Small perturbations of the training data do not significantly affect the results obtained. 
* For every new classification, it should be carried out by adding the new data to all initial instances and repeating the calculation procedure for the selected K value.
* Requires a fairly high amount of data to make realistic predictions and is sensitive to the noise of the analyzed data. 

# Different ways to calculate the distance in KNN
***
<center><img src="imgs\distan5.gif" width="80%" height="80%"></center>
<center><img src="imgs\distan6.gif" width="80%" height="80%"></center>

# Building the training set
***

In [1]:
# Let's start with the import libraries needed for the simulation:
import numpy as np    
import tensorflow as tf    
import input_data 

#To construct the data model for the training set, use the input_data.read_data_sets function, introduced earlier:
# original code: mnist = input_data.read_data_sets("/tmp/data/", one_hot=True) 
mnist = input_data.read_data_sets("MNIST_data", one_hot=True) 

#In our example we will take training phase consisting of 100 MNIST images:
train_pixels,train_list_values = mnist.train.next_batch(100) 

#While we test our algorithm for 10 images:
test_pixels,test_list_of_values  = mnist.test.next_batch(10) 

#Finally, we define the tensors train_pixel_tensor and test_pixel_tensor we use to construct our classifier:
train_pixel_tensor = tf.placeholder("float", [None, 784])
test_pixel_tensor = tf.placeholder("float", [784])


Extracting MNIST_data\train-images-idx3-ubyte.gz
Extracting MNIST_data\train-labels-idx1-ubyte.gz
Extracting MNIST_data\t10k-images-idx3-ubyte.gz
Extracting MNIST_data\t10k-labels-idx1-ubyte.gz


# Cost function and optimization 
***

### The cost function is represented by the distance in terms of pixels:
 

In [6]:
# wrong: distance = tf.reduce_sum(tf.abs(tf.add(train_pixel_tensor, tf.neg(test_pixel_tensor))), reduction_indices=1)
distance = tf.reduce_sum(tf.abs(tf.add(train_pixel_tensor, tf.negative(test_pixel_tensor))), 1) # Manhattan
# distance = tf.sqrt(tf.reduce_sum(tf.pow(tf.add(train_pixel_tensor, tf.negative(test_pixel_tensor)),2), 1)) # Euclidean

test_add = tf.add(train_pixel_tensor, tf.negative(test_pixel_tensor))

# Finally, to minimize the distance function, we use arg_min , which returns the index with the smallest distance (nearest neighbor):
# wrong: pred = tf.arg_min(distance, 0)
pred = tf.argmin(distance, 0)


The tf.reduce function sum computes the sum of elements across the dimensions of a tensor. For example (from the TensorFlow on-line manual):
    # 'x' is [[1, 1, 1]    
    #         [1, 1, 1]]    
    tf.reduce_sum(x) ==> 6    
    tf.reduce_sum(x, 0) ==> [2, 2, 2]    
    tf.reduce_sum(x, 1) ==> [3, 3]    
    tf.reduce_sum(x, 1, keep_dims=True) ==> [[3], [3]]    
    tf.reduce_sum(x, [0, 1]) ==> 6 

# Testing and algorithm evaluation 

In [7]:
#Accuracy is a parameter that helps us to compute the final result of the classifier:
accuracy = 0 

#Initialize the variables:
## wrong: init = tf.initialize_all_variables() 
init = tf.global_variables_initializer() 


In [8]:
#Start the simulation:
with tf.Session() as sess:        
    sess.run(init)
    for i in range(len(test_list_of_values)): 
        # Then we evaluate the nearest neighbor index, using the pred function, defined earlier:
        nn_index = sess.run(pred, feed_dict={train_pixel_tensor:train_pixels, test_pixel_tensor:test_pixels[i,:]})

        # Finally, we find the nearest neighbor class label and compare it to its true label:
        print ("Test N° ", i,"Predicted Class: ", np.argmax(train_list_values[nn_index]), "True Class: ", np.argmax(test_list_of_values[i]))

        if np.argmax(train_list_values[nn_index]) == np.argmax(test_list_of_values[i]):
            #Then we evaluate and report the accuracy of the classifier:
            accuracy += 1./len(test_pixels)        

    print ("Result = ", accuracy )

Test N°  0 Predicted Class:  7 True Class:  7
Test N°  1 Predicted Class:  2 True Class:  2
Test N°  2 Predicted Class:  1 True Class:  1
Test N°  3 Predicted Class:  0 True Class:  0
Test N°  4 Predicted Class:  4 True Class:  4
Test N°  5 Predicted Class:  1 True Class:  1
Test N°  6 Predicted Class:  4 True Class:  4
Test N°  7 Predicted Class:  9 True Class:  9
Test N°  8 Predicted Class:  6 True Class:  5
Test N°  9 Predicted Class:  9 True Class:  9
Result =  0.8999999999999999
