# Neural-Networks Project

## Introduction
An essential part of the behavior of humans is their ability to recognize objects. Humans are able to recognize
large numbers of other humans, letters, digits, and so on.
The object recognition problem can be defined as a labeling problem based on models of known objects. Formally,
given an image containing one or more objects of interest (and background) and a set of labels corresponding to a set
of models known to the system, the system should assign correct labels to regions, or a set of regions, in the image.

## Objective
The goal of this project is to build an object recognition system that can pick out and identify objects from an
inputted camera image, as shown in Figure 1, based on the registered objects.

## System Architecture

Input -> Features Extraction -> Classifier -> Voting -> Output

## Features Extraction

- Use Scale Invariant Feature Transform (SIFT) algorithm [1] to extract features of an image.
- SIFT describes image features that have many properties that make them suitable for matching differing images of an object or scene.
- The features are invariant to image scaling and rotation, and partially invariant to change in illumination and 3D camera viewpoint.
- An important aspect of this approach is that it generates large numbers of features that densely cover the image over the full range of scales and locations.
- As shown in Figure 3, given an image, SIFT generates a set of keypoints, each keypoint consists of its location, scale, orientation, and a set of 128 descriptors.
- Keypoints are the samples, and their features are the 128 element feature vector (descriptors) for each keypoint.
- You can use the implementation of SIFT in VLFeat library [2] or in OpenCV [3].

## Requirments

- The user must be able to insert an input (image) to the application, and the application has to identify objects on the inputted image.
- Using the test images you will test your classifier. And find the performance of your classifier using the Overall Accuracy (OA) and Confusion Matrix.
- A comparative study showing the difference in applying the three classification algorithms based on the six evaluation measures mentioned above. Thus, a report template will be provided to you for filling it.
- Also, the report must be provided showing the different NN architectures and different parameters you used, and their effect on the training and testing results.

### Bouns
Conduct a comparative study of different feature extraction algorithms such as SIFT, PCA-SIFT [5], and
SURF [6] to show up their effects in improving classification performance of the project’s objective based
on the six evaluation measures mentioned above. (A report template will be provided for that)

## How to use ipython-notebook?
ipython notebook act as client to the server side that's running on your console write now

the notebook has cells just like this one ... you can click on this cell to edit it after editing the cell whether it's a "markdown" cell or "code" cell you can "evaluate" or run the cell by pressing "shift+enter" or you can use the play button in the icon bar above

you can know and change the type of the cell using the dropdown menu from the icon bar above

you can evaluate each cell in any order you wish when a cell is not evaluated it has an empty brackets like this on it's left side "[]" when the code in this block is running or waiting to be executed the left side indicator will be "[\*]" after finishing executing the cell will have a number indicating it's order in execution "[1]" for first cell, "[2]" for the second ... etc

you can use *TAB* key to auto-complete code

you can add *?* then evaluate the cell to get the documentation for example: "np.array?"

### Step 01:
listing the files of training and testing

In [4]:
import os,sys
import numpy as np
import cv2
from os import listdir
from os.path import isfile, join
from sklearn import svm
from sklearn.cluster import KMeans
import random

import warnings
warnings.filterwarnings("ignore")


TRAINING_PATH = "Data set/Training/"
TESTING_PATH = "Data set/Testing/"
TRAINING_FILES = [f for f in listdir(TRAINING_PATH) if isfile(join(TRAINING_PATH, f))]
TESTING_FILES = [f for f in listdir(TESTING_PATH) if isfile(join(TESTING_PATH, f))]

### Step 02:
loading images and labels into dictionary

In [59]:
Labels_Index = {
                "Cat":        np.array([1.0,0.0,0.0,0.0,0.0]),
                "Laptop":     np.array([0.0,1.0,0.0,0.0,0.0]),
                "Apple":      np.array([0.0,0.0,1.0,0.0,0.0]),
                "Car":        np.array([0.0,0.0,0.0,1.0,0.0]),
                "Helicopter": np.array([0.0,0.0,0.0,0.0,1.0])
                }

#given the word this function returns the onehot vector label
def getLabelIndex(word):
    if word in Labels_Index:
        return Labels_Index[word]
    else:
        return np.zeros(5)

#given the onehot vector label this function returns the word
def getLabelWord(index):
    for key in Labels_Index.keys():
        if np.argmax(index) == np.argmax(Labels_Index[key]):
            return key
    return ""

def getIndexWord(index):
    one_hot = np.array([0,0,0,0,0])
    one_hot[int(index)] = 1.0
    return getLabelWord(one_hot)

In [6]:
#function given filename it returns a list of labels associated with this file
#it depend on filename that  contains the classes of this image
def getImageLabels(image_filename):
    result = []
    if image_filename.find("Cat") != -1:
        result.append(Labels_Index["Cat"])
    
    if image_filename.find("Laptop") != -1:
        result.append(Labels_Index["Laptop"])
        
    if image_filename.find("Apple") != -1:
        result.append(Labels_Index["Apple"])
        
    if image_filename.find("Car") != -1:
        result.append(Labels_Index["Car"])
        
    if image_filename.find("Helicopter") != -1:
        result.append(Labels_Index["Helicopter"])
        
    return result
        

### Data Layout:
- Data = Dictionary or map of data
- Key = File name
- Value = Dictionary {"image": numpy array of image data, "labels": list of onehot vectors that represents the labels associated with this image

### How to Iterate over Data?
```
for filename in TrainingData:
        image = TrainingData[filename]["image"]
        labels = TrainingData[filename]["labels"]
```

In [7]:
#dictionary that contains the training data
TrainingData = {}
for image in TRAINING_FILES:
    image_filename = join(TRAINING_PATH, image)
    TrainingData[image_filename] = {"image": cv2.imread(image_filename),
                                     "labels": getImageLabels(image_filename)}
#dictionary that contains the testing data
TestingData = {}
for image in TESTING_FILES:
    image_filename = join(TESTING_PATH, image)
    TestingData[image_filename] = {"image": cv2.imread(image_filename),
                                    "labels": getImageLabels(image_filename)}

Function to View the image given it's image matrix/data

In [8]:
def view_image(img):
    cv2.startWindowThread()
    cv2.namedWindow("preview")
    cv2.imshow("preview" ,img)

Get SIFT Features and a function to draw and view the features on the image

In [9]:
SIFT = cv2.SIFT()
def getKeyPoints(img):
    return SIFT.detect(img,None)

def getKeyDescPoints(img):
    return SIFT.detectAndCompute(img, None)

def viewSIFTPoints(img,points):
    view_image(cv2.drawKeypoints(img,points))

In [10]:
def logical_or_labels(labels):
    result = None
    for label in labels:
        if result is None:
            result = label
        else:
            result = np.logical_or(result, label).astype(float)
    return result

In [11]:
SVM = svm.NuSVC(probability=True,decision_function_shape="ovr")
def train_SVM():
    all_descriptions = []
    all_labels = []
    for key in TrainingData.keys():
        img = TrainingData[key]["image"]
        img = cv2.Canny(img, 100, 200)
        points, descriptions = getKeyDescPoints(img)
        label = logical_or_labels(TrainingData[key]["labels"])
        descriptions = np.array(descriptions)
        all_descriptions.append(np.mean(descriptions, axis = 0))
        all_labels.append(np.argmax(label))
    
    SVM.fit(all_descriptions,all_labels)
    
    kmeans = KMeans(n_clusters=3)
    for key in TestingData.keys():
        org_img = TestingData[key]["image"]
        img = cv2.Canny(org_img, 100, 200)
        points, descriptions = getKeyDescPoints(img)
        all_points = [point.pt for point in points]
        kmeans.fit(all_points)
        clusters_indices = kmeans.predict(all_points)
        cluster_pts0 = []
        cluster_pts1 = []
        cluster_pts2 = []
        points0 = []
        points1 = []
        points2 = []
        desc0 = []
        desc1 = []
        desc2 = []
        clusters = [0,0,0]
        for i in xrange(len(clusters_indices)):
            clusters[clusters_indices[i]] += 1
            if clusters_indices[i] == 0:
                cluster_pts0.append(points[i])
                desc0.append(descriptions[i])
                points0.append(points[i].pt)
            elif clusters_indices[i] == 1:
                cluster_pts1.append(points[i])
                desc1.append(descriptions[i])
                points1.append(points[i].pt)
            elif clusters_indices[i] == 2:
                cluster_pts2.append(points[i])
                desc2.append(descriptions[i])
                points2.append(points[i].pt)
        
        points0 = np.array(points0)
        points1 = np.array(points1)
        points2 = np.array(points2)
        
        c1 = SVM.predict(np.mean(np.array(desc0), axis = 0))
        
        c2 = SVM.predict(np.mean(np.array(desc1), axis = 0))
        
        c3 = SVM.predict(np.mean(np.array(desc2), axis = 0))
        
        answer = {}
        answer[getIndexWord(c1[0])] = points0
        answer[getIndexWord(c2[0])] = points1
        answer[getIndexWord(c3[0])] = points2
        
        answer_image = org_img
        
        colors = [(255,0,0), (0,255,0), (0,0,255)]
        color_ix = 0
        print "answers: ", len(answer)
        for ans in answer.keys():
            min_pt = (100000, 100000)
            max_pt = (0, 0)
            for pt in answer[ans]:
                if pt[0] < min_pt[0] or pt[1] < min_pt[1]:
                    min_pt = pt
                elif pt[0] > max_pt[0] or pt[1] > max_pt[1]:
                    max_pt = pt
            cv2.rectangle(answer_image,(int(min_pt[0]), int(min_pt[1])), (int(max_pt[0]), int(max_pt[1])),colors[color_ix])
            cv2.putText(answer_image, ans, (int(min_pt[0]), int(min_pt[1])), cv2.FONT_HERSHEY_DUPLEX, 0.5, colors[color_ix])
            if color_ix == 0:
                print "Blue: ", ans
            elif color_ix == 1:
                print "Green: ", ans
            elif color_ix == 2:
                print "Red: ", ans
                    
            color_ix += 1
            if color_ix > 2:
                color_ix = 0
        view_image(answer_image)
        raw_input("wait")
        

In [15]:
train_SVM()



4
4
3
[4] [4] [3]
answers:  2
Blue:  Helicopter
Green:  Car
wait




0
0
2
[0] [0] [2]
answers:  2
Blue:  Apple
Green:  Cat
wait




2
2
4
[2] [2] [4]
answers:  2
Blue:  Helicopter
Green:  Apple
wait




1
4
2
[1] [4] [2]
answers:  3
Blue:  Helicopter
Green:  Laptop
Red:  Apple
wait




2
2
2
[2] [2] [2]
answers:  1
Blue:  Apple
wait




1
1
2
[1] [1] [2]
answers:  2
Blue:  Laptop
Green:  Apple


KeyboardInterrupt: 

## Back-Propagation
### Network Class

In [41]:
class Network(object):

    def __init__(self, sizes):
        """The list ``sizes`` contains the number of neurons in the
        respective layers of the network.  For example, if the list
        was [2, 3, 1] then it would be a three-layer network, with the
        first layer containing 2 neurons, the second layer 3 neurons,
        and the third layer 1 neuron.  The biases and weights for the
        network are initialized randomly, using a Gaussian
        distribution with mean 0, and variance 1.  Note that the first
        layer is assumed to be an input layer, and by convention we
        won't set any biases for those neurons, since biases are only
        ever used in computing the outputs from later layers."""
        self.num_layers = len(sizes)
        self.sizes = sizes
        self.biases = [np.random.randn(y, 1) for y in sizes[1:]]
        self.weights = [np.random.randn(y, x)
                        for x, y in zip(sizes[:-1], sizes[1:])]

    def feedforward(self, a):
        """Return the output of the network if ``a`` is input."""
        for b, w in zip(self.biases, self.weights):
            a = sigmoid(np.dot(w, a)+b)
        return a

    def SGD(self, training_data, epochs, mini_batch_size, eta,
            test_data=None):
        """Train the neural network using mini-batch stochastic
        gradient descent.  The ``training_data`` is a list of tuples
        ``(x, y)`` representing the training inputs and the desired
        outputs.  The other non-optional parameters are
        self-explanatory.  If ``test_data`` is provided then the
        network will be evaluated against the test data after each
        epoch, and partial progress printed out.  This is useful for
        tracking progress, but slows things down substantially."""
        if test_data: n_test = len(test_data)
        n = len(training_data)
        for j in xrange(epochs):
            random.shuffle(training_data)
            mini_batches = [
                training_data[k:k+mini_batch_size]
                for k in xrange(0, n, mini_batch_size)]
            for mini_batch in mini_batches:
                self.update_mini_batch(mini_batch, eta)
            if test_data:
                print "Epoch {0}: {1} / {2}".format(
                    j, self.evaluate(test_data), n_test)
            else:
                print "Epoch {0} complete".format(j)

    def update_mini_batch(self, mini_batch, eta):
        """Update the network's weights and biases by applying
        gradient descent using backpropagation to a single mini batch.
        The ``mini_batch`` is a list of tuples ``(x, y)``, and ``eta``
        is the learning rate."""
        nabla_b = [np.zeros(b.shape) for b in self.biases]
        nabla_w = [np.zeros(w.shape) for w in self.weights]
        for x, y in mini_batch:
            delta_nabla_b, delta_nabla_w = self.backprop(x, y)
            nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]
            nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]
        self.weights = [w-(eta/len(mini_batch))*nw
                        for w, nw in zip(self.weights, nabla_w)]
        self.biases = [b-(eta/len(mini_batch))*nb
                       for b, nb in zip(self.biases, nabla_b)]

    def backprop(self, x, y):
        """Return a tuple ``(nabla_b, nabla_w)`` representing the
        gradient for the cost function C_x.  ``nabla_b`` and
        ``nabla_w`` are layer-by-layer lists of numpy arrays, similar
        to ``self.biases`` and ``self.weights``."""
        nabla_b = [np.zeros(b.shape) for b in self.biases]
        nabla_w = [np.zeros(w.shape) for w in self.weights]
        # feedforward
        activation = x
        activations = [x] # list to store all the activations, layer by layer
        zs = [] # list to store all the z vectors, layer by layer
        for b, w in zip(self.biases, self.weights):
            z = np.dot(w, activation)+b
            zs.append(z)
            activation = sigmoid(z)
            activations.append(activation)
        # backward pass
        sg_prime = sigmoid_prime(zs[-1])
        cost_dev = self.cost_derivative(activations[-1], y)
        delta = cost_dev * sg_prime
        nabla_b[-1] = delta
        #print delta.shape, activations[-2].transpose().shape
        nabla_w[-1] = np.dot(delta, activations[-2].transpose())
        # Note that the variable l in the loop below is used a little
        # differently to the notation in Chapter 2 of the book.  Here,
        # l = 1 means the last layer of neurons, l = 2 is the
        # second-last layer, and so on.  It's a renumbering of the
        # scheme in the book, used here to take advantage of the fact
        # that Python can use negative indices in lists.
        for l in xrange(2, self.num_layers):
            z = zs[-l]
            sp = sigmoid_prime(z)
            delta = np.dot(self.weights[-l+1].transpose(), delta) * sp
            nabla_b[-l] = delta
            nabla_w[-l] = np.dot(delta, activations[-l-1].transpose())
        return (nabla_b, nabla_w)

    def evaluate(self, test_data):
        """Return the number of test inputs for which the neural
        network outputs the correct result. Note that the neural
        network's output is assumed to be the index of whichever
        neuron in the final layer has the highest activation."""
        test_results = [(np.argmax(self.feedforward(x)), y)
                        for (x, y) in test_data]

        return sum(int(x == y) for (x, y) in test_results)
    
    def evaluate_test(self, test_data):
        """Return the number of test inputs for which the neural
        network outputs the correct result. Note that the neural
        network's output is assumed to be the index of whichever
        neuron in the final layer has the highest activation."""
        test_results = [(np.argmax(self.feedforward(x)), y)
                        for (x, y) in test_data]

        return test_results

    def cost_derivative(self, output_activations, y):
        """Return the vector of partial derivatives \partial C_x /
        \partial a for the output activations."""
        return (output_activations-y)

#### Miscellaneous functions
def sigmoid(z):
    """The sigmoid function."""
    return 1.0/(1.0+np.exp(-z))

def sigmoid_prime(z):
    """Derivative of the sigmoid function."""
    return sigmoid(z)*(1-sigmoid(z))

### Training & Testing

In [42]:
NET = Network([128, 25, 5])
def devectorize_result(arr):
    return np.argmax(arr)
def backPropagation():
    
    def vectorized_result(j):
        e = np.zeros((5, 1))
        e[j] = 1.0
        return e
    
    def devectorize_result(arr):
        return np.argmax(arr)

    training_samples = []
    training_labels = []
    for key in TrainingData.keys():
        kp, desc = getKeyDescPoints(TrainingData[key]["image"])
        
        for i in xrange(len(desc)):
            training_samples.append(np.reshape(desc[i], (128,1)))
            label = TrainingData[key]["labels"]
            training_labels.append(vectorized_result(np.argmax(label[0])))
    training_data = zip(training_samples, training_labels)
    
    testing_samples = []
    testing_labels = []
    
    for key in TestingData.keys():
        dp, desc = getKeyDescPoints(TestingData[key]["image"])
        
        for i in xrange(len(desc)):
            testing_samples.append(np.reshape(desc[i], (128,1)))
            testing_labels.append(devectorize_result(TestingData[key]["labels"][0]))

    testing_data = zip(testing_samples, testing_labels)
    
    NET.SGD(training_data, 30, 100, 0.5, test_data=None)
        

In [43]:
backPropagation()

Epoch 0 complete
Epoch 1 complete
Epoch 2 complete
Epoch 3 complete
Epoch 4 complete
Epoch 5 complete
Epoch 6 complete
Epoch 7 complete
Epoch 8 complete
Epoch 9 complete
Epoch 10 complete
Epoch 11 complete
Epoch 12 complete
Epoch 13 complete
Epoch 14 complete
Epoch 15 complete
Epoch 16 complete
Epoch 17 complete
Epoch 18 complete
Epoch 19 complete
Epoch 20 complete
Epoch 21 complete
Epoch 22 complete
Epoch 23 complete
Epoch 24 complete
Epoch 25 complete
Epoch 26 complete
Epoch 27 complete
Epoch 28 complete
Epoch 29 complete


In [60]:
testing_samples = []
testing_labels = []

for key in TestingData.keys():
    dp, desc = getKeyDescPoints(TestingData[key]["image"])

    for i in xrange(len(desc)):
        testing_samples.append(np.reshape(desc[i], (128,1)))
        testing_labels.append(devectorize_result(TestingData[key]["labels"][0]))
        
    result = NET.evaluate_test(zip(testing_samples, testing_labels))
    result_classes = {}
    for (x, y) in result:
        if x in result_classes:
            result_classes[x] += 1
        else:
            result_classes[x] = 1
    for label in TestingData[key]["labels"]:
        ix = np.argmax(label)
        print ix, getIndexWord(ix),
    print result_classes
            
    
        
    testing_samples = []
    testing_labels = []

testing_data = zip(testing_samples, testing_labels)

3 Car 4 Helicopter {0: 453, 1: 104, 2: 4, 3: 207, 4: 201}
0 Cat {0: 786, 1: 106, 2: 2, 3: 252, 4: 264}
0 Cat 2 Apple 4 Helicopter {0: 620, 1: 85, 2: 6, 3: 199, 4: 256}
3 Car {0: 494, 1: 76, 2: 2, 3: 273, 4: 232}
0 Cat 1 Laptop {0: 912, 1: 653, 2: 3, 3: 276, 4: 205}
1 Laptop {0: 350, 1: 97, 3: 148, 4: 141}
0 Cat 1 Laptop {0: 341, 1: 42, 2: 1, 3: 120, 4: 99}
2 Apple {0: 52, 1: 6, 2: 1, 3: 18, 4: 14}
0 Cat 1 Laptop {0: 380, 1: 98, 2: 1, 3: 190, 4: 197}
1 Laptop 3 Car {0: 408, 1: 87, 2: 4, 3: 186, 4: 194}
1 Laptop 2 Apple {0: 250, 1: 65, 3: 108, 4: 95}
0 Cat 3 Car {0: 134, 1: 26, 2: 1, 3: 95, 4: 68}
4 Helicopter {0: 169, 1: 32, 3: 53, 4: 44}
0 Cat 3 Car {0: 544, 1: 122, 2: 3, 3: 237, 4: 221}


In [44]:
print NET.evaluate_test(testing_data)

[(1, 3), (1, 3), (0, 3), (3, 3), (0, 3), (1, 3), (1, 3), (1, 3), (3, 3), (0, 3), (0, 3), (0, 3), (0, 3), (4, 3), (4, 3), (0, 3), (1, 3), (3, 3), (3, 3), (1, 3), (4, 3), (4, 3), (4, 3), (3, 3), (3, 3), (0, 3), (0, 3), (0, 3), (0, 3), (4, 3), (3, 3), (0, 3), (0, 3), (4, 3), (0, 3), (4, 3), (4, 3), (0, 3), (4, 3), (0, 3), (0, 3), (3, 3), (4, 3), (0, 3), (0, 3), (0, 3), (0, 3), (1, 3), (4, 3), (0, 3), (0, 3), (1, 3), (0, 3), (0, 3), (0, 3), (4, 3), (3, 3), (0, 3), (4, 3), (0, 3), (3, 3), (0, 3), (0, 3), (1, 3), (1, 3), (3, 3), (1, 3), (0, 3), (0, 3), (0, 3), (3, 3), (0, 3), (4, 3), (4, 3), (3, 3), (4, 3), (4, 3), (0, 3), (3, 3), (3, 3), (1, 3), (0, 3), (0, 3), (0, 3), (3, 3), (0, 3), (0, 3), (3, 3), (0, 3), (1, 3), (0, 3), (3, 3), (0, 3), (1, 3), (0, 3), (0, 3), (3, 3), (1, 3), (1, 3), (1, 3), (4, 3), (0, 3), (0, 3), (3, 3), (3, 3), (4, 3), (0, 3), (3, 3), (0, 3), (0, 3), (4, 3), (0, 3), (0, 3), (4, 3), (1, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (0, 3), (3, 3), (0, 3), (1, 3),