> **Jupyter slideshow:** This notebook can be displayed as slides. To view it as a slideshow in your browser type in the console:


> `> jupyter nbconvert [this_notebook.ipynb] --to slides --post serve`


> To toggle off the slideshow cell formatting, click the `CellToolbar` button, then `View --> Cell Toolbar --> None`

In [1]:
from __future__ import division, print_function

from IPython.core.display import Image
from IPython.display import YouTubeVideo

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
%matplotlib inline

<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">


# Computer Vision via Neural Nets (w/ GraphLab Create)

_Author: Alex Combs (NYC) _

---

<a id="learning-objectives"></a>
### Learning Objectives
*After this lesson, you will be able to:*
- Understand how perceptrons work
- Understand how logistic regressors are continuous perceptrons
- Use GraphLab to train a neural network on images

**Note: This lesson's exercises require registration at GraphLab Create, which offers a [one-year student trial](https://turi.com/download/academic.html).**

### Lesson Guide

- [Opening](#opening)
	- [Neural Networks](#neural-networks)
	- [The perceptron](#the-perceptron)
	- [The perceptron as a decision process](#the-perceptron-as-a-decision-process)
	- [Perceptron update rule](#perceptron-update-rule)
	- [The perceptron as the fundamental building block of neural nets](#the-perceptron-as-the-fundamental-building-block-of-neural-nets)
	- [The continuous perceptron: Logistic Regression](#the-continuous-perceptron-logistic-regression)


- [Image similarity with deep features](#image-similarity-with-deep-features)
- [GraphLab Create Exercise](#getting-started-with-graphlab-create)
- [Independent Exercise](#independent-exercise)


<a id="opening"></a>
## Opening
---

Today we're going to talk about neural networks and their applications in computer vision

<a id="neural-networks"></a>
### Neural Networks

Deep learning is one of the most popular and cutting-edge incarnations of neural networks. To understand deep learning, we will start from the basics of neural networks. One of the original building blocks of neural networks is the perceptron.

Perceptrons date back to the 1950s. They were an attempt to adopt the biological model of the brain to computers. Their first application was, naturally, image recognition.

<a id="the-perceptron"></a>
### The Perceptron

You may recall from your high school biology that neurons work in the following way:

- Each neuron is connected in a network to many other neurons.
- These neurons both send and receive signals from connected neurons.
- When a neuron receives a signal it can either fire or not, depending on whether the incoming signal is above some threshold.

A single perceptron, like a neuron, can be thought of a decision-making unit. If the weight of the incoming signals is above a threshold, the perceptron fires, and if not it doesn't. In this case firing equals outputting a value of 1 and not firing equals outputting a value of 0.

<img src="images/ann-perceptron.png" width=500>
<br>
<img src="images/ann-perceptron-math.png" widht=500>

<a id="the-perceptron-as-a-decision-process"></a>
### The perceptron as a decision process

As a example, imagine you are considering taking a new job. There are several factors that go into the decision:
- The location.
- The pay.
- The type of work.
- How trendy the office is.

Obviously, not all of these will carry the same weight when making your decision. In the perceptron, each input then is modified by a weighting factor. 

If the weighted incoming values are greater than the threshold the unit outputs a 1; if not, it outputs a 0.

#### Example

Let's now see how learning can take place with a perceptron. We will train a perceptron with two inputs to learn the `AND` rule. That is, we want the perceptron to output a 1 when both inputs are 1, and a 0 in all other cases.

Say our $x_1$ input is equal to 1 and our $x_2$ input is equal to -1. $w_1$ is randomly set to 0.8, and $w_2$ is randomly set to 0.4. As both must be positive for our output to be 1, then our threshold is any value greater than 1.

Therefore, we have:

### $$w_1x_1 + w_2x_2 = 1 \cdot 0.8 + -1 \cdot 0.4 = 0.8 – 0.4 = 0.4$$

Now, since we expect our output to be 0, we can say that we have an error of 0.4. We will now attempt to improve our model by pushing these errors back down to the inputs in order to update the weights.

<a id="perceptron-update-rule"></a>
### Perceptron update rule

<img src="images/ann-perceptron-update.png">

Here, $w_i$ is the weight of the $i$th input, $t$ is the target outcome, and 0 is the actual outcome. Ignore the $\eta$ term for now. $\eta$ is the learning rate and it determines how large or small our updates should be. For now, we will assume it is set to 1.

In our example case on the previous slide, our target outcome is 0, and our actual outcome is 0.4. 


**Updating the $x_1$ weight**

Let's look at $x_1$ to update its weight. We have $1 * (0 – 0.4) * 1$, which equals -0.4. That is our $w$ delta; therefore, updating equation 1, we have $0.8 – 0.4$, which gives us our new weight $w_1 = 0.4$. 

**Updating the $x_2$ weight**

The weight for $x_1$ has come down. What about the weight for $x_2$? Let's take a look. That one is $1 * (0 – 0.4) * -1$, which equals 0.4. Updating the weight, we have $0.2 + 0.4 = 0.6$. Notice the weights are converging to parity, which is what we would expect.


<a id="the-perceptron-as-the-fundamental-building-block-of-neural-nets"></a>
### The perceptron as the fundamental building block of neural nets

This ultra-simple model is conceptually the building block of today's fancier neural networks. Of course, many advancements have been applied over the years. For example, the mathematics of neurons today are continuous rather than discrete, many activation functions have been tested (including our familiar logit function), and different network structures have been experimented with. The phrase "deep learning" indicates the network contains "hidden layers" -- layers between the input layers and output layers.

<img src="images/ann-layers.png">

We are actually already familiar with one of the most frequently used continuous neuron models -- logistic regression! Recall that a logistic regressor is the logit function applied to a linear function, just like our perceptron is a "transfer function" ("sign" in this case) applied to a linear function.

<a id="the-continuous-perceptron-logistic-regression"></a>
### The continuous perceptron: Logistic Regression 

Let's see how the discrete perceptron model relates to the continuous logistic regression model in more detail. Let coefficient $i$ be $\beta_i$, feature $i$ be $x_i$, and the logit function be $\sigma(t)$. Then our familiar logistic regressor is:

### $$p(y|\mathbf{x}) = \frac{1}{1 + e^{-(\beta_0 + \beta_1x_1 + ... + \beta_nx_n)}} = \sigma(\beta_0 + \beta_1x_1 + ... + \beta_nx_n)$$

Now let's relate this to the perceptron. Define the coefficient $\beta_i$ to be weight $w_i$. Define the threshold as $\beta_0$. Define the transfer function as our logit function $\sigma(x)$. After making these replacements, we get a very similar formula to the perceptron that is continuous and easily differentiable (necessary for training via backpropagation):

### $$ouput_{logreg} = \sigma(w_1x_1 + ... + w_nx_n - threshold)$$

**A fundamental neuron used in modern-day deep learning is a logistic regressor!**

For clarity's sake, let's directly compare this to our perceptron transfer function -- it is nearly identical. (Verify the use of sgn() below is equivalent to the definition given earlier -- just subtract the threshold from both sides of the inequalities.)

### $$output_{perceptron} = sgn(w_1x_1 + ... + w_nx_n - threshold)$$


<a id="image-similarity-with-deep-features"></a>
## Image similarity with deep features
---

Now that we've learned a bit about the building blocks of deep learning, let's use a pre-trained deep learning network to find the image most similar to ourselves in the CIFAR-10 image database.

We will first need to make sure GraphLab Create is installed. This is a platform that makes large-scale (distributed) learning possible with a very simple API. Fortunately for us, it is available free for a one year period for students (bootcamps included).

Follow these directions to get it installed: https://turi.com/download/install-graphlab-create-command-line.html

Once that is completed, we can begin our code in our Jupyter notebook.

<a id="getting-started-with-graphlab-create"></a>
### Getting started with GraphLab Create

In [2]:
import graphlab
graphlab.canvas.set_target('ipynb')

# load pre-trained model
gl_img = graphlab.SFrame('http://s3.amazonaws.com/dato-datasets/coursera/deep_learning/image_train_data')
gl_img

This non-commercial license of GraphLab Create for academic use is assigned to kiefer.katovich@generalassemb.ly and will expire on April 27, 2018.


[INFO] graphlab.cython.cy_server: GraphLab Create v2.1 started. Logging: /tmp/graphlab_server_1493391769.log


id,image,label,deep_features,image_array
24,Height: 32 Width: 32,bird,"[0.242871761322, 1.09545373917, 0.0, ...","[73.0, 77.0, 58.0, 71.0, 68.0, 50.0, 77.0, 69.0, ..."
33,Height: 32 Width: 32,cat,"[0.525087952614, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[7.0, 5.0, 8.0, 7.0, 5.0, 8.0, 5.0, 4.0, 6.0, 7.0, ..."
36,Height: 32 Width: 32,cat,"[0.566015958786, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[169.0, 122.0, 65.0, 131.0, 108.0, 75.0, ..."
70,Height: 32 Width: 32,dog,"[1.12979578972, 0.0, 0.0, 0.778194487095, 0.0, ...","[154.0, 179.0, 152.0, 159.0, 183.0, 157.0, ..."
90,Height: 32 Width: 32,bird,"[1.71786928177, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[216.0, 195.0, 180.0, 201.0, 178.0, 160.0, ..."
97,Height: 32 Width: 32,automobile,"[1.57818555832, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[33.0, 44.0, 27.0, 29.0, 44.0, 31.0, 32.0, 45.0, ..."
107,Height: 32 Width: 32,dog,"[0.0, 0.0, 0.220677852631, 0.0, ...","[97.0, 51.0, 31.0, 104.0, 58.0, 38.0, 107.0, 61.0, ..."
121,Height: 32 Width: 32,bird,"[0.0, 0.23753464222, 0.0, 0.0, 0.0, 0.0, ...","[93.0, 96.0, 88.0, 102.0, 106.0, 97.0, 117.0, ..."
136,Height: 32 Width: 32,automobile,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 7.5737862587, 0.0, ...","[35.0, 59.0, 53.0, 36.0, 56.0, 56.0, 42.0, 62.0, ..."
138,Height: 32 Width: 32,bird,"[0.658935725689, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[205.0, 193.0, 195.0, 200.0, 187.0, 193.0, ..."


In this DataFrame we have a number of columns describing the images, but we are primarily concerned with the `deep_features` column. 

These are features that were learned from a deep learning neural network. They won't make any sense to you, but they are the learned features that discriminate from the various classes of images in the CIFAR-10 data set. We are going to use something called transfer learning to allow us to use these deep features for our own purposes.

Because each layer of a network can be thought of as a progressively more holistic representation of the trained object, we can 'cut off the head' of the network (so to speak) and use it to extract lower-level features to apply on our own images.

<img src="http://i.imgur.com/p3vKsAo.png" width=700>>

<a id="we-can-look-at-the-first--in-the-dataset"></a>
## We can look at the first 5 in the dataset

In [3]:
graphlab.image_analysis.resize(gl_img['image'][:5], 128,128).show()

<a id="next-we-can-pull-in-our-own-image"></a>
## Next we can pull in our own image

In [4]:
img = graphlab.Image('https://pbs.twimg.com/profile_images/627283912075702272/gsmHwYrT.jpg')
ppsf = graphlab.SArray([img])
ppsf = graphlab.image_analysis.resize(ppsf, 32,32)
graphlab.image_analysis.resize(ppsf, 128,128).show()

In [5]:
ppsf = graphlab.SFrame(ppsf).rename({'X1': 'image'})
ppsf

image
Height: 32 Width: 32


<a id="next-we-need-to-extract-the-deep-features-from-our-image"></a>
## Next we need to extract the deep features from our image

In [6]:
extractor = graphlab.feature_engineering.DeepFeatureExtractor(features='image', model='auto')
extractor = extractor.fit(ppsf)
ppsf['deep_features'] = extractor.transform(ppsf)['deep_features.image']
ppsf

image,deep_features
Height: 32 Width: 32,"[1.57483696938, 0.0, 0.0, 0.0, 0.0, 0.0, ..."


<a id="then-we-append-it-to-our-sframe"></a>
## Then we append it to our SFrame

In [7]:
ppsf['label'] = 'me'
gl_img['id'].max()

ppsf['id'] = 50000
labels = ['id', 'image', 'label', 'deep_features']
part_train = gl_img[labels]
new_train = part_train.append(ppsf[labels])
new_train.tail()

id,image,label,deep_features
49840,Height: 32 Width: 32,cat,"[0.0344250798225, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
49845,Height: 32 Width: 32,dog,"[0.125242829323, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
49881,Height: 32 Width: 32,bird,"[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 9.30746841431, 0.0, ..."
49896,Height: 32 Width: 32,cat,"[0.0, 0.0, 0.592454433441, 0.0, ..."
49913,Height: 32 Width: 32,automobile,"[1.2023819685, 0.342965483665, 0.0, ..."
49919,Height: 32 Width: 32,automobile,"[0.0, 0.0, 0.0, 0.769036352634, 0.0, ..."
49927,Height: 32 Width: 32,dog,"[0.558163285255, 0.0, 1.05110442638, 0.0, 0.0, ..."
49958,Height: 32 Width: 32,cat,"[0.674960494041, 0.0, 0.0, 1.9640891552, ..."
49970,Height: 32 Width: 32,cat,"[1.07501864433, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
50000,Height: 32 Width: 32,me,"[1.57483696938, 0.0, 0.0, 0.0, 0.0, 0.0, ..."


<a id="now-we-use-knn-to-find-our-spirit-animal"></a>
## Now we use knn to find our spirit animal

In [8]:
knn_model = graphlab.nearest_neighbors.create(new_train, features=['deep_features'], label='id')

me_test = new_train[-1:]
def reveal_my_twin(x):
    return gl_img.filter_by(x['reference_label'],'id')

spirit_animal = reveal_my_twin(knn_model.query(me_test))
graphlab.image_analysis.resize(spirit_animal['image'], 128,128).show()

<a id="independent-exercise"></a>
### Independent Exercise

Use the deep features on your own images to find the most similar images.