## linear boundaries ##

### boundary: a line ###

$w_1x_1 + w_2x_2 + b = 0$  

$Wx + b = 0$  

$W = (w_1, w_2)$  

$x = (x_1, x_2)$  

$y =$ label: 0 or 1  

<img src="p1.png" width=500>

### prediction ###

$$
\hat{y} = \left\{ \begin{array}{rl}
 1&\mbox{if $Wx + b > 0$} \\
  0&\mbox{if $Wx + b < 0$}
       \end{array} \right.
$$

### goal ###
to model $\hat{y}$ as close to $y$ as possible

### boundary: a plane ###

$w_1x_1 + w_2x_2 + w_3x_3 + b = 0$  

$Wx + b = 0$  

$W = (w_1, w_2, w_3)$  

$x = (x_1, x_2, x_3)$  

$y =$ label: 0 or 1  

<img src="p2.png" width=500>

### prediction ###

$$
\hat{y} = \left\{ \begin{array}{rl}
 1&\mbox{if $Wx + b > 0$} \\
  0&\mbox{if $Wx + b < 0$}
       \end{array} \right.
$$

### boundary: an n-1 dimensional space ###

$w_1x_1 + w_2x_2 + ... + w_nx_n + b = 0$  

$Wx + b = 0$  

$W = (w_1, w_2, ..., w_n)$  

$x = (x_1, x_2, ..., x_n)$  

$y =$ label: 0 or 1  

<img src="p3.png" width=500>

### prediction ###

$$
\hat{y} = \left\{ \begin{array}{rl}
 1&\mbox{if $Wx + b > 0$} \\
  0&\mbox{if $Wx + b < 0$}
       \end{array} \right.
$$

---

<img src="p4.png" width=500>

<img src="p5.png" width=500>

---

<img src="p6.png" width=500>  

<img src="p7.png" width=500>  

<img src="p8.png" width=500>  

<img src="p9.png" width=500>  

<img src="q1.png" width=500>

---

### Perceptrons as Logical Operators ###  

In this lesson, we'll see one of the many great applications of perceptrons. As logical operators! You'll have the chance to create the perceptrons for the most common of these, the AND, OR, and NOT operators. And then, we'll see what to do about the elusive XOR operator. Let's dive in!

In [8]:
from IPython.display import HTML
HTML("""<div align="middle"><video width="100%" controls><source src="v1.mp4" type="video/mp4"></video></div>""")

### AND logical operator perceptron ###

In [8]:
import pandas as pd

# TODO: Set weight1, weight2, and bias
weight1 = 1.0
weight2 = 1.0
bias = -1.1

# Inputs and outputs
test_inputs = [(1, 1), (1, 0), (0, 1), (0, 0)]
correct_outputs = [True, False, False, False]
outputs = []

# Generate and check output
for test_input, correct_output in zip(test_inputs, correct_outputs):
    linear_combination = weight1 * test_input[0] + weight2 * test_input[1] + bias
    output = int(linear_combination >= 0)
    is_correct_string = 'Yes' if output == correct_output else 'No'
    outputs.append([test_input[0], test_input[1], linear_combination, output, is_correct_string])

# Print output
num_wrong = len([output[4] for output in outputs if output[4] == 'No'])
output_frame = pd.DataFrame(outputs, columns=['Input 1', '  Input 2', '  Linear Combination', '  Activation Output', '  Is Correct'])
if not num_wrong:
    print('Nice!  You got it all correct.\n')
else:
    print('You got {} wrong.  Keep trying!\n'.format(num_wrong))
print(output_frame.to_string(index=False))

Nice!  You got it all correct.

Input 1    Input 2    Linear Combination    Activation Output   Is Correct
      1          1                   0.9                    1          Yes
      1          0                  -0.1                    0          Yes
      0          1                  -0.1                    0          Yes
      0          0                  -1.1                    0          Yes


<img src="q2.png" width=600>

### OR perceptron ###

In [9]:
import pandas as pd

# TODO: Set weight1, weight2, and bias
weight1 = 1.0
weight2 = 1.0
bias = -0.9

# Inputs and outputs
test_inputs = [(1, 1), (1, 0), (0, 1), (0, 0)]
correct_outputs = [True, True, True, False]
outputs = []

# Generate and check output
for test_input, correct_output in zip(test_inputs, correct_outputs):
    linear_combination = weight1 * test_input[0] + weight2 * test_input[1] + bias
    output = int(linear_combination >= 0)
    is_correct_string = 'Yes' if output == correct_output else 'No'
    outputs.append([test_input[0], test_input[1], linear_combination, output, is_correct_string])

# Print output
num_wrong = len([output[4] for output in outputs if output[4] == 'No'])
output_frame = pd.DataFrame(outputs, columns=['Input 1', '  Input 2', '  Linear Combination', '  Activation Output', '  Is Correct'])
if not num_wrong:
    print('Nice!  You got it all correct.\n')
else:
    print('You got {} wrong.  Keep trying!\n'.format(num_wrong))
print(output_frame.to_string(index=False))


Nice!  You got it all correct.

Input 1    Input 2    Linear Combination    Activation Output   Is Correct
      1          1                   1.1                    1          Yes
      1          0                   0.1                    1          Yes
      0          1                   0.1                    1          Yes
      0          0                  -0.9                    0          Yes


### NOT perceptron ###

In [11]:
import pandas as pd

# TODO: Set weight1, weight2, and bias
weight1 = -1.1
weight2 = -1.2
bias = 1.1

# Inputs and outputs
test_inputs = [(0, 0), (0, 1), (1, 0), (1, 1)]
correct_outputs = [True, False, True, False]
outputs = []

# Generate and check output
for test_input, correct_output in zip(test_inputs, correct_outputs):
    linear_combination = weight1 * test_input[0] + weight2 * test_input[1] + bias
    output = int(linear_combination >= 0)
    is_correct_string = 'Yes' if output == correct_output else 'No'
    outputs.append([test_input[0], test_input[1], linear_combination, output, is_correct_string])

# Print output
num_wrong = len([output[4] for output in outputs if output[4] == 'No'])
output_frame = pd.DataFrame(outputs, columns=['Input 1', '  Input 2', '  Linear Combination', '  Activation Output', '  Is Correct'])
if not num_wrong:
    print('Nice!  You got it all correct.\n')
else:
    print('You got {} wrong.  Keep trying!\n'.format(num_wrong))
print(output_frame.to_string(index=False))

Nice!  You got it all correct.

Input 1    Input 2    Linear Combination    Activation Output   Is Correct
      0          0                   1.1                    1          Yes
      0          1                  -0.1                    0          Yes
      1          0                   0.0                    1          Yes
      1          1                  -1.2                    0          Yes


### XOR perceptron ###

In [7]:
from IPython.display import HTML
HTML("""<div align="middle"><video width="100%" controls><source src="v2.mp4" type="video/mp4"></video></div>""")

Build an XOR Multi-Layer Perceptron  

Now, let's build a multi-layer perceptron from the AND, NOT, and OR perceptrons to create XOR logic!  

The neural network below contains 3 perceptrons, A, B, and C. The last one (AND) has been given to you. The input to the neural network is from the first node. The output comes out of the last node.  

The multi-layer perceptron below calculates XOR. Each perceptron is a logic operation of AND, OR, and NOT. However, the perceptrons A, B, and C don't indicate their operation. In the following quiz, set the correct operations for the perceptrons to calculate XOR.

<img src="q3.png" width=600>

And if we introduce the NAND operator as the combination of AND and NOT, then we get the following two-layer perceptron that will model XOR. That's our first neural network!

<img src="q4.png" width=600>

### Perceptron Trick ###  

In the last section you used your logic and your mathematical knowledge to create perceptrons for some of the most common logical operators. In real life, though, we can't be building these perceptrons ourselves. The idea is that we give them the result, and they build themselves. For this, here's a pretty neat trick that will help us.

In [14]:
from IPython.display import HTML
HTML("""<div align="middle"><video width="100%" controls><source src="v3.mp4" type="video/mp4"></video></div>""")

Answer: CLOSER!

### Time for some math! ###  

Now that we've learned that the points that are misclassified, want the line to move closer to them, let's do some math. The following video shows a mathematical trick that modifies the equation of the line, so that it comes closer to a particular point.

In [15]:
from IPython.display import HTML
HTML("""<div align="middle"><video width="100%" controls><source src="v4.mp4" type="video/mp4"></video></div>""")

For the second example, where the line is described by 3x1+ 4x2 - 10 = 0, if the learning rate was set to 0.1, how many times would you have to apply the perceptron trick to move the line to a position where the blue point, at (1, 1), is correctly classified?

1) 3.1 + 4.1 - 9.9 = -2.7  
2) 3.2 + 4.2 - 9.8 = -2.4  
3) 3.3 + 4.3 - 9.7 = -2.1  
4) -1.8  
5) -1.5  
6) -1.2  
7) -0.9  
8) -0.6  
9) -0.3  
10) 0  

### answer: 10 !! ###

In [16]:
from IPython.display import HTML
HTML("""<div align="middle"><video width="100%" controls><source src="v5.mp4" type="video/mp4"></video></div>""")

### Coding the Perceptron Algorithm ###  

Time to code! In this quiz, you'll have the chance to implement the perceptron algorithm to separate the following data (given in the file data.csv).

Recall that the perceptron step works as follows. For a point with coordinates $(p,q)$ label $y$, and prediction given by the equation  

$\hat{y} = step(w_1x_1 + w_2x_2 + b)$:

If the point is correctly classified, do nothing.  

If the point is classified positive, but it has a negative label, __subtract__ $\alpha p$, $\alpha q$, and $\alpha$ from $w_1$, $w_2$, $b$ respectively.  

If the point is classified negative, but it has a positive label, __add__ $\alpha p$, $\alpha q$,and $\alpha$ to $w_1$,  $w_2$, and $b$ respectively.  

Then click on test run to graph the solution that the perceptron algorithm gives you. It'll actually draw a set of dotted lines, that show how the algorithm approaches to the best solution, given by the black solid line.


Feel free to play with the parameters of the algorithm (number of epochs, learning rate, and even the randomizing of the initial parameters) to see how your initial conditions can affect the solution!

In [19]:
import numpy as np
# Setting the random seed, feel free to change it and see different solutions.
np.random.seed(42)

def stepFunction(t):
    if t >= 0:
        return 1
    return 0

def prediction(X, W, b):
    return stepFunction((np.matmul(X,W)+b)[0])

# the code below implements the perceptron trick.
# The function receive as inputs the data X, the labels y, the weights W (as an array), and the bias b,
# updates the weights and bias W, b, according to the perceptron algorithm, and return W and b.
def perceptronStep(X, y, W, b, learn_rate = 0.01):
    for i in range(len(X)):
        y_hat = prediction(X[i],W,b)
        if y[i]-y_hat == 1:
            W[0] += X[i][0]*learn_rate
            W[1] += X[i][1]*learn_rate
            b += learn_rate
        elif y[i]-y_hat == -1:
            W[0] -= X[i][0]*learn_rate
            W[1] -= X[i][1]*learn_rate
            b -= learn_rate
    return W, b

    
# This function runs the perceptron algorithm repeatedly on the dataset,
# and returns a few of the boundary lines obtained in the iterations,
# for plotting purposes.
# Feel free to play with the learning rate and the num_epochs,
# and see your results plotted below.
def trainPerceptronAlgorithm(X, y, learn_rate = 0.01, num_epochs = 25):
    x_min, x_max = min(X.T[0]), max(X.T[0])
    y_min, y_max = min(X.T[1]), max(X.T[1])
    W = np.array(np.random.rand(2,1))
    b = np.random.rand(1)[0] + x_max
    # These are the solution lines that get plotted below.
    boundary_lines = []
    for i in range(num_epochs):
        # In each epoch, we apply the perceptron step.
        W, b = perceptronStep(X, y, W, b, learn_rate)
        boundary_lines.append((-W[0]/W[1], -b/W[1]))
    return boundary_lines


<img src="q6.png" width=600>

### non-linear regions ###

In [6]:
from IPython.display import HTML
HTML("""<div align="middle"><video width="100%" controls><source src="v6.mp4" type="video/mp4"></video></div>""")

### log-loss error function :: Gradient Descent ###

In [5]:
from IPython.display import HTML
HTML("""<div align="middle"><video width="100%" controls><source src="v7.mp4" type="video/mp4"></video></div>""")

### Discrete vs Continuous Predictions ###  

In the last few videos, we learned that continuous error functions are better than discrete error functions, when it comes to optimizing. For this, we need to switch from discrete to continuous predictions. The next two videos will guide us in doing that.

In [4]:
from IPython.display import HTML
HTML("""<div align="middle"><video width="100%" controls><source src="v8.mp4" type="video/mp4"></video></div>""")

In [3]:
from IPython.display import HTML
HTML("""<div align="middle"><video width="100%" controls><source src="v9.mp4" type="video/mp4"></video></div>""")

### Sigmoid Function ###

The sigmoid function is defined as: 

# $\sigma(x) = \frac{1}{1+e^{-x}}$ #

### Multi-Class Classification and Softmax ###

The Softmax Function  

In the next video, we'll learn about the softmax function, which is the equivalent of the sigmoid activation function, but when the problem has 3 or more classes.

In [2]:
from IPython.display import HTML
HTML("""<div align="middle"><video width="100%" controls><source src="v10.mp4" type="video/mp4"></video></div>""")

In [1]:
from IPython.display import HTML
HTML("""<div align="middle"><video width="100%" controls><source src="v11.mp4" type="video/mp4"></video></div>""")

### Softmax Function ###

(normalizes all input values so they all add up to 1)

The softmax function is defined as: 

Linear Function Scores: $Z_1, ..., Z_n$

# $P($class $i) = \frac{e^{Z_i}}{e^{Z_1}+ ... + e^{Z_n}}$ #

In [38]:
# softmax in py

import numpy as np

def softmax(L):
    expL = np.exp(L)
    sumExpL = sum(expL)
    result = []
    for i in expL:
        result.append(i*1.0/sumExpL)
    return result
    
    # Note: The function np.divide can also be used here, as follows:
    # def softmax(L):
    #     expL = np.exp(L)
    #     return np.divide (expL, expL.sum())
