## Topics

## 1. Many Choices for the Activation Function
## 2. Simple Multi-layer Neural Network

<br>
<br>


## 1. Comparison of the different sigmoids:

## (https://en.wikipedia.org/wiki/Activation_function, and there are more)

Logistic (a.k.a Soft step)	Activation logistic:	$f(x)=\frac{1}{1+e^{-x}}$, 	$\,f'(x)=f(x)(1-f(x))$,    (0,1)	

TanH	Activation tanh:	$f(x)=\tanh(x)=\frac{2}{1+e^{-2x}}-1$,	$\,f'(x)=1-f(x)^2$,   (-1,1)

In [1]:
%matplotlib inline
# All imports
from random import choice
import numpy as np
import matplotlib.pyplot as plt

np.set_printoptions(formatter={'float': '{:.4f}'.format})


## 2. Simple Multi-layer Neural Network

(see lecture slides for architecture)

In [2]:
def sigmoid(x):
    '''The logistic function as the sigmoid'''

    return 1.0/(1.0 + np.exp(-x))

def sigmoid_pr(z):
    '''derivative of the logistic function'''
    return z * (1 - z)

## The inner (dot) product and the outer product of two vectors

In [3]:
a = np.atleast_2d(np.array([1, 2]))
b = np.atleast_2d(np.array([3, 4]))

inner_prod = np.dot(a, b.T)
outer_prod = np.dot(a.T, b)
print('inner product:\n', inner_prod)
print('outer product:\n', outer_prod)

inner product:
 [[11]]
outer product:
 [[3 4]
 [6 8]]


In [4]:
print('a:\n', a)
print()
print('a.T:\n', a.T)
print()
print('b:\n', b)
print()
print('b.T transposed:\n', b.T)

a:
 [[1 2]]

a.T:
 [[1]
 [2]]

b:
 [[3 4]]

b.T transposed:
 [[3]
 [4]]


## Breakout Excercise -- Write a function simple_nn(X, w1, w2) that performs the forward propagation

   - ## X is the input array
   - ## w1 are the weights in the first layer (it should be a 2x2 array...and think dot product)
   - ## w2 are the weights for the second layer
   - ## It should return z, b, A, a
   - ## Test it with the input values and weights given in lecture.  And then print out the following:
   
 
            w1:
             [[0.1000 0.4000]
             [0.8000 0.6000]]
            w2:
             [0.3000 0.9000]
            a: [0.7550 0.6800]
            A: [0.6803 0.6637]
            b: 0.801444986674
            output (z) of nn: 0.690283492908 

In [5]:
def simple_nn(X, w1new, w2new):
    #return z, b, A, a
#     a0=(w1[0][0] * X[0]) + ( w1[1][0] * X[1] )
#     a1=(w1[0][1] * X[0]) + ( w1[1][1] * X[1] )
#     a=[a0, a1]
    a = np.dot(X, w1new)
    
#     A0=sigmoid(a0) 
#     A1=sigmoid(a1)
#     A=[A0, A1]
    A = sigmoid(a)
    
#     b=(A0*w2[0])+(A1*w2[1])
    b = np.dot(A, w2new)
    
    z=sigmoid(b)
    
    return z, b, A, a
    
X = np.array([0.35, 0.9])   
w1 = np.array([[0.1000, 0.4000], [0.8000, 0.6000]])
print('w1:\n', w1)
w2 = np.array([0.3000, 0.9000])
print('w2:\n', w2)

z, b, A, a = simple_nn(X, w1, w2)
print('a: ', a)
print('A: ', A)
print('b: ', b)
print('output (z) of nn: ', z)


w1:
 [[0.1000 0.4000]
 [0.8000 0.6000]]
w2:
 [0.3000 0.9000]
a:  [0.7550 0.6800]
A:  [0.6803 0.6637]
b:  0.8014449866735119
output (z) of nn:  0.6902834929076443


In [6]:
#Transpose

M = np.random.rand(2, 2)
print(M)
print('Transposed:\n', M.T)

[[0.9981 0.8353]
 [0.7423 0.9987]]
Transposed:
 [[0.9981 0.7423]
 [0.8353 0.9987]]


In [7]:
#np.atleast_2d()
aa = np.array([5, 8])
AA = np.atleast_2d(aa)
print('Vector:\n', aa, aa.shape)
print('Matrix:\n', AA, AA.shape)
print('Matrix transposed:\n', AA.T, AA.T.shape)
print("For a 1-D objecte, transpose doesn't change anything:", aa.T)

Vector:
 [5 8] (2,)
Matrix:
 [[5 8]] (1, 2)
Matrix transposed:
 [[5]
 [8]] (2, 1)
For a 1-D objecte, transpose doesn't change anything: [5 8]


In [8]:
#the inner (dot) product and the outer product of two vectors
AA = np.atleast_2d(np.array([1, 2]))
BB = np.atleast_2d(np.array([3, 4]))

print(AA)
print(BB.T)

inner_prod = np.dot(AA, BB.T)
outer_prod = np.dot(AA.T, BB)
print('inner product: \n', inner_prod)

print(AA.T)
print(BB)
print('outer product: \n', outer_prod)

[[1 2]]
[[3]
 [4]]
inner product: 
 [[11]]
[[1]
 [2]]
[[3 4]]
outer product: 
 [[3 4]
 [6 8]]


## Breakout Excercise -- Write a function training_nn(X, y, z, w1, w2) that performs the backward propagation

   - ## y is the target
   - ## Define delta2 as we talked about in lecture
   - ## Also for convenience and clarity, define delta_w2 = delta2 * A.  This is essentially how w2 should be adjusted (with alpha = 1).
   - ## Then define delta1 as we talked about in class -- think about the best way to do it 
   - ## Define a delta_w1.   This is essentially how w1 should be adjusted (with alpha = 1). (*Hint*: Think matrix multiplication.)
   - ## Run it and you should get this:
 

            w1:
             [[0.1000 0.4000]
             [0.8000 0.6000]]
            w2:
             [0.3000 0.9000]
            output (z) of nn: 0.690283492908
            
            delta2: -0.04068112511233903 
            delta_w2: [-0.0277 -0.0270] 
            w2: [0.2723 0.8730]  
            sigmoid_pr(A): [0.6803 0.6637] 
            delta1: [[-0.0024 -0.0079]] 
            delta_w1: 
            [[-0.0008 -0.0028] 
            [-0.0022 -0.0071]] 
            w1: 
            [[0.0992 0.3972] 
            [0.7978 0.5929]] 

In [52]:
def training_nn(X, y, z, w1, w2):
    alpha=1
    error = y-z
    delta2 = error * sigmoid_pr(z)
    delta_w2 = np.dot(delta2, A)
#     print('delta2: ', delta2)
#     print('delta_w2: ', delta_w2)

    delta1 = np.atleast_2d(delta2 * sigmoid_pr(A) *w2)
    X = np.atleast_2d(X)
    delta_w1 = np.dot(X.T, delta1)
    
    w2 += (alpha * delta_w2)
#     print('w2: ', w2)
#     print('\n')
    
#     print('sigmoid_pr(A): ', (A))
#     delta1_0 = delta2 * w2[0] * A[0]*(1-A[0])
#     delta1_1 = delta2 * w2[1] * A[1]*(1-A[1])
#     delta1 = np.atleast_2d(np.array([delta1_0, delta1_1]))
    
#     print('delta1: ', delta1)
#     print('delta_w1: \n', delta_w1)
    w1 += (alpha * delta_w1)
    #print('w1: \n', w1)
    
    return w2, w1

y=0.5
X = np.array([0.35, 0.9]) 

w1 = np.array([[0.1000, 0.4000], [0.8000, 0.6000]])
print('w1:\n', w1)
w2 = np.array([0.3000, 0.9000])
print('w2:\n', w2)


w2new, w1new = training_nn(X, y, z, w1, w2)


w1:
 [[0.1000 0.4000]
 [0.8000 0.6000]]
w2:
 [0.3000 0.9000]


In [53]:
#Training neural net
w1 = np.array([[0.1000, 0.4000], [0.8000, 0.6000]])
print('w1_start:\n', w1)
w2 = np.array([0.3000, 0.9000])
print('w2_start:\n', w2)

training_steps = 100
for i in range(training_steps):
    #forward propagation
    z, b, A, a = simple_nn(X, w1new, w2new)
    #backward propagation
    w2new, w1new = training_nn(X, y, z, w1, w2)

    #printing z value, remember y=0.5
    print('z: ', z)

w1_start:
 [[0.1000 0.4000]
 [0.8000 0.6000]]
w2_start:
 [0.3000 0.9000]
z:  0.6902436813789282
z:  0.6820324115901653
z:  0.6739604747521266
z:  0.666087216261871
z:  0.6584281605999255
z:  0.6509960787563582
z:  0.6438010722624493
z:  0.6368507070599206
z:  0.63015018513052
z:  0.6237025423825647
z:  0.6175088623849497
z:  0.6115684969534533
z:  0.6058792861557863
z:  0.6004377718732407
z:  0.5952394005383426
z:  0.5902787119952019
z:  0.5855495125676786
z:  0.5810450313587252
z:  0.5767580595476445
z:  0.5726810730166964
z:  0.5688063390467185
z:  0.5651260080978882
z:  0.5616321918610335
z:  0.5583170288499256
z:  0.5551727388260017
z:  0.552191667321298
z:  0.5493663214672256
z:  0.5466893982576013
z:  0.5441538062829072
z:  0.5417526818757601
z:  0.5394794005099176
z:  0.5373275842002292
z:  0.535291105561065
z:  0.5333640890972847
z:  0.5315409102255151
z:  0.5298161924546301
z:  0.5281848030928074
z:  0.5266418477940651
z:  0.5251826642093262
z:  0.5238028149652744
z:  0.522498

# End of Week 6-2