## Neural network - 1 input, 0 hidden, 1 output 

Here's the very very simple neural network that we'll be implementing 

![Network Diagram](1-0-1_nnet.PNG)

Let's start by stepping through calculation for one training example:

In [94]:
import numpy as np
from math import e

In [95]:
# Create some sample data 
X = np.array([4,7,1,6,3])
Y = np.array([0,0,1,1,0])

In [96]:
# init weights and biases randomly 
w1 = np.random.random()
b1 = np.random.random()

In [97]:
# Feedforward step 
i = 0 
a0 = X[i]
y = Y[i]

# Define activation function 
def relu(x): return x if x > 0 else 0 

# Calculate output 
net = a0 * w1 + b1
a1 = relu(net) 

# Calculate cost 
cost = (a1 - y)**2


The diagram on the left shows the network as a multiplication graph. The diagram on the right shows the repeated applications of the chain rule to compute backpropagation.  This assumes that we are using a Relu activation function. 
![](1-0-1_nnet_backprop_combined.PNG)

In [98]:
#Backpropagation 
dCost_dCost = 1 
dCost_da1 = 2*(a1 - y) * dCost_dCost
dCost_dnet = 1 * dCost_da1 if net > 0 else 0 
dCost_da0 = w1 * dCost_dnet   # this one is pretty useless, since we can't directly influence a0
dCost_dw1 = a0 * dCost_dnet
dCost_db1 = b1 * dCost_dnet

# Adjusting the weights and biases 
w1 += dCost_dw1 
b1 += dCost_db1

That was just one example. Let's put it all together in a loop now, for all our training data, and keep running until the cost stablises. 

In [123]:
##### Define activation function 
def relu(x): return max(x,0)
def sigmoid(x): return (e**x / (e**x + 1))

# Create some sample data 
X = np.array([1,2,3,4,5])
Y = np.array([0,0,1,1,1])

# init weights and biases randomly 
w1 = np.random.random()
b1 = np.random.random()

# Feedforward step 
n_epochs = 5000
activation_fun = 'relu'
step_size = 0.0001 # how far we move with gradient descent 
for n in range(n_epochs):
    c = 0 
    for i in range(len(X)):
        # determine input and output 
        a0 = X[i]
        y = Y[i]

        # Calculate output 
        net = a0 * w1 + b1
        if   activation_fun == 'relu':      a1 = relu(net) 
        elif activation_fun == 'sigmoid':   a1 = sigmoid(net) 
            
        # Calculate cost 
        cost = (a1 - y)**2

        # Backpropagation 
        dCost_dCost = 1 
        dCost_da1 = 2*(a1 - y) * dCost_dCost
        if   activation_fun == 'relu':      dCost_dnet = 1 * dCost_da1 if net > 0 else 0 
        elif activation_fun == 'sigmoid':   dCost_dnet = dCost_dnet = sigmoid(net) * ( 1 - sigmoid(net)) * dCost_da1
        dCost_dnet = 1 * dCost_da1 if net > 0 else 0 
        dCost_da0 = w1 * dCost_dnet   # this one is pretty useless, since we can't directly influence a0
        dCost_dw1 = a0 * dCost_dnet
        dCost_db1 = 1 * dCost_dnet

        # Adjust the weights and biases 
        w1 += -1 * step_size * dCost_dw1 # the minus 1 is because we want to step downhill
        b1 += -1 * step_size * dCost_db1

        # Print our cost function and change in weights 
        #print("a0:", a0,"w1:", w1, "b1:", b1,   "net:", net, "a1:", a1, "y:",y)
        #print(dCost_da1, net, dCost_dnet, dCost_da0, dCost_dw1, dCost_db1 )
        #print ("cost:", cost, "w1Change:", dCost_dw1, "b1Change:", dCost_db1 )
        #print(cost, dCost_dw1, dCost_db1)
        c += cost
    if n % 100 == 0 : print(round(c,2))

38.38
4.66
1.48
1.16
1.1
1.07
1.05
1.02
1.0
0.97
0.95
0.93
0.91
0.89
0.87
0.85
0.83
0.81
0.8
0.78
0.76
0.75
0.73
0.72
0.71
0.69
0.68
0.67
0.65
0.64
0.63
0.62
0.61
0.6
0.59
0.58
0.57
0.56
0.55
0.54
0.54
0.53
0.52
0.51
0.51
0.5
0.49
0.49
0.48
0.47


In [100]:
preds = np.array([sigmoid(x * w1 + b1) for x in X ]).round(3)

In [101]:
preds

array([ 0.   ,  0.069,  0.946,  1.   ,  1.   ])

In [102]:
Y

array([0, 0, 1, 1, 1])