# Basics of Neural Networks

We start with a manual implementation of a simple neural network. This is based on an example by  Andrew Trask, University of Oxford, iamtrask.github.io

In [23]:
import numpy as np

def nonlin(x,deriv=False):
	if(deriv==True):
	    return x*(1-x)

	return 1/(1+np.exp(-x))

First we define our activation function and its derivative.
## Task 1
- Make a plot of the activation function and its derivative for x = -10, ..., 10

Define the input training data (X) and the expected output (Y). What is the pattern that connects the two?

In [24]:
X = np.array([[0,0,1],
            [0,1,1],
            [1,0,1],
            [1,1,1]])
                
y = np.array([[0], [1], [1], [0]])

Randomly initialize the weights for a 3 layer network. We have 3 input neurons, 1 hidden layer with 6 neurons and an output layer with a single neuron.  Set the learnung rate, mu.

In [25]:
np.random.seed(1)
syn0 = 2*np.random.random((3,6)) - 1
syn1 = 2*np.random.random((6,1)) - 1
mu = 1.0

Train the neural network over 60000 epochs (iterations)

In [26]:
for j in range(60000):

	# Feed forward through layers 0, 1, and 2
    l0 = X
    l1 = nonlin(np.dot(l0,syn0))
    l2 = nonlin(np.dot(l1,syn1))

    # how much did we miss the target value?
    l2_error = y - l2
    
    if (j% 10000) == 0:
        print ("Error:" + str(np.mean(np.abs(l2_error))))
        
    # in what direction is the target value?
    l2_delta = l2_error*mu * nonlin(l2,deriv=True)

    # how much did each l1 value contribute to the l2 error (according to the weights)?
    l1_error = l2_delta.dot(syn1.T)
    
    # in what direction is the target l1?
    l1_delta = l1_error * mu * nonlin(l1,deriv=True)

    syn1 += l1.T.dot(l2_delta)
    syn0 += l0.T.dot(l1_delta)
#show the output after training    
print (l2)

Error:0.49972699398089837
Error:0.01037957354915965
Error:0.007053181402197165
Error:0.005657851297518694
Error:0.004847339430731188
Error:0.004303099786337031
[[0.00324144]
 [0.99807094]
 [0.99478933]
 [0.00524269]]


Now the same neural network implemented in Keras/Tensor Flow

In [22]:
from tensorflow import keras
from tensorflow.keras import layers

training_data = X
outputs = y

 
model = keras.Sequential()
model.add(keras.Input(shape=(3,)))
model.add(layers.Dense(6, activation="sigmoid"))
model.add(layers.Dense(1, activation="sigmoid"))

model.summary()
# Call model on a test input

model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
              loss=keras.losses.MeanSquaredError())
h = model.fit(training_data, outputs,epochs=100)

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_10 (Dense)             (None, 6)                 24        
_________________________________________________________________
dense_11 (Dense)             (None, 1)                 7         
Total params: 31
Trainable params: 31
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/

Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


In [20]:
print (model.predict(X))

[[1.1797477e-07]
 [9.8162055e-01]
 [9.9498057e-01]
 [3.7617207e-02]]


## Tasks
- Compare the manual implementation with Keras, what are similarities and differences
- Do they show the same performance? If there are differences what is the reason? What needs to be changed?
- Change the activation function to relu (code below)? What happens? Do you have an explanation?
- What is the prediction for an input where the last element is 0?

In [None]:
def relu(x,deriv=False):
    if(deriv==True):
	    return np.array(x >0).astype(int)
    return np.maximum(0,x)