# Neural network for addition of 2 numbers
### by Börge Göbel

In [1]:
import numpy as np
import matplotlib.pyplot as plt

In [2]:
import random

## 1 . Prepare training and test data (typically loaded from file)

- Here we generate it

In [3]:
rangeData = 20                             # Numbers from [-rangeData,+rangeData]
lenData = 1000                             # How many pairs of numbers do we generate
testProportion = 0.3                       # 30% testing, 70% training 
testEnd = round(lenData * testProportion)  # How many pairs of numbers are used for testing

- Generate 1000 pairs of numbers as 1000 seperate inputs for our network

In [4]:
dataIn = np.random.randint(-rangeData, rangeData+1, size=(lenData, 2))

- Generate the corresponding 1000 output values. These will be the sum of the two inputs.
- We do not tell the network that it is the sum. The network shall learn this by itself.

In [5]:
dataOut = dataIn[:,0] + dataIn[:,1]

- Adding a '1' element to each input pair (related to bias - more on this later)

In [6]:
dataIn = np.concatenate([np.ones([lenData,1]), dataIn], axis=1)

- The final data sets and 1 example each

In [7]:
testingIn   = dataIn[0:testEnd]
testingOut  = dataOut[0:testEnd]
trainingIn  = dataIn[testEnd:]
trainingOut = dataOut[testEnd:]

In [8]:
print( testingIn[0] )
print( testingOut[0] )
print( trainingIn[0] )
print( trainingOut[0] )

[  1.   3. -17.]
-14
[ 1. -6.  5.]
-1


## 2. Setting up neural network

![Addition_network.png](Addition_network.png)

Input layer length: 3 (1 bias + 2 numbers)

Output layer length: 1 (result)

### 2.1 Initialize weights: Numbers in the range from -2 to 2

- We need a starting point for our weights. Let's select them randomly.

### 2.2 Activation function

- Typically a monotonuous function that rescales a value to the range [0,1]
- Here it is not necessary (Comes in the other examples)

### 2.3 Calculate output of our neural network

The value of a neuron is given as the dot product of the two vectors: 
- weights 
- value of the neurons in the previous layer (including bias: value 1)

\\( y = w_0 + w_1x_1 + w_2x_2 \\)

- At the end of this notebook (after training our network) we will have the weights

\\( w_0 = 0, w_1 = 1, w_2 = 1 \\)

because then our output will be 

\\( y = x_1 + x_2 \\)

### 2.4 Functions: Calculate accuracy and individual error

### - Accuracy: 
What is the rate at which the output is predicted correctly (only correct and wrong matter)?

- So far, output is random 

### - Error (better for learning): 
- For a pair of numbers we calculate:

\\( \Delta = (y-Y)^2 \\)

\\( y \\): Predicted result by the neural network

\\( Y \\): Correct result (what we have calculated in the beginning)

- Here we only have a single output neuron but in general 

\\( \Delta = (\vec{y}-\vec{Y})^2=\sum_j (y_j-Y_j)^2 \\)

### 2.5 Function: Calculate gradient (d Error / d weight)

- All derivatives with respect to the individual weights (use chain rule)

\\( \frac{\partial }{\partial w_i}\Delta = 2(y-Y)\cdot x_i\\)

## 3. Training: Use Gradient descent to change weights to minimize the error

Repeat the following process many time:
- Select an input pair (index)
- Calculate the gradient of the error 
- Change weights accoding to 

\\( w_\mathrm{new} = w_\mathrm{old} - learingRate\cdot gradient\\)

## 4. Application to test data set (new data)