# Implementing a 2 and 3 layer Neural Network

The code in this chapter is based on the article from http://iamtrask.github.io.

And now we are going to implement basic 2 and 3 layers Neural Network.
within 11 lines of codes

The code:

```python
X = np.array([ [0,0,1],[0,1,1],[1,0,1],[1,1,1] ])
y = np.array([[0,1,1,0]]).T
syn0 = 2*np.random.random((3,4)) - 1
syn1 = 2*np.random.random((4,1)) - 1
for j in xrange(60000):
    l1 = 1/(1+np.exp(-(np.dot(X,syn0))))
    l2 = 1/(1+np.exp(-(np.dot(l1,syn1))))
    l2_delta = (y - l2)*(l2*(1-l2))
    l1_delta = l2_delta.dot(syn1.T) * (l1 * (1-l1))
    syn1 += l1.T.dot(l2_delta)
    syn0 += X.T.dot(l1_delta)
```

## 2 Layer Neural Network

<table class="tg">
  <tbody><tr>
    <th class="tg-5rcs" colspan="3">Inputs</th>
    <th class="tg-5rcs">Output</th>
  </tr>
  <tr>
    <td class="tg-4kyz">0</td>
    <td class="tg-4kyz">0</td>
    <td class="tg-4kyz">1</td>
    <td class="tg-4kyz">0</td>
  </tr>
  <tr>
    <td class="tg-4kyz">1</td>
    <td class="tg-4kyz">1</td>
    <td class="tg-4kyz">1</td>
    <td class="tg-4kyz">1</td>
  </tr>
  <tr>
    <td class="tg-4kyz">1</td>
    <td class="tg-4kyz">0</td>
    <td class="tg-4kyz">1</td>
    <td class="tg-4kyz">1</td>
  </tr>
  <tr>
    <td class="tg-4kyz">0</td>
    <td class="tg-4kyz">1</td>
    <td class="tg-4kyz">1</td>
    <td class="tg-4kyz">0</td>
  </tr>
</tbody></table>

<table class="tg">
  <tbody><tr>
    <th class="tg-5rcs">Variable</th>
    <th class="tg-5rcs">Definition</th>
  </tr>
  <tr>
    <td class="tg-4kyx">X</td>
    <td class="tg-4kyz">Input dataset matrix where each row is a training example</td>
  </tr>
  <tr>
    <td class="tg-4kyx">y</td>
    <td class="tg-4kyz">Output dataset matrix where each row is a training example</td>
  </tr>
  <tr>
    <td class="tg-4kyx">l0</td>
    <td class="tg-4kyz">First Layer of the Network, specified by the input data</td>
  </tr>
  <tr>
    <td class="tg-4kyx">l1</td>
    <td class="tg-4kyz">Second Layer of the Network, otherwise known as the hidden layer</td>
  </tr>
  <tr>
    <td class="tg-4kyx">syn0</td>
    <td class="tg-4kyz">First layer of weights, Synapse 0, connecting l0 to l1.</td>
  </tr>
  <tr>
    <td class="tg-4kyx">*</td>
    <td class="tg-4kyz">Elementwise multiplication, so two vectors of equal size are multiplying corresponding values 1-to-1 to generate a final vector of identical size.</td>
  </tr><tr>
    <td class="tg-4kyx">-</td>
    <td class="tg-4kyz">Elementwise subtraction, so two vectors of equal size are subtracting corresponding values 1-to-1 to generate a final vector of identical size.</td>
  </tr>
  <tr>
    <td class="tg-4kyx">x.dot(y)</td>
    <td class="tg-4kyz">If x and y are vectors, this is a dot product. If both are matrices, it's a matrix-matrix multiplication. If only one is a matrix, then it's vector matrix multiplication.</td>
  </tr>
  </tbody>
</table>

In [1]:
import numpy as np

# sigmoid function, gives a value between 0 - 1
def nonlin(x,deriv=False):
    if(deriv==True):
        return x*(1-x)
    return 1/(1+np.exp(-x))
    
# input dataset
X = np.array([  [0,0,1],
                [0,1,1],
                [1,0,1],
                [1,1,1] ])
    
# output dataset            
y = np.array([[0,0,1,1]]).T

# seed random numbers to make calculation
# deterministic (just a good practice)
np.random.seed(1)

# initialize weights randomly with mean 0
syn0 = 2*np.random.random((3,1)) - 1
print "Init of syn0:"
print syn0

for iter in xrange(10000):

    # forward propagation
    l0 = X
    l1 = nonlin(np.dot(l0,syn0))

    # how much did we miss?
    l1_error = y - l1

    # multiply how much we missed by the 
    # slope of the sigmoid at the values in l1
    l1_delta = l1_error * nonlin(l1,True)

    # update weights
    syn0 += np.dot(l0.T,l1_delta)
    
    if (iter == 0) or (iter == 9998):
        print " "
        print "iter:"+ str(iter)
        print "2layer Output in iter:"+ str(iter)
        print l1
        print "Error in iter:"+ str(iter)
        print l1_error
        print "Delta in iter:"+ str(iter)
        print l1_delta
        print "Weights in iter:"+ str(iter)
        print syn0

print ""
print "Output After Training: "
print "Final output L1:"
print l1
print "Final weights:"
print syn0
print "Final Error:"
print l1_error
print "Final Delta:"
print l1_delta


Init of syn0:
[[-0.16595599]
 [ 0.44064899]
 [-0.99977125]]
 
iter:0
2layer Output in iter:0
[[ 0.2689864 ]
 [ 0.36375058]
 [ 0.23762817]
 [ 0.3262757 ]]
Error in iter:0
[[-0.2689864 ]
 [-0.36375058]
 [ 0.76237183]
 [ 0.6737243 ]]
Weights in iter:0
[[ 0.12025406]
 [ 0.50456196]
 [-0.85063774]]
 
iter:9998
2layer Output in iter:9998
[[ 0.00966498]
 [ 0.00786546]
 [ 0.99358866]
 [ 0.99211917]]
Error in iter:9998
[[-0.00966498]
 [-0.00786546]
 [ 0.00641134]
 [ 0.00788083]]
Weights in iter:9998
[[ 9.67289058]
 [-0.20784374]
 [-4.62958526]]

Output After Training: 
Final output L1:
[[ 0.00966449]
 [ 0.00786506]
 [ 0.99358898]
 [ 0.99211957]]
Final weights:
[[ 9.67299303]
 [-0.2078435 ]
 [-4.62963669]]
Final Error:
[[-0.00966449]
 [-0.00786506]
 [ 0.00641102]
 [ 0.00788043]]
Final Delta:
[[ -9.24997129e-05]
 [ -6.13726376e-05]
 [  4.08376707e-05]
 [  6.16117429e-05]]
