## Neural Networks 
- This was adopted from the PyTorch Tutorials. 
- http://pytorch.org/tutorials/beginner/pytorch_with_examples.html

## Neural Networks 
- Neural networks are the foundation of deep learning, which has revolutionized the 

```In the mathematical theory of artificial neural networks, the universal approximation theorem states[1] that a feed-forward network with a single hidden layer containing a finite number of neurons (i.e., a multilayer perceptron), can approximate continuous functions on compact subsets of Rn, under mild assumptions on the activation function.```

### Generate Fake Data
- `D_in` is the number of dimensions of an input varaible.
- `D_out` is the number of dimentions of an output variable.
- Here we are learning some special "fake" data that represents the xor problem. 
- Here, the dv is 1 if either the first or second variable is 


In [5]:
# -*- coding: utf-8 -*-
import numpy as np

#This is our independent and dependent variables. 
x = np.array([ [0,0,0],[1,0,0],[0,1,0],[0,0,0] ])
y = np.array([[0,1,1,0]]).T
print("Input data:\n",x,"\n Output data:\n",y)

Input data:
 [[0 0 0]
 [1 0 0]
 [0 1 0]
 [0 0 0]] 
 Output data:
 [[0]
 [1]
 [1]
 [0]]


### A Simple Neural Network 
- Here we are going to build a neural network with 2 hidden layers. 
-

In [26]:
np.random.seed(seed=83832)
#D_in is the number of input variables. 
#H is the hidden dimension.
#D_out is the number of dimensions for the output. 
D_in, H, D_out = 3, 3, 1

# Randomly initialize weights og out 2 hidden layer network.
w1 = np.random.randn(D_in, H)
w2 = np.random.randn(H, D_out)
bias = np.random.randn(H, 1)

### Learn the Appropriate Weights via Backpropogation
- Learning rate adjust how quickly the model will adjust parameters. 

In [27]:
# -*- coding: utf-8 -*-

learning_rate = 1e-3
for t in range(500):
    # Forward pass: compute predicted y
    h = x.dot(w1)

    #A relu is just the activation.
    h_relu = np.maximum(h, 0)
    y_pred = h_relu.dot(w2)

    # Compute and print loss
    loss = np.square(y_pred - y).sum()
    print(t, loss)

    # Backprop to compute gradients of w1 and w2 with respect to loss
    grad_y_pred = 2.0 * (y_pred - y)
    grad_w2 = h_relu.T.dot(grad_y_pred)
    grad_h_relu = grad_y_pred.dot(w2.T)
    grad_h = grad_h_relu.copy()
    grad_h[h < 0] = 0
    grad_w1 = x.T.dot(grad_h)

    # Update weights
    w1 -= learning_rate * grad_w1
    w2 -= learning_rate * grad_w2

0 0.160537076323
1 0.153595453821
2 0.146975821295
3 0.140662873332
4 0.134642046866
5 0.128899484146
6 0.123421997627
7 0.118197036687
8 0.11321265606
9 0.108457485896
10 0.103920703363
11 0.0995920057013
12 0.0954615846471
13 0.0915201021691
14 0.0877586674284
15 0.0841688149077
16 0.0807424836442
17 0.0774719975089
18 0.074350046475
19 0.0713696688267
20 0.0685242342555
21 0.0658074278016
22 0.0632132345924
23 0.0607359253406
24 0.0583700425598
25 0.0561103874624
26 0.0539520075041
27 0.0518901845415
28 0.0499204235731
29 0.0480384420318
30 0.0462401596028
31 0.0445216885394
32 0.042879324452
33 0.0413095375454
34 0.0398089642841
35 0.0383743994617
36 0.0370027886556
37 0.0356912210476
38 0.034436922592
39 0.0332372495143
40 0.032089682123
41 0.030991818921
42 0.0299413709995
43 0.0289361567028
44 0.0279740965484
45 0.0270532083915
46 0.0261716028219
47 0.0253274787804
48 0.0245191193855
49 0.0237448879602
50 0.0230032242474
51 0.022292640807
52 0.021611719584
53 0.0209591086412
54 

#CFully connected 

In [34]:

pred = np.maximum(x.dot(w1),0).dot(w2)

print (pred, "\n", y)


[[ 0.        ]
 [ 0.98919177]
 [ 1.00082623]
 [ 0.        ]] 
 [[0]
 [1]
 [1]
 [0]]


### Hidden Layers are Often Viewed as Unknown
- Just a weighting matrix

In [8]:
#However
w1

array([[ 0.74015787, -0.16474176, -0.48135933],
       [ 1.23746379, -0.00148781,  0.31587215],
       [-0.49991311, -0.18841141, -0.16494133]])

In [9]:
w2

array([[ 1.16346749],
       [-2.15990232],
       [-1.30731163]])

In [49]:
# Relu just removes the negative numbers.  
h_relu

array([[ 0.        ,  0.        ,  0.        ],
       [ 0.72108356,  0.        ,  0.        ],
       [ 0.72753913,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ]])