![](images/calc.png)

### Let's first understand how backpropagation works

![alt text](https://i.ytimg.com/vi/An5z8lR8asY/maxresdefault.jpg "Logo Title Text 1")

In 1986 Hinton released [this](http://www.cs.toronto.edu/~hinton/absps/naturebp.pdf) paper detailing a new optimization strategy for neural networks called 'backpropagation'. This paper is the reason the current Deep Learning boom is possible. 

3 concepts behind Backpropagtion (From Calculus)

1. Derivative
![alt text](https://i.imgur.com/eRF9pXu.jpg "Logo Title Text 1")

2. Partial Derivative

![alt text](https://i.imgur.com/Rergqbt.jpg "Logo Title Text 1")

3. Chain Rule

![alt text](https://i.imgur.com/HFmGQyH.jpg "Logo Title Text 1")

![](images/bp1.png)

![](images/bp2.png)

In [1]:
# l1 = 0.05
# l2 = 0.10
# o1 = 0.90
# o2 = 0.78

In [2]:
# A single Neuron
inputs = [1.2,5.6,2.1]
weights = [3.1,2.1,9.8]
bias = 1.0

In [3]:
output = inputs[0]*weights[0] + inputs[1]*weights[1] + inputs[2]*weights[2] + bias

In [4]:
output

37.06

In [27]:
# A single Layer
inputs = [1.2,5.6,2.1,2.5]
weights1 = [3.1,2.1,9.8,1.0]
weights2 = [5.1,6.1,4.8,7.0]
weights3 = [6.1,4.1,5.8,6.0]
bias1 = 1.0
bias2 = 2.0
bias3 = 0.5

In [28]:
output = [inputs[0]*weights1[0] + inputs[1]*weights1[1] + inputs[2]*weights1[2] + inputs[3]* weights1[3] + bias1,
         inputs[0]*weights2[0] + inputs[1]*weights2[1] + inputs[2]*weights2[2] + inputs[3]* weights2[3] + bias2,
         inputs[0]*weights3[0] + inputs[1]*weights3[1] + inputs[2]*weights3[2] + inputs[3]* weights3[3] + bias3]

In [29]:
output

[39.56, 69.85999999999999, 57.959999999999994]

In [7]:
import numpy as np

In [8]:
inputs = [1.2,5.6,2.1,2.5]
weights = [[3.1,2.1,9.8,1.0],
          [5.1,6.1,4.8,7.0],
          [6.1,4.1,5.8,6.0]]

bias = [1.0,2.0,0.5]

In [9]:
outputs = np.dot(weights,inputs) + bias

In [10]:
def create_bias_and_weights(inputs,neurons):
    weights = 0.10*(np.random.randn(inputs,neurons))
    biases = np.zeros((1,neurons))
    print(weights)
    print(biases)

In [30]:
create_bias_and_weights(6,128)

[[ 1.23369231e-02  1.48732726e-01 -1.47231522e-01 -2.45400033e-02
  -5.82388308e-02  7.37313447e-02 -1.36764884e-01 -1.99994852e-02
  -3.45012562e-02  3.27411311e-02 -1.24379263e-02  9.95091459e-02
   5.84661652e-02 -3.19422668e-02  3.55581915e-02 -1.69456914e-02
  -1.01487495e-01 -1.18388353e-01 -3.52350005e-02 -9.29561613e-02
   1.95958603e-01  1.84947977e-01  1.61249874e-02  4.16377566e-04
   1.32785295e-01 -9.45944896e-02 -1.55866338e-01 -7.92354078e-02
   1.06658675e-01  1.47139948e-01  1.73918711e-01 -6.21492092e-02
  -4.59484971e-02  4.74428580e-02 -5.94946271e-02 -1.14002993e-02
   7.01280888e-02 -1.27643691e-01  7.09636066e-02 -1.49851904e-01
   9.32289675e-02  7.12097224e-03 -3.86660259e-02 -3.16910754e-01
  -1.89150075e-01  1.40418232e-01 -1.98262137e-01  2.41051226e-02
   1.30451792e-01 -3.91056360e-04  4.54836167e-02 -1.06557206e-01
  -2.21001900e-02  4.31564359e-02 -1.31428512e-02  3.97618589e-02
  -1.46842770e-01 -6.08110734e-03  7.07789081e-02 -1.70350731e-01
   7.89855

# How do artificial & biological neural nets compare?

Artificial Neural Networks are inspired by the hierarchial structure of brains neural network

![alt text](https://appliedgo.net/media/perceptron/neuron.png "Logo Title Text 1")

The brain has 
-100 billion neurons 
-- Each neuron has
   - A cell body w/ connections
   - numerous dendrites 
   - A single axon 
- Parallel chaining (each neurons connected to 10,000+ others)
- Great at connecting different concepts

Computers have
- Not neurons, but transistors made in silicon!
- Serially chained (each connected to 2-3 others (logic gates))
- Great at storage and recall

Some key differences
- All sensory or motor systems in the brain are recurrent
- Sensory systems tend to have lots of lateral inhibition (neurons inhibiting other neurons in the same layer)
- There is no such thing as a fully connected layer in the brain, connectivity is usually sparse (though not random).
- brains are born pre-wired to learn without supervision.
- The Brain is low power. Alpha GO consumed the power of 1202 CPUs and 176 GPUs, not to train, but just to run. Brain’s power consumption is ~20W.

![alt text](https://images.gr-assets.com/books/1348246481l/5080355.jpg
 "Logo Title Text 1")

"the brain is not a blank slate of neuronal layers 
waiting to be pieced together and wired-up; 
we are born with brains already structured 
for unsupervised learning in a dozen cognitive 
domains, some of which already work pretty well 
without any learning at all." - Steven Pinker


In [12]:
import numpy as np

In [13]:
X =  [[3.1,2.1,9.8,1.0],
          [5.1,6.1,4.8,7.0],
          [6.1,4.1,5.8,6.0]]

In [14]:
class Dense():
    def __init__(self,inputs,neurons):
        self.weights = 0.10*np.random.randn(inputs,neurons)
        self.biases = np.zeros((1,neurons))
    def forward(self,inputs):
        self.output = np.dot(inputs,self.weights) + self.biases

In [15]:
layer1 = Dense(4,5)

In [16]:
layer1

<__main__.Dense at 0x2904ae54608>

In [17]:
layer1.forward(X)

In [18]:
print(layer1.output)

[[-1.10152064  0.1017865   0.30105313 -1.49946945  0.84348851]
 [-1.10396292 -0.6766822   0.67612415 -0.29111452  0.16715133]
 [-1.25728793 -0.38877619  0.52298024 -0.44106244  0.23215512]]


In [19]:
layer2 = Dense(5,2)

In [20]:
layer2.forward(layer1.output)

In [21]:
print(layer2.output)

[[-0.27329422 -0.19157428]
 [-0.24756747 -0.1221705 ]
 [-0.25344906 -0.09829976]]


In [22]:
class Relu():
    def forward(self,inputs):
        self.output = np.maximum(0,inputs)

In [23]:
activation = Relu()

In [24]:
activation.forward(layer1.output)

In [25]:
print(activation.output)

[[0.         0.1017865  0.30105313 0.         0.84348851]
 [0.         0.         0.67612415 0.         0.16715133]
 [0.         0.         0.52298024 0.         0.23215512]]


# 5 Research Directions

## 1. Bayesian Deep Learning (smarter backprop)

## 2. Spike-Timing-Dependent Plasticity

## 3. Deep Reinforcement Learning

## 4. Evolutionary Strategies

## 5. Better Hardware

![](images/karpathy.png)