# From Neurons to Neural Network

### Introduction

In the previous lessons, we learned about the hypothesis function and training procedure for a single neuron.  Here are a couple of diagrams to jog your memory.

This is an illustration of the hypothesis function for a single neuron.

<img src="sigmoid-neuron.png" width="40%">

And this is an illustration of our training procedure of gradient descent.

<img src="./cost-curve-slopes.png" width="40%">

Using the above hypothesis function and training procedure, we found a hypothesis function that could predict if cells contained cancer or not by looking at a single feature -- the average cell area in a photo of cells.  But unfortunately, using a single neuron will fall short if we try to use it for basic image recognition -- like identifying handwritten digits.

<img src="mnist.png" width="30%">

### More Neurons

So how do will we solve a problem like identifying our handwritten digits above?  Well, by using more neurons.

<img src="./mit-neurons.jpg" width="50%">

But, of course, it's a little more complicated than that.

Take a look at a representation of an artificial neural network below.  The blue dots in the diagram represent different neurons, and the lines are the connections between the neurons.

<img src="./artificial-network.png" width="50%">

The takeaway from the diagram above is that our neural network operates in layers.  And as we can see, there are multiple neurons in each layer.

What's the point of the layers?  Well let's consider the problem of using our neural network to identify handwritten digits. In this case, we could imagine that the first layer just identifies the regions where a number is drawn.  Then perhaps the second layer identifies if there horizontal lines, vertical lines, or curved lines in the image.  And then the third layer could use this information to associate the image with a single digit.  

<img src="./mnist.png" width="30%">

In other words, there is an information flow from one layer to the next.  And each layer is responsible for making a determination at a higher level of abstraction than the previous layer.  Finally, at the last layer, a determination is made - like determining which digit an image represents.

Now building a neural network, with multiple layers, and with each layer having multiple neurons is pretty similar to what we saw previously:

In [2]:
import torch.nn as nn
net = nn.Sequential(
    nn.Linear(784, 64),
    nn.Sigmoid(),
    nn.Linear(64, 10),
    nn.Softmax(dim = 1)
)

net

Sequential(
  (0): Linear(in_features=784, out_features=64, bias=True)
  (1): Sigmoid()
  (2): Linear(in_features=64, out_features=10, bias=True)
  (3): Softmax(dim=1)
)

But underneath, each layer has now has many different parameters.

In [3]:
net[0].weight

Parameter containing:
tensor([[-0.0045,  0.0154,  0.0091,  ...,  0.0069,  0.0312, -0.0038],
        [ 0.0217, -0.0055, -0.0326,  ...,  0.0126, -0.0239,  0.0093],
        [-0.0134, -0.0300,  0.0282,  ..., -0.0345,  0.0124, -0.0024],
        ...,
        [ 0.0099, -0.0183,  0.0087,  ..., -0.0331,  0.0355,  0.0346],
        [ 0.0336, -0.0242,  0.0086,  ...,  0.0208,  0.0261, -0.0288],
        [ 0.0056,  0.0161, -0.0326,  ..., -0.0190, -0.0087, -0.0329]],
       requires_grad=True)

And remember that is just a one of multiple layers in our neural network above.  So how does something like this work in code, or mathematically?

Over the next four lessons, we'll learn more about what it means for a neural network to operate with layers of neurons.  And specifically we'll learn how we can use Python and some math to understand our neural network.  

### Resources

[Visualizing Neural Nets](https://ml4a.github.io/ml4a/looking_inside_neural_nets/)