# Pytorch Prediction Function

### Reviewing our Hypothesis Function

So far, we have learned about a neuron's hypothesis function.  Let's start by reviewing what we learned so far.

We saw that we represented our neuron as taking in inputs, and based on those inputs, combined with the weights and bias term -- the neuron fired or not. 

<img src="neuron-general-2.png" width="40%">

We saw that this hypothesis function really consists of two components, the linear layer which can return any positive or negative or number, and our activation function which translates that output to a number between 1 and 0 (to represent firing or not).

> Mathematically, we represented the neuron's linear function and activation function as the following:

$z(x) = w_1x_1 + w_2x_2 + b = w \cdot x + b $

$ \sigma(z) = \frac{1}{1 + e^{-z}} $

And we can represent this as code like so:  

In [3]:
def linear_layer(x):
    return w.dot(x) + b

In [4]:
def sigmoid_activation(z):
    return 1/(1 + torch.exp(-z.float()))

Then we can define some initial weights and a bias.

In [8]:
# weight cell_area 2, weight for cell_concavities 1
w = torch.tensor([2., 1.])
b = torch.tensor(-10.)

And pass through a vector $x$, which represents the efeeatures of an observation.

In [9]:
import torch
# cell area is 3, and cell concavities is 4
x = torch.tensor([2., 4.])

So to get the out from the linear layer, we pass our data through the linear layer.

In [10]:
z = w.dot(x) + b
z

tensor(-2.)

And then pass that output to the activation layer, which returns a value between 0 and 1.

In [11]:
sigmoid_activation(z)

tensor(0.1192)

### Making it Real

Ok, so above we reviewed our prediction function for our neuron.  Now below, let's re-implement that same prediction function function -- the linear layer and the sigmoid layer.  But this time we'll use all of the tools that a Pytorch professional would use.  

Here's how we do it.

In [14]:
import torch.nn as nn

net = nn.Sequential(
    nn.Linear(2, 1),
    nn.Sigmoid()
)

net

Sequential(
  (0): Linear(in_features=2, out_features=1, bias=True)
  (1): Sigmoid()
)

> So above, we created a neural network in Pytorch.  It's just a pretty simple neural network as it only consists of a single neuron that takes in two features.  We'll eventually, build a more complex neural network, but for now this is fine.   

To have the neural network above make a prediction, we just pass through a feature vector like we did above.

In [15]:
# cell area is 3, and cell concavities is 4
x = torch.tensor([2., 4.])

So $x$ represents the features of a single observation.  And we can see our neural network's predictions with the following:

In [16]:
net(x)

tensor([0.0642], grad_fn=<SigmoidBackward>)

> So just like before, the linear layer ouputs a positive or negative number, which our sigmoid activation translates to a number between 1 and 0.

We'll break down the code above in a moment, but for now notice it largely consists of what we saw above: `nn.Linear` represents our linear layer and `nn.Sigmoid` represents our sigmoid activation function, and the `nn.Sequential` simply creates a neural network that passes the output from the linear layer to the sigmoid function, just like we saw above.  

$z(x) = w \cdot x + b$

$ \sigma(z) = \frac{1}{1 + e^{-z}} $

### Understanding the Components

Ok, so we just saw how we can create a neural network in Pytorch.

In [43]:
import torch.nn as nn

net = nn.Sequential(
    nn.Linear(2, 1),
    nn.Sigmoid()
)

net

Sequential(
  (0): Linear(in_features=2, out_features=1, bias=True)
  (1): Sigmoid()
)

Now let's understand these components a bit better.

The first, and main, component to understand is our `Linear` function -- which creates a linear layer.  This represents the linear layer of a single neuron.  

In the linear layer above, we passed the argument `(2, 1)`.  This tells the linear to create a vector with two weights -- and the `1` specifies that we want to create a single neuron.

In [17]:
ll = nn.Linear(2, 1)

We can see what our linear layer looks like under the hood by calling the `_parameters` function.

In [23]:
ll._parameters

OrderedDict([('weight',
              Parameter containing:
              tensor([[-0.2562,  0.5871]], requires_grad=True)),
             ('bias',
              Parameter containing:
              tensor([-0.5983], requires_grad=True))])

So just like the linear function we defined above, here we have a tensor, our weight vector, and a bias term.

In [21]:
ll.weight

Parameter containing:
tensor([[-0.2562,  0.5871]], requires_grad=True)

In [22]:
ll.bias

Parameter containing:
tensor([-0.5983], requires_grad=True)

Why is the weight vector of length 2?  Because when we created our linear layer, we specified a `2` -- which created a weight vector of length 2.

$z(x) = w_1x_1 + w_2x_2 + b$

### Creating Random Weights

Now where did those numbers come from?  Well they were randomly created.  So this means that each time we create the linear layer, new parameters would be created.  You can check this by repeatedly pressing `shift + return` on the cell below.

In [33]:
ll = nn.Linear(2, 1)
ll.weight

Parameter containing:
tensor([[-0.6047,  0.2253]], requires_grad=True)

> Notice the weights change.

Now this is largely ok, because we'll soon see that we can teach our neural network to learn the correct weights.  But, sometimes for teaching purpooses, it's easier if everyone is working with the same weights.  So to remove this randomness, we'll sometimes specify a `manual_seed` just before creating the neural network or layer.  As long as we pass the same number into that `manual_seed` this will ensure the same weights show up each time.  Let's see this.

In [34]:
torch.manual_seed(5)
ll = nn.Linear(2, 1)

ll.weight

Parameter containing:
tensor([[ 0.4670, -0.5288]], requires_grad=True)

In [35]:
torch.manual_seed(5)
ll = nn.Linear(2, 1)

ll.weight

Parameter containing:
tensor([[ 0.4670, -0.5288]], requires_grad=True)

See this time the we ensured the same initial weights show up each time. 

### Creating multiple neurons

So we just saw the different components to creating a single neuron of length 2.  We did so with `nn.Linear(2, 1)`.  But eventually, we'll want to create a neural network with more than one neuron.  So what does it even mean to create a neural network with more than one neuron?  Well as we'll see later, each neuron we create will get it's own feature vector and bias term.  So let's create a linear layer with 3 neurons, each of length 2.

In [49]:
ll_3 = nn.Linear(2, 3) # 3 neurons of length 2

In [50]:
ll_3._parameters

OrderedDict([('weight',
              Parameter containing:
              tensor([[ 0.6720, -0.5793],
                      [ 0.0386,  0.2537],
                      [-0.3339, -0.1547]], requires_grad=True)),
             ('bias',
              Parameter containing:
              tensor([-0.4722, -0.3343, -0.6446], requires_grad=True))])

So it looks like we now have three weight vectors each of length 2.

In [40]:
ll_3.weight

Parameter containing:
tensor([[-0.6047,  0.2253],
        [ 0.3041,  0.1122],
        [ 0.6801,  0.2124]], requires_grad=True)

And three bias terms.

In [41]:
ll_3.bias

Parameter containing:
tensor([-0.6270,  0.5941,  0.2401], requires_grad=True)

Now we'll talk about working with multiple neurons later on.  But hopefully the above allows us to understand what the `nn.Linear(2, 1)` -- it created a single vector of length 2 representing our single neuron.

In [42]:
ll = nn.Linear(2, 1)
ll._parameters

OrderedDict([('weight',
              Parameter containing:
              tensor([[-0.3373, -0.6495]], requires_grad=True)),
             ('bias',
              Parameter containing:
              tensor([0.4031], requires_grad=True))])

In other words, the code above simply created our weight vector and our bias.  And if we pass through a feature vector, it will apply our linear function of $z(x) = w \cdot x + b$ 

In [51]:
x

tensor([2., 4.])

In [52]:
ll(x)

tensor([-2.8697], grad_fn=<AddBackward0>)

There we go.

Finally, the sigmoid function takes the output from our linear function and passes it through our sigmoid function.

In [47]:
sigmoid = nn.Sigmoid()

In [48]:
z = ll(x)
sigmoid(z)

tensor([0.0537], grad_fn=<SigmoidBackward>)

And then our `nn.Sequential` function packages up our two functions, and passes the output from one layer into the next layer.

In [61]:
import torch.nn as nn

net = nn.Sequential(
    nn.Linear(2, 1),
    nn.Sigmoid()
)

net

Sequential(
  (0): Linear(in_features=2, out_features=1, bias=True)
  (1): Sigmoid()
)

So that's what it looks like to create a neural network with one neuron that takes in an observation with two features.

### Summary

In this lesson, we saw how to create a neural network -- with a single neuron -- in Pytorch.

In [64]:
net = nn.Sequential(
    nn.Linear(2, 1),
    nn.Sigmoid()
)

We saw that with the linear layer, we specify the number of input features, and the number of neurons -- where each neuron consists of a weight vector and a bias term.

In [65]:
layer = nn.Linear(2, 1)
layer._parameters

OrderedDict([('weight',
              Parameter containing:
              tensor([[-0.2977,  0.1892]], requires_grad=True)),
             ('bias',
              Parameter containing:
              tensor([-0.3397], requires_grad=True))])

And we saw that we can pass a feature vector to this layer, and it will apply the linear function $z(x) = w \cdot x + b$.

In [66]:
x

tensor([2., 4.])

In [67]:
ll(x)

tensor([0.1087], grad_fn=<AddBackward0>)

And finally that if we pass the feature vector to the neural net, that it will pass the feature vector through the linear layer, and that output to the sigmoid activation function to produce a prediction between 0 and 1 expressing the strength of the neuron firing.

In [68]:
net(x)

tensor([0.2172], grad_fn=<SigmoidBackward>)