# Using Pytorch Linear Layers

### Introduction

In the previous lesson, we learned about Pytorch tensors and saw how we can use them to perform many of the same operations that we have in numpy.  In constructing the weights and biases for a neural network, we can use tensors to create our weight matrices and bias vectors:

In [15]:
import torch 

W1 = torch.randn(8, 15)
b1 = torch.randn(8)

In this lesson, we'll see how we can use a Linear object to work with both a weight matrix and bias vector simultaneously. 

### Linear Layers

Previously, we've initialized a linear layer of our neural network through creating a weight matrix and a bias vector.  But going forward, we'll use the Linear object.

In [16]:
from torch import nn
W1 = nn.Linear(15, 8)
W1

Linear(in_features=15, out_features=8, bias=True)

As we can see, we just initialized a linear layer that takes in 64 features, and outputs a vector of length 8 for each observation.

This linear object contains both the weight matrix and bias vector that we saw before.  Let's take a look.

In [22]:
W1.weight, W1.bias

(Parameter containing:
 tensor([[ 0.0153, -0.2517,  0.0039,  0.2195,  0.0455, -0.0423,  0.1498, -0.0034,
           0.1061, -0.0602,  0.0272, -0.1746, -0.2355,  0.2385,  0.1183],
         [ 0.0416,  0.0378,  0.0023,  0.1865,  0.1587,  0.1989, -0.1091, -0.1594,
           0.1042, -0.1967,  0.0451, -0.2548,  0.0320, -0.0221, -0.2207],
         [-0.0980, -0.1517, -0.0681,  0.1166, -0.1324,  0.2280, -0.0969, -0.1797,
           0.0771, -0.1517, -0.0594,  0.0892, -0.1748, -0.2192,  0.0030],
         [-0.2560, -0.0397, -0.2001, -0.2446,  0.1498, -0.0142,  0.1464,  0.0254,
           0.0066,  0.1346,  0.0430, -0.0201,  0.1950, -0.0395, -0.0901],
         [ 0.1232,  0.2515, -0.2305,  0.0939, -0.2218,  0.0135, -0.2104,  0.2278,
          -0.2412, -0.2580,  0.0965,  0.0917,  0.0793, -0.0584,  0.1782],
         [ 0.1335,  0.0241, -0.0353, -0.0671,  0.1532, -0.2241, -0.2580, -0.0652,
           0.1999, -0.1292,  0.2364,  0.1334, -0.0924, -0.0818, -0.1487],
         [ 0.0585, -0.1632,  0.1573,  0.1

And if we want to pass through an observation through these weight and matix vectors, we can do so with the matrix operations we saw previously.

In [19]:
X_1 = torch.randn(1, 15)
X_1

tensor([[-1.1694,  2.0995, -1.4558,  0.2007,  0.5987, -0.4140, -1.5270,  1.2951,
         -0.5993,  0.0120, -1.4823,  1.6493, -0.7202,  1.8320,  1.3714]])

In [34]:
W1.weight.shape, X_1.T.shape

(torch.Size([8, 15]), torch.Size([15, 1]))

In [35]:
W1.weight @ X_1.T
#  + W1.bias.view(-1, 1)

tensor([[-0.3203],
        [-0.8808],
        [-0.4242],
        [-0.0724],
        [ 1.4462],
        [-0.1124],
        [-1.9090],
        [-0.6849]], grad_fn=<MmBackward>)

Or we can just pass our data directly through our linear layer.

In [36]:
X_1

tensor([[-1.1694,  2.0995, -1.4558,  0.2007,  0.5987, -0.4140, -1.5270,  1.2951,
         -0.5993,  0.0120, -1.4823,  1.6493, -0.7202,  1.8320,  1.3714]])

In [37]:
Z1 = W1(X_1)
Z1

tensor([[-0.5513, -1.1182, -0.5681, -0.1491,  1.5722, -0.0787, -2.1548, -0.6259]],
       grad_fn=<AddmmBackward>)

So as we can see, this performs the calculations of our linear layer for us -- both the matrix multiplication and the addition of the bias vector.

### Completing the Activation Layer

So at this point we know how to both initialize and pass data through a linear layer.

In [42]:
W1 = nn.Linear(15, 8)
W1

Linear(in_features=15, out_features=8, bias=True)

In [47]:
Z1 = W1(X_1)

Z1.data

tensor([[-1.5759,  0.7659, -0.0972,  0.0187,  0.9738, -0.1467,  0.2360, -0.0631]])

What about an activation layer.  Well remember the activation layer simply means that we pass our linear outputs through a non-linear function like the sigmoid function.

In [48]:
import numpy as np

def sigmoid(z):
    return 1/(1 + np.exp(-z))

sigmoid(Z1.data)

tensor([[0.1714, 0.6826, 0.4757, 0.5047, 0.7259, 0.4634, 0.5587, 0.4842]])

Of course, Pytorch has a built in function for us.

In [54]:
torch.sigmoid(Z1)

tensor([[0.1714, 0.6826, 0.4757, 0.5047, 0.7259, 0.4634, 0.5587, 0.4842]],
       grad_fn=<SigmoidBackward>)

### Summary

In this lesson we learned how to construct our linear and activation layers in Pytorch.  We saw that we can initialize a linear layer through the Linear constructor.

In [57]:
from torch.nn import Linear

W1 = Linear(15, 8)

And that our linear layer consists of both a weight matrix and a bias vector.

In [59]:
W1.weight.shape, W1.bias.shape

(torch.Size([8, 15]), torch.Size([8]))

And that to multiply our data by our weight matrix and then add our bias vector, we simply pass our data through the instance of the linear layer.

In [62]:
X_1 = torch.randn(1, 15)
W1(X_1)

tensor([[ 1.2126,  0.0518,  0.6054, -0.0200,  0.4632,  0.4092,  0.1312,  0.4435]],
       grad_fn=<AddmmBackward>)

We then moved onto the activation layer, which is simply a non-linear function.  If we use a sigmoid function as our activation layer, we can access this from Pytorch.

In [63]:
torch.sigmoid(Z1)

tensor([[0.1714, 0.6826, 0.4757, 0.5047, 0.7259, 0.4634, 0.5587, 0.4842]],
       grad_fn=<SigmoidBackward>)