# Numpy to Pytorch

### Introduction

In this lesson, we'll begin moving towards the Pytorch library.  We'll see how we can build linear layers in Pytorch as well as make use of non-linear layers.  Let's get started.

### Initializing a Model

Previously, we've initialized a new model with something like the following.

In [8]:
def init_model(n_features, neur_l1, neur_out):
    W1 = np.random.randn(n_features, neur_l1) / np.sqrt(n_features)
    b1 = np.zeros((1, neur_l1))
    W2 = np.random.randn(neur_l1, neur_out) / np.sqrt(neur_l1)
    b2 = np.zeros((1, neur_out))
    model = {'W1': W1, 'b1': b1, 'W2': W2, 'b2': b2}
    return model

We initialized a set of weight matrices (`W1`, and `W2`) and bias vectors (`b1` and `b2`), which we later tune through our training procedure.  Once these weights and biases are trained, we can use them to make a prediction with our model.

With Pytorch, we can do the same thing with a Pytorch tensor. 

In [37]:
import torch

W1 = torch.randn(8, 784)

In [40]:
W1

tensor([[-1.1513, -0.0768, -0.3163,  ..., -1.3084, -0.2952, -1.2212],
        [ 0.5460,  0.0042,  1.0023,  ...,  1.1528,  0.1197,  0.0430],
        [ 0.4627, -1.6576, -2.6115,  ...,  0.6369, -1.8958,  1.0517],
        ...,
        [ 0.4963, -0.0175,  0.3728,  ..., -2.6316,  0.1751,  0.4416],
        [ 0.6346,  1.1451,  0.7053,  ...,  0.6085,  0.0940, -1.5013],
        [-0.2043,  1.0081,  0.3093,  ..., -1.3962,  0.3015,  0.8347]])

In [39]:
W1.shape

torch.Size([8, 784])

As we can see, the Pytorch library provides us with much of the same interface as numpy.  We can initialize data through the `randn` function, the `randint` function, or by creating our own custom array just like in numpy.

In [41]:
torch.tensor([1, 2, 3])

tensor([1, 2, 3])

And we can select our data by using the same bracket operators as we saw in numpy.

In [43]:
W1[:2, :3]

tensor([[-1.1513, -0.0768, -0.3163],
        [ 0.5460,  0.0042,  1.0023]])

The main difference between a pytorch tensor and a numpy array that we need to worry about at this point is reshaping our data.  In Pytorch, the method for changing the shape of the data is called `view`.

In [47]:
W1.view(-1)

tensor([-1.1513, -0.0768, -0.3163,  ..., -1.3962,  0.3015,  0.8347])

### Matrix Operations in Pytorch

For the purposes of neural networks, perhaps the most important thing to know is how to perform matrix algebra operations.  Remember that the whole point of initializing a weight matrix and a bias vector is to use them to calculate the linear output of each neuron.

Let's initialize some data.

In [55]:
import numpy as np

X = torch.randn(3, 784)
X.shape

torch.Size([3, 784])

In [67]:
W1.shape, X.T.shape

(torch.Size([8, 784]), torch.Size([784, 3]))

In [81]:
W1 @ X.T 


tensor([[ -8.4259, -11.8295,  23.4698],
        [-39.4412,  36.8929, -61.9721],
        [-23.1415,  42.7091,  25.7792],
        [ 33.3887, -11.9642,  13.4949],
        [ 22.8652,   8.9591,  -3.2612],
        [ -9.7971,  17.3862, -10.0517],
        [ 12.3032, -52.7073,  32.4848],
        [ 19.9870,  23.3461,  -0.5489]])

So we can see above that we have eight outputs for each of our three observations, just like we saw before.

And to include the addition of a bias vector, we initialize a vector with eight rows.

In [82]:
b1 = torch.randn(8, 1)
b1

tensor([[-2.0904],
        [-1.2224],
        [-0.6987],
        [ 0.1317],
        [ 1.6392],
        [ 1.4074],
        [ 0.4002],
        [-0.1969]])

And then add.

In [87]:
Z1 = W1 @ X.T  + b1
Z1

tensor([[-10.5163, -13.9199,  21.3794],
        [-40.6635,  35.6705, -63.1945],
        [-23.8402,  42.0104,  25.0805],
        [ 33.5204, -11.8325,  13.6266],
        [ 24.5044,  10.5983,  -1.6220],
        [ -8.3896,  18.7937,  -8.6443],
        [ 12.7034, -52.3072,  32.8849],
        [ 19.7900,  23.1491,  -0.7459]])

So at this point, we can translate our `init_model` function to use Pytorch tensors.

In [90]:
def init_model(n_features, neur_l1, neur_out):
    W1 = torch.randn(n_features, neur_l1) / np.sqrt(n_features)
    b1 = torch.zeros((1, neur_l1))
    W2 = torch.randn(neur_l1, neur_out) / np.sqrt(neur_l1)
    b2 = torch.zeros((1, neur_out))
    model = {'W1': W1, 'b1': b1, 'W2': W2, 'b2': b2}
    return model

In [91]:
init_model(5, 3, 2)

{'W1': tensor([[ 0.2883,  0.5991,  0.0466],
         [-0.5802, -0.3250, -0.8185],
         [ 0.4495, -0.1099,  0.4779],
         [ 0.0704, -0.6322,  0.9711],
         [ 0.7418, -0.4575, -0.1500]]),
 'b1': tensor([[0., 0., 0.]]),
 'W2': tensor([[-0.0307, -0.4393],
         [-0.5518, -0.3604],
         [-0.3678, -0.2677]]),
 'b2': tensor([[0., 0.]])}

### Summary

In this lesson, we learned about initializing the weight matrices and bias vectors of our neural network with Pytorch tensors.  As we saw, Pytorch tensors are similar to numpy arrays both in their construction and the operations we can perform with them.

> We can initialize a Pytorch tensor in similar ways.

In [94]:
import torch

W1 = torch.randn(8, 784)

In [95]:
W1.shape

torch.Size([8, 784])

Use the `view` method to reshape the tensore.

In [96]:
W1.view(-1)

tensor([-0.2966, -0.7510,  0.9930,  ...,  1.9472, -0.3612,  0.0994])

And can perform matrix algebra operations with our tensor to pass our data through our linear layer.

In [97]:
Z1 = W1 @ X.T  + b1
Z1

tensor([[ 7.1926e+00,  2.8639e-03, -8.8633e+00],
        [ 4.1558e+01, -3.9941e+01, -3.0199e+01],
        [-1.3246e+01,  1.8456e+01,  2.3438e+01],
        [-3.0452e+01, -2.8885e+01, -2.9598e+01],
        [ 5.8378e+00, -4.2085e+01,  1.7439e+01],
        [-1.4112e+01, -2.5177e+00,  3.6517e+01],
        [-1.4192e+00, -3.1555e+01, -7.2901e+00],
        [ 2.5637e+01,  5.2545e+01,  1.7124e+01]])