# Intro
PyTorch is a very powerful machine learning framework. Central to PyTorch are tensors, a generalization of matrices to higher ranks. One intuitive example of a tensor is an image with three color channels: A 3-channel (red, green, blue) image which is 64 pixels wide and 64 pixels tall is a
 tensor. You can access the PyTorch framework by writing import torch near the top of your code, along with all of your other import statements.

This guide will help introduce you to the functionality of PyTorch, but don't worry too much about memorizing it: the assignments will link to relevant documentation where necessary.

In [2]:
import torch

# **Why PyTorch?**

---


One important question worth asking is, why is PyTorch being used for this course? There is a great breakdown by the Gradient looking at the state of machine learning frameworks today. In part, as highlighted by the article, PyTorch is generally more pythonic than alternative frameworks, easier to debug, and is the most-used language in machine learning research by a large and growing margin. While PyTorch's primary alternative, Tensorflow, has attempted to integrate many of PyTorch's features, Tensorflow's implementations come with some inherent limitations highlighted in the article.

Notably, while PyTorch's industry usage has grown, Tensorflow is still (for now) a slight favorite in industry. In practice, the features that make PyTorch attractive for research also make it attractive for education, and the general trend of machine learning research and practice to PyTorch makes it the more proactive choice.

In simple words tensor is a way to store the data in the form of a matrix for the machine learning to understad better

In [3]:
example_tensor = torch.Tensor(
    [
        [[1,2], [3,4]],
        [[5,6], [7,8]],
        [[9,0], [1,2]]
    ]
)

In [4]:
example_tensor

tensor([[[1., 2.],
         [3., 4.]],

        [[5., 6.],
         [7., 8.]],

        [[9., 0.],
         [1., 2.]]])

# Tensor Properties: Device

In [5]:
example_tensor.device

device(type='cpu')

# Tensor Properties: Shape

In [6]:
example_tensor.shape

torch.Size([3, 2, 2])

In [7]:
n=1
example_tensor.shape[n-1] , example_tensor.shape[n], example_tensor.shape[n+1]

(3, 2, 2)

The above cell it to print the size of a particular dimension (n) where n can be from 0 to 2

# Indexing Tensors

To access specific elements in pytorch  just write : example_tensor[n] where n can be 0 , 1 ----- (n-1)

In [8]:
example_tensor[1]

tensor([[5., 6.],
        [7., 8.]])

In [9]:
example_tensor[0]

tensor([[1., 2.],
        [3., 4.]])

In [10]:
example_tensor[2]

tensor([[9., 0.],
        [1., 2.]])

In addition, if you want to access the
-jth dimension of the
-ith example, you can write example_tensor[i, j]

In [11]:
example_tensor[2,0,0]


tensor(9.)

In [12]:
example_tensor[2,0,1]


tensor(0.)

In [13]:
example_tensor[2,1,0]


tensor(1.)

In [14]:
example_tensor[2,1,1]

tensor(2.)


Note that if you'd like to get a Python scalar value from a tensor, you can use example_scalar.item()

In [15]:
example_tensor[2,0,0].item()

9.0

In addition, you can index into the ith element of a column by using x[:, i]. For example, if you want the top-left element of each element in example_tensor, which is the 0, 0 element of each matrix, you can write:

In [16]:
example_tensor

tensor([[[1., 2.],
         [3., 4.]],

        [[5., 6.],
         [7., 8.]],

        [[9., 0.],
         [1., 2.]]])

In [17]:
example_tensor[:,0,1]

tensor([2., 6., 0.])

# Initializing Tensors
Just like when we initialise the variables there are multiple ways to do that like n=0 , n=1 and then use in the code. In a very similar way there are ways to initialize tensors. The two ways are:
1. torch.ones_like()-> create tensors of all ones with the same shape
2. torch.zeros_like()->create tensors of all zeros with the same shape

In [18]:
torch.zeros_like(example_tensor)

tensor([[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]])

In [19]:
torch.ones_like(example_tensor)

tensor([[[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]]])

In [20]:
torch.randn(2, 2, device='cpu') # Alternatively, for a GPU tensor, you'd use device='cuda'

tensor([[ 0.2093, -1.2174],
        [ 0.1111, -1.1797]])

# Basic Functions
There are a number of basic functions that you should know to use PyTorch - if you're familiar with numpy, all commonly-used functions exist in PyTorch, usually with the same name. You can perform element-wise multiplication / division by a scalar
 by simply writing c * example_tensor, and element-wise addition / subtraction by a scalar by writing example_tensor + c

In [21]:
(example_tensor -5) *2

tensor([[[ -8.,  -6.],
         [ -4.,  -2.]],

        [[  0.,   2.],
         [  4.,   6.]],

        [[  8., -10.],
         [ -8.,  -6.]]])


You can calculate the mean or standard deviation of a tensor using example_tensor.mean() or example_tensor.std().

In [22]:
print("Mean:", example_tensor.mean())
print("Stdev:", example_tensor.std())

Mean: tensor(4.)
Stdev: tensor(2.9848)


In [23]:
example_tensor.mean(0) # for a specific dimension

tensor([[5.0000, 2.6667],
        [3.6667, 4.6667]])

# PyTorch Neural Network Module (torch.nn)
PyTorch has a lot of powerful classes in its torch.nn module (Usually, imported as simply nn). These classes allow you to create a new function which transforms a tensor in specific way, often retaining information when called multiple times

In [24]:

import torch.nn as nn

# nn.Linear

nn.Linear
To create a linear layer, you need to pass it the number of input dimensions and the number of output dimensions. The linear object initialized as nn.Linear(10, 2) will take in a nx10
 matrix and return an nx2
 matrix, where all
 elements have had the same linear transformation performed. For example, you can initialize a linear layer which performs the operation Ax+b
, where A
 and b
 are initialized randomly when you generate the nn.Linear() object.

In [27]:
linear = nn.Linear(10,2)
example_input = torch.randn(3,10)
print(example_input)
print(linear)
example_output = linear(example_input)
example_output

tensor([[-0.2082,  0.7283,  2.2764, -0.5144,  2.1596, -0.5977,  0.4211, -0.4971,
         -0.3024,  1.4633],
        [ 0.0695, -0.3264, -0.7732, -0.1609, -3.0675, -0.5476,  0.0873, -1.1022,
          1.5734, -0.4639],
        [-2.2093,  1.6308,  0.5078, -0.4286, -0.9090, -0.0185, -1.7203,  0.9237,
          1.6211, -0.7965]])
Linear(in_features=10, out_features=2, bias=True)


tensor([[-1.0690, -0.4122],
        [ 0.8470, -0.7256],
        [ 0.2006, -0.0790]], grad_fn=<AddmmBackward0>)

# nn.ReLU
nn.ReLU() will create an object that, when receiving a tensor, will perform a ReLU activation function. This will be reviewed further in lecture, but in essence, a ReLU non-linearity sets all negative numbers in a tensor to zero. In general, the simplest neural networks are composed of series of linear transformations, each followed by activation functions.

In [28]:
relu = nn.ReLU()
relu_output = relu(example_output)
relu_output

tensor([[0.0000, 0.0000],
        [0.8470, 0.0000],
        [0.2006, 0.0000]], grad_fn=<ReluBackward0>)

# nn.BatchNorm1d
nn.BatchNorm1d is a normalization technique that will rescale a batch of
 inputs to have a consistent mean and standard deviation between batches.

As indicated by the 1d in its name, this is for situations where you expects a set of inputs, where each of them is a flat list of numbers. In other words, each input is a vector, not a matrix or higher-dimensional tensor. For a set of images, each of which is a higher-dimensional tensor, you'd use nn.BatchNorm2d, discussed later on this page.

nn.BatchNorm1d takes an argument of the number of input dimensions of each object in the batch (the size of each example vector).

In [31]:
batchnorm = nn.BatchNorm1d(2)
batchnorm_output = batchnorm(relu_output)
batchnorm_output

tensor([[-0.5624,  0.0000],
        [ 1.4049,  0.0000],
        [-0.8425,  0.0000]], grad_fn=<NativeBatchNormBackward0>)

In [32]:
mlp_layer = nn.Sequential(
    nn.Linear(5, 2),
    nn.BatchNorm1d(2),
    nn.ReLU()
)

test_example = torch.randn(5,5) + 1
print("input: ")
print(test_example)
print("output: ")
print(mlp_layer(test_example))

input: 
tensor([[ 1.3337, -0.5931,  0.6257,  0.9361, -0.0082],
        [ 2.2895,  1.6730,  0.0726, -1.7070,  2.4218],
        [ 0.9048,  2.3590,  2.4978, -0.2834,  0.3359],
        [ 1.9263, -0.2548, -0.4634,  2.3502,  2.1302],
        [ 1.5787,  1.0540,  1.2046,  0.6980, -0.8799]])
output: 
tensor([[0.0000, 0.0626],
        [1.5731, 0.0000],
        [0.0000, 0.0000],
        [0.7319, 1.3158],
        [0.0000, 0.7405]], grad_fn=<ReluBackward0>)


# Optimizers
To create an optimizer in PyTorch, you'll need to use the torch.optim module, often imported as optim. optim.Adam corresponds to the Adam optimizer. To create an optimizer object, you'll need to pass it the parameters to be optimized and the learning rate, lr, as well as any other parameters specific to the optimizer.

In [33]:
import torch.optim as optim
adam_opt = optim.Adam(mlp_layer.parameters(), lr=1e-1)

A (basic) training step in PyTorch consists of four basic parts:

Set all of the gradients to zero using opt.zero_grad()
Calculate the loss, loss
Calculate the gradients with respect to the loss using loss.backward()
Update the parameters being optimized using opt.step()
That might look like the following code (and you'll notice that if you run it several times, the loss goes down):

In [34]:
train_example = torch.randn(100,5) + 1
adam_opt.zero_grad()

# We'll use a simple loss function of mean distance from 1
# torch.abs takes the absolute value of a tensor
cur_loss = torch.abs(1 - mlp_layer(train_example)).mean()

cur_loss.backward()
adam_opt.step()
print(cur_loss)

tensor(0.7768, grad_fn=<MeanBackward0>)
