<h1 style="text-align: center;">Pytorch Tutorial</h1>

**WHAT IS PYTORCH?**

>It’s a Python-based scientific computing package targeted at two sets of audiences:
> 1. A replacement for NumPy to use the power of GPUs
> 2. a deep learning research platform that provides maximum flexibility and speed

**1.1 Tensors**

In [None]:
! pip install torch

In [None]:
from __future__ import print_function
import torch

In [None]:
# Task-01 Construct a 5x3 matrix, uninitialized:
from __future__ import print_function
x = torch.empty(5, 4)
print(x)

In [None]:
# Task-02 Construct a randomly initialized matrix:

x = torch.rand(5,4)
print(x)

In [None]:
# Task-03 Construct a matrix filled zeros and of dtype long:

x = torch.zeros(5, 4, dtype=torch.long)
print(x)

In [None]:
# Task-04 Construct a tensor directly from data:

x = torch.tensor([5.5, 3, 4., -1.])
print(x)

In [None]:
# Task-05 These methods will reuse properties of the 
# input tensor, e.g. dtype, unless new values are provided by user

x = x.new_ones(5, 4, dtype=torch.double)      # new_* methods take in sizes
print(x)
print(type(x))

x = torch.randn_like(x, dtype=torch.float)    # override dtype!
print(x)
print(type(x))

In [None]:
# Task-06 Get its size:

print(x.size())

**1.2 Operations**

In [None]:
# Task-07 Addition using +

y = torch.rand(5, 4)
print(x + y)

In [None]:
# Task-08 Addition using add() method

print(torch.add(x, y))

In [None]:
# Task-09 Addition: providing an output tensor as argument

result = torch.empty(5, 4)
torch.add(x, y, out=result)
print(result)

In [None]:
# Task-10 Addition: in-place
# Any operation that mutates a tensor in-place is post-fixed with an _. 
# For example: x.copy_(y), x.t_(), will change x.

# adds x to y
y.add_(x)
print(y)

In [None]:
# Task-11 You can use standard NumPy-like indexing with all bells and whistles!

print(x[:, 1])

In [None]:
# Task-12 Resizing: If you want to resize/reshape tensor, you can use torch.view

x = torch.randn(4, 4)
y = x.view(16)
z = x.view(-1, 8)  # the size -1 is inferred from other dimensions
print(x.size(), y.size(), z.size())

In [None]:
# Task-12 Get numerical value: 
# If you have a one element tensor, use .item() 
# to get the value as a Python number

x = torch.randn(1)
print(x)
print(x.item())
print(y[0].item())

**1.3 Converting a Torch Tensor to a NumPy Array**

In [None]:
# Task-13 Converting a Torch Tensor to a NumpPy Array
a = torch.ones(5)
print(a)
b = a.numpy()
print(b)

In [None]:
# Task-14 See how the numpy array changed in value.

a.add_(1)
print(a)
print(b)

# They share the same memory

In [None]:
# Task-15 Converting NumPy Array to Torch Tensor

import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print(a)
print(b)

In [None]:
# Task-16 Tensors can be moved onto any device using the .to method.

# let us run this cell only if CUDA is available
# We will use ``torch.device`` objects to move tensors in and out of GPU
if torch.cuda.is_available():
    device = torch.device("cuda")          # a CUDA device object
    y = torch.ones_like(x, device=device)  # directly create a tensor on GPU
    x = x.to(device)                       # or just use strings ``.to("cuda")``
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       # ``.to`` can also change dtype together!

**1.4 Autograd: automatic differentiation**

Central to all neural networks in PyTorch is the *autograd* package. Let’s first briefly visit this, and we will then go to training our first neural network.

The autograd package provides automatic differentiation for all operations on Tensors. It is a define-by-run framework, which means that your backprop is defined by how your code is run, and that every single iteration can be different.

In [None]:
# Task-17 Create a tensor and set requires_grad=True to track computation with it

x = torch.ones(2, 2, requires_grad=True)
print(x)

In [None]:
y = x + 2
print(y)

In [None]:
print(y.grad_fn)

In [None]:
# Task-18 do operations on y

z = y * y * 3
out = z.mean()
print(z, out)

In [None]:
a = torch.randn(2, 2)
a = ((a * 3) / (a - 1))
print(a.requires_grad)
a.requires_grad_(True)
print(a.requires_grad)
b = (a * a).sum()
print(b.grad_fn)

**1.5 Gradients**

Let’s backprop now Because out contains a single scalar, 
out.backward() is equivalent to out.backward(torch.tensor(1)).

In [None]:
# Task-19 do the backprop

out.backward()

In [None]:
print(x.grad)

In [None]:
# Task-20 the autograd operation could complicated

x = torch.randn(3, requires_grad=True)
y = x * 2
while y.data.norm() < 1000:
    y = y * 2
print(y)

In [None]:
gradients = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)
y.backward(gradients)
print(x.grad)

In [None]:
# Task-21 Use requires_grad()
# You can also stop autograd from tracking history on Tensors with 
# .requires_grad=True by wrapping the code block in with torch.no_grad():

print(x.requires_grad)
print((x ** 2).requires_grad)

with torch.no_grad():
    print((x ** 2).requires_grad)

In [None]:
# Task-22 create a neural network model

# A typical training procedure for a neural network is as follows:

# Define the neural network that has some learnable parameters (or weights)
# Iterate over a dataset of inputs
# Process input through the network
# Compute the loss (how far is the output from being correct)
# Propagate gradients back into the network’s parameters
# Update the weights of the network, typically using a simple update rule: 
# weight = weight - learning_rate * gradient

import torch
import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 5x5 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square you can only specify a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    def num_flat_features(self, x):
        size = x.size()[1:]  # all dimensions except the batch dimension
        num_features = 1
        for s in size:
            num_features *= s
        return num_features


net = Net()
print(net)

In [None]:
# Task-23 The learnable parameters of a model are returned by net.parameters()

params = list(net.parameters())
print(len(params))
print(params[0].size())  # conv1's .weight

In [None]:
# Task-24 
input = torch.randn(1, 1, 32, 32)
out = net(input)
print(out)

In [None]:
# Task-25 Zero the gradient buffers of all parameters and backprops with random gradients:

net.zero_grad()
out.backward(torch.randn(1, 10))

In [None]:
# Task-26 Loss function 

output = net(input)
target = torch.randn(10)  # a dummy target, for example
target = target.view(1, -1)  # make it the same shape as output
criterion = nn.MSELoss()

loss = criterion(output, target)
print(loss)


In [None]:
print(loss.grad_fn)  # MSELoss
print(loss.grad_fn.next_functions[0][0])  # Linear
print(loss.grad_fn.next_functions[0][0].next_functions[0][0])  # ReLU

In [None]:
# Task-27 Backprop

net.zero_grad()     # zeroes the gradient buffers of all parameters

print('conv1.bias.grad before backward')
print(net.conv1.bias.grad)

loss.backward()

print('conv1.bias.grad after backward')
print(net.conv1.bias.grad)

In [None]:
# Update the weights

learning_rate = 0.01
for f in net.parameters():
    f.data.sub_(f.grad.data * learning_rate)

In [None]:
import torch.optim as optim

# create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)

# in your training loop:
optimizer.zero_grad()   # zero the gradient buffers
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step()    # Does the update

In [None]:
x = torch.tensor(-2.0, requires_grad=True)
y = torch.tensor(5.0, requires_grad=True)
z = torch.tensor(-4.0, requires_grad=True)
f = (x+y)*z # Define the computation graph
f.backward() # PyTorch’s internal backward gradient computation
print('Gradients after backpropagation:', x.grad, y.grad, z.grad)