<center> <img src="https://pytorch.org/tutorials/_static/img/thumbnails/cropped/profiler.png" width="150" height="150"/></center> <center> <h1><b><i>Introduction to Pytorch</i></b></h1></center>
<center> <h3><i>Facebook's DeepLearning & Machine Learning Framework</i></h3></center>

# *Introduction*

<p style= "font-size:120%"><i><b>Tensor</b></i> - These are basically matrices to a very high rank.<br>Example of this would be an image [A 3-channel (red, green, blue) image which is 64 pixels wide and 64 pixels tall is a  3×64×64  tensor.]</p>


In [1]:
# Import libraries
import torch

## *Tensor creation & properties*

In [2]:
# Create example tensor

extensor = torch.Tensor(
    [
        [[1,2],[3,4]],
        [[5,6],[7,8]], 
        [[9,0],[1,2]]
    ]
)

extensor

tensor([[[1., 2.],
         [3., 4.]],

        [[5., 6.],
         [7., 8.]],

        [[9., 0.],
         [1., 2.]]])

In [3]:
# Tensor properties

print("Tensor device: ",extensor.device)
print("Tensor shape: ",extensor.shape)
print("Rank: ",len(extensor.shape))
print("Number of coefficients: ",extensor.numel())

Tensor device:  cpu
Tensor shape:  torch.Size([3, 2, 2])
Rank:  3
Number of coefficients:  12


In [4]:
# Accessing specific elements & their types

print("Element at (1,1) indices - ",extensor[1][1])
print("\nElement at the (1,1,1) index - ", extensor[1][1][1], "\nType - ",type(extensor[1][1][1]))
print("\nElement at the (1,1,1) index - ", extensor[1][1][1].item(), "\nType - ",type(extensor[1][1][1].item()))

Element at (1,1) indices -  tensor([7., 8.])

Element at the (1,1,1) index -  tensor(8.) 
Type -  <class 'torch.Tensor'>

Element at the (1,1,1) index -  8.0 
Type -  <class 'float'>


In [5]:
# Get first element of each tensor in the current tensor
# Here the (0,1) is the internal index from each sub matrix
print("Example tensor - \n",extensor,"\n\n")
extensor[:, 0,1]

Example tensor - 
 tensor([[[1., 2.],
         [3., 4.]],

        [[5., 6.],
         [7., 8.]],

        [[9., 0.],
         [1., 2.]]]) 




tensor([2., 6., 0.])

## *Methods to create tensors in PyTorch*

In [6]:
# Method 1 - Tensor with all 1s' of shape (extensor.shape)
print("\n\nMethod 1 -\n",torch.ones_like(extensor))

# Method 2 - Tensor with all 0s' of shape (extensor.shape)
print("\n\nMethod 2 -\n",torch.zeros_like(extensor))

# Method 3 - Tensor with all random numbers of shape (extensor.shape) - Random like extensor
print("\n\nMethod 3 -\n",torch.randn_like(extensor))

# Method 4 - Tensor from only shape and device
print("\n\nMethod 4 -\n",torch.randn(3,3, device="cpu"))



Method 1 -
 tensor([[[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]]])


Method 2 -
 tensor([[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]])


Method 3 -
 tensor([[[-0.0400, -0.7003],
         [-0.2335,  2.7244]],

        [[ 0.1507, -2.1696],
         [-0.9754, -1.3152]],

        [[ 0.5193, -0.3705],
         [-0.2675, -0.4955]]])


Method 4 -
 tensor([[ 0.3844,  2.7677, -0.7787],
        [ 0.0523, -0.6698, -1.7853],
        [-1.1561, -0.2203, -1.5102]])


<center> <img src="https://pytorch.org/tutorials/_static/img/thumbnails/cropped/profiler.png" width="150" height="150"/></center> <center> <h1><b><i>PyTorch - As a DeepLearning framework</i></b></h1></center>

In [36]:
# nn module from pytorch
import torch.nn as nn
import torch.optim as optimizers

### `nn.Linear`

To create a linear layer, you need to pass it the number of input dimensions and the number of output dimensions. The linear object initialized as `nn.Linear(10, 2)` will take in a $n\times10$ matrix and return an $n\times2$ matrix, where all $n$ elements have had the same linear transformation performed. For example, you can initialize a linear layer which performs the operation $Ax + b$, where $A$ and $b$ are initialized randomly when you generate the [`nn.Linear()`](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html) object. 

In [8]:
linear = nn.Linear(10,2)
# Initialize a random tensor
exinput = torch.randn(3,10)
exout = linear(exinput)
print("Input: ",exinput,"\n\n")
exout

Input:  tensor([[ 0.5780,  0.9117, -1.9039, -0.4185, -1.2281,  1.2157, -2.7131, -0.1461,
         -0.0768, -0.8076],
        [-0.0731,  0.5010,  0.6021, -1.3893,  1.3945, -1.8512, -0.6994,  1.0924,
          2.2604,  0.4381],
        [-0.5080, -0.1629,  2.4953, -0.4880,  1.5919,  0.0756,  0.3597,  0.1017,
         -0.3851, -1.5134]]) 




tensor([[-0.2617,  0.0816],
        [ 0.3904, -0.1394],
        [ 0.0401, -0.5929]], grad_fn=<AddmmBackward0>)

### `nn.ReLU`

[`nn.ReLU()`](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html) will create an object that, when receiving a tensor, will perform a ReLU activation function. This will be reviewed further in lecture, but in essence, a ReLU non-linearity sets all negative numbers in a tensor to zero. In general, the simplest neural networks are composed of series of linear transformations, each followed by activation functions. 

In [9]:
relu = nn.ReLU()
relu_out = relu(exout)
relu_out

tensor([[0.0000, 0.0816],
        [0.3904, 0.0000],
        [0.0401, 0.0000]], grad_fn=<ReluBackward0>)

### `nn.BatchNorm1D`

[`nn.BatchNorm1d`](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm1d.html) is a normalization technique that will rescale a batch of $n$ inputs to have a consistent mean and standard deviation between batches.  

As indicated by the `1d` in its name, this is for situations where you expect a set of inputs, where each of them is a flat list of numbers. In other words, each input is a vector, not a matrix or higher-dimensional tensor. For a set of images, each of which is a higher-dimensional tensor, you'd use [`nn.BatchNorm2d`](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html), discussed later on this page.

`nn.BatchNorm1d` takes an argument of the number of input dimensions of each object in the batch (the size of each example vector).

In [12]:
print("Number of features: ",exout.shape[1])

Number of features:  2


In [16]:
# Batch Normalization (num of features)
batchnorm = nn.BatchNorm1d(2);
batchnorm_out = batchnorm(relu_out)
batchnorm_out

tensor([[-0.8182,  1.4095],
        [ 1.4078, -0.7047],
        [-0.5896, -0.7047]], grad_fn=<NativeBatchNormBackward0>)

### `nn.Sequential`

[`nn.Sequential`](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html) creates a single operation that performs a sequence of operations. For example, you can write a neural network layer with a batch normalization as

In [32]:
layer = nn.Sequential(
    # Same as Dense layer in TF
    nn.Linear(in_features=5, out_features=2),
    nn.BatchNorm1d(2),
    nn.ReLU()
)

# Call layer
data = torch.randn(4,5)+1
print("Input: \n",data)
print("\nLayer Output: \n", layer(data))

Input: 
 tensor([[ 1.0595,  1.0661,  1.4449, -0.1592,  1.2165],
        [ 1.5318,  0.6325,  0.8639,  0.3034, -0.5190],
        [ 2.6771,  0.4743,  1.2824, -0.3809,  1.7856],
        [ 2.6985,  0.4521,  0.4706,  0.6911,  1.0830]])

Layer Output: 
 tensor([[0.0000, 0.0000],
        [0.9691, 1.1057],
        [0.0000, 0.0000],
        [0.9309, 0.8260]], grad_fn=<ReluBackward0>)


### `Optimizers`

To create an optimizer in PyTorch, you'll need to use the `torch.optim` module, often imported as `optim`. [`optim.Adam`](https://pytorch.org/docs/stable/optim.html#torch.optim.Adam) corresponds to the Adam optimizer. To create an optimizer object, you'll need to pass it the parameters to be optimized and the learning rate, `lr`, as well as any other parameters specific to the optimizer.

For all `nn` objects, you can access their parameters as a list using their `parameters()` method, as follows:

In [37]:
# Pass the layer parameters to the optimizer

adam_opt = optimizers.Adam(layer.parameters(), lr=1e-1)

### `Training in PyTorch`

A (basic) training step in PyTorch consists of four basic parts:


1.   Set all of the gradients to zero using `opt.zero_grad()`
2.   Calculate the loss, `loss`
3.   Calculate the gradients with respect to the loss using `loss.backward()`
4.   Update the parameters being optimized using `opt.step()`

In [54]:
data = torch.randn(100,5)+1

# Step 1
adam_opt.zero_grad()
# Step 2
loss = torch.abs(1-layer(data)).mean()
# Step 3: Gradients for weights modification wrt the loss
loss.backward()
# Step 4: Update the parameters
adam_opt.step()

print("Loss: ",loss.item())

Loss:  0.20308691263198853


#### `requires_grad_()`
You can also tell PyTorch that it needs to calculate the gradient with respect to a tensor that you created by saying `example_tensor.requires_grad_()`, which will `change it in-place`. This means that even if PyTorch wouldn't normally store a grad for that particular tensor, it will for that specified tensor. <br><br>


#### `with torch.no_grad():`
PyTorch will usually calculate the gradients as it proceeds through a set of operations on tensors. This can often take up unnecessary computations and memory, especially if you're performing an evaluation. However, you can wrap a piece of code with `with torch.no_grad()` to prevent the gradients from being calculated in a piece of code. <br><br>


#### `detach():`
Sometimes, you want to calculate and use a tensor's value without calculating its gradients. For example, if you have two models, A and B, and you want to directly optimize the parameters of A with respect to the output of B, without calculating the gradients through B, then you could feed the detached output of B to A. There are many reasons you might want to do this, including efficiency or cyclical dependencies (i.e. A depends on B depends on A).

# New `nn` Classes

You can also create new classes which extend the `nn` module. For these classes, all class attributes, as in `self.layer` or `self.param` will automatically treated as parameters if they are themselves `nn` objects or if they are tensors wrapped in `nn.Parameter` which are initialized with the class. 

The `__init__` function defines what will happen when the object is created. The first line of the init function of a class, for example, `WellNamedClass`, needs to be `super(WellNamedClass, self).__init__()`. 

The `forward` function defines what runs if you create that object `model` and pass it a tensor `x`, as in `model(x)`. If you choose the function signature, `(self, x)`, then each call of the forward function, gets two pieces of information: `self`, which is a reference to the object with which you can access all of its parameters, and `x`, which is the current tensor for which you'd like to return `y`.

One class might look like the following:

In [56]:
class ExampleModule(nn.Module):
#     Constructor in Python
    def __init__(self,input_dims,output_dims):
        super(ExampleModule,self).__init__()
        # Here self.data_member_name
        self.linear = nn.Linear(input_dims,output_dims)
        self.exponent = nn.Parameter(torch.tensor(1.))
    
    def forward(self,x):
        x = self.linear(x)
        x = x**self.exponent
        return x

In [58]:
# Calling the Module
# Create new object
model = ExampleModule(10,2)

# Create new data
input = torch.randn(4,10)
model(input)

tensor([[ 0.6112, -0.3306],
        [ 0.3569,  0.6215],
        [-0.1654, -0.6504],
        [ 1.0556,  0.7813]], grad_fn=<PowBackward1>)

## `2D Operations`

You won't need these for the first lesson, and the theory behind each of these will be reviewed more in later lectures, but here is a quick reference: 


*   2D convolutions: [`nn.Conv2d`](https://pytorch.org/docs/master/generated/torch.nn.Conv2d.html) requires the number of input and output channels, as well as the kernel size.
*   2D transposed convolutions (aka deconvolutions): [`nn.ConvTranspose2d`](https://pytorch.org/docs/master/generated/torch.nn.ConvTranspose2d.html) also requires the number of input and output channels, as well as the kernel size
*   2D batch normalization: [`nn.BatchNorm2d`](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html) requires the number of input dimensions
*   Resizing images: [`nn.Upsample`](https://pytorch.org/docs/master/generated/torch.nn.Upsample.html) requires the final size or a scale factor. Alternatively, [`nn.functional.interpolate`](https://pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.interpolate) takes the same arguments. 