# PyTorch

In PyTorch, the computational graph is built up as you execute the code, as opposed to TensorFlow where you define your graph and then run it. 
In PyTorch, you create your graph by running it.

# Tensors

Tensors are similar to NumPy’s ndarrays, with the addition being that Tensors can also be used on a GPU to accelerate computing.

In [0]:
import torch
import numpy as np

Construct a matrix filled zeros and of dtype long:

![alt text](https://drive.google.com/uc?id=1mTiCspjN9hYj-PoiRpBiFLnb9izOVIqq)


In [2]:
x = torch.zeros(5, 3, dtype = torch.long)
print(x)
y = torch.zeros((5, 3), dtype = torch.long) # notice that both x and y have similar values
print(y)

tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])
tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])


*Get* size of `x` using either `x.size()` or `x.shape`:

In [3]:
print(x.size())

torch.Size([5, 3])


**Note:** `torch.Size` is in fact a tuple, so it supports all tuple operations.

In [4]:
a, b = x.size()
print(a)
print(b)

5
3


In [5]:
print(x.shape)
print(x.size())
print(y.shape)
print(x == y)

torch.Size([5, 3])
torch.Size([5, 3])
torch.Size([5, 3])
tensor([[True, True, True],
        [True, True, True],
        [True, True, True],
        [True, True, True],
        [True, True, True]])


An uninitialized matrix is declared, but does not contain definite known values before it is used. When an uninitialized matrix is created, whatever values were in the allocated memory at the time will appear as the initial values.

Construct a 5x3 matrix, uninitialized:

In [6]:
x = torch.empty(5, 3)
print(x)

tensor([[5.0173e-36, 0.0000e+00, 3.3631e-44],
        [0.0000e+00,        nan, 0.0000e+00],
        [1.1578e+27, 1.1362e+30, 7.1547e+22],
        [4.5828e+30, 1.2121e+04, 7.1846e+22],
        [9.2198e-39, 7.0374e+22, 0.0000e+00]])


Construct a randomly initialized matrix:

In [7]:
x = torch.rand(5, 3)
print(x)
y = torch.rand((5, 3))
print(y)
print(x.shape)
print(x.size())
print(y.shape)

tensor([[0.0097, 0.3371, 0.2936],
        [0.0634, 0.1718, 0.2978],
        [0.1646, 0.4859, 0.8145],
        [0.9422, 0.7910, 0.9739],
        [0.3337, 0.4996, 0.0296]])
tensor([[0.0417, 0.4654, 0.8836],
        [0.4664, 0.6991, 0.9959],
        [0.5893, 0.8863, 0.7308],
        [0.6408, 0.1174, 0.1627],
        [0.1935, 0.9692, 0.0025]])
torch.Size([5, 3])
torch.Size([5, 3])
torch.Size([5, 3])


Construct an Identity matrix

In [8]:
x = torch.eye(5, 5)
print(x)

tensor([[1., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0.],
        [0., 0., 0., 0., 1.]])


Construct a tensor directly from data:

`torch.tensor(data, dtype=None, device=None, requires_grad=False pin_memory=False)` → Tensor

Constructs a tensor with `data`.

![alt text](https://drive.google.com/uc?id=1YzeBactU2MQXI9q0dNJ3Q39cd6qD0oPb)


In [9]:
x = torch.tensor([[1, 2, 3], [4, 5, 6]])
print(x)
print(x.size())

tensor([[1, 2, 3],
        [4, 5, 6]])
torch.Size([2, 3])


Or create a tensor based on an existing tensor. These methods will reuse properties of the input tensor, e.g. dtype, unless new values are provided by user

In [10]:
x = x.new_ones(5, 3, dtype=torch.double)      # new_* methods take in sizes
print(x)

x = torch.randn_like(x, dtype=torch.float)    # override dtype!
print(x)                                      # result has the same size

# rand_like will inherit all the attributes from its argument's tensor.
# This is true in general for any *_like() methods.
# Some of the other useful methods are torch.ones_like() and torch.zeros_like(). 

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)
tensor([[-1.4660, -1.6301, -1.4368],
        [-0.2555,  0.4440, -0.7693],
        [-0.4597, -1.6205,  0.2796],
        [-1.5583,  0.1140,  1.6616],
        [ 1.5929, -1.5526,  0.0489]])


## Operations
There are multiple syntaxes for operations. In the following example, we will take a look at the addition operation.

Addition: syntax 1

In [11]:
y = torch.rand(5, 3)
print(x + y)

tensor([[-1.4152, -1.4416, -0.7250],
        [ 0.4058,  0.7322, -0.1084],
        [-0.2204, -0.8886,  0.8981],
        [-0.8824,  0.8038,  2.0714],
        [ 2.4004, -0.8462,  0.2737]])


Addition: syntax 2

In [12]:
print(torch.add(x, y))

tensor([[-1.4152, -1.4416, -0.7250],
        [ 0.4058,  0.7322, -0.1084],
        [-0.2204, -0.8886,  0.8981],
        [-0.8824,  0.8038,  2.0714],
        [ 2.4004, -0.8462,  0.2737]])


Addition: providing an output tensor as argument

In [13]:
result = torch.empty(5, 3)
torch.add(x, y, out=result)
print(result)

tensor([[-1.4152, -1.4416, -0.7250],
        [ 0.4058,  0.7322, -0.1084],
        [-0.2204, -0.8886,  0.8981],
        [-0.8824,  0.8038,  2.0714],
        [ 2.4004, -0.8462,  0.2737]])


Addition: syntax 3

In [14]:
# adds 1 to y
print(y.add(1)) # also an example of "broadcasting" in PyTorch

tensor([[1.0507, 1.1885, 1.7117],
        [1.6613, 1.2881, 1.6609],
        [1.2393, 1.7319, 1.6185],
        [1.6760, 1.6898, 1.4098],
        [1.8075, 1.7065, 1.2247]])


Addition: in-place

In [15]:
# adds x to y
y.add_(x)
print(y)

tensor([[-1.4152, -1.4416, -0.7250],
        [ 0.4058,  0.7322, -0.1084],
        [-0.2204, -0.8886,  0.8981],
        [-0.8824,  0.8038,  2.0714],
        [ 2.4004, -0.8462,  0.2737]])


**Note:** Any operation that mutates a tensor in-place is post-fixed with an `_`. For example:` x.copy_(y), x.t_(),` will change `x`.

Matrix Multiplication, Transpose and Inverse is similar to NumPy with slight variations

In [16]:
m1 = torch.randn((5, 3))
m2 = torch.randn((5, 3))
print(m2.t() @ m1) # In NumPy this is equivalent to print(m2.T @ m1)
print(torch.inverse(m2.t() @ m1)) # In NumPy this is equivalent to print(inv(m2.T @ m1))
print((m2.t()).mm(m1)) # We can also use the mm() method to do matrix multiplications

tensor([[ 1.5106,  0.7047,  0.1931],
        [-1.3575,  0.3428,  0.8860],
        [-2.6322,  0.3797, -0.1784]])
tensor([[ 0.1699, -0.0851, -0.2385],
        [ 1.1002, -0.1020,  0.6840],
        [-0.1653,  1.0378, -0.6301]])
tensor([[ 1.5106,  0.7047,  0.1931],
        [-1.3575,  0.3428,  0.8860],
        [-2.6322,  0.3797, -0.1784]])


You can use standard NumPy-like indexing with all bells and whistles!

In [17]:
print(x[:, 1])

tensor([-1.6301,  0.4440, -1.6205,  0.1140, -1.5526])


Resizing: If you want to resize/reshape tensor, you can use `torch.view`:

In [18]:
x = torch.randn(4, 4)
y = x.view(16)
z = x.view(-1, 8)  # the size -1 is inferred from other dimensions
print(x.size(), y.size(), z.size())

torch.Size([4, 4]) torch.Size([16]) torch.Size([2, 8])


If you have a one element tensor, use `.item()` to get the value as a Python number

In [19]:
x = torch.randn(1)
print(x)
print(x.item())
# temp = torch.rand(5, 3)
# print(temp)
# print(temp.item()) -> Throws this error : ValueError: only one element tensors can be converted to Python scalars

tensor([0.3653])
0.365324467420578


### Broadcasting in PyTorch

Broadcasting in PyTorch is similar to broadcasting in NumPy

In [20]:
x = torch.ones((5, 3))
print(x)
print("*" * 65)

print(x + 1)
print(x.add(1))
print("*" * 65)

print(x * 3)
print("*" * 65)

print((x + 1) ** 2) 
print("*" * 65)

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
*****************************************************************
tensor([[2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.]])
tensor([[2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.]])
*****************************************************************
tensor([[3., 3., 3.],
        [3., 3., 3.],
        [3., 3., 3.],
        [3., 3., 3.],
        [3., 3., 3.]])
*****************************************************************
tensor([[4., 4., 4.],
        [4., 4., 4.],
        [4., 4., 4.],
        [4., 4., 4.],
        [4., 4., 4.]])
*****************************************************************


**Read Later:** 100+ Tensor operations, including transposing, indexing, slicing, mathematical operations, linear algebra, random numbers, etc., are described [here](https://pytorch.org/docs/torch).

### Some more useful methods of `torch`

### `torch.gather`

Gathers values along an axis specified by dim.

For a 3-D tensor the output is specified by:

```
out[i][j][k] = input[index[i][j][k]][j][k]  # if dim == 0
out[i][j][k] = input[i][index[i][j][k]][k]  # if dim == 1
out[i][j][k] = input[i][j][index[i][j][k]]  # if dim == 2
```

*Parameters*

* **input** (`Tensor`) – the source tensor

* **dim** (`int`) – the axis along which to index

* **index** (`LongTensor`) – the indices of elements to gather

* **out** (`Tensor`, optional) – the destination tensor

* **sparse_grad** (`bool`, optional) – If `True`, gradient w.r.t. `input` will be a sparse tensor.

In [21]:
t = torch.tensor([[1,2],[3,4]])
print(torch.gather(t, 1, torch.tensor([[0,0],[1,0]])))

tensor([[1, 1],
        [4, 3]])


`torch.gather` creates a new tensor from the input tensor by taking the values from each row along the input dimension `dim`. The values in `torch.LongTensor`, passed as `index`, specify which value to take from each 'row'. The dimension of the output tensor is same as the dimension of index tensor. Following illustration from the official docs explains it more clearly:

![alt text](https://i.stack.imgur.com/nudGq.png)

(Note: In the illustration, indexing starts from 1 and not 0).

In first example, the dimension given is along rows (top to bottom), so for (1,1) position of `result`, it takes row value from the `index` for the `src` that is `1`. At (1,1) in source value is `1` so, outputs `1` at (1,1) in `result`. Similarly for (2,2) the row value from the index for `src` is `3`. At (3,2) the value in `src` is `8` and hence outputs `8` and so on.

Similarly for second example, indexing is along columns, and hence at (2,2) position of the `result`, the column value from the index for `src` is `3`, so at (2,3) from `src` ,`6` is taken and outputs to `result` at (2,2)

Source of this text cell : https://stackoverflow.com/a/54706716/6644968



In [22]:
src = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
index = torch.LongTensor([[0, 1, 2], [1, 2, 0]])
result = torch.gather(input = src, dim = 0, index = index)
print(result)
# The position of elements of result w.r.t src matrix can be written as follows
# result = [[(0, 0), (1, 1), (2, 2)] [(1, 0), (2, 1), (0, 2)]]

tensor([[1, 5, 9],
        [4, 8, 3]])


**NOTE:**

`dim` in PyTorch is similar to `axis` in NumPy. Its actually bit tricky to visualize these concepts when learning about them initially. The following resources will help in gaining intuition for the same.
* [Resource 1](https://www.youtube.com/watch?v=nS0oKBbNjWY)
* [Resource 2](https://www.sharpsightlabs.com/blog/numpy-axes-explained/)
* [Resource 3](https://towardsdatascience.com/understanding-dimensions-in-pytorch-6edf9972d3be)
* [Resource 4](https://medium.com/@aerinykim/numpy-sum-axis-intuition-6eb94926a5d1)

### `torch.squeeze`

![alt text](https://drive.google.com/uc?id=1ohtcSu0VKZF8b68mXRGXvciEln8ycuXH)

In [23]:
x = torch.zeros(2, 1, 2, 1, 2)
print(x.size())

y = torch.squeeze(x)
print(y.size())

y = torch.squeeze(x, 0)
print(y.size())

y = torch.squeeze(x, 1)
print(y.size())

torch.Size([2, 1, 2, 1, 2])
torch.Size([2, 2, 2])
torch.Size([2, 1, 2, 1, 2])
torch.Size([2, 2, 1, 2])


### `torch.unsqueeze`

![alt text](https://drive.google.com/uc?id=1cQ7qLk3gtWZB6O70Un5VNJBH7L5_41Wp)

In [24]:
x = torch.tensor([1, 2, 3, 4])
print(x.size())
print("*" * 65)

print(torch.unsqueeze(x, 0))
print((torch.unsqueeze(x, 0)).size())
print("*" * 65)

print(torch.unsqueeze(x, 1))
print((torch.unsqueeze(x, 1)).size())
print("*" * 65)

torch.Size([4])
*****************************************************************
tensor([[1, 2, 3, 4]])
torch.Size([1, 4])
*****************************************************************
tensor([[1],
        [2],
        [3],
        [4]])
torch.Size([4, 1])
*****************************************************************


### Putting it together (`gather` and `squeeze`)

In PyTorch you can perform the same operation using the `gather()` method. If `s` is a PyTorch Tensor of shape `(N, C)` and `y` is a PyTorch Tensor of shape `(N,)` containing longs in the range `0 <= y[i] < C`, then

`s.gather(1, y.view(-1, 1)).squeeze()`

will be a PyTorch Tensor of shape `(N,)` containing one entry from each row of `s`, selected according to the indices in `y`.

run the following cell to see an example.

In [25]:
# Example of using gather to select one entry from each row in PyTorch
def gather_example():
    N, C = 4, 5
    s = torch.randn(N, C)
    y = torch.LongTensor([1, 2, 1, 3])
    print(s)
    print(y)
    print(s.gather(1, y.view(-1, 1)).squeeze())
gather_example()

tensor([[ 0.5562, -2.3713, -0.7690, -0.5957,  0.3673],
        [-0.1252, -0.4316, -0.1369,  0.2187,  2.6047],
        [ 0.2975,  1.2903,  0.2209,  0.4667,  0.6130],
        [-1.0303,  0.0174,  0.1998,  0.8787,  1.5550]])
tensor([1, 2, 1, 3])
tensor([-2.3713, -0.1369,  1.2903,  0.8787])


### `torch.max`

![alt text](https://drive.google.com/uc?id=1BWyunoD1xey8MRuADi9iSpBBrkS3TUL1)

In [26]:
a = torch.randn(1, 3)
print(a)

print(torch.max(a))

tensor([[-0.2067, -0.7156,  0.7224]])
tensor(0.7224)


![alt text](https://drive.google.com/uc?id=184CwB5REsibX8RpXNfN_xVnXVhwG-rjj)

In [27]:
a = torch.randn(4, 4)
print(a)
print("*" * 65)

print(torch.max(a, dim = 1))
print("*" * 65)

values, indices = torch.max(a, 1)
print(values)
print(indices)
print("*" * 65)

tensor([[ 0.0527,  0.0126, -0.2024,  0.9487],
        [ 0.5723, -1.2748, -0.7612, -0.8935],
        [ 0.4131,  0.7309,  0.1183, -0.4827],
        [ 0.1957,  0.8081, -0.9771, -0.0841]])
*****************************************************************
torch.return_types.max(
values=tensor([0.9487, 0.5723, 0.7309, 0.8081]),
indices=tensor([3, 0, 1, 1]))
*****************************************************************
tensor([0.9487, 0.5723, 0.7309, 0.8081])
tensor([3, 0, 1, 1])
*****************************************************************


![alt text](https://drive.google.com/uc?id=13aDh2cQ0pw8zROcV1rZIqdAMs4uwUmQ6)

In [28]:
a = torch.randn(4)
print(a)
b = torch.randn(4)
print(b)
print(torch.max(a, b))

tensor([ 0.0737, -0.9813,  0.1793, -0.1010])
tensor([ 0.0576, -0.8531,  0.4598, -0.3327])
tensor([ 0.0737, -0.8531,  0.4598, -0.1010])


### `torch.abs()`

![alt text](https://drive.google.com/uc?id=1-XwicMO-HPvMDhYKxIHGxi8qgctUDNUl)

### `torch.cat`

![alt text](https://drive.google.com/uc?id=1gN9UzU_zBoG36dkV-r-iSyPil3AmPX09)

In [29]:
x = torch.randn(2, 3)
print(x)
print(x.size())
print("*" * 65)

print(torch.cat((x, x, x), 0))
print((torch.cat((x, x, x), 0)).size())
print("*" * 65)

print(torch.cat((x, x, x), 1))
print((torch.cat((x, x, x), 1)).size())
print("*" * 65)

tensor([[-1.1953,  1.7962,  0.3583],
        [ 0.4639, -0.8959, -0.2181]])
torch.Size([2, 3])
*****************************************************************
tensor([[-1.1953,  1.7962,  0.3583],
        [ 0.4639, -0.8959, -0.2181],
        [-1.1953,  1.7962,  0.3583],
        [ 0.4639, -0.8959, -0.2181],
        [-1.1953,  1.7962,  0.3583],
        [ 0.4639, -0.8959, -0.2181]])
torch.Size([6, 3])
*****************************************************************
tensor([[-1.1953,  1.7962,  0.3583, -1.1953,  1.7962,  0.3583, -1.1953,  1.7962,
          0.3583],
        [ 0.4639, -0.8959, -0.2181,  0.4639, -0.8959, -0.2181,  0.4639, -0.8959,
         -0.2181]])
torch.Size([2, 9])
*****************************************************************


## NumPy Bridge
Converting a Torch Tensor to a NumPy array and vice versa is a breeze.

The Torch Tensor and NumPy array will share their underlying memory locations (if the Torch Tensor is on CPU), and changing one will change the other.

### Converting a Torch Tensor to a NumPy Array

In [30]:
a = torch.ones(5)
print(a)
print(a.size())

tensor([1., 1., 1., 1., 1.])
torch.Size([5])


In [31]:
b = a.numpy()
print(b)

[1. 1. 1. 1. 1.]


See how the numpy array changed in value.

In [32]:
a.add_(1)
print(a)
print(b)

tensor([2., 2., 2., 2., 2.])
[2. 2. 2. 2. 2.]


### Converting NumPy Array to Torch Tensor
See how changing the np array changed the Torch Tensor automatically

In [33]:
import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print(a)
print(b)

[2. 2. 2. 2. 2.]
tensor([2., 2., 2., 2., 2.], dtype=torch.float64)


All the Tensors on the CPU except a CharTensor support converting to NumPy and back.

## CUDA Tensors

Tensors can be moved onto any device using the `.to` method.

In [0]:
# let us run this cell only if CUDA is available
# We will use ``torch.device`` objects to move tensors in and out of GPU
if torch.cuda.is_available():
    device = torch.device("cuda")          # a CUDA device object
    y = torch.ones_like(x, device=device)  # directly create a tensor on GPU
    x = x.to(device)                       # or just use strings ``.to("cuda")``
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       # ``.to`` can also change dtype together!

One more way to use GPU (a more common approach) in Pytorch is as follows:

In [35]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

cpu


You can move data to the GPU by doing `.to(device)` 

In [0]:
data = torch.eye(3)
data = data.to(device)

Bottomline is you can do pretty much everything just by using only Tensors!

# References

1.   [Deep Learning with PyTorch: A 60 minute blitz, Soumith Chintala](https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html)

2.   [TORCH](https://pytorch.org/docs/stable/torch.html)

3. [Stefan Otte: Deep Neural Networks with PyTorch | PyData Berlin 2018](https://www.youtube.com/watch?v=_H3aw6wkCv0&t=821s)

4. [CS231n: Convolutional Neural Networks for Visual Recognition](http://cs231n.stanford.edu/)



