## What is PyTorch?

It’s a Python based scientific computing package targeted at two sets of
audiences:
- A replacement for numpy to use the power of GPUs
- a deep learning research platform that provides maximum flexibility and speed

#### Tensors
Tensors are similar to numpy’s ndarrays, with the addition being that they can also be used on a GPU to accelerate computing.

In [1]:
# import Pytorch
import torch
print(torch.__version__)

0.4.0


 Construct a 5x3 matrix, uninitialized:

In [2]:
x = torch.Tensor(5, 3)
print(x)

tensor([[ 1.2303e-37,  0.0000e+00,  5.7453e-44],
        [ 0.0000e+00,         nan,  6.4893e-07],
        [ 1.3733e-14,  6.4076e+07,  2.0706e-19],
        [ 7.3909e+22,  2.4176e-12,  1.1625e+33],
        [ 8.9605e-01,  1.1632e+33,  5.6003e-02]])


 Construct a randomly initialized matrix

In [3]:
x = torch.rand(5, 3)
print(x)

tensor([[ 0.9453,  0.5786,  0.8760],
        [ 0.6387,  0.5484,  0.3137],
        [ 0.2800,  0.0729,  0.2273],
        [ 0.0281,  0.9334,  0.3736],
        [ 0.7685,  0.6841,  0.3241]])


 Get its size

In [4]:
x.size()

torch.Size([5, 3])

Operations supported:

In [5]:
y = torch.rand(5, 3)
print(x + y)
print(torch.add(x,y))

tensor([[ 0.9513,  1.1290,  0.9448],
        [ 0.7734,  1.2357,  1.0777],
        [ 0.4542,  0.3850,  0.9336],
        [ 0.1764,  0.9566,  1.2332],
        [ 1.0841,  0.6899,  0.8097]])
tensor([[ 0.9513,  1.1290,  0.9448],
        [ 0.7734,  1.2357,  1.0777],
        [ 0.4542,  0.3850,  0.9336],
        [ 0.1764,  0.9566,  1.2332],
        [ 1.0841,  0.6899,  0.8097]])


In-place operation:

In [6]:
y.add_(x)
print(y)

tensor([[ 0.9513,  1.1290,  0.9448],
        [ 0.7734,  1.2357,  1.0777],
        [ 0.4542,  0.3850,  0.9336],
        [ 0.1764,  0.9566,  1.2332],
        [ 1.0841,  0.6899,  0.8097]])


**Note:** Any operation that mutates a tensor in-place is post-fixed with a _.

100+ Tensor operations, including transposing, indexing, slicing, mathematical operations, linear algebra, random numbers, etc are described here <https://pytorch.org/docs/stable/tensors.html>_

#### Numpy Bridge

Converting a torch Tensor to a numpy array and vice versa is a breeze.
**Note:** The torch Tensor and numpy array will share their underlying memory locations, and changing one will change the other.

In [7]:
# Converting torch Tensor to numpy Array
a = torch.ones(5)
print(a)

tensor([ 1.,  1.,  1.,  1.,  1.])


In [8]:
b = a.numpy()
print(b)

[1. 1. 1. 1. 1.]


Now, Try changing the values of b and then print a

In [9]:
 a.add_(1)
print(a)
print(b)

tensor([ 2.,  2.,  2.,  2.,  2.])
[2. 2. 2. 2. 2.]


Converting numpy arrays to torch tensors

In [10]:
import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print(a)
print(b)

[2. 2. 2. 2. 2.]
tensor([ 2.,  2.,  2.,  2.,  2.], dtype=torch.float64)


#### CUDA Tensors (Using GPU)

In [11]:
 # let us run this cell only if CUDA is available
if torch.cuda.is_available():
    x = x.to('cuda')
    y = y.to('cuda')
    x + y

## Autograd: automatic differentiation
Central to all neural networks in PyTorch is the autograd package. The autograd package provides automatic differentiation for all operations on Tensors. It is a define-by-run framework, which means that your backprop is defined by how your code is run, and that every single iteration can be different.

### requires_grad parameter
If you set its attribute ```.requires_grad``` as ```True```, it starts to track all operations on it. When you finish your computation you can call ```.backward()``` and have all the gradients computed automatically. The gradient for this tensor will be accumulated into ```.grad``` attribute.

In [12]:
x = torch.ones(2, 2, requires_grad=True)
print(x)

tensor([[ 1.,  1.],
        [ 1.,  1.]])


In [13]:
y = x + 2
print(y)

tensor([[ 3.,  3.],
        [ 3.,  3.]])


```y``` was created as a result of an operation, so it has a ```grad_fn```.

In [14]:
print(y.grad_fn)

<AddBackward0 object at 0x7fe40c7e20b8>


In [15]:
z = y * y * 3
out = z.mean()
print(z,out)

tensor([[ 27.,  27.],
        [ 27.,  27.]]) tensor(27.)


### Gradients
Let’s backprop now Because out contains a single scalar, ```out.backward()``` is equivalent to ```out.backward(torch.tensor(1))```.

In [16]:
out.backward()
print(x.grad)

tensor([[ 4.5000,  4.5000],
        [ 4.5000,  4.5000]])


We got the tensor matrix of ```4.5```. Lets call the ```out``` tensor "O". so we have that $O = \frac{1}{4} \sum_i z_i\ \ \  z_i=3(x_i+2)^2\ and\ z_i |_{x_i=1} = 27.$ Therefore $ \frac{\partial o}{\partial x_i} = 4.5$ 

[For the documentation, read here.](https://pytorch.org/docs/stable/autograd.html)