# PyTorch

In this Jupyter Notebook, we will cover the most important parts of PyTorch which will be needed for this class. You can find a nice tutorial and some examples also on following urls:
1. PyTorch: https://pytorch.org/
2. Tutorials: https://pytorch.org/tutorials/index.html   https://github.com/yunjey/pytorch-tutorial
3. Examples: https://pytorch.org/tutorials/beginner/pytorch_with_examples.html 
4. https://pytorch.org/tutorials/beginner/former_torchies/tensor_tutorial.html#sphx-glr-beginner-former-torchies-tensor-tutorial-py

## What is PyTorch?
PyTorch provides two main features:
1. An n-dimensional Tensor, similar to numpy but can run on GPUs
2. Automatic differentiation for building and training neural networks

In [1]:
import torch
import numpy as np

In [2]:
a = torch.empty(3, 2, dtype=torch.float) #Create a tensor of size (3 x 2) with uninitialized memory:
print(a)
b = a*0
print(b)

tensor([[0.0000e+00, 0.0000e+00],
        [2.0250e+18, 1.0845e-19],
        [9.1836e-40, 4.8752e-10]])
tensor([[0., 0.],
        [0., 0.],
        [0., 0.]])


In [3]:
a = torch.randn(3, 2, dtype=torch.double)
print(a)
print(a.size())

tensor([[ 1.9837,  1.5332],
        [-0.8643, -0.5868],
        [ 1.3702,  0.0401]], dtype=torch.float64)
torch.Size([3, 2])


In [4]:
print(a[0,0])
print(a[0,0].item()) #can be called only on single element
# print a.item() #this would fail



tensor(1.9837, dtype=torch.float64)
1.98371892180179


# Inplace / Out-of-place operations

In [7]:
a = torch.empty(3, 2, dtype=torch.float)
print(a)
a.fill_(3.5)
print(a)

tensor([[0.0000, 0.0000],
        [0.0000, 0.0000],
        [0.0000, 0.0000]])
tensor([[3.5000, 3.5000],
        [3.5000, 3.5000],
        [3.5000, 3.5000]])


In [8]:
b = a.add(4.0) # a new tensor "b" is created
print(b)
# b[0,0]=9

tensor([[7.5000, 7.5000],
        [7.5000, 7.5000],
        [7.5000, 7.5000]])


In [11]:
print(a)
a.add_(b)  # this happens inplace
print(a)

tensor([[11., 11.],
        [11., 11.],
        [11., 11.]])
tensor([[18.5000, 18.5000],
        [18.5000, 18.5000],
        [18.5000, 18.5000]])


# Accessing elements

In [12]:
a = torch.randn(3, 3, dtype=torch.double)
print(a)
b = a[0:2,1:3]
print(b)

tensor([[-0.1404,  1.3053, -0.0946],
        [-0.6335, -0.6155, -0.1908],
        [ 0.8496,  1.0392,  2.0897]], dtype=torch.float64)
tensor([[ 1.3053, -0.0946],
        [-0.6155, -0.1908]], dtype=torch.float64)


In [13]:
x = torch.ones(2, 2,3)
print(x)

tensor([[[1., 1., 1.],
         [1., 1., 1.]],

        [[1., 1., 1.],
         [1., 1., 1.]]])


In [14]:
z = torch.empty(5, 2)
z[:, 0] = 10
z[:, 1] = 100
print(z)

tensor([[ 10., 100.],
        [ 10., 100.],
        [ 10., 100.],
        [ 10., 100.],
        [ 10., 100.]])


# Converting torch Tensor to numpy Array

In [16]:
a = torch.ones(5)
print("pytorch:",a) 
b = a.numpy()
print("numpy:",b) 

pytorch: tensor([1., 1., 1., 1., 1.])
numpy: [1. 1. 1. 1. 1.]


In [17]:
print("a",a)
print("b",b)
a.add_(1)
print("a",a)
print("b",b)  # see how the numpy array changed in value as they point into the same memory where data is stored!!!

a tensor([1., 1., 1., 1., 1.])
b [1. 1. 1. 1. 1.]
a tensor([2., 2., 2., 2., 2.])
b [2. 2. 2. 2. 2.]


In [None]:
# Converting numpy Array to torch Tensor
import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
print(a)
print(b)
np.add(a, 1, out=a)  # inplace numpy 
print(a)
print(b)  # see how changing the np array changed the torch Tensor automatically

In [None]:
# Matrix Multiplication
A = torch.randn(3, 10, dtype=torch.double)
B = torch.randn(3, 10, dtype=torch.double)
C = torch.mm(A,B.t())
print C

# Autograd
1. Autograd is now a core torch package for automatic differentiation. It uses a tape based system for automatic differentiation.
2. In the forward phase, the autograd tape will remember all the operations it executed, and in the backward phase, it will replay the operations.

See following for more details
1. https://en.wikipedia.org/wiki/Automatic_differentiation
2. https://pytorch.org/tutorials/beginner/former_torchies/autograd_tutorial.html#sphx-glr-beginner-former-torchies-autograd-tutorial-py

In [18]:
x = torch.ones(2, 2, requires_grad=True)
print(x)
print(x.data)
print(x.grad)
print(x.grad_fn)  # we've created x ourselves

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
tensor([[1., 1.],
        [1., 1.]])
None
None


In [19]:
y = np.sin(x.data)
print(y)
print(y.grad_fn)

tensor([[0.8415, 0.8415],
        [0.8415, 0.8415]])
None


In [20]:
z = y * y * 3
out = z.mean()
print(z, out)

tensor([[2.1242, 2.1242],
        [2.1242, 2.1242]]) tensor(2.1242)


In [21]:
out.backward()  # this will compute gradients
print(x.grad)

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

# Exercises
1. Create two pytorch tensors $a \in R^{n}$ and $b \in R^{n}$. Compute $y = a \cdot  b$
2. Create two input matrices A, B and define some complicated function $y = f(A,B)$ such that $y \in R$. Then compute the gradients $\nabla_A f(A,B)$ and $\nabla_B f(A,B)$

# GPU Support

In [24]:
torch.cuda.set_device(6) # please try to negotiate how to split GPUs ;)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

AttributeError: module 'torch._C' has no attribute '_cuda_setDevice'

In [None]:
import time

In [None]:
N = 20000
A = torch.randn(N,N)
B = torch.randn(N,N)
C = torch.mm(A,B)

AGPU = A.to(device)
BGPU = B.to(device)
CGPU = torch.mm(AGPU,BGPU)

In [None]:
start = time.time()
C = torch.mm(A,B)
print time.time()-start

start = time.time()
CGPU = torch.mm(AGPU,BGPU)
print time.time()-start
