> ### EEE4423: Deep Learning Lab

# Lab \# 1: PyTorch Basics

In [1]:
import datetime
print("This code is written at " + str(datetime.datetime.now()))

This code is written at 2022-03-04 11:16:59.254867


# 1. Matrices with PyTorch 

## Matrices


### Matrices Brief Introduction
- Basic definition: rectangular array of numbers.
- Tensors (PyTorch)
- Ndarrays (NumPy)

- **2 x 2 Matrix (R x C)**

1 | 1 
--- | ---
1 | 1 

- **2 x 3 Matrix**

1 | 1 | 1
--- | ---| ---
1 | 1 | 1 

### Creating Matrices

- **Create list**

In [5]:
# Creating a 2x2 array
arr = [[1, 2], [3, 4]]
print(arr)
print(type(arr))

[[1, 2], [3, 4]]
<class 'list'>


- **Create numpy array via list**

In [7]:
import numpy as np

# Convert to NumPy
np.array(arr)
print(arr)
print(type(arr))
print(type(np.array(arr)))

[[1, 2], [3, 4]]
<class 'list'>
<class 'numpy.ndarray'>


- **Convert numpy array to PyTorch tensor**

In [4]:
import torch

# Convert to PyTorch Tensor
torch.Tensor(arr)

tensor([[1., 2.],
        [3., 4.]])

### Creating Matrices  with Default Values

- **Create 2x2 numpy array of 1's**

In [9]:
np.ones((2, 2))

array([[1., 1.],
       [1., 1.]])

- **Create 2x2 torch tensor of 1's**

In [10]:
torch.ones((2, 2))

tensor([[1., 1.],
        [1., 1.]])

- **Create 2x2 numpy array of random numbers**

In [11]:
np.random.rand(2, 2)

array([[0.4236548 , 0.64589411],
       [0.43758721, 0.891773  ]])

- **Create 2x2 PyTorch tensor of random numbers**

In [12]:
torch.rand(2, 2)

tensor([[0.4171, 0.7779],
        [0.7806, 0.4943]])

### Seeds for Reproducibility

> **Why do we need seeds?** 
>
> We need seeds to enable reproduction of experimental results. This becomes critical later on where you can easily let people reproduce your code's output exactly as you've produced.`

- **Create seed to enable fixed numbers for random number generation**

In [13]:
# Seed
np.random.seed(0)
np.random.rand(2, 2)

array([[0.5488135 , 0.71518937],
       [0.60276338, 0.54488318]])

- **Repeat random array generation to check**

If you do not set the seed, you would not get the same set of numbers like here.

In [14]:
# Seed
np.random.seed(0)
np.random.rand(2, 2)

array([[0.5488135 , 0.71518937],
       [0.60276338, 0.54488318]])

- **Create a numpy array without seed**

Notice how you get different numbers compared to the first 2 tries?

In [15]:
# No seed
np.random.rand(2, 2)

array([[0.4236548 , 0.64589411],
       [0.43758721, 0.891773  ]])

- **Repeat numpy array generation without seed**

You get the point now, you get a totally different set of numbers.

In [16]:
# No seed
np.random.rand(2, 2)

array([[0.96366276, 0.38344152],
       [0.79172504, 0.52889492]])

- **Create a PyTorch tensor with a fixed seed**

In [17]:
# Torch Seed
torch.manual_seed(0)
torch.rand(2, 2)

tensor([[0.4963, 0.7682],
        [0.0885, 0.1320]])

- **Repeat creating a PyTorch fixed seed tensor**

In [18]:
# Torch Seed
torch.manual_seed(0)
torch.rand(2, 2)

tensor([[0.4963, 0.7682],
        [0.0885, 0.1320]])

- **Creating a PyTorch tensor without seed**

Like with a numpy array of random numbers without seed, you will not get the same results as above.

In [19]:
# Torch No Seed
torch.rand(2, 2)

tensor([[0.3074, 0.6341],
        [0.4901, 0.8964]])

- **Repeat creating a PyTorch tensor without seed**

Notice how these are different numbers again?

In [20]:
# Torch No Seed
torch.rand(2, 2)

tensor([[0.4556, 0.6323],
        [0.3489, 0.4017]])

- **Seed for GPU is different: Fix a seed for GPU tensors**

When you conduct deep learning experiments, typically you want to use GPUs to accelerate your computations and fixing seed for tensors on GPUs is different from CPUs as we have done above.

In [22]:
# 모든 gpu의 random seed를 일정하게 맞추는 작업
if torch.cuda.is_available(): # 사용 가능한 GPU를 사용할 수 있는지
    torch.cuda.manual_seed_all(0)

In [23]:
!nvidia-smi # GPU 사용량을 확인하는 방법 : 현재는 한개임.
# 중요한 이유는 한번 저장된 변수는 restart 하지 않는 이상 계속 누적되어 
# 저장되어 있기 때문에 out of GPU를 방지하는데 필요하다

Fri Mar  4 11:23:00 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   51C    P8    30W / 149W |      3MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [24]:
tmp = torch.rand(32,2048,128,128)

In [27]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
tmp_gpu = tmp.to(device) # gpu로 올림

In [28]:
!nvidia-smi # GPU 사용량을 확인하는 방법 : 현재는 한개임.
# 중요한 이유는 한번 저장된 변수는 restart 하지 않는 이상 계속 누적되어 
# 저장되어 있기 때문에 out of GPU를 방지하는데 필요하다

Fri Mar  4 11:27:55 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   53C    P0    70W / 149W |   4610MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [29]:
# out of meory가 뜨는 경우 
tmp = torch.rand(32,2048,128,11100000, device=device)

RuntimeError: ignored

### NumPy and Torch Bridge

#### NumPy to Torch

- **Create a numpy array of 1's**

In [30]:
# Numpy array
np_array = np.ones((2, 2))

print(np_array)

[[1. 1.]
 [1. 1.]]


- **Get the type of class for the numpy array**

In [31]:
print(type(np_array))

<class 'numpy.ndarray'>


In [32]:
print(np_array.dtype)

float64


- **Convert numpy array to PyTorch tensor**

In [35]:
# Convert to Torch Tensor
torch_tensor = torch.from_numpy(np_array)

print(torch_tensor)

tensor([[1., 1.],
        [1., 1.]], dtype=torch.float64)


In [36]:
# 단 torch는 자동 float32이다
torch.rand(3,3).dtype

torch.float32

- **Get type of class for PyTorch tensor**

Notice how it shows it's a torch Float64 Tensor? There're actually tensor types and it depends on the numpy data type.

In [37]:
print(torch_tensor.dtype)

torch.float64


- **Create PyTorch tensor from a different numpy datatype**

You will get an error running this code because PyTorch tensor don't support all datatype.

In [38]:
# Data types matter: intentional error
np_array_new = np.ones((2, 2), dtype=np.int8)
torch.from_numpy(np_array_new)

tensor([[1, 1],
        [1, 1]], dtype=torch.int8)

> **What conversion support does Numpy to PyTorch tensor bridge gives?**
> - `double`
> - `float` 
> - `int64`, `int32`, `uint8` 
>
> [TORCH.DTYPE](https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.dtype)

- **Create PyTorch long tensor**

See how a int64 numpy array gives you a PyTorch long tensor?

In [39]:
# Data types matter
np_array_new = np.ones((2, 2), dtype=np.int64)
torch_tensor = torch.from_numpy(np_array_new)

In [40]:
print(torch_tensor.dtype)

torch.int64


- **Create PyTorch int tensor**

In [41]:
# Data types matter
np_array_new = np.ones((2, 2), dtype=np.int32)
torch.from_numpy(np_array_new)

tensor([[1, 1],
        [1, 1]], dtype=torch.int32)

- **Create PyTorch byte tensor**

In [42]:
# Data types matter
np_array_new = np.ones((2, 2), dtype=np.uint8)
torch.from_numpy(np_array_new)

tensor([[1, 1],
        [1, 1]], dtype=torch.uint8)

- **Create PyTorch Double Tensor**

In [43]:
# Data types matter
np_array_new = np.ones((2, 2), dtype=np.float64)
torch.from_numpy(np_array_new)

tensor([[1., 1.],
        [1., 1.]], dtype=torch.float64)

Alternatively you can do this too via `np.double`

In [44]:
# Data types matter
np_array_new = np.ones((2, 2), dtype=np.double)
torch.from_numpy(np_array_new)

tensor([[1., 1.],
        [1., 1.]], dtype=torch.float64)

- **Create PyTorch Float Tensor**

In [45]:
# Data types matter
np_array_new = np.ones((2, 2), dtype=np.float32)
torch.from_numpy(np_array_new)

tensor([[1., 1.],
        [1., 1.]])

#### Torch to NumPy 

- **Create PyTorch tensor of 1's**

You would realize this defaults to a float tensor by default if you do this.

In [46]:
torch_tensor = torch.ones(2, 2)

print(torch_tensor.dtype)

torch.float32


- **Convert tensor to numpy**

It's as simple as this.

In [47]:
torch_to_numpy = torch_tensor.numpy()

print(type(torch_to_numpy))

<class 'numpy.ndarray'>


In [48]:
print(torch_to_numpy.dtype)

float32


### Tensors on CPU vs GPU

- **Move tensor to CPU and back**

This by default creates a tensor on CPU. You do not need to do anything.

In [50]:
# CPU
tensor_cpu = torch.ones(2, 2)

If you would like to send a tensor to your GPU, you just need to do a simple `.cuda()`

In [57]:
# CPU to GPU
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
tensor_gpu = tensor_cpu.to(device)
print(tensor_gpu)

tensor([[1., 1.],
        [1., 1.]], device='cuda:0')


In [58]:
torch.zeros(1,1, device = torch.device("cuda:0"))
# 하나의 GPU에 올려가 연산이 가능하다. 

tensor([[0.]], device='cuda:0')

And if you want to move that tensor on the GPU back to the CPU, just do the following.

In [59]:
# GPU to CPU
tensor_cpu = tensor_cpu.cpu() # CPU가 다시 내리는 잡업
print(tensor_cpu)
# CPU 로 내리는 이유는 sqencial한 연산할 때는 CPU가 유리
# loss를 확인할 때도 CPU에서만 찍는 것이기 때문에 내려야함

tensor([[1., 1.],
        [1., 1.]])


In [60]:
# CPU + GPU 변수는 불가능하다
tensor_cpu +tensor_gpu

RuntimeError: ignored

- **More examples**

In [63]:
x = torch.randn(2,4)
print(x)

tensor([[ 0.6543, -0.5107,  0.6086, -0.2044],
        [-1.0294,  1.1573,  0.7635, -0.6948]])


In [64]:
if torch.cuda.is_available():
    device = torch.device("cuda")          # a CUDA device object
    y = torch.ones_like(x, device=device)  # directly create a tensor on GPU
    x = x.to(device)                       # or just use strings ``.to("cuda")``
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       # ``.to`` can also change dtype together!

tensor([[ 1.6543,  0.4893,  1.6086,  0.7956],
        [-0.0294,  2.1573,  1.7635,  0.3052]], device='cuda:0')
tensor([[ 1.6543,  0.4893,  1.6086,  0.7956],
        [-0.0294,  2.1573,  1.7635,  0.3052]], dtype=torch.float64)


### Tensor Operations

#### Resizing Tensor

- **Creating a 2x2 tensor**

In [95]:
a = torch.ones(2, 2)
print(a)

tensor([[1., 1.],
        [1., 1.]])


- **Getting size of tensor**

In [96]:
print(a.size())

torch.Size([2, 2])


In [97]:
print(a.shape)

torch.Size([2, 2])


In [98]:
tmp = torch.ones(16,256,32,32) # N,C,H,W

In [99]:
tmp.size(0)

16

In [100]:
tmp.shape[0]

16

- **Resize tensor to 4x1**

In [101]:
a.view(4) # 일직선으로 펴주는 역할을 한다

tensor([1., 1., 1., 1.])

In [102]:
tmp.view(16*256*32*32)

tensor([1., 1., 1.,  ..., 1., 1., 1.])

In [103]:
tmp.view(16*256*32*32).shape

torch.Size([4194304])

In [105]:
tmp.permute(1,0,2,3).shape # (256, 16 ,32 ,32 ) 

torch.Size([256, 16, 32, 32])

- **Get size of resized tensor**

In [106]:
a.view(4).size()

torch.Size([4])

#### Element-wise Addition

- **Creating first 2x2 tensor**

In [107]:
a = torch.ones(2, 2)
print(a)

tensor([[1., 1.],
        [1., 1.]])


- **Creating second 2x2 tensor**

In [108]:
b = torch.ones(2, 2)
print(b)

tensor([[1., 1.],
        [1., 1.]])


- **Element-wise addition of 2 tensors**

In [109]:
# Element-wise addition
c = a + b
print(c)

tensor([[2., 2.],
        [2., 2.]])


- **Alternative element-wise addition of 2 tensors**

In [110]:
# Element-wise addition
c = torch.add(a, b)
print(c)

tensor([[2., 2.],
        [2., 2.]])


- **In-place element-wise addition**

This would replace the c tensor values with the new addition.

In [111]:
# In-place addition
print('Old c tensor')
print(c)

c.add_(a)

print('-'*60)
print('New c tensor')
print(c)

Old c tensor
tensor([[2., 2.],
        [2., 2.]])
------------------------------------------------------------
New c tensor
tensor([[3., 3.],
        [3., 3.]])


#### Element-wise Subtraction

- **Check values of tensor a and b'**

Take note that you've created tensor a and b of sizes 2x2 filled with 1's each above.

In [112]:
print(a)
print(b)

tensor([[1., 1.],
        [1., 1.]])
tensor([[1., 1.],
        [1., 1.]])


- **Element-wise subtraction: method 1**

In [113]:
a - b

tensor([[0., 0.],
        [0., 0.]])

- **Element-wise subtraction: method 2**

In [114]:
# Not in-place
print(a.sub(b))
print(a)

tensor([[0., 0.],
        [0., 0.]])
tensor([[1., 1.],
        [1., 1.]])


- **Element-wise subtraction: method 3**

In [77]:
# Inplace
print(a.sub_(b))
print(a)

tensor([[0., 0.],
        [0., 0.]])
tensor([[0., 0.],
        [0., 0.]])


#### Element-Wise Multiplication

- **Create tensor a and b of sizes 2x2 filled with 1's and 0's**

In [78]:
a = torch.ones(2, 2)
print(a)
b = torch.zeros(2, 2)
print(b)

tensor([[1., 1.],
        [1., 1.]])
tensor([[0., 0.],
        [0., 0.]])


- **Element-wise multiplication: method 1**

In [79]:
a * b

tensor([[0., 0.],
        [0., 0.]])

- **Element-wise multiplication: method 2**

In [80]:
# Not in-place
print(torch.mul(a, b))
print(a)

tensor([[0., 0.],
        [0., 0.]])
tensor([[1., 1.],
        [1., 1.]])


- **Element-wise multiplication: method 3**

In [81]:
# In-place
print(a.mul_(b))
print(a)

tensor([[0., 0.],
        [0., 0.]])
tensor([[0., 0.],
        [0., 0.]])


#### Element-Wise Division

- **Create tensor a and b of sizes 2x2 filled with 1's and 0's**

In [82]:
a = 3 * torch.ones(2, 2)
print(a)
b = torch.ones(2, 2)
print(b)

tensor([[3., 3.],
        [3., 3.]])
tensor([[1., 1.],
        [1., 1.]])


- **Element-wise division: method 1**

In [83]:
b / a

tensor([[0.3333, 0.3333],
        [0.3333, 0.3333]])

In [84]:
print(a)
print(b)

tensor([[3., 3.],
        [3., 3.]])
tensor([[1., 1.],
        [1., 1.]])


- **Element-wise division: method 2**

In [85]:
torch.div(b, a)

tensor([[0.3333, 0.3333],
        [0.3333, 0.3333]])

In [86]:
print(a)
print(b)

tensor([[3., 3.],
        [3., 3.]])
tensor([[1., 1.],
        [1., 1.]])


- **Element-wise division: method 3**

In [87]:
# Inplace
b.div_(a)

tensor([[0.3333, 0.3333],
        [0.3333, 0.3333]])

In [88]:
print(a)
print(b)

tensor([[3., 3.],
        [3., 3.]])
tensor([[0.3333, 0.3333],
        [0.3333, 0.3333]])


#### Tensor Mean

$$1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 = 55$$

$$ mean = 55 /10 = 5.5 $$


- **Create tensor of size 10 filled from 1 to 10**

In [115]:
a = torch.Tensor([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
a.size()

torch.Size([10])

- **Get tensor mean**

Here we get 5.5 as we've calculated manually above.

In [116]:
a.mean(dim=0) # dim 안넣어주면 모든 값에서 평균

tensor(5.5000)

- **Get tensor mean on second dimension**

Here we get an error because the tensor is of size 10 and not 10x1 so there's no second dimension to calculate.

In [118]:
a.mean(dim=1) # dim이 올바르지 않을때

IndexError: ignored

- **Create a 2x10 Tensor, of 1-10 digits each**

In [119]:
a = torch.Tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]])

In [129]:
print(a)
print(a.shape)

tensor([[ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
        [ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]])
torch.Size([2, 10])


In [120]:
a.size()

torch.Size([2, 10])

In [125]:
a.mean()

tensor(5.5000)

In [126]:
a.mean(dim=1)

tensor([5.5000, 5.5000])

In [127]:
a.mean(dim = 0)

tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

In [130]:
a.sum()

tensor(110.)

In [131]:
a.std()

tensor(2.9469)

- **Get tensor mean on second dimension**

Here we won't get an error like previously because we've a tensor of size 2x10

In [132]:
a.mean(dim=1)

tensor([5.5000, 5.5000])

#### Tensor Standard Deviation

- **Get standard deviation of tensor**

In [133]:
a = torch.Tensor([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
a.std(dim=0)

tensor(3.0277)

#### Broadcasting, Reshape, Repeat, Concatentate, Advanced Indexing

- **Initialize a random tensor**

In [134]:
torch.Tensor(5, 3)

tensor([[-1.1298e+20,  3.0796e-41,  0.0000e+00],
        [ 2.3438e+00,  2.3694e-38,  6.0000e+00],
        [ 7.0000e+00,  8.0000e+00,  9.0000e+00],
        [ 1.0000e+01,  6.3015e+34,  6.2688e+22],
        [ 4.7428e+30,  2.8369e+22,  7.8118e+01]])

- **From a uniform distribution**

In [135]:
torch.Tensor(5, 3).uniform_(-1, 1)

tensor([[ 0.0644,  0.4726, -0.4397],
        [-0.8171,  0.7886,  0.2015],
        [-0.0764, -0.3520,  0.2811],
        [ 0.8405,  0.6719,  0.2503],
        [-0.6614, -0.0564,  0.7881]])

- **Get it's shape**

In [136]:
x = torch.Tensor(5, 3).uniform_(-1, 1)

In [137]:
print(x.size())

torch.Size([5, 3])


- **Broadcasting**

In [138]:
print (x.size())
y = x + torch.randn(5, 1) # randn 은 가우시안 분포에서 가져옴 x : (5,3)
print(y)
# broadcasting 못쓰겠으면 stack으로 쌓아서 더하자

torch.Size([5, 3])
tensor([[ 0.8280, -0.7212,  0.1523],
        [ 0.9884,  0.1066,  1.2412],
        [-1.5215, -1.4578, -1.9776],
        [-0.2887,  0.0483,  1.0702],
        [ 0.4599, -0.5514, -0.3427]])


- **Reshape**

In [139]:
y = torch.randn(5, 10, 15)

In [140]:
print(y.size())

torch.Size([5, 10, 15])


In [141]:
print(y.view(-1, 15).size())  # Same as doing y.view(50, 15)

torch.Size([50, 15])


In [142]:
print(y.view(-1, 15).unsqueeze(1).size()) # Adds a dimension at index 1.

torch.Size([50, 1, 15])


In [143]:
print(y.view(-1, 15).unsqueeze(0).size()) # Adds a dimension at index 0

torch.Size([1, 50, 15])


In [144]:
print(y.view(-1, 15).unsqueeze(1).squeeze().size()) # squeeze()는 1 dim을 삭제함
# If input is of shape: (Ax1xBxCx1xD)(Ax1xBxCx1xD) then the out Tensor will be of shape: (AxBxCxD)(AxBxCxD)

torch.Size([50, 15])


In [145]:
print(y.transpose(0, 1).size()) # transpose는 두개를 바꾸는것

torch.Size([10, 5, 15])


In [146]:
print(y.transpose(1, 2).size())

torch.Size([5, 15, 10])


In [147]:
print(y.transpose(0, 1).transpose(1, 2).size())

torch.Size([10, 15, 5])


In [148]:
print(y.permute(1, 2, 0).size()) #permute는 전체적으로 바꾸는 것

torch.Size([10, 15, 5])


- **Repeat**

In [149]:
print(y.view(-1, 15).unsqueeze(1).expand(50, 100, 15).size())

torch.Size([50, 100, 15])


In [150]:
print(y.view(-1, 15).unsqueeze(1).expand_as(torch.randn(50, 100, 15)).size())

torch.Size([50, 100, 15])


In [151]:
y = torch.randn(1, 2, 3)

In [152]:
y

tensor([[[ 0.7646, -0.4540, -1.0457],
         [ 0.2154, -1.1476,  0.0038]]])

In [155]:
y.size()

torch.Size([1, 1, 2, 3])

In [156]:
y = y.view(-1, 3)

In [157]:
y

tensor([[ 0.7646, -0.4540, -1.0457],
        [ 0.2154, -1.1476,  0.0038]])

In [158]:
y.size()

torch.Size([2, 3])

In [159]:
y = y.unsqueeze(1)

In [160]:
y

tensor([[[ 0.7646, -0.4540, -1.0457]],

        [[ 0.2154, -1.1476,  0.0038]]])

In [161]:
y.size()

torch.Size([2, 1, 3])

In [162]:
y = y.expand(2,5,3)

In [163]:
y

tensor([[[ 0.7646, -0.4540, -1.0457],
         [ 0.7646, -0.4540, -1.0457],
         [ 0.7646, -0.4540, -1.0457],
         [ 0.7646, -0.4540, -1.0457],
         [ 0.7646, -0.4540, -1.0457]],

        [[ 0.2154, -1.1476,  0.0038],
         [ 0.2154, -1.1476,  0.0038],
         [ 0.2154, -1.1476,  0.0038],
         [ 0.2154, -1.1476,  0.0038],
         [ 0.2154, -1.1476,  0.0038]]])

In [164]:
y.size()

torch.Size([2, 5, 3])

- **Concatenate**

In [165]:
# 2 is the dimension over which the tensors are concatenated
print(torch.cat([y, y], 2).size())

torch.Size([2, 5, 6])


In [166]:
# stack concatenates the sequence of tensors along a new dimension.
print(torch.stack([y, y], 0).size())

torch.Size([2, 2, 5, 3])


- **Advanced Indexing**

In [167]:
y = torch.randn(2, 3, 4)
print(y[[1, 0, 1, 1]].size())

torch.Size([4, 3, 4])


In [168]:
# PyTorch doesn't support negative strides yet so ::-1 does not work.
rev_idx = torch.arange(1, -1, -1).long()
print(y[rev_idx].size())

torch.Size([2, 3, 4])


In [201]:
tmp = torch.randn(3,3)
print(tmp)

tensor([[-0.4018,  0.0197,  1.0013],
        [ 0.2447,  0.1708, -0.5727],
        [-1.0964, -1.0571, -0.2981]])


In [202]:
tmp[1,0]

tensor(0.2447)

In [203]:
tmp[:,0]

tensor([-0.4018,  0.2447, -1.0964])

In [204]:
tmp[:,0:1]

tensor([[-0.4018],
        [ 0.2447],
        [-1.0964]])

In [205]:
tmp[:,-1]

tensor([ 1.0013, -0.5727, -0.2981])

In [206]:
label = torch.ones(3,3)

In [207]:
label[1,1] = 0
print(label)

tensor([[1., 1., 1.],
        [1., 0., 1.],
        [1., 1., 1.]])


In [208]:
tmp[label == 1] 

tensor([-0.4018,  0.0197,  1.0013,  0.2447, -0.5727, -1.0964, -1.0571, -0.2981])

In [209]:
tmp[label == 1] =2.0

In [210]:
tmp

tensor([[2.0000, 2.0000, 2.0000],
        [2.0000, 0.1708, 2.0000],
        [2.0000, 2.0000, 2.0000]])

## Summary
We've learnt to...

- Create Matrices
- Create Matrices with Default Initialization Values
    - Zeros 
    - Ones
- Initialize Seeds for Reproducibility on GPU and CPU
- Convert Matrices: NumPy to Torch and Torch to NumPy
- Move Tensors: CPU to GPU and GPU to CPU
- Run Important Tensor Operations
    - Element-wise addition, subtraction, multiplication and division
    - Resize
    - Calculate mean 
    - Calculate standard deviation
    - Broadcasting
    - Reshape
    - Repeat
    - Concatentate
    - Advanced Indexing

# 2. Gradients with PyTorch 

## Tensors with Gradients


### Creating Tensors with Gradients
- Allows accumulation of gradients

- **Method 1: Create tensor with gradients**

It is very similar to creating a tensor, all you need to do is to add an additional argument.

In [212]:
a = torch.ones((2, 2), requires_grad=True)
b = torch.ones((2,2))

- **Check if tensor requires gradients**

This should return `True` otherwise you've not done it right.

In [213]:
a.requires_grad

True

In [214]:
b.requires_grad # defalut는 false이다.

False

In [216]:
b.requires_grad = True 

In [217]:
b

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)

- **Method 2: Create tensor with gradients**

This allows you to create a tensor as usual then an additional line to allow it to accumulate gradients.

In [218]:
# Normal way of creating gradients
a = torch.ones(2, 2)

# Requires gradient
a.requires_grad_()

# Check if requires gradient
a.requires_grad

True

- **A tensor without gradients just for comparison**

If you do not do either of the methods above, you'll realize you will get False for checking for gradients.

In [219]:
# Not a variable
no_gradient = torch.ones(2, 2)

In [220]:
no_gradient.requires_grad

False

- **Tensor with gradients addition operation**

In [221]:
# Behaves similarly to tensors
b = torch.ones((2, 2), requires_grad=True)
print(a + b)
print(torch.add(a, b))

tensor([[2., 2.],
        [2., 2.]], grad_fn=<AddBackward0>)
tensor([[2., 2.],
        [2., 2.]], grad_fn=<AddBackward0>)


- **Tensor with gradients multiplication operation**

As usual, the operations we learnt previously for tensors apply for tensors with gradients. Feel free to try divisions, mean or standard deviation!

In [222]:
print(a * b)
print(torch.mul(a, b))

tensor([[1., 1.],
        [1., 1.]], grad_fn=<MulBackward0>)
tensor([[1., 1.],
        [1., 1.]], grad_fn=<MulBackward0>)


### Creating Tensors with Gradients

> **What exactly is `requires_grad`?** 
>
> Allows calculation of gradients w.r.t. the tensor that all allows gradients accumulation

$$y_i = 5(x_i+1)^2$$

- **Create tensor of size 2x1 filled with 1's that requires gradient**

In [232]:
x = torch.ones(2)
print(x.shape)

torch.Size([2])


In [233]:
print(x)
x[1] = 2
print(x)

tensor([1., 1.])
tensor([1., 2.])


In [235]:
x.requires_grad=True
print(x)

tensor([1., 2.], requires_grad=True)


- **Simple linear equation with x tensor created**

$$y_i\bigr\rvert_{x_i=1} = 5(1 + 1)^2 = 5(2)^2 = 5(4) = 20$$

We should get a value of 20 by replicating this simple equation

In [236]:
y = 5 * (x + 1) ** 2

In [237]:
y

tensor([20., 45.], grad_fn=<MulBackward0>)

- **Simple equation with y tensor**

Backward should be called only on a scalar (i.e. 1-element tensor) or with gradient w.r.t. the variable

Let's reduce y to a scalar then...

$$o = \frac{1}{2}\sum_i y_i$$

As you can see above, we've a tensor filled with 20's, so average them would return 20

In [238]:
o = (1/2) * torch.sum(y)

In [239]:
o

tensor(32.5000, grad_fn=<MulBackward0>)

- **Calculating first derivative**

**Recap `y` equation**: $y_i = 5(x_i+1)^2$
    
**Recap `o` equation**: $o = \frac{1}{2}\sum_i y_i$
    
**Substitute `y` into `o` equation**: $o = \frac{1}{2} \sum_i 5(x_i+1)^2$
    
$$\frac{\partial o}{\partial x_i} = \frac{1}{2}[10(x_i+1)]$$
    
$$\frac{\partial o}{\partial x_i}\bigr\rvert_{x_i=1} = \frac{1}{2}[10(1 + 1)] = \frac{10}{2}(2) = 10$$
    

We should expect to get 10, and it's so simple to do this with PyTorch with the following line...
    
Get first derivative:

In [240]:
o.backward()

Print out first derivative:

In [241]:
x.grad

tensor([10., 15.])

- **If x requires gradient and you create new objects with it, you get all gradients**

In [242]:
print(x.requires_grad)
print(y.requires_grad)
print(o.requires_grad)

True
True
True


- **Stop autograd from tracking history on Tensors**

You can also stop autograd from tracking history on Tensors with `.requires_grad=True` by wrapping the code block in `with torch.no_grad()`:

In [245]:
# 평가할 때는 gradient는 필요없다. 이 때 없애면 memory를 아낄 수 있다.
print(x.requires_grad)
print((x ** 2).requires_grad)

True
True


In [247]:
# gradient를 계산해주는 context을 비활성화 시켜 필요한 메모리를 줄어주고 연산속도를 증가시키는 역할을 한다.
with torch.no_grad():
    print((x ** 2).requires_grad)

False


## Summary
We've learnt to...

- Tensor with Gradients
    - Wraps a tensor for gradient accumulation
- Gradients
    - Define original equation
    - Substitute equation with `x` values
    - Reduce to scalar output, `o` through `mean`
    - Calculate gradients with `o.backward()`
    - Then access gradients of the `x` tensor with `requires_grad` through `x.grad`

### *References*
[1] [DOI](https://zenodo.org/badge/139945544.svg)(https://zenodo.org/badge/latestdoi/139945544)

[2] https://github.com/mila-udem/welcome_tutorials/tree/master/pytorch

[3] https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html#sphx-glr-beginner-blitz-tensor-tutorial-py

[4] https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html#sphx-glr-beginner-blitz-autograd-tutorial-py