Copyright (C) 2020 Software Platform Lab, Seoul National University

Licensed under the Apache License, Version 2.0 (the "License"); 

you may not use this file except in compliance with the License. 

You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 

Unless required by applicable law or agreed to in writing, software 

distributed under the License is distributed on an "AS IS" BASIS, 


WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 


See the License for the specific language governing permissions and


limitations under the License.

# 1. PyTorch Operations

## Tensor

Let's create a Tensor in PyTorch. PyTorch Tensors are similar to NumPy ndarrays. In order to create an uninitialized Tensor object, we use `torch.empty(*size)` API.

In [23]:
import torch

# Create 2X3 2-dimensional Tensor (Matrix)
x = torch.empty(2, 3)

# Create 2X3X4 3-dimensional Tensor
y = torch.empty(2, 3, 4)

print(x)
print(y)

tensor([[2.4836e-35, 0.0000e+00, 1.6255e-43],
        [6.4460e-44, 1.3733e-43, 1.3593e-43]])
tensor([[[2.3338e-35, 0.0000e+00, 7.0065e-44, 6.7262e-44],
         [6.3058e-44, 6.8664e-44, 6.8664e-44, 6.3058e-44],
         [6.8664e-44, 7.7071e-44, 1.1771e-43, 6.8664e-44]],

        [[7.4269e-44, 8.1275e-44, 7.2868e-44, 7.4269e-44],
         [8.1275e-44, 7.0065e-44, 7.2868e-44, 6.4460e-44],
         [7.1466e-44, 7.7071e-44, 6.7262e-44, 7.8473e-44]]])


We can also create initialized Tensor objects using the following APIs.

* `torch.rand(*size)` : Initialize with random numbers from a uniform distribution on the interval [0, 1). 
* `torch.randn(*size)` : Initialize with random numbers from a normal distribution N(0, 1).
* `torch.zeros(*size)` : Initialize with zeros.
* `torch.ones(*size)` : Initialize with ones.

Also, you can directly construct a Tensor from data using `torch.tensor()` API.

In [24]:
x_rand = torch.rand(2, 3)
x_zeros = torch.zeros(2, 3)
x_ones = torch.ones(2, 3)
x = torch.tensor([[1, 2], [3, 4]])

print(x_rand)
print(x_zeros)
print(x_ones)
print(x)

tensor([[0.8971, 0.1654, 0.9918],
        [0.7657, 0.0728, 0.6054]])
tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[1, 2],
        [3, 4]])


The size of a Tensor object can be retrieved using `size()` API.

In [25]:
x = torch.empty(2, 3)
print(x.size())

torch.Size([2, 3])


## Math Operations
PyTorch is supporting multiple syntaxes for math operations. In this example, let's take a look at the division operation. You can also refer to the following link for the full list of PyTorch operations: https://pytorch.org/docs/stable/torch.html 

In [26]:
x = torch.tensor([[10, 20], [30, 40]])
y = torch.tensor([[1, 2], [3, 4]])
div = torch.div(x, y)

print('x: \n', x)
print('y: \n', y)
print('div: \n', div)

x: 
 tensor([[10, 20],
        [30, 40]])
y: 
 tensor([[1, 2],
        [3, 4]])
div: 
 tensor([[10., 10.],
        [10., 10.]])


Some useful operations are overloaded for simplicity.

In [27]:
# This will produce the same result as the above.
x = torch.tensor([[10, 20], [30, 40]])
y = torch.tensor([[1, 2], [3, 4]])
div = x/y

# def __div__(self,val):
# operation overide

print('div: \n', div)

div: 
 tensor([[10., 10.],
        [10., 10.]])


## Quiz 1
**Define a function that takes two 2X2 Python lists as inputs and computes matrix multiplication operation. Return the result Tensor. Please note that inputs are given as Python lists, NOT Tensor objects. (HINT: use torch.matmul)**

In [28]:
import torch

def matmul(x: list, y: list) -> torch.Tensor:
    ############# Write here. ################
    x_tensor = torch.FloatTensor(x)
    y_tensor = torch.FloatTensor(y)
    return torch.matmul(x_tensor,y_tensor)
    ##########################################

x = [[1, 2], [3, 4]]
y = [[5, 6], [7, 8]]
z = matmul(x, y)
print(z)

tensor([[19., 22.],
        [43., 50.]])


# 2. AutoGrad

AutoGrad provides automatic differentiation for all operations on PyTorch Tensors. Let's see how it works with some examples below.

1) In order to use AutoGrad, you should create a Tensor object with its `requires_grad` attribute `True`. It will enable all operations on this Tensor object to be tracked so that its gradient value can be computed automatically through  backpropagation.

In [29]:
# Note the difference between x and y
x = torch.ones(2, 2)
y = torch.ones(2, 2, requires_grad=True)

print('x with requires_grad==False:\n', x)
print('y with requires_grad==True:\n', y)

x with requires_grad==False:
 tensor([[1., 1.],
        [1., 1.]])
y with requires_grad==True:
 tensor([[1., 1.],
        [1., 1.]], requires_grad=True)


2) After constructing Tensors with its `requires_grad` attribute `True`, do some operations on them.

In [30]:
x = torch.ones(2, 2, requires_grad=True)
y = x + 2 # This is equivalent as y = x + torch.tensor([[2, 2], [2, 2]])
out = y.mean()

print(out)
print(out.grad) # None & Error

tensor(3., grad_fn=<MeanBackward0>)
None


  


3) Finally, do backpropagation on the final output Tensor `out` by calling `out.backward()`. Gradient value `d(out)/dx` will be automatically computed and stored at the `grad` attribute of the Tensor `x`.

In [31]:
x = torch.ones(2, 2, requires_grad=True)
y = x + 2 
out = y.mean()
out.backward()

# Let's check the gradient value d(out)/d(x)
print(x.grad)

tensor([[0.2500, 0.2500],
        [0.2500, 0.2500]])


Since $out = 0.25 \sum_{i}{(x_i + 2)}$, $\frac{\partial{out}}{\partial{x_i}}=0.25$ and the result is reasonale. Let's also take a look at a more complex example.


In [32]:
x = torch.tensor([[1., 1.,], [1., 1.]], requires_grad=True)
y = torch.tensor([[2., 2.,], [2., 2.]], requires_grad=True)
z = x + y
z = z * z
out = z.mean()
out.backward()

print('x.grad: \n', x.grad)
print('y.grad: \n', y.grad)

x.grad: 
 tensor([[1.5000, 1.5000],
        [1.5000, 1.5000]])
y.grad: 
 tensor([[1.5000, 1.5000],
        [1.5000, 1.5000]])


$out = 0.25 \sum_{i}{(x_i + y_i)^2}$ and, therefore $\frac{\partial{out}}{\partial{x_i}} = 0.5 (x_i + y_i)$

You can also check up the gradient value `d(out)/dy` from `y.grad` since `y` is also a Tensor that requires gradient. 

In order to stop AutoGrad from tracking Tensor operations, you can wrap the code block in `with torch.no_grad():`. It will stop tracking of all Tensor operations in the wrapped code block.

In [33]:
x = torch.tensor([[1., 1.,], [1., 1.]], requires_grad=True)
print(x.requires_grad)
y = (x * x)
print(y.requires_grad)

with torch.no_grad():
    y = (x * x)
    print(y.requires_grad)

True
True
False


## Quiz 2

1. Define three 2X2 matrices (`X`, `Y` and `Z`) with random initial values.
2. Compute `out = mean(XY + Z)` (use matrix multiplication, not element-wise multiplication)
3. Compute and print gradient values of `out` with respect to `X`, `Y` and `Z`.


In [34]:
import torch

############# Write here. ################
def generate_tensor():
  return torch.rand(2,2, requires_grad=True)

X = generate_tensor()
Y = generate_tensor()
Z = generate_tensor()

out = (torch.matmul(X,Y)+Z).mean()
out.backward()

print('X.grad: \n', X.grad)
print('Y.grad: \n', Y.grad)
print('Z.grad: \n', Z.grad)
##########################################

X.grad: 
 tensor([[0.2917, 0.4294],
        [0.2917, 0.4294]])
Y.grad: 
 tensor([[0.1880, 0.1880],
        [0.1312, 0.1312]])
Z.grad: 
 tensor([[0.2500, 0.2500],
        [0.2500, 0.2500]])


# 3. Dataset and DataLoader

Data preparation is one of the main tasks in machine learning tasks. Hopefully, PyTorch is providing efficient tools to make data loading easy while maintaining the code readablility. In this section, let's learn PyTorch's `torch.utils.data.Dataset` and `torch.utils.data.DataLoder API` for data preparation.

## Dataset

`torch.utils.data.Dataset` is the abstract class representing a dataset. To make your own dataset class, you should first inherit `Dataset` class and then override the following methods. 

* `__len__` : it will enable `len(dataset)` to return the size of your custom dataset.
* `__getitem__` : it will enable `dataset[i]` to index ith sample of your custom dataset.

Let's take a look at the following example to see how it works. Before moving on, let's first prepare a toy dataset file that consists of 10 samples, each of them having `i%2` for the label and vector `[i, i+1, i+2]` for the image, and store it as a `.csv` format. This toy dataset will be used throughout the example. 

In [35]:
import csv    

# .csv file will be stored at ./dataset.csv
filename = 'dataset.csv'
f = open(filename, 'w', encoding='utf-8', newline='')
wr = csv.writer(f)
for i in range(8):
    wr.writerow([i%2] + [i, i+1, i+2]) # (label, image)
f.close()

f = open(filename, 'r', encoding='utf-8')
rdr = csv.reader(f)
for line in rdr:
    print(line)
f.close()  

['0', '0', '1', '2']
['1', '1', '2', '3']
['0', '2', '3', '4']
['1', '3', '4', '5']
['0', '4', '5', '6']
['1', '5', '6', '7']
['0', '6', '7', '8']
['1', '7', '8', '9']


Let's make a custom Dataset class that reads our `dataset.csv` dataset file. 

In [36]:
import torch 
import pandas as pd
from torch.utils.data import Dataset

class CustomDataset(Dataset):
    # read input .csv dataset file and store in self.data as a pandas object
    def __init__(self, file_path):
        self.data = pd.read_csv(file_path, header=None)
    
    # data length is simply the length of self.data
    def __len__(self):
        return len(self.data)
    
    # return a tuple of image and label at (index)th row of our self.data
    def __getitem__(self, index):
        label = self.data.iloc[index, 0] # First element is the label
        image = self.data.iloc[index, 1:] # Remaining elements are the image vector
        # convert the image into torch Tensor type
        image = torch.tensor(image.tolist(), dtype=torch.float)
        return image, label

Let us then instantiate our `CustomDataset` class and iterate through the data samples. Since we have `__len__` and `__getitem__` methods overriden, we can get the length of the dataset with `len(dataset)` and access to ith sample with `dataset[i]`.

In [37]:
dataset = CustomDataset('dataset.csv')
for i in range(len(dataset)): # len을 정의하였기 때문에 사용 가능
    print('%dth sample' % i)
    image = dataset[i][0] # getitem을 정의하였기 때문에 사용 가능
    label = dataset[i][1]
    # image, label = dataset[i] # Same
    print('image:', image, ', label:', label)

0th sample
image: tensor([0., 1., 2.]) , label: 0
1th sample
image: tensor([1., 2., 3.]) , label: 1
2th sample
image: tensor([2., 3., 4.]) , label: 0
3th sample
image: tensor([3., 4., 5.]) , label: 1
4th sample
image: tensor([4., 5., 6.]) , label: 0
5th sample
image: tensor([5., 6., 7.]) , label: 1
6th sample
image: tensor([6., 7., 8.]) , label: 0
7th sample
image: tensor([7., 8., 9.]) , label: 1


## Transforms

In many cases, we need to apply transformations on our sample images (i.e. normalization, image reshaping, etc.) We can easily implement such transforms with callable classes: we just need to implement `__call__` method and, if required, `__init__` method. Then, we can transform each sample of the dataset like this:
```
transform = Transform(params)
transformed_sample = transform(sample)
```
As an example, let's implement two toy transforms: negation and addition. `__call__` method takes a sample of the dataset as an input and returns transformed sample as an output. Note that we can additionally define `__init__` method if required.  

In [38]:
class Negation(object):
    def __call__(self, sample): # Callable
        image = sample[0]
        label = sample[1]
        negative_image = -1. * image
        return negative_image, label

# negation_op = Negation() 
# negative_sample, label = negation_op(sample)

class Addition(object):
    def __init__(self, v):
        self._v = v
        
    def __call__(self, sample):
        image = sample[0]
        label = sample[1]
        added_image = image + self._v
        return added_image, label

# add_1_op = Addition(tensor_of_value_one)
# add_2_op = Addition(tensor_of_value_two)

Then, instantiate the transform classes. We can compose more than one transforms using `torchvision.transforms.Compose` API. 

In [39]:
from torchvision import transforms
added_one = Addition(1)
negation = Negation()
compose = transforms.Compose([added_one, negation])

Let's check how our transform works on a sample image.

In [40]:
sample = dataset[0]
transformed_sample = compose(sample)
other_transformed_sample = negation(added_one(sample))

print('image before transformation:', sample[0])
print('image after transformation:', transformed_sample[0])

image before transformation: tensor([0., 1., 2.])
image after transformation: tensor([-1., -2., -3.])


Finally, we can redefine our custom Dataset to apply the transform.

In [41]:
class CustomDataset(Dataset):
    def __init__(self, file_path, transform=None):
        self.data = pd.read_csv(file_path, header=None)
        self.transform = transform
    
    def __len__(self):
        return len(self.data)
    
    def __getitem__(self, index):
        label = self.data.iloc[index, 0] 
        image = self.data.iloc[index, 1:] 
        image = torch.tensor(image.tolist(), dtype=torch.float)
        sample = (image, label)
        
        # transform the image
        if self.transform is not None:
            sample = self.transform(sample) 
        
        return sample

transformed_dataset = CustomDataset('dataset.csv', transform=compose)

for i in range(len(transformed_dataset)):
    print('%dth sample' % i)
    image = transformed_dataset[i][0]
    label = transformed_dataset[i][1]
    print('image:', image, ', label:', label) # 값에 모두 1을 더해서 negtaion이 된 것을 볼 수 있다.

0th sample
image: tensor([-1., -2., -3.]) , label: 0
1th sample
image: tensor([-2., -3., -4.]) , label: 1
2th sample
image: tensor([-3., -4., -5.]) , label: 0
3th sample
image: tensor([-4., -5., -6.]) , label: 1
4th sample
image: tensor([-5., -6., -7.]) , label: 0
5th sample
image: tensor([-6., -7., -8.]) , label: 1
6th sample
image: tensor([-7., -8., -9.]) , label: 0
7th sample
image: tensor([ -8.,  -9., -10.]) , label: 1


## DataLoader

So far, we have learned how to construct a Dataset class from `.csv` file and apply transforms over the dataset. However, simply iterating over the dataset with `for` loop has potential problems because it does not support these features.
* Batching data
* Shuffling data
* Loading data in parallel on multiple workers

Luckily, PyTorch is providing `torch.utils.data.DataLoader` module, an iterator that can support all these features. The following example creates a Dataloader that shuffles data and batches it with `batch_size==4`.

In [42]:
from torch.utils.data import DataLoader

transformed_dataset = CustomDataset('dataset.csv', transform=compose)
dataloader = DataLoader(dataset=transformed_dataset,
                        shuffle=True,
                        batch_size=4)
# dataloader = DataLoader(dataset=transformed_dataset,
#                         shuffle=False,
#                         batch_size=4)

Finally, we can iterate over the dataset using the dataloader. Note that the resulting data is shuffled over the different epoches and batched with the size of 4.

In [43]:
for epoch in range(2):
    print('Epoch %d\n' % epoch)
    for i, data in enumerate(dataloader):
        images, labels = data
        print('Batch %d' % i)
        print('batched images:\n', images)
        print('batched labels:\n', labels)
        print('')
    print('='*30)

Epoch 0

Batch 0
batched images:
 tensor([[ -6.,  -7.,  -8.],
        [ -7.,  -8.,  -9.],
        [ -8.,  -9., -10.],
        [ -4.,  -5.,  -6.]])
batched labels:
 tensor([1, 0, 1, 1])

Batch 1
batched images:
 tensor([[-2., -3., -4.],
        [-3., -4., -5.],
        [-1., -2., -3.],
        [-5., -6., -7.]])
batched labels:
 tensor([1, 0, 0, 0])

Epoch 1

Batch 0
batched images:
 tensor([[-1., -2., -3.],
        [-7., -8., -9.],
        [-6., -7., -8.],
        [-5., -6., -7.]])
batched labels:
 tensor([0, 0, 1, 0])

Batch 1
batched images:
 tensor([[ -3.,  -4.,  -5.],
        [ -2.,  -3.,  -4.],
        [ -4.,  -5.,  -6.],
        [ -8.,  -9., -10.]])
batched labels:
 tensor([0, 1, 1, 1])



## Quiz 3

Create a dataloader that satisfies the following conditions.
1. Reads `dataset.csv` file
2. Apply vector normalization transform (i.e. rescale the image vector `v` so that the vector size `||v||` equals 1)
3. Shuffle and batch into the size of 2.

In [44]:
class CustomDataset(Dataset):
    def __init__(self, file_path, transform=None):
        self.data = pd.read_csv(file_path, header=None)
        self.transform = transform
    
    def __len__(self):
        return len(self.data)
    
    def __getitem__(self, index):
        label = self.data.iloc[index, 0] 
        image = self.data.iloc[index, 1:] 
        image = torch.tensor(image.tolist(), dtype=torch.float)
        sample = (image, label)
        
        # transform the image
        if self.transform is not None:
            sample = self.transform(sample) 
        
        return sample

class Normalization(object):
    def __call__(self, sample):
        ############# Write here. ################
        image, label = sample
        size = (image*image).sum() ** 0.5
        return image/size, label
        ##########################################

############# Write here. ################
normalization = Normalization()
dataset = CustomDataset('dataset.csv', transform = normalization)
dataloader = DataLoader(dataset=dataset,
                        shuffle=True,
                        batch_size=2)
##########################################

for epoch in range(2):
    print('Epoch %d\n' % epoch)
    for i, data in enumerate(dataloader):
        images, labels = data
        print('Batch %d' % i)
        print('batched images:\n', images)
        print('batched labels:\n', labels)
        print('')
    print('='*30)

Epoch 0

Batch 0
batched images:
 tensor([[0.4767, 0.5721, 0.6674],
        [0.4915, 0.5735, 0.6554]])
batched labels:
 tensor([1, 0])

Batch 1
batched images:
 tensor([[0.4558, 0.5698, 0.6838],
        [0.3714, 0.5571, 0.7428]])
batched labels:
 tensor([0, 0])

Batch 2
batched images:
 tensor([[0.2673, 0.5345, 0.8018],
        [0.0000, 0.4472, 0.8944]])
batched labels:
 tensor([1, 0])

Batch 3
batched images:
 tensor([[0.4243, 0.5657, 0.7071],
        [0.5026, 0.5744, 0.6462]])
batched labels:
 tensor([1, 1])

Epoch 1

Batch 0
batched images:
 tensor([[0.4558, 0.5698, 0.6838],
        [0.4915, 0.5735, 0.6554]])
batched labels:
 tensor([0, 0])

Batch 1
batched images:
 tensor([[0.2673, 0.5345, 0.8018],
        [0.4767, 0.5721, 0.6674]])
batched labels:
 tensor([1, 1])

Batch 2
batched images:
 tensor([[0.5026, 0.5744, 0.6462],
        [0.3714, 0.5571, 0.7428]])
batched labels:
 tensor([1, 0])

Batch 3
batched images:
 tensor([[0.4243, 0.5657, 0.7071],
        [0.0000, 0.4472, 0.8944]])