<a href="https://colab.research.google.com/github/Sunnnyyy16/NLP_withPytorch/blob/main/chap1/NLP_pytorch_chap1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# PytTorch Basics

In [None]:
import torch
import numpy as np
torch.manual_seed(1234)

<torch._C.Generator at 0x7d478818c510>

# Tensors
* Scalar is a single number.
* Vector is an array of numbers.
* Matrix is a 2-D array of numbers
* Tensors are N-D arrays of numbers.

## Creating Tensors
You can create tensors by specifying the shape as arguments. Here is a tensor with 2 rows and 3 columns

In [None]:
def describe(x):
  print("Type: {}".format(x.type()))
  print("Shape/size: {}".format(x.shape))
  print("Values:\n{}".format(x))

In [None]:
describe(torch.Tensor(2,3))

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[8.7393e+21, 3.2561e-41, 8.7389e+21],
        [3.2561e-41, 0.0000e+00, 0.0000e+00]])


### Creating a randomly initialized tensor

In [None]:
import torch
describe(torch.rand(2,3)) # uniform random
describe(torch.randn(2,3)) # random normal

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[0.0518, 0.4681, 0.6738],
        [0.3315, 0.7837, 0.5631]])
Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[ 0.2310,  0.6931, -0.2669],
        [ 2.1785,  0.1021, -0.2590]])


### Creating a filled tensor

In [None]:
describe(torch.zeros(2,3))
x = torch.ones(2,3)
describe(x)
x.fill_(5)
describe(x)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[0., 0., 0.],
        [0., 0., 0.]])
Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[1., 1., 1.],
        [1., 1., 1.]])
Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[5., 5., 5.],
        [5., 5., 5.]])


Tensors can be initialized and then filled in place.

Note: operations that end in an underscore(_) are in place operations.

Tensors can be initialized from a list of lists

### Creating and initializing a tensor from lists

In [None]:
x = torch.Tensor([[1,2,3],
                  [4,5,6]])
describe(x)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[1., 2., 3.],
        [4., 5., 6.]])


### Creating and initializing a tensor from NumPy

In [None]:
import torch
import numpy as np
npy = np.random.rand(2,3)
describe(torch.from_numpy(npy))

Type: torch.DoubleTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[0.6390, 0.9942, 0.8277],
        [0.8480, 0.4551, 0.0487]], dtype=torch.float64)


DoubleTensor: torch.float64

## Tensor Types and Size

### Tensor properties

In [None]:
x = torch.FloatTensor([[1,2,3],
                       [4,5,6]])
describe(x)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[1., 2., 3.],
        [4., 5., 6.]])


In [None]:
x = x.long()
describe(x)

Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[1, 2, 3],
        [4, 5, 6]])


LongTensor: 64-bit integer

In [None]:
x = torch.tensor([[1,2,3],
                  [4,5,6]],dtype=torch.int64)
describe(x)

Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[1, 2, 3],
        [4, 5, 6]])


In [None]:
x = x.float()
describe(x)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[1., 2., 3.],
        [4., 5., 6.]])


## Tensor Operations

### Tensor operations: addition

In [None]:
import torch
x = torch.randn(2,3)
describe(x)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[-0.1549, -1.3706, -0.1319],
        [ 0.8848, -0.2611,  0.6104]])


In [None]:
describe(torch.add(x,x))

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[-0.3098, -2.7412, -0.2638],
        [ 1.7697, -0.5222,  1.2208]])


In [None]:
describe(x+x)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[-0.3098, -2.7412, -0.2638],
        [ 1.7697, -0.5222,  1.2208]])


### Dimension-based tensor operations

In [None]:
import torch
x = torch.arange(6)
describe(x)

Type: torch.LongTensor
Shape/size: torch.Size([6])
Values:
tensor([0, 1, 2, 3, 4, 5])


In [None]:
x = x.view(2,3)
describe(x)

Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[0, 1, 2],
        [3, 4, 5]])


In [None]:
describe(torch.sum(x,dim=0))

Type: torch.LongTensor
Shape/size: torch.Size([3])
Values:
tensor([3, 5, 7])


In [None]:
describe(torch.sum(x,dim=1))

Type: torch.LongTensor
Shape/size: torch.Size([2])
Values:
tensor([ 3, 12])


In [None]:
describe(torch.transpose(x,0,1))

Type: torch.LongTensor
Shape/size: torch.Size([3, 2])
Values:
tensor([[0, 3],
        [1, 4],
        [2, 5]])


## Indexing, Slicing and Joining

### Slicing and indexing a tensor

In [None]:
import torch
x =torch.arange(6).view(2,3)
describe(x)

Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[0, 1, 2],
        [3, 4, 5]])


In [None]:
describe(x[:1,:2])

Type: torch.LongTensor
Shape/size: torch.Size([1, 2])
Values:
tensor([[0, 1]])


In [None]:
describe(x[0,1])

Type: torch.LongTensor
Shape/size: torch.Size([])
Values:
1


### Complex indexing: noncontiguous indexing of a tensor

In [None]:
indices = torch.LongTensor([0,2])
describe(torch.index_select(x,dim=1, index=indices))

Type: torch.LongTensor
Shape/size: torch.Size([2, 2])
Values:
tensor([[0, 2],
        [3, 5]])


In [None]:
indices = torch.LongTensor([0,0])
describe(torch.index_select(x,dim=0,index=indices))

Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[0, 1, 2],
        [0, 1, 2]])


In [None]:
row_indices = torch.arange(2).long()
col_indices = torch.LongTensor([0,1])
describe(x[row_indices,col_indices])

Type: torch.LongTensor
Shape/size: torch.Size([2])
Values:
tensor([0, 4])


* row_indices = [0, 1]과 col_indices = [0, 1]은 각각 행과 열의 인덱스를 나타냅니다.
* x[row_indices, col_indices]는 x[0, 0]과 x[1, 1]의 값을 선택하여 [0, 4]를 출력합니다.

### Concatenating tensors

In [None]:
import torch
x = torch.arange(6).view(2,3)
describe(x)

Type: torch.LongTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[0, 1, 2],
        [3, 4, 5]])


In [None]:
describe(torch.cat([x,x],dim=0))

Type: torch.LongTensor
Shape/size: torch.Size([4, 3])
Values:
tensor([[0, 1, 2],
        [3, 4, 5],
        [0, 1, 2],
        [3, 4, 5]])


In [None]:
describe(torch.cat([x,x],dim=1))

Type: torch.LongTensor
Shape/size: torch.Size([2, 6])
Values:
tensor([[0, 1, 2, 0, 1, 2],
        [3, 4, 5, 3, 4, 5]])


In [None]:
describe(torch.stack([x,x]))

Type: torch.LongTensor
Shape/size: torch.Size([2, 2, 3])
Values:
tensor([[[0, 1, 2],
         [3, 4, 5]],

        [[0, 1, 2],
         [3, 4, 5]]])


### Linear algebra on tensors: multiplication

In [None]:
import torch
x1 = torch.arange(6).view(2,3)
x1= x1.float()
describe(x1)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 3])
Values:
tensor([[0., 1., 2.],
        [3., 4., 5.]])


In [None]:
x2 = torch.ones(3,2)
x2[:,1] += 1
describe(x2)

Type: torch.FloatTensor
Shape/size: torch.Size([3, 2])
Values:
tensor([[1., 2.],
        [1., 2.],
        [1., 2.]])


In [None]:
describe(torch.mm(x1,x2)) # x1,x2 multiplication

Type: torch.FloatTensor
Shape/size: torch.Size([2, 2])
Values:
tensor([[ 3.,  6.],
        [12., 24.]])


## Tensors and Computational Graphs

### Creating tensors for gradient bookkeeping

In [None]:
import torch
x = torch.ones(2,2,requires_grad = True)
describe(x)
print(x.grad is None)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 2])
Values:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
True


requires_grad = True

* bookkeeping operations가 tensor의 gradient, gradient function을 추적가능
* PyTorch가 forward pass의 value 추적하고 계산 마지막의 a single scalar가 backward pass 계산에 사용.
* backward pass는 `backward()` 로 초기화 가능. 해당 tensor는 loss fcn의 평가에 사용.

In [None]:
y = (x+2)*(x+5)+3
describe(y)
print(x.grad is None)

Type: torch.FloatTensor
Shape/size: torch.Size([2, 2])
Values:
tensor([[21., 21.],
        [21., 21.]], grad_fn=<AddBackward0>)
True


In [None]:
z = y.mean()
describe(z)
z.backward()
print(x.grad is None)

Type: torch.FloatTensor
Shape/size: torch.Size([])
Values:
21.0
False


## CUDA Tensors
* CUDA API는 NVIDIA GPUs에서만 사용 가능
* PyTorch는 CUDA tensor object 제공, CPU에서 GPU로 tensor 바꾸는 것 가능

### Creating CUDA tensors

In [None]:
import torch
print(torch.cuda.is_available())

True


In [None]:
# preferred method: device agnostic tensor instantiation
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cuda


In [None]:
x = torch.rand(3,3).to(device)
describe(x)

Type: torch.cuda.FloatTensor
Shape/size: torch.Size([3, 3])
Values:
tensor([[0.9822, 0.7940, 0.0843],
        [0.9132, 0.2309, 0.3524],
        [0.9786, 0.9198, 0.4529]], device='cuda:0')


### Mixing CUDA tensors with CPU-bound tensors
* CUDA, non-CUDA object를 사용할 때, 같은 device에 있는지 확인필요. 그렇지 않을 경우, computation break.
* 아래 예시에서 에러 발생 확인가능

In [None]:
y = torch.rand(3,3)
x+y

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

In [None]:
cpu_device = torch.device("cpu")
y = y.to(cpu_device)
x = x.to(cpu_device)
x+y

tensor([[1.6848, 0.9671, 0.6043],
        [1.5971, 0.5178, 0.8131],
        [1.2236, 1.1894, 0.7935]])

It is expensive to move data back and forth from the GPU. Therefore, the typical procedure involves doing many of the parallelizable computations on the GPU and then transferring just the final result back to the CPU. This will allow you to fully utilize the GPUs. If you have several CUDA-visible devices(i.e., multiple GPUs), the best practice is to use the CUDA_VISIBLE_DEVICES environment variable when executing the program

`CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py`

# Exercises

1. Create a 2D tensor and then add a dimension of size 1 inserted at dimension 0

In [None]:
a = torch.rand(3,3)
a.unsqueeze(0)

tensor([[[0.2448, 0.9430, 0.4632],
         [0.4011, 0.7563, 0.8257],
         [0.4887, 0.1293, 0.6006]]])

unsqueeze(): add dimension of size 1. need to define where you add the dimension

2. Remove the extra dimension you just added to the previous tensor.

In [None]:
a.squeeze(0)

tensor([[0.2448, 0.9430, 0.4632],
        [0.4011, 0.7563, 0.8257],
        [0.4887, 0.1293, 0.6006]])

squeeze(): remove dimension of size 1. If you choose dimension of the size, remove the dimension of the size

3. Create a random tensor of shape 5x3 in the interval [3,7)

In [None]:
a = 3+torch.rand(5,3)*(7-3)
print(a)

tensor([[4.9205, 5.5030, 4.2642],
        [3.8410, 5.4200, 4.5893],
        [6.6695, 4.1409, 4.6291],
        [6.6977, 4.1457, 5.5588],
        [5.7260, 3.9335, 5.5922]])


4. Create a tensor with values from a normal distribution (mean=0, std=1)

In [None]:
a = torch.randn(3,3)
a

tensor([[-0.1236, -0.9187, -0.9060],
        [-1.3344,  1.6519,  0.0409],
        [ 0.8689,  0.1418,  0.4556]])

5. Retrieve the indexes of all the nonzero elements in the tensor torch.Tensor([1,1,1,0,1])

In [None]:
a = torch.Tensor([1,1,1,0,1])
torch.nonzero(a)

tensor([[0],
        [1],
        [2],
        [4]])

6. Create a random tensor of size (3,1) and then horizontally stack four copies together

In [None]:
a = torch.rand(3,1)
a.expand(3,4)

tensor([[0.7891, 0.7891, 0.7891, 0.7891],
        [0.6819, 0.6819, 0.6819, 0.6819],
        [0.3602, 0.3602, 0.3602, 0.3602]])

7. Return the batch matrix-matrix product of two three-dimensional matrices

In [None]:
a = torch.rand(3,4,5)
b = torch.rand(3,5,4)
torch.bmm(a,b)

tensor([[[1.9668, 0.9266, 1.8186, 1.6429],
         [1.3943, 0.3852, 0.8399, 1.3677],
         [1.5733, 0.7475, 1.3582, 1.3025],
         [2.2510, 0.9459, 1.5703, 1.9471]],

        [[0.9535, 0.1922, 0.7647, 0.6638],
         [1.7112, 0.6822, 1.0698, 0.7731],
         [1.7918, 0.5267, 1.1256, 0.8851],
         [1.4233, 0.7244, 1.0528, 0.7182]],

        [[0.9511, 0.8278, 1.7722, 0.6934],
         [1.6896, 1.3719, 2.5818, 1.5306],
         [1.7271, 1.3471, 1.9680, 1.2615],
         [1.5085, 1.0828, 1.8817, 1.1842]]])

* torch.mm : 2차원 tensor 곱
* torch.bmm : 3차원 tensor 곱

8. Return the batch matrix-matrix product of a 3D matrix and a 2D matrix

In [None]:
a = torch.rand(3,4,5)
b = torch.rand(5,4)
torch.bmm(a,b.unsqueeze(0).expand(a.size(0),*b.size()))

tensor([[[0.9125, 1.5914, 0.8697, 1.1377],
         [0.9872, 1.4219, 0.7271, 0.9546],
         [0.7157, 1.4188, 0.6221, 0.9352],
         [1.0608, 1.6024, 0.8291, 1.1706]],

        [[0.7113, 1.5876, 0.5427, 0.8651],
         [1.0125, 1.4226, 0.5279, 0.7074],
         [0.8450, 1.2126, 0.4889, 0.6681],
         [0.9564, 1.2175, 0.7119, 0.8840]],

        [[1.2962, 2.2125, 0.9435, 1.1462],
         [0.9216, 1.7465, 0.8049, 1.1075],
         [0.2254, 0.6343, 0.1720, 0.3904],
         [1.4691, 2.4132, 1.2177, 1.5431]]])