# Pytorch Basic  
![](_2023-04-17-13-59-20.png)  

Writer: KukJin Kim kukjinkim@korea.ac.kr  



In [None]:
!pip3 install torch --index-url https://download.pytorch.org/whl/cu117

![](_2023-04-17-13-37-23.png)  

Is ChatGPT really implemented in Pytorch?


# Contents
- Class Review 
- Data representation with tensor <-
- Tensor Manipulation <-
- Dataset, DataLoader
- Regresseion 
- Clasification

Since 2016, deep learning and reinforcement learning have developed at a really fast pace, resulting in innovative products such as the current ChatGPT.  
In this practice, we will practice in earnest the Pytorch deep learning framework, which is very widely used.

### A GPU is required for this exercise. If your laptop does not have a GPU, you can solve this by replacing all device code with cpu. Or, please practice on Colab and change the runtime

![img](2023-04-25-21-42-12.png)

![](2023-04-25-21-43-14.png)

# 2. Data representation with tensor  
In linear algebra, you learn about the concepts of scalars, vectors, matrices, and tensors.  
Linear algebra books and many machine learning books denote them with the following symbols  
Let's see how to manipulate tensors in PyTorch through several modules and methods.  

$$\text{Scalar} : x\quad 0 \  \text{(0-dimension) }$$
$$\text{Vector} : \mathbf x \quad n \ \text{ (1-dimension)}$$
$$\text{Matrix} : \mathbf X \quad m \times n \  \text {(2-dimension)}$$
$$\text{Tensor} : \mathcal {X} \quad l \times m \times n \times \dots \text {(more 3-dimension)} $$

![](2023-04-25-21-44-38.png)

(https://furkangulsen.medium.com/what-is-a-tensor-ce8e78835d08)

Let's recall the types of data we learn at the beginning of programming courses.
Typical examples include:
1. Int  
2. Double  
3. Float  

Similarly, you can create tensors with the above data types through various tensor modules of torch.

In [3]:
import torch
from torch import tensor
from torch import FloatTensor as ftensor
from torch import DoubleTensor as dtensor
from torch import IntTensor as itensor

#### 2.1 Scalar
`torch.tensor(data)` takes a data array as input and returns a tensor object from pytorch.  
    
   
Tensor objects are allocated in the following form.  
`tensor = torch.tensor(data, dtype=float32, device='cpu')`  
   
  
Parameters and arguments for creating a tensor object include data, dtype, and device.  
data can contain lists or numpy arrays, device can contain a gpu index such as cuda:0 or cpu

By default, if no argument is given, the tensor is calculated with cpu, and the data type is set to float32 or int64.

In [4]:
tensor1 = torch.tensor(1.0)
print(tensor1)
print(tensor1.device)
print(tensor1.dtype)

tensor2 = torch.tensor(2, device='cuda')
print(tensor2)
print(tensor2.device)
print(tensor2.dtype)

# For plotting value
print(tensor1.item())
print(tensor2.item())




tensor(1.)
cpu
torch.float32
tensor(2, device='cuda:0')
cuda:0
torch.int64
1.0
2


Things to remember here are:
- Both the dtype and device of tensor objects must be unified
- That is, the data type of the input tensor and the data type of the model weight tensor must be the same.
- In addition, the input tensor and model weight tensor calculation device must be the same.

If the above is not followed, pytorch internally converts the device and dtype to perform the operation.

In [5]:
tensor3 = tensor1 + tensor2
print(tensor3)
print(tensor3.device)
print(tensor3.dtype)

tensor(3., device='cuda:0')
cuda:0
torch.float32


#### 2.2 Vector

In [6]:
import numpy as np
vec = np.array([1, 2, 3, 4])

In numpy, you represent vectors or tensors by putting sequence data (lists, tuples).  
In `torch.tensor(data)`, the containers can be List, Tuple, and Numpy array.  
Creating tensor objects in PyTorch is very similar to existing numpy.  

In [7]:
vec1 = ftensor([1.0, -1.1, 3.9]) # list input
vec2 = dtensor((1.0, -1.1, 3.9)) # Tuple input
vec3 = tensor(np.array([1, 2, 3, 4])) # numpy ary input
vec4 = tensor([1.0, -1.1, 3.9], dtype=int)

print(vec1)
print(vec1.dtype)
print(vec2)
print(vec3)
print(vec4)

tensor([ 1.0000, -1.1000,  3.9000])
torch.float32
tensor([ 1.0000, -1.1000,  3.9000], dtype=torch.float64)
tensor([1, 2, 3, 4], dtype=torch.int32)
tensor([ 1, -1,  3])


#### 2.3 Matrix  
Now let's create a matrix tensor.  
If you start dealing with datatypes as matrices, it's a good idea to designate the device as a gpu for parallel processing.  
Now, since it is difficult to generate the data manually, we will use rand.  
Let's fix the seed to make same results of rand function. 

In [8]:
torch.manual_seed(1) # torch.manual_seed(seed) fixing random seed

<torch._C.Generator at 0x2a5832a96f0>

In [9]:
mat = torch.rand([2, 2], device='cuda')
print(mat)
print(mat.device)


tensor([[0.8903, 0.0275],
        [0.9031, 0.5386]], device='cuda:0')
cuda:0


#### 2.4 Tensor
Now let's create a tensor of more than three dimensions.  
'to(dtype=dtype, device=device)' : This method returns the desired data type, the tensor of the desired device, as a method within the tenso


In [10]:
torch.manual_seed(2)
# T = torch.rand([64, 64, 3], device='cuda:0')
T = torch.rand([64, 64, 3], device='cpu')
T = T * 256
print(T[0:1])
print(T.dtype)

T = T.to(int) 
T = T.to(device='cuda:0')
print(T[0:1])
print(T.dtype)

tensor([[[157.3619,  97.5393, 163.1013],
         [121.4620, 182.6800, 158.4733],
         [113.2884,  24.5168, 157.2242],
         [ 14.6759, 144.8228, 136.5066],
         [ 99.8530, 232.6652, 136.5423],
         [181.0788, 182.1689,  52.4888],
         [ 78.7924, 251.0999,   2.6269],
         [119.3065, 117.8560, 218.7916],
         [115.8311, 161.7047, 121.8555],
         [ 56.3268,  55.4509,  65.8133],
         [ 11.7242,  44.9303, 158.1247],
         [212.2427, 134.3056,  69.3261],
         [184.2539,  78.8643,  99.6456],
         [ 57.8295,  87.8077,   9.3952],
         [182.6163, 177.7710, 153.4251],
         [190.8456, 182.2505, 133.6631],
         [141.5559, 137.7742, 196.3005],
         [213.9836, 219.9226, 202.1895],
         [ 96.7924, 122.3017, 101.9875],
         [202.4617, 142.2154, 246.4687],
         [192.9206,  18.6036, 165.4441],
         [250.9919, 241.6952, 125.9885],
         [170.4645,   7.9294,  87.1935],
         [190.4178,  11.3967, 239.5140],
         [ 43.82

# Pop Quiz 1 (Easy)  
1) !Check the memory capacity and current usage of gpu through `nvidia-smi`
2) Create a four-dimensional random tensor of size (1000, 64, 64, 3) and assign it to gpu memory
3) Calculate again how much gpu memory usage has increased through `nvidia-smi`

#### If your laptop doesn't have a GPU, set the CPU to device and check the memory usage through the task manager.

In [None]:
!nvidia-smi

In [4]:
4d_tensor = pass

In [None]:
!nvidia-smi

# 3. Tensor manipulation

In linear algebra, we learn how to add, subtract, inner product and outer product of two vectors, and we learn matrix operations such as addition, subtraction, multiplication, and inverse matrix calculation. All of these vectors, matrices, and tensor operations can similarly be performed on the pytorch. 

#### 3.1 Vector, Matrix, Tensor Operation

In [11]:
# Vector Inner Product
vec1 = tensor([1, 2, 3])
vec2 = tensor([4, 5, 6])
# For dot product of vectors, use the torch.dot method or the tensor.dot method.
vec3 = vec1.dot(vec2) # 1 * 4 + 2 * 5 + 3 * 6
print(vec3)




tensor(32)


In [12]:
# Vector Cross Product
crosse_vec = torch.cross(vec1, vec2) # The length of the vector must be three dimensions.
print(crosse_vec)

tensor([-3,  6, -3])


In [13]:
# Vector Outer Product
outer_vec = torch.outer(vec1, vec2) # Calculates the outer part of the vector. The output is a matrix.
print(outer_vec)


tensor([[ 4,  5,  6],
        [ 8, 10, 12],
        [12, 15, 18]])


In [14]:
# * :asterisk This symbol performs elementwise product.
vec4 = vec1 * vec2
print(vec4) # [4, 10, 18]



tensor([ 4, 10, 18])


In [15]:
# Any tensor can be element multiplied via the asterisk symbol.
mat1 = tensor([[1, 2], 
              [3, 4]])
mat2 = tensor([[5, 6], 
              [7, 8]])
print(mat1 * mat2) # 5, 12, 21, 32

tensor([[ 5, 12],
        [21, 32]])


In [16]:
# at sign @ : The symbol of the golbang performs matrix multiplication.
# Matrix multiplication can also be calculated using the matmul function.

mat4 = mat1 @ mat2
mat5 = mat1.matmul(mat2)
print(mat4)
print(mat5)

ten1 = tensor([[[1, 2], 
              [3, 4]],
                [[5, 6], 
              [7, 8]]])
ten2 = tensor([[[1, 2], 
              [3, 4]],
                [[5, 6], 
              [7, 8]]])

a = ten1.matmul(ten2)
b = ten1 @ ten2
print(a)
print(b)


tensor([[19, 22],
        [43, 50]])
tensor([[19, 22],
        [43, 50]])
tensor([[[  7,  10],
         [ 15,  22]],

        [[ 67,  78],
         [ 91, 106]]])
tensor([[[  7,  10],
         [ 15,  22]],

        [[ 67,  78],
         [ 91, 106]]])


In [19]:
# The identity matrix can be created through  .eye(shape).
eye_tensor = torch.eye(2)
print(eye_tensor.shape)

# diagnomal elemeents through torch.diag(matrix)
v = torch.randn([4,4])
print(v)
diag_vector = torch.diag(v)
print(diag_vector)

torch.Size([2, 2])
tensor([[-2.3838, -1.8098, -1.3820, -2.1522],
        [ 0.1541, -1.0625,  0.1794, -0.5829],
        [ 0.1726,  0.1184, -0.5380,  0.1871],
        [ 0.5071,  0.0385,  1.2226, -0.0821]])
tensor([-2.3838, -1.0625, -0.5380, -0.0821])


In [20]:
# tensor dot
print(torch.tensordot(ten1, ten2))

tensor([[ 50,  60],
        [114, 140]])


#### 3.2 * **Tensor dimension manipulation** * (Very important!)  
To input data into a deep learning model, we really need to deal with the dimensions of the tensor.  
Methods for manipulating dimensions are as follows.  
1) Dimension reduction, expansion: `squeeze(), unsqueeze()` removes or adds dimensions of axis in which size is 1.  
2) Dimension exchange: `transpose(), permute()` : Change the order of the dimensions. It is usually used to handle tensors that are more than three dimensions.  
3) Changing tensor shape : `flatten(), view(), reshape()`  
4) Tensor concatenation and stacking: `cat(), stack()`  

#### 3.2.1 Dimension reduction, extension
The `tensor.squeeze(axis)` or `torch.squeeze(axis)` function removes a specific axis corresponding to the axis value.  
The `tensor.unsqueeze(axis)` or `torch.unsqueeze(axis)` function adds a dimension to the axis corresponding to the axis value.  

In [18]:
vec1 = tensor([1, 2, 3])
print(vec1.shape)
unsq_vec1 = vec1.unsqueeze(1)
print(unsq_vec1.shape)
unsq_vec1 = unsq_vec1.unsqueeze(1)
print(unsq_vec1.shape)
sq_vec1 = unsq_vec1.squeeze(2)
print(sq_vec1.shape)
sq_vec1 = sq_vec1.squeeze(1)
print(sq_vec1.shape)
sq_vec1 = sq_vec1.squeeze()
print(sq_vec1.shape)

torch.Size([3])
torch.Size([3, 1])
torch.Size([3, 1, 1])
torch.Size([3, 1])
torch.Size([3])
torch.Size([3])


In [89]:
# Easily create tensors with all values of 1 via torch.ones(shape)
tensor_7d = torch.ones([10, 32, 1, 1, 64, 64, 3])
tensor_7d = torch.zeros([10, 32, 1, 1, 64, 64, 3])


# Use the `ones_like` method when it is cumbersome to input the shape of a dimension individually.
tensor_7d2 = torch.ones_like(tensor_7d)
tensor_7d3 = torch.zeros_like(tensor_7d)
print(tensor_7d.shape)
print(tensor_7d2.shape)

tensor_6d = tensor_7d.squeeze(axis=2) # Remove the axis corresponding to the 2nd index from shape.
print(tensor_6d.shape) # [10, 32, 1, 64, 64, 3]
tensor_5d = tensor_7d.squeeze()
print(tensor_5d.shape) # [10, 32, 64, 64, 3]



torch.Size([10, 32, 1, 1, 64, 64, 3])
torch.Size([10, 32, 1, 1, 64, 64, 3])
torch.Size([10, 32, 1, 64, 64, 3])
torch.Size([10, 32, 64, 64, 3])


In [87]:
tensor_4d = torch.ones([32, 3, 64, 64])
print(tensor_4d.shape)

tensor_5d = tensor_4d.unsqueeze(0)
tensor_5d2 = tensor_4d.unsqueeze(4) # = -1 
print(tensor_5d.shape)
print(tensor_5d2.shape)


torch.Size([32, 3, 64, 64])
torch.Size([1, 32, 3, 64, 64])
torch.Size([32, 3, 64, 64, 1])


#### 3.2.2 Dimension exchange: ` transpose(), permute()  `
The methods above are used to swap dimensions.


In [23]:
matrix = torch.ones([2, 3])
print(matrix.shape)

torch.Size([2, 3])


In [25]:
# torch.transpose(input, dim0, dim1) swaps two dimensions of the input tensor.
# Available by calling tensor.transpose() or via torch.transpose().
transposed = torch.transpose(matrix, 0, 1)
print(transposed.shape)
transposed2 = matrix.transpose(0, 1)
print(transposed2.shape)

tensor_4d = torch.ones([4, 2, 6, 11])
transposed_4d = tensor_4d.transpose(0, 2)
print(transposed_4d.shape) # 6, 2, 4, 11


torch.Size([3, 2])
torch.Size([3, 2])
torch.Size([6, 2, 4, 11])


In [26]:
# torch.permute() : Super version method of transpose. List the dimension axis order as desired.

permuted_4d = torch.permute(tensor_4d, [3, 2, 0, 1])
print(permuted_4d.shape)
permuted_4d2 = torch.permute(tensor_4d, (1, 0, 2, 3))
print(permuted_4d2.shape)

torch.Size([11, 6, 4, 2])
torch.Size([2, 4, 6, 11])


# Pop Quiz 2 (Easy)   
Add a redundancy dimension to the 0th axis of `tensor_4d` and swap the 0th and 2nd dimensions using `permute()`

In [2]:
# Code implementation
pass

#### 3.2.3 Changing the shape of tensor  : `flatten(),  reshape(), view()`  
The above three methods are mainly used when exchanging data between the CNN layer and the FC layer.  
Or, these are often used when changing the shape of an input tensor.


`torch.flatten()` The method converts the entire tensor into a 1D vector.


In [27]:
t = tensor([[[1, 2],
            [3, 4]],
           [[5, 6],
           [7, 8]]])
print(t.shape)

torch.Size([2, 2, 2])


In [28]:
flattened_t = torch.flatten(t)
print(flattened_t)
print(flattened_t.shape)

tensor([1, 2, 3, 4, 5, 6, 7, 8])
torch.Size([8])


In [29]:
# 만약에 start_dim, end_dim이 주어지면 해당하는 차원부터 flatten합니다.
flattened_t2 = torch.flatten(t, start_dim=2)
print(flattened_t2)
print(flattened_t2.shape)

tensor([[[1, 2],
         [3, 4]],

        [[5, 6],
         [7, 8]]])
torch.Size([2, 2, 2])


`reshape()` This method adjusts the dimensions of the tensor to have the desired shape.  
However, at this time, the product of total shape of each dimension must match the original dimension. ex 8 -> 2, 2, 2 or 4, 2  
If -1 is entered, the corresponding dimension is automatically calculated.



In [108]:
t2 = torch.arange(36)
t2.shape

torch.Size([36])

In [109]:
reshaped_t = t2.reshape(3, 3, 4)
print(reshaped_t.shape)

torch.Size([3, 3, 4])


In [107]:
reshaped_t = t2.reshape(-1, 2) 
print(reshaped_t.shape)

torch.Size([18, 2])


`torch.view(m, ...,)`: It does the same thing as reshape, but behaves differently from a memory perspective. See the documentation for details.


In [116]:
t3 = torch.arange(24940)
viewed_t1 = t3.view(5, 4, -1)
viewed_t2 = t3.reshape(10, -1, 1)
print(viewed_t1.shape)
print(viewed_t2.shape)


torch.Size([5, 4, 1247])
torch.Size([10, 2494, 1])


#### 3.2.3 Stacking and Concatenation : `cat() stack()` 
Finally, we utilize cat() and stack() to concatenate tensors.
`cat()` merges two tensors along the same axis. It is mainly used inside the model to combine tensors that go into the next hidden layer.
The `stack()` function is mainly used to build mini-batches by stacking data.


In [30]:
# The torch.cat() function merges two or more sequences of tensors (along a specific axis).
# These tensors must have the same dimension.
vec1 = tensor([1, 2, 3])
vec2 = tensor([4, 5, 6])
vec3 = tensor([7, 8, 9])
vec4 = torch.cat([vec1, vec2, vec3])
print(vec4)

tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])


In [31]:
# The stack() method stacks two or more tensors along a specific axis.
batch_vec = torch.stack([vec1, vec2, vec3], dim=0)
print(batch_vec)
print(batch_vec.shape)

batch_vec = torch.stack([vec1, vec2, vec3], dim=1)
print(batch_vec)
print(batch_vec.shape)

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
torch.Size([3, 3])
tensor([[1, 4, 7],
        [2, 5, 8],
        [3, 6, 9]])
torch.Size([3, 3])


# Pop Quiz 3
1) Create a random vector of length 4096 using `arange()`
2) Resize the vector to a square (64x64) matrix using one of the methods learned above using `reshape() or view()`
3) Repeat steps 1 and 2 to create three square matrices, and then add a redundancy dimension of size 1 to the 0th axis of each tensor using `unsqueeze()`
4) Exchange the 0th axis and the last axis using `transpose()`
5) Put three square matrices in a list and synthesize the tensor based on the last axis -> You should get a 64x64x3 tensor using `cat()`
6) Repeat the above process 4 times to stack 64x64x3 to create a 4 x 64 x 64 x 3 tensor using `stack()`

In [32]:
torch.manual_seed(1030)
# code : 
pass

# Pop Quiz 4
1. Re-implement as a torch component of Policy Class implemented in Pop Quiz presented in Class.ipynb.

#### Policy class requirements
- `__init__(self, state_dim, act_dim)`:
   - Function: Receives state_dimension and action_dimension as input and stores them as member variables.
   - A random matrix with the shape of [aciton_dim, state_dim] is stored in self.weight. (Hint. Use the `torch.randn` method)
- `__call__(self, state)`:
   - Function: internally calls get_dist() and returns the result.

- `get_dist(self, state)`:
   - This function takes state as input and returns distribution.
   - Obtain a linearly transformed vector through matrix multiplication of self.weight and state. Matrix multiplication is computed in the same way as `mat3 = mat1 @ mat2` .
   - Pass the linearized vector through a nonlinear function in the same way as nonlinear_vector = `relu(vector)`.
   - Apply the `softmax` function to the calculated vector to obtain the probability distribution `probs`.
   - Get the categorical distribution `dist` via `dist=Categorical(probs)` and return it.
  
- `get_action(self, dist)`:
   - Function: Take distribution as input, sample action_index and return.
   - Return action_index through `dist.sample()`.

In [33]:
import torch
from torch.nn.functional import softmax, relu
from torch.distributions import Categorical

class Policy:
    def __init__(self, obs_dim, act_dim):
        pass
    
    def get_dist(self, state):
        pass
    
    def __call__(self, state):
        pass
    
    def get_action(self, dist):
        pass

In [34]:
obs_dim = 8
act_dim = 4
policy = Policy(obs_dim, act_dim)

state = torch.randn(8)
dist = policy(state)
action = policy.get_action(dist)
print(action)

None
