# 1. AI, ML and DL
AI = 인간의 지능을 모방하여 사람이 하는 일을 컴퓨터가 할 수 있도록하는 기술  
머신러닝(Machine Learning, ML)과 딥러닝(Deep Learning, DL)이 대표적임  

머신러닝은 주어진 데이터를 전처리하는 과정이 필수적임 (특징 추출), 딥러닝의 경우 해당 과정을 생략하게 됨 (CNN, RNN, ...)  

딥러닝도 머신러닝과 크게 다르지 않음 (학습(train, learning) -> 예측(prediction))

## CLF AI
1. 지도 학습 (Supervised Learning)
   1. 분류 (Classification)
      1. KNN
      2. SVM
      3. DT
      4. Logistic Regression
      5. ...
   2. 회귀 (Regression)
      1. Linear Regression
2. 비지도 학습 (Unsupervised Learning)
   1. Clustering
      1. K-means
      2. DBSCAN
      3. ...
   2. Dimensionality Reduction
      1. PCA
      2. LDA
      3. ...
3. 강화 학습 (Reinforcement Learning)
   1. MDP (Markov Decision Process)
   2. Monte Carlo method

## Deep Learning?
신경망 원리를 모방한 심층 신경망 이론을 기반으로 고안된 **머신러닝** 방법의 일종  
인공지능 > 머신러닝 > 딥러닝

데이터 -> 모델 정의 -> 모델 컴파일(옵티마이저, 손실 함수, ...) -> 훈련 -> 예측

### len(data)==1000, batch==20, epochs=10?
가중치를 50번 업데이트를하는 것을 10번 반복 -> 최종적으로, 가중치는 500번 업데이트 됨

딥러닝은 머신러닝의 한 분야지만, 심층 신경망(Deep Neural Network, NN)을 사용한다는 명백한 차이점이 있음  
-> 심층 신경망? 데이터셋의 어떤 특성들이 중요한지 스스로에게 가르쳐 줄 수 있는 기능  
-> [**역전파**](https://ko.wikipedia.org/wiki/%EC%97%AD%EC%A0%84%ED%8C%8C)  
-> [**기울기 소실**](https://ko.wikipedia.org/wiki/%EA%B8%B0%EC%9A%B8%EA%B8%B0_%EC%86%8C%EB%A9%B8_%EB%AC%B8%EC%A0%9C)  
-> [**전이 학습 & 미세튜닝**](https://dacon.io/forum/405988)

# 2. PyTorch
> GPU에서 텐서 조작 및 동적 신경망 구축을 위한 프레임워크, ~~미분기계~~

### 동적 신경망?
훈련을 반복할 때마다 네트워크 변경이 가능한 신경망

In [24]:
# 1D -> Vertor
# 2D -> Matrix
# 3D 이상 -> Tensor
import torch
import numpy as np

vector = np.array([1, 2, 3])
matrix = np.array([
    [1, 2, 3],
    [4, 5, 6]
])
tensor = torch.tensor(
    [[[1, 1, 1],
      [2, 2, 2]],
     [[3, 3, 3],
     [4, 4, 4]]]
)

In [None]:
torch # GPU 지원 텐서 패키지
torch.autograd # 미분 패키지
torch.nn # 신경망
torch.multiprocessing # 멀티프로세싱
torch.utils # 유틸리티

In [None]:
from torch.utils.data import Dataset, DataLoader
from torch.utils import checkpoint

### offset and stride

In [33]:
A = torch.tensor([
    [1, 2, 3],
    [4, 5, 6]
])

print(A) # 1 2 3 4 5 6 / offset=1, stride=(3, 1)
print(A.T) # 1 4 2 5 3 6 / offset=1, stride=(2, 1)

tensor([[1, 2, 3],
        [4, 5, 6]])
tensor([[1, 4],
        [2, 5],
        [3, 6]])


### Basic

In [49]:
torch.cuda.is_available() # GPU 사용 가능 여부 확인
device = "cuda" if torch.cuda.is_available() else "cpu"

In [45]:
torch.cuda.get_device_name()

'NVIDIA GeForce RTX 4080'

In [50]:
# 텐서 생성
print(torch.tensor([[1, 2], [3, 4]]))
print(torch.tensor([[1, 2], [3, 4]], device=device))
print(torch.tensor([[1, 2], [3, 4]], dtype=torch.float64))

tensor([[1, 2],
        [3, 4]])
tensor([[1, 2],
        [3, 4]], device='cuda:0')
tensor([[1., 2.],
        [3., 4.]], dtype=torch.float64)


In [51]:
# 텐서 -> 배열
temp = torch.tensor([[1, 2], [3, 4]])
print(temp.numpy())

[[1 2]
 [3 4]]


In [54]:
# GPU 텐서 -> CPU 텐서 -> 배열
temp = torch.tensor([[1, 2], [3, 4]], device=device)
print(temp)
print(temp.to("cpu"))
print(temp.to("cpu").numpy())

tensor([[1, 2],
        [3, 4]], device='cuda:0')
tensor([[1, 2],
        [3, 4]])
[[1 2]
 [3 4]]


In [55]:
# 텐서 조작
temp = torch.tensor([1, 2, 3, 4, 5, 6])
print(temp[0], temp[1], temp[2])
print(temp[:3])

tensor(1) tensor(2) tensor(3)
tensor([1, 2, 3])


In [59]:
v = torch.tensor([1, 2, 3])
w = torch.tensor([4, 5, 6])

print(v.dot(w))

tensor(32)


In [60]:
# 차원 조작
temp = torch.tensor([
    [1, 2],
    [3, 4]
])

temp

tensor([[1, 2],
        [3, 4]])

In [62]:
temp.shape

torch.Size([2, 2])

In [64]:
print(temp.view(4, 1).shape)
temp.view(4, 1)

torch.Size([4, 1])


tensor([[1],
        [2],
        [3],
        [4]])

In [65]:
print(temp.view(-1).shape)
temp.view(-1)

torch.Size([4])


tensor([1, 2, 3, 4])

In [66]:
print(temp.view(1, -1).shape)
temp.view(1, -1)

torch.Size([1, 4])


tensor([[1, 2, 3, 4]])

In [73]:
torch.stack([v, w], dim=0), torch.stack([v, w], dim=1)

(tensor([[1, 2, 3],
         [4, 5, 6]]),
 tensor([[1, 4],
         [2, 5],
         [3, 6]]))

In [84]:
x = torch.randn(2, 3)
print(x.shape)
x

torch.Size([2, 3])


tensor([[-0.1265,  1.2245,  0.3920],
        [-0.8166,  1.1322,  0.2424]])

In [85]:
print(torch.cat((x, x, x), dim=0).shape)
torch.cat((x, x, x), dim=0)

torch.Size([6, 3])


tensor([[-0.1265,  1.2245,  0.3920],
        [-0.8166,  1.1322,  0.2424],
        [-0.1265,  1.2245,  0.3920],
        [-0.8166,  1.1322,  0.2424],
        [-0.1265,  1.2245,  0.3920],
        [-0.8166,  1.1322,  0.2424]])

In [86]:
print(torch.cat((x, x, x), dim=1).shape)
torch.cat((x, x, x), dim=1)

torch.Size([2, 9])


tensor([[-0.1265,  1.2245,  0.3920, -0.1265,  1.2245,  0.3920, -0.1265,  1.2245,
          0.3920],
        [-0.8166,  1.1322,  0.2424, -0.8166,  1.1322,  0.2424, -0.8166,  1.1322,
          0.2424]])

In [90]:
torch.transpose(x, 0, 1) # permute는 모든 차원을 교환할 수 있음

tensor([[-0.1265, -0.8166],
        [ 1.2245,  1.1322],
        [ 0.3920,  0.2424]])

### Data

In [101]:
import pandas as pd

random_xy_data = pd.read_csv("../data/random_xy_data.csv")

print(random_xy_data.shape)
random_xy_data

(10, 2)


Unnamed: 0,x,y
0,5,8
1,41,90
2,73,32
3,21,65
4,27,12
5,36,39
6,16,65
7,45,51
8,5,14
9,97,76


In [119]:
print(torch.from_numpy(data.values).shape)
print(torch.from_numpy(data.values).unsqueeze(dim=0).shape)
print(torch.from_numpy(data.values).unsqueeze(dim=1).shape)
print(torch.from_numpy(data.values).unsqueeze(dim=2).shape)

torch.Size([10, 2])
torch.Size([1, 10, 2])
torch.Size([10, 1, 2])
torch.Size([10, 2, 1])


In [106]:
data = random_xy_data.copy()

x = torch.from_numpy(data["x"].values).unsqueeze(dim=1).float()
y = torch.from_numpy(data["y"].values).unsqueeze(dim=1).float()

In [125]:
pd.read_csv("../data/Spaceship_Titanic_train.csv").head(3)

Unnamed: 0,PassengerId,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported
0,0001_01,Europa,False,B/0/P,TRAPPIST-1e,39.0,False,0.0,0.0,0.0,0.0,0.0,Maham Ofracculy,False
1,0002_01,Earth,False,F/0/S,TRAPPIST-1e,24.0,False,109.0,9.0,25.0,549.0,44.0,Juanna Vines,True
2,0003_01,Europa,False,A/0/S,TRAPPIST-1e,58.0,True,43.0,3576.0,0.0,6715.0,49.0,Altark Susent,False


In [147]:
temp = pd.read_csv("../data/Spaceship_Titanic_train.csv")
temp[temp.dtypes[temp.dtypes!="object"].index]

Unnamed: 0,Age,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Transported
0,39.0,0.0,0.0,0.0,0.0,0.0,False
1,24.0,109.0,9.0,25.0,549.0,44.0,True
2,58.0,43.0,3576.0,0.0,6715.0,49.0,False
3,33.0,0.0,1283.0,371.0,3329.0,193.0,False
4,16.0,303.0,70.0,151.0,565.0,2.0,True
...,...,...,...,...,...,...,...
8688,41.0,0.0,6819.0,0.0,1643.0,74.0,False
8689,18.0,0.0,0.0,0.0,0.0,0.0,False
8690,26.0,0.0,0.0,1872.0,1.0,0.0,True
8691,32.0,0.0,1049.0,0.0,353.0,3235.0,False


In [159]:
# CustomDataset
from torch.utils.data import Dataset
from torch.utils.data import DataLoader

class CustomDataset(Dataset):
    def __init__(self, file_path): # 변수 선언, 데이터 전처리 등
        self.label = pd.read_csv(file_path).iloc[:10]
        self.label = self.label[self.label.dtypes[self.label.dtypes!="object"].index]
    
    def __len__(self): # 길이 -> 총 샘플 수
        return len(self.label)
    
    def __getitem__(self, idx): # 텐서 형태의 개별 데이터 정보
        sample = torch.tensor(self.label.iloc[idx, :-1])
        label = torch.tensor(self.label.iloc[idx, -1])
        return sample, label

In [160]:
Spaceship_Titanic_data = CustomDataset("../data/Spaceship_Titanic_train.csv")
loader = DataLoader(Spaceship_Titanic_data, batch_size=4, shuffle=True)

In [161]:
for idx, data in enumerate(loader, 0):
    print(f"{idx} {data[0].size()}")

0 torch.Size([4, 6])
1 torch.Size([4, 6])
2 torch.Size([2, 6])


  label = torch.tensor(self.label.iloc[idx, -1])


In [167]:
# torchvision <- torch가 제공하는 데이터셋 패키지
# https://pytorch.org/vision/stable/datasets.html
from torchvision.datasets import MNIST
import torchvision.transforms as transforms

transformers = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((.5, ), (1., ))
])

download_root = "./"
train = MNIST(download_root, transform=transformers, train=True, download=True)
valid = MNIST(download_root, transform=transformers, train=False, download=True)
test = MNIST(download_root, transform=transformers, train=False, download=True)

In [170]:
train_loader = DataLoader(train, batch_size=16, shuffle=True)
valid_loader = DataLoader(valid, batch_size=16, shuffle=True)
test_loader = DataLoader(test, batch_size=16, shuffle=True)

### Models
- 계층(layer): CNN, Linear, ...
- 모듈(module): 한 개 이상의 계층
- 모델(model): 최종적인 네트워크(신경망)

In [176]:
import torch.nn as nn

# simple
model = nn.Linear(in_features=1, out_features=1, bias=True)

model

Linear(in_features=1, out_features=1, bias=True)

In [182]:
# inheritance
class MLP(nn.Module):
    def __init__(self, inputs): # 모듈, 활성화 함수, ...
        super(MLP, self).__init__()
        self.layer = nn.Linear(inputs, 1)
        self.activation = nn.Sigmoid()
    
    def forward(self, X): # 연산 정의
        X = self.layer(X)
        X = self.activation(X)
        return X

In [185]:
# inheritance with Seq
class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(in_channels=3, out_channels=64, kernel_size=5),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(2)
        )
        self.layer2 = nn.Sequential(
            nn.Conv2d(in_channels=64, out_channels=30, kernel_size=5),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(2)
        )
        self.layer3 = nn.Sequential(
            nn.Linear(in_features=30*5*5, out_features=10, bias=True),
            nn.ReLU(inplace=True)
        )
        
    def forward(self, x):
        x = self.layer1(x)
        x = self.layer2(x)
        x = x.view(x.shape[0], -1)
        x = self.layer3(x)
        return x
    
model = MLP()

In [189]:
model.children # same level

<bound method Module.children of MLP(
  (layer1): Sequential(
    (0): Conv2d(3, 64, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer2): Sequential(
    (0): Conv2d(64, 30, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer3): Sequential(
    (0): Linear(in_features=750, out_features=10, bias=True)
    (1): ReLU(inplace=True)
  )
)>

In [188]:
model.modules # all

<bound method Module.modules of MLP(
  (layer1): Sequential(
    (0): Conv2d(3, 64, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer2): Sequential(
    (0): Conv2d(64, 30, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer3): Sequential(
    (0): Linear(in_features=750, out_features=10, bias=True)
    (1): ReLU(inplace=True)
  )
)>

### Parameters
- 손실함수(loss function): 학습 동안의 실제값과 예측값의 차이 (오차)
- 옵티마이저(optimizer): 업데이트 방법 결정
  - `step`을 통해 업데이트
  - `torch.optim.Optimizer`
  - `zero_grad`는 옵티마이저에 사용된 파라미터들의 기울기(gradient)를 0으로 만듬
  - `torch.optim.lr_scheduler`로 학습률 조절 가능
- 학습 스케줄러(learning rate scheduler)
- 지표(metrics)

#### 지역 최소점과 전역 최소점

### Train

In [None]:
for epoch in range(1000):
    pred_y = model(x_train)
    loss = criterion(pred_y, y_train) # 오차 계산
    optimizer.zero_grad() # 오차 중첩 방지를 위한 초기화
    loss.backward()
    optimizer.step()

In [None]:
# model.eval()
with torch.no_grad():
    valid_loss = 0
    for x, y in valid_loader:
        pred = model(x)
        loss = F.cross_entropy(pred, y)
        valid_loss += float(loss)
        pred_y += [pred]
        
valid_loss = valid_loss / len(valid_loader)