<a href="https://colab.research.google.com/github/youse0ng/pytorch_practice/blob/main/02_pytorch_classification_video.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 02. Neural Network classification with Pytorch

Classification is a problem of predicting whether something is one thing or another (there can be multiple things as the options).
분류는 분류(Classification)는 어떤 것이 다른 것 중 하나인지 예측하는 문제입니다.

## 1. Make classification data and get it ready

In [None]:
import sklearn

In [None]:
from sklearn.datasets import make_circles

# Make 1000 samples
n_samples=1000

# Create circles (넘파이 어레이)
X,y= make_circles(n_samples,
                  noise=0.03,
                  random_state=42)

In [None]:
X[0],y[0]

In [None]:
print(f"First 5 samples of X:\n {X[:5]}")
print(f"First 5 samples of y:\n {y[:5]}")

In [None]:
# Make dataframe of circle data

import pandas as pd

circles=pd.DataFrame({"X1":X[:,0],
                      "X2":X[:,1],
                      "label":y})

circles.head(10)

In [None]:
circles.label.value_counts()
# 1번 클래스 500개
# 0번 클래스  500개

In [None]:
# Visualize,visualize,visualize
import matplotlib.pyplot as plt
plt.scatter(x=X[:,0],
            y=X[:,1],
            c=y, # color
            cmap= plt.cm.RdYlBu # 컬러)
)


**Note:** The data we're working with is often referred to as a toy dataset, a dataset that is small enough to experiment but still sizeable enough to practice the fundamental

### 1.1 Check input and output shapes

In [None]:
X.shape,y.shape

In [None]:
# View the first example of features and labels

X_sample=X[0]
y_sample=y[0]

print(f"Values for one sample of X: {X_sample} and the same for y: {y_sample}")
print(f"Shapes for one sample of X: {X_sample.shape} and the same for y: {y_sample.shape}")

# 스칼라는 형태가 없어서 ()로 출력

### 1.2 Turn data into tensors and create train and test splits

In [None]:
import torch
torch.__version__

In [None]:
type(X), X.dtype
# 넘파이의 기본 데이터형은 float 64이다
# 파이토치 기본 유형은 float 32이다
# X는 넘파이 어레이이다.

In [None]:
# Turn data into tensors
X=torch.from_numpy(X).type(torch.float) # 파이토치 기본형으로 변환
y=torch.from_numpy(y).type(torch.float) # 파이토치 기본형으로 변환

X[:5],y[:5]

In [None]:
type(X),X.dtype,y.dtype

In [None]:
# Split data into training and test sets
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(X,
                                               y,
                                               test_size=0.2, # 0.2=20% of data will be test & 80% data will be train
                                               random_state=42 # torch.manual_seed(42)와 유사
)

In [None]:
len(X_train),len(X_test), len(y_train),len(y_test)

# 훈련용 샘플 800 ,
# 테스트 샘플 200


In [None]:
X_train[0],y_train[0]

In [None]:
n_samples # 1000

1. 데이터를 확인해보기 -> indexing 으로 확인

2. 데이터를 가공 (dataframe화) = DataFrame

https://kongdols-room.tistory.com/107

Pandas에서 데이터를 확인하는 방법 중 하나로, 가장 쉽고 단순하게 확인할 수 있는 함수
* head()함수, head(n=5), DataFrame 내의 처음 n줄의 데이터를 출력

* tail()함수, tail(n=5),DataFrame 내의 마지막 n줄의 데이터를 출력한다.

객체 내에 데이터 타입을 확인하는데 유용

그래도 만약 데이터가 뭔지 모르겠다면 Visualize!

3. import matplotlib.pyplot as plt

scatter() 메서드에서 cmap은 컬러 맵(colormap)을 지정하는 매개변수입니다. 컬러 맵은 데이터 포인트의 값에 따라 색상을 매핑하는 데 사용됩니다. cmap 매개변수를 통해 지정된 컬러 맵은 색상 매핑에 사용되며, 각 데이터 포인트에 대한 색상을 결정합니다.

컬러 맵은 주로 연속적인 데이터에 사용되며, 데이터의 범위에 따라 다양한 색상으로 표현할 수 있습니다. 예를 들어, 데이터가 낮은 값부터 높은 값까지 범위를 가지는 경우, 컬러 맵은 해당 범위에 대한 색상 그라디언트를 생성합니다.

c(color)는

scatter(c=y)는 scatter() 메서드에서 c 매개변수에 y 값을 전달하는 것을 의미합니다. 이 경우 y 값이 데이터 포인트의 색상을 지정하는 데 사용됩니다.

일반적으로 c 매개변수에 단일 값, 값의 리스트 또는 배열, 또는 컬러 맵을 지정할 수 있습니다. c=y의 경우 y는 데이터 포인트의 색상을 나타내는 값들의 리스트 또는 배열입니다.

scatter() 메서드에 c=y를 사용하여 데이터 포인트의 색상을 y 값에 따라 지정할 수 있습니다

c=y를 사용하여 데이터 포인트의 색상을 y 값에 따라 지정하였습니다. y 값이 색상으로 사용되므로 y 값이 클수록 더 진한 색상으로 표시됩니다.

4. 데이터 split

from sklearn.model_selection import train_test_split
사용

## Building a model

Let's build a model to classify our blue and red dots.

To do so, we want to:
1. Setup device agnostic code so our code will run on an accelerator(GPU) if there is one
2. Construct a model (by subclassing 'nn.Module')
3. Define a loss function and optimizer
4. Create a training loop and test loop

In [None]:
# Import pytorch and nn

import torch
from torch import nn

# Make device agnostic code
device= "cuda" if torch.cuda.is_available() else "cpu"
device

In [None]:
X_train

Now we've setup device agnostic code, let's create a model that:

1. Subclass `nn.Module`(almost all models in Pytorch subclass `nn.Module`)
2. Create 2 `nn.Linear()` layers that are capable of handling the shapes of our data
3. Defines a `forward()` method that outlines the forward pass(or forward computation) of our model
4. Instatiate an instance of our model class and send it to the target `device`

In [None]:
X_train.shape

In [None]:
# 1. Construct a model that subclasses nn.Module

class CirclemodelV0(nn.Module):
  def __init__(self):
    super().__init__()
    # 2. Create 2 nn.Linear layers capable of handling the shapes of our data
    #shift command space -> 독스트링이 나옴
    self.layer_1=nn.Linear(in_features=2 ,out_features=5)  # 첫 레이어의 in_feature가 왜 2일까 그건 X_train.shape를 보면 알 수 있다.
    self.layer_2=nn.Linear(in_features=5,out_features=1) # output layer #in_feature의 value는 이전 레이어의 out_feature의 value와 같아야한다. 따라서 5가 된다.
    # in_feature의 value가 이전 레이어의 out_feature와 맞지 않는다면 모양 불일치 오류가 난다.
    # 8의 배수가 컴퓨팅에 좋다. 경험에 의거한
    # 경험에 따르면 숨겨진 hidden feature 가 많으면 모델이 데이터의 패턴을 학습할 기회가 많아진다.
    # 위에서 패턴을 학습할 수 있는 숫자가 2개 (X1,X2), 5로 업스케일링 하면 패턴을 배울 수 있는 숫자가 5개가 된다.

  # 3. Define a forward() method that outlines the forward pass
  def forward(self, x):
    return self.layer_2(self.layer_1(x)) # x -> layer_1 ->layer_2 -> return output

  # 4. Instatiate an instance of our model class and send it to the target device
model_0=CirclemodelV0().to(device)
model_0

In [None]:
# Sequential 을 이용한 모델 정의
class CirclemodelV1_sequential(nn.Module):
  def __init__(self):
    super().__init__()
    self.two_linear_layers=nn.Sequential(
        nn.Linear(in_features=2,out_features=5),
        nn.Linear(in_features=5,out_features=1)
    )
  def forward(self, x):
    return self.two_linear_layers(x)

model_sequential=CirclemodelV1_sequential()

model_sequential.state_dict()

In [None]:
next(model_0.parameters()).device # 모델의 파라미터를 cuda에다 저장
#next() 메소드는 파이썬에서 이터레이터(iterator)를 사용할 때 다음 요소를 반환하는 함수입니다.
#이터레이터는 값을 한 번에 하나씩 차례대로 반환하는 객체로,
#반복 가능한(iterable) 객체를 대표하는 인터페이스입니다.

#my_list = [1, 2, 3, 4, 5]
#my_iterator = iter(my_list)
#print(next(my_iterator))  # 1
#print(next(my_iterator))  # 2
#print(next(my_iterator))  # 3


tensorflow playground
----
https://playground.tensorflow.org/#activation=tanh&batchSize=10&dataset=circle&regDataset=reg-plane&learningRate=0.03&regularizationRate=0&noise=0&networkShape=4,2&seed=0.35820&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false

In [None]:
# Let's replicate the model above using nn.Sequential()

model_0=nn.Sequential(
    nn.Linear(in_features=2,out_features=5),
    nn.Linear(in_features=5,out_features=1)
).to(device)

#위의 모델 선언과 다른점은 위에서는 forward함수를 써서 이런식으로 순방향으로 계산하라고했지만,
# 여기 sequential 모델은 말그대로 순차적으로 전달을 하라는 기본전제가 깔
model_0

In [None]:
# Make predictions

with torch.inference_mode():
  untrained_preds=model_0(X_test.to(device))
print(f"Length of predictions: {len(untrained_preds)}, shape: {untrained_preds.shape}")
print(f"Length of test samples: {len(X_test)}, Shape: {X_test.shape}")
print(f"\n First 10 predictions:\n {torch.round(untrained_preds[:10])}")
print(f"\nFirst 10 labels:\n{y_test}")

### 2.1 Setup loss function and optimizer

Which loss function or optimizer should you use?

Again... this is problem specific.

For example for regression you might want MAE OR MSE(mean absolute error or mean squared error).

For classification you might want binary cross entropy or categorical cross entropy(cross entropy).

As a reminder, the loss function measures how *wrong* your model predictions are.

And for optimizers, two of the most common and useful are SGD and Adam, However Pytorch has many built-in options.

* For some common choice of loss functions and optimizers - https://www.learnpytorch.io/02_pytorch_classification/#21-setup-loss-function-and-optimizer

* For the loss function we're going to use `torch.nn.BECWithLogitsLoss()`

* For different optimizers see `torch.optim`

https://www.learnpytorch.io/02_pytorch_classification/

In [None]:
# Setup the loss function
#loss_fn=nn.BCELoss() # BCELoss=requires inputs to have gone through the sigmoid activation prior to input to BCELoss(BCELoss에 들어오기전에 시그모이드를 거쳐야한다.)
nn.Sequential(
    nn.Sigmoid(),
    nn.BCELoss()
) # 이것이 BCEWithLogitsLoss()와 같다.
loss_fn =nn.BCEWithLogitsLoss() # BCEWithLogitsLoss =sigmoid activation function built-in
#위 손실함수가 더 수치적으로 안정적이다.
optimizer=torch.optim.SGD(params=model_0.parameters(),
                         lr=0.1)


In [None]:
model_0.state_dict()

In [None]:
# Calculate accuracy - out of 100 examples, what percentage does our model get right?

def accuracy_fn(y_true,y_pred):
  correct=torch.eq(y_true,y_pred).sum().item()
  acc=(correct/len(y_pred))*100
  return acc

## 3. Train model

To train our model, we're going to nee to build training loop following steps:

1. Forward pass
2. Calculate the loss
3. Optimizer zero grad
4. Loss backward
5. optimizer step(gradient descent)

## 3.1 Going from raw logits -> prediction probabilities -> prediction labels

our model outputs are going to be raw **logits**.

로짓은 모델의 출력이며 모델 레이어의 순방향 기능에서 나온 것이다.

We can convert these **logits** into **prediction probabilities** by passing them to
some kind of **activation function** (e.g. sigmoid for binary crossentropy and softmax for multiclass classification) 시그모이드와 소프트맥스와 같은 활성화 함수를 통해 예측확률로 전환한다.

Then we can convert our model's prediction probabilities to **Prediction label**
by either rounding them or taking the `argmax()`

`torch.round()`를 이용하여 label로 변환하고 argmax를 통해 높은 확률의 인자를 추출

In [None]:
# View the first 5 outputs of the forward pass on the test data
model_0.eval()
with torch.inference_mode():
  y_logits=model_0(X_test.to(device))
y_logits[:,0]

In [None]:
y_test
# y_logits를 y_test와 같이 형식을 바꿔주어야함


In [None]:
# Use the sigmoid activation function on our model logits to turn them into prediction probabilities
y_pred_probs=torch.sigmoid(y_logits)

For our prediction probabilities values, we need to perform a range-style rounding on them

* `y_pred_probs` >=0.5 ,`y=1` (class 1)
* `y_pred_probs` <0.5, `y=0 `(class 0)

In [None]:
# Find the predicted labels
y_preds=torch.round(y_pred_probs)

# in full (raw logits(포워드패스를 통한)-> pred_probs(2진 분류: Sigmoid) -> pred_labels(torch.round를 통한 label작업)
y_pred_labels=torch.round(torch.sigmoid(model_0(X_test.to(device))))

# Check for equality
print(torch.eq(y_preds.squeeze(),y_pred_labels.squeeze()))

# Get rid of extra dimension
y_preds.squeeze()

### 3.2 Building a training and testing loop

In [None]:
loss_fn =nn.BCEWithLogitsLoss()
optimizer=torch.optim.SGD(params=model_0.parameters(),
                            lr=0.01)

In [None]:
torch.manual_seed(42)
torch.cuda.manual_seed(42)

#

In [None]:
torch.manual_seed(42)
torch.cuda.manual_seed(42)

# Set the number of epochs
epochs=1000
Epoch=[]
Loss_values=[]
Test_Loss_values=[]
# Put data to target device
X_train,y_train=X_train.to(device),y_train.to(device)
X_test,y_test=X_test.to(device),y_test.to(device)

# Build training and evaluation loop
for epoch in range(epochs):
  ### Training
  model_0.train()

  # 1. Forward pass
  y_logits=model_0(X_train).squeeze()
  y_pred=torch.round(torch.sigmoid(y_logits)) # turn logits -> pred probs -> pred labels

  # 2. Calculate loss/accuracy
  loss=loss_fn(y_logits,
               y_train) # BCEWithlogitsLoss는 이진분류일때 쓰임
  acc=accuracy_fn(y_true=y_train,
                  y_pred=y_pred)

  # 3. Optimizer zero grad
  optimizer.zero_grad()

  # 4. Loss backward (backpropagation) 모든 매개변수에 대한 기울기를 계산합니다.
  loss.backward()

  # 5. Optimizer step (gradient descent) 기울기를 줄이기 위해 매개변수를 업데이트
  optimizer.step()

  ### Testing
  model_0.eval()
  with torch.inference_mode():
    # 1. Forward pass
    test_logits=model_0(X_test).squeeze()
    test_pred=torch.round(torch.sigmoid(test_logits))
    # 2. Calculate loss / accuracy
    test_loss=loss_fn(test_logits,y_test)
    test_acc=accuracy_fn(y_true=y_test,
                         y_pred=test_pred)


  # Print out what's happening?
  if epoch % 10==0:
    Loss_values.append(loss)
    Epoch.append(epoch)
    Test_Loss_values.append(test_loss)
    print(f"Epoch:{epoch} | Loss: {loss:.5f}, ACC:{acc} % | Test loss: {test_loss:.5f}, Test acc:{test_acc} %")

손실함수가 큰폭으로 떨어지지 않고, 정확도도 오히려 떨어짐..
거의 학습을 하지않은 것과 유사 어떻게 고칠 수 있을까..


## 4. Make predictions and evaluate the model.

From the metrics it looks like our model isn't learning anything...
지표만 봤을때, 모델이 학습을 하지않는 것처럼 보인다.

So to inspect it let's make some predictions and make them visual !
이를 확인하기 위해선 예측을 하고 visual을 해야한다.
In other words, Visualize, visualize, visualize

TO do so, we're going to import a function called `plot_decision_boundary()` -
https://github.com/mrdbourke/pytorch-deep-learning/blob/main/helper_functions.py

In [None]:
import requests
from pathlib import Path

# Download helper functions from Learn Pytorch repo (if it's not already downloaded)
if Path("helper_function.py").is_file():
  print("helper_function.py already exists, skipping download")
else:
  print("Downloading helper_functions.py")
  request=requests.get("https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/helper_functions.py")
  #github의 Raw 버젼의 URL을 문자열로 입력해주어야함, 야 requests야 저 URL의 정보좀 가져와 (저 링크는 PYTHON 스크립트이다.)
  with open("helper_functions.py","wb") as f: # "helper_functions.py"를 생성한 다음 "WB" 바이너리, f="helper_function.py"가 될 것이다/
    f.write(request.content) # f="helper_functions.py"니까 저기 py스크립트에 request의 정보 즉, 저 URL의 스크립트 정보를 f에다가 입혀라
from helper_functions import plot_predictions,plot_decision_boundary

In [None]:
# Plot decision boundary of the model

plt.figure(figsize=(12,6))
plt.subplot(1,2,1)
plt.title("train")
plot_decision_boundary(model_0,X_train,y_train)

plt.subplot(1,2,2)
plt.title("test")
plot_decision_boundary(model_0,X_test,y_test)

위에 보이는 선이 직선형태이다. 우리의 데이터는 원형인데 직선으로 빨간색점과 파란색점을 나눌 수 있을까..? Linear 선형 레이어를 짜서 이렇게 성능지표가 낮게 나온 건 아닐까 생각을 해보자.


In [None]:
type(Test_Loss_values[0]),type(Test_Loss_values)

# 외부 큰 틀은 list 형태인데, 리스트내의 원소는 torch.tensor인 형태
Test_Loss_values=torch.stack(Test_Loss_values)
Test_Loss_values.shape

### can't convert to numpy 어쩌구 나오면 확인해보기 데이터 형태와 데이터 내의 원소 형태를

In [None]:
import matplotlib.pyplot as plt
import numpy as np
plt.plot(Epoch,np.array(torch.tensor(Loss_values).numpy()),label="train loss")
plt.plot(Epoch,Test_Loss_values.cpu(), label="test loss")
plt.title("Training and test loss curves")
plt.ylabel("Loss")
plt.xlabel("Epochs")
plt.legend()

#OVer fitting

In [None]:
loss_fn_without_logits=nn.BCELoss()
loss_fn_without_logits

## 5. Improving a model (from a model perspective)

* Add more layers - give the model more chances to learn about patterns in the data
* Add more hidden units - go from 5 hiddne units to 10 hidden units # 모델이 데이터를 나타내는 매개 변수가 많아진다.
* Fit for longer
* Changing the activation functions 모델 내에 넣을 수 있는 활성화 함수도 존재한다.
* Change the learning rate
* Change the loss function

These options are all from a model's perspective because they deal directly with the model, rather than the data
위의 선택지들은 모두 모델의 관점으로부터 온 것이다 그 이유는 데이터를 건드리기 보다는 모델을 건드려서 성능을 올리려고 하는 것이기 때문이다.

데이터의 관점에서 모델을 개선할 방법이 몇가지 더 있긴하다. 하지만 우리는 데이터를 개선하는 것보단 모델을 개선하는 것에 더 힘을 실을 것이다.

**Because these options are all values we (as machine learning engineers and data scientists) can be change, they are referred as hyperparameters.**

우리는 머신 러닝엔지니어로서 이러한 선택지들을 바꿀 수 있다. 그것들은 하이퍼 파라미터로 변경 가능 하기 때문에

레이어 추가, 은닉노드 추가, 에포크 증가, 활성화기능 추가, 학습률 조정, 손실함수 변경 같은 것들은 우리가 하이퍼파라미터로 변경할 수 있기 때문에

In [None]:
model_0.state_dict()

# 20개쯤되는 파라미터들 0. 레이어의 파라미터들, 1. 레이어의 파라미터들
# 만약 10개의 레이어가 있으면 데이터의 패턴을 파악할 매개변수가 10개는 더 늘어날 것이다.


Let's try and improve our model by:
* Adding more hidden units: 5 -> 10
* Increase the number of layers: 2 -> 3
* Increase the number of epochs: 100 -> 1000

여기선 3가지의 변수들을 모두 적용했지만, 실제로는 과학자들은 1가지의 변수들만 변경해본다. 통제할 수 있는 변수를 찾기 위해서


In [None]:
class CircleModelV1(nn.Module):
  def __init__(self):
    super().__init__()
    self.layer_1=nn.Linear(in_features=2,out_features=10)
    self.layer_2=nn.Linear(in_features=10,out_features=10)
    self.layer_3=nn.Linear(in_features=10,out_features=1)

  def forward(self, x):
    # 포워드의 방법중 1번째
    # z=self.layer_1
    # z=self.layer_2(z)
    # z=self.layer_3(z)
    # return z

    # 포워드 방법 중 2번째
    # 본 방법이 computing 연산에 있어서 빠르게 계산 가능
    return self.layer_3(self.layer_2(self.layer_1(x)))
model_1=CircleModelV1()
model_1

In [None]:
model_1.state_dict()

In [None]:
# Create a loss function
loss_fn=nn.BCEWithLogitsLoss()
# Create an optimizer
optimizer=torch.optim.SGD(params=model_1.parameters(),
                          lr=0.1)

def accuracy_fn(y_true,y_pred):
> correct=torch.eq(y_true,y_pred).sum().item()

> acc=(correct/len(y_pred))*100

> return acc





In [None]:
# Training and evaluation loop for model_1

torch.manual_seed(42)
torch.cuda.manual_seed(42)

# Train for longer
epochs=1000
model_1.to(device)
# Put data on the target device
X_train,y_train=X_train.to(device),y_train.to(device)
X_test,y_test=X_test.to(device),y_test.to(device)
for epoch in range(epochs):
  ### Training
  model_1.train()
  # 1. forward pass
  y_logits=model_1(X_train).squeeze()
  y_pred=torch.round(torch.sigmoid(y_logits)) # logits-> prediction probabilities -> labels

  # 2.Calculate the loss
  loss=loss_fn(y_logits,y_train) # y_pred 대신 y_logits을 쓰는 이유는 loss_fn이 BCEWithLogitsLoss()라서이다.
  acc=accuracy_fn(y_true=y_train,y_pred=y_pred)

  # 3. Optimizer zero grad
  optimizer.zero_grad()

  # 4. Loss backward (backpropagation) 모든 매개변수에 대한 기울기를 구한다.
  loss.backward()

  # 5. Optimizier step() 기울기를 줄이기 위해 매개변수를 업데이트
  optimizer.step()

  ### Testing
  model_1.eval()
  with torch.inference_mode():
    # 1. forward pass
    test_logits=model_1(X_test).squeeze()
    test_pred=torch.round(torch.sigmoid(test_logits))
    # 2. calculate loss
    test_loss=loss_fn(test_logits, y_test)

    test_acc=accuracy_fn(y_true=y_test,y_pred=test_pred)

    # Print out what's happening
    if epoch % 100 ==0:
      print(f"Epoch:{epoch} | Loss: {loss:.5f}. ACC:{acc:.2f}% | Test loss: {test_loss: .5f}, test acc:{test_acc:.5f}%")

In [None]:
# Plot the decision boundary
plt.figure(figsize=(12,6))
plt.subplot(1,2,1)
plt.title("train")
plot_decision_boundary(model_1,X_train,y_train)

plt.subplot(1,2,2)
plt.title("test")
plot_decision_boundary(model_1,X_test,y_test)

linear layer와 hidden unit 과 epoch 증가에도 성능은 여전
코인던지기 처럼 선을 반반가르기해서 보여준다.
비선형이필요


### 5.1 Preparing data to see if our model can fit a straight line

One way to troubleshoot to a larger problem is to test out a smaller problem

위의 선형모델 v1을 선형데이터셋을 잠깐 구성해서 테스트해본다

In [None]:
# Create some data(same as notebook 01)

weight=0.7
bias=0.3
start=0
end=1
step=0.01

# Create data
X_regression=torch.arange(start,end,step).unsqueeze(dim=1)
y_regression=weight * X_regression + bias # Linear regression formula(without epsilon)

# Check the data
print(len(X_regression))
X_regression[:5],y_regression[:5]

In [None]:
# Create train and test split

train_split =int(0.8 * len(X_regression))
X_train_regression,y_train_regression= X_regression[:train_split], y_regression[:train_split]
X_test_regression,y_test_regression=X_regression[train_split:],y_regression[train_split:]

# Check length of data
len(X_train_regression),len(y_train_regression),len(X_test_regression),len(y_test_regression)

In [None]:
type(X_train_regression[0]),type(X_train_regression),type(y_train_regression),type(y_train_regression[0]),type(X_test_regression),type(X_test_regression[0]),type(y_test_regression),type(y_test_regression[0])

def plot_predictions(
    train_data, train_labels, test_data, test_labels, predictions=None

In [None]:
plot_predictions(train_data=X_train_regression,
                 train_labels=y_train_regression,
                 test_data=X_test_regression,
                 test_labels=y_test_regression)

In [None]:
# 위의 선형 데이터와 model_1가 적합할까?
# 난 1번째 레이어의  in_feature를 1로 바꾸어야한다고 생각한다.
# 그 이유는 X_train_regression의 입력값이 1개 이기때문이다.
X_train_regression[0],model_1

### 5.2 Adjust `model_1` to fit a straight line

In [None]:
# Same architecture as model_1 (but using nn.Sequential())

model_2=nn.Sequential(
    nn.Linear(in_features=1,out_features=10),
    nn.Linear(in_features=10,out_features=10),
    nn.Linear(in_features=10,out_features=1)
).to(device)
model_2

In [None]:
# Loss and optimizer
loss_fn=nn.L1Loss()
optimizer=torch.optim.SGD(params=model_2.parameters(),lr=0.001)

In [None]:
device

In [None]:
#  Train the model
torch.manual_seed(42)
torch.cuda.manual_seed(42)

# Set the number of epochs
epochs=1000

# Put the data on the target device
X_train_regression,y_train_regression=X_train_regression.to(device),y_train_regression.to(device)
X_test_regression,y_test_regression=X_test_regression.to(device),y_test_regression.to(device)

# Training
for epoch in range(epochs):
  y_pred=model_2(X_train_regression)
  loss=loss_fn(y_pred,y_train_regression)
  optimizer.zero_grad()
  loss.backward()
  optimizer.step()

  # testing
  model_2.eval()
  with torch.inference_mode():
    test_pred=model_2(X_test_regression)
    test_loss=loss_fn(test_pred,y_test_regression)

  # Print out what's happening
  if epoch % 100==0:
    print(f"Epoch {epoch} | Loss:{loss:.5f} |Test_loss{test_loss:.5f}")

In [None]:
type(X_train_regression),

In [None]:
#Turn on evaluation mode
model_2.eval()

with torch.inference_mode():
  y_preds=model_2(torch.Tensor(X_test_regression).to(device))

# plot data and predictions
# plot_prediction의 매개변수들은 cpu상에서 작업되어야한다. matplotlib은 cpu상인가보다..
plot_predictions(train_data=X_train_regression.cpu(),
                 train_labels=y_train_regression.cpu(),
                 test_data=X_test_regression.cpu(),
                 test_labels=y_test_regression.cpu(),
                 predictions=y_preds.cpu())

## 6. The missing piece: Non-linearity

"What patterns could you draw if you were given an infinite amount of a straight and non-straight line?"

Or in machine learning terms, and infinite (but really it is finite) of linear and non-linear functions?

### 6.1 Recreating non-linear data (red and blue circles)

In [None]:
# make and plot data

import matplotlib.pyplot as plt
from sklearn.datasets import make_circles

n_samples=1000
X,y=make_circles(n_samples,
                 noise=0.03,
                 random_state=42)

plt.scatter(X[:,0],X[:,1],c=y,cmap=plt.cm.RdYlBu)

In [None]:
#diagnostic code
import torch
from torch import nn
device="cuda" if torch.cuda.is_available() else "cpu"

In [None]:
def accuracy_fn(y_true,y_pred):
  correct=torch.eq(y_true,y_pred).sum().item()
  acc=(correct/len(y_pred))*100
  return acc

In [None]:
# Convert data to tensors and then to train and test split
import torch
from sklearn.model_selection import train_test_split

# Turn data into tensors
X=X
#y=torch.from_numpy(y).type(torch.float)

# Split into train and test sets
X_train,X_test,y_train,y_test=train_test_split(X,
                                               y,
                                               test_size=0.2,
                                               random_state=42)

X_train[:5],y_train[:5]

### 6.2 Building a model with non-linearity

* Linear=straight line
* Non linear= non-straight line

Artificial neural networks are a large combination of linear (straight) and non_straight (non-linear) functions which are potentially able to find patterns in data.

인공 신경망은 데이터의 패턴을 찾을 수 있는 선형과 비선형의 큰 조합이다.


In [None]:
# Build a model with non-linear activation functions
from torch import nn
class CircleModelV2(nn.Module):
  def __init__(self):
    super().__init__()
    self.layer_1=nn.Linear(in_features=2,out_features=10)
    self.layer_2=nn.Linear(in_features=10,out_features=10)
    self.layer_3=nn.Linear(in_features=10,out_features=1)
    self.relu=nn.ReLU() # Relu is a non linear activation function


  def forward(self,x):
    # Where should we put our Non - linear activation functions?
    return self.layer_3(self.relu(self.layer_2(self.relu(self.layer_1(x)))))

model_3=CircleModelV2().to(device)
model_3

In [None]:
#Loss and Optimizer

loss_fn=nn.BCEWithLogitsLoss()
optimizer=torch.optim.SGD(params=model_3.parameters(),
                          lr=0.1)

In [None]:
type(X_train),type(X_test),type(y_train),type(y_test)

X_train=torch.Tensor(X_train).to(device)
X_test=torch.Tensor(X_test).to(device)
y_train=torch.Tensor(y_train).to(device)
y_test=torch.Tensor(y_test).to(device)
type(X_train),type(X_test),type(y_train),type(y_test)

### 6.3 Training a model with non-linearity

In [None]:
# Random seeds
torch.manual_seed(42)
torch.cuda.manual_seed(42)

# set the number of epoch
epochs=2000

# Put all data on target device
X_train,y_train=X_train.to(device),y_train.to(device)
X_test,y_test=(X_test).to(device),y_test.to(device)
Epoch=[]
Loss_values=[]
Test_Loss_values=[]
print(X_train)
# Training

for epoch in range(epochs):
  ### Training
  model_3.train()
  # 1. Forward pass
  y_logits=model_3(X_train).squeeze().to(device)
  y_pred=torch.round(torch.sigmoid(y_logits))

  # 2. Calculate the loss / acc
  loss=loss_fn(y_logits,y_train) # BCEWithLogitsLoss 는 로짓이 인풋값
  acc= accuracy_fn(y_true=y_train,
                y_pred=y_pred)

  # 3. Opitimzer zero grad
  optimizer.zero_grad()

  # 4. Loss backward
  loss.backward()

  # 5. Optimizer step()
  optimizer.step()

# TESTING
  model_3.eval()
  with torch.inference_mode():
    test_logits=model_3(X_test).squeeze()
    test_pred=torch.round(torch.sigmoid(test_logits))

    test_loss=loss_fn(test_logits,y_test)
    test_acc=accuracy_fn(y_true=y_test,
                         y_pred=test_pred)

 # print out what's happening
  if epoch % 100==0:
    Loss_values.append(loss)
    Epoch.append(epoch)
    Test_Loss_values.append(test_loss)
    print(f"Epoch {epoch} | Loss:{loss:.5f}, ACC={acc:.2f} |Test_loss{test_loss:.5f}, Test_acc: {test_acc:.2f}")

In [None]:
model_3.state_dict()
# Activation function 활성화함수에는 매개변수가 존재하지않음

In [None]:
print(type(X_train))

In [None]:
y_logits.shape,y_train.shape

### 6.4 Evaluating a model trained with non-linear activation functions

In [None]:
# Make predictions
model_3.eval()
with torch.inference_mode():
  y_preds=torch.round(torch.sigmoid(model_3(X_test))).squeeze()
y_preds[:10],y_test[:10]

In [None]:
import matplotlib.pyplot as plt
import numpy as np
plt.plot(Epoch,np.array(torch.tensor(Loss_values).numpy()),label="train loss")
plt.plot(Epoch,Test_Loss_values, label="test loss")
plt.title("Training and test loss curves")
plt.ylabel("Loss")
plt.xlabel("Epochs")
plt.legend()

In [None]:
# Plot decision boundaries
plt.figure(figsize=(12,6))
plt.subplot(1,3,1)
plt.title("Train")
plot_decision_boundary(model_3,X_train,y_train)
plt.subplot(1,3,2)
plt.title("Test")
plot_decision_boundary(model_3,X_test,y_test) # model_3 는 비선형적
plt.subplot(1,3,3)
plt.title("Model_1_train")
plot_decision_boundary(model_1,X_train,y_train) # Model_1 은 선형적

### 7. Replicating non-linear activation functions

Neural networks, rather than us telling the model what to learn, we give it the tools to discover pattern in data and it tries to figure out the patterns its own

인공신경망은 우리가 모델에게 어떤 것을 배우라고 말하기보단, 우리가 데이터내에 패턴을 발견할 툴을 주고 스스로 패턴을 찾으려고 노력한다.

And these tools are linear & non-linear functions


In [None]:
# Create a tensor

A=torch.arange(-10,10,1,dtype=torch.float32)

In [None]:
# Visualize the tensor
plt.plot(A)

In [None]:
plt.plot(torch.relu(A))

In [None]:
def relu(x: torch.Tensor) -> torch.Tensor:
  return torch.maximum(torch.tensor(0),x) # 입력값은 무조건 텐서여야한다.

relu(A)

In [None]:
# Plot ReLU activation function
plt.plot(relu(A));

In [None]:
# Now let's do the same for sigmoid
def sigmoid(x):
  return 1/(1+torch.exp(-x))

In [None]:
plt.plot(torch.sigmoid(A))

In [None]:
plt.plot(torch.sigmoid(torch.relu(A)))

In [None]:
plt.plot(torch.relu(torch.sigmoid(A)))

여기까지 이진 분류에대해서 얘기했다. 다음은 다중 분류에대해서 진행


## 8. Putting it all together with a multi-class classification problem

* Binary classification = one thing or another(cat vs dog, spam vs not spam, fraud or not fraud)

* Multi-class classification = more than one thing or another (cat vs dog vs chicken)

We'll use activation functions with Softmax for Multi-classification and Cross EntropyLoss rather than BCELoss

### 8.1 Creating a toy multi-class dataset


In [None]:
# Import dependencies
import torch
from torch import nn
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs  # https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_blobs.html
from sklearn.model_selection import train_test_split

# centers=10이면 10개의 클라스를 생성가능
# 이진 분류를 원하면 centers=2 를 넣으면된다.

# Set the hyperparameters for data creation
NUM_CLASSES=4
NUM_FEATURES=2
RANDOM_SEED=42

# 1. Create multi-class data

X_blob,y_blob=make_blobs(n_samples=1000,
                         n_features=NUM_FEATURES,
                         centers=NUM_CLASSES,
                         cluster_std=2, # 흩어짐의 정도를 주어 모델을 좀 힘들게 하겠다의 척도
                         random_state=RANDOM_SEED
                         )

# 2. Turn data into Tensors
type(X_blob),type(y_blob) #만들어진 데이터의 타입이 어떻게되는지 파악

X_blob=torch.from_numpy(X_blob).type(torch.float) # 데이터 내부 원소도 torch.float로 변환
y_blob=torch.from_numpy(y_blob).type(torch.LongTensor) # 데이터 내부원소도 torch.float로 변환  토치의 데이터기본형은 float32
type(X_blob),X_blob.dtype

# 3. Split into train and test
X_blob_train,X_blob_test,y_blob_train,y_blob_test=train_test_split(X_blob,
                                                                   y_blob,
                                                                   test_size=0.2,
                                                                   random_state=RANDOM_SEED)
# 4. Plot data (visualize,visualize,visualize)
plt.figure(figsize=(10,7))
plt.scatter(X_blob[:,0],X_blob[:,1],c=y_blob,cmap=plt.cm.RdYlBu)

### 8.2 Building a multi-class classification model in Pytorch

In [None]:
# Create device agnostic code
device= "cuda" if torch.cuda.is_available() else "cpu"
device

In [None]:
# Build a multi-class classification model
class BlobModel(nn.Module):
  def __init__(self,input_features,output_features,hidden_units=8):
    """Initialize Multi-class classification model

    Args:
      input_features(int): Number of input features to the model
      output_features(int): Number of outputs features (number of output classes)
      hidden_units (int): Number of hidden units between layers, default 8

    Returns:

    Example:
    """
    super().__init__()
    self.linear_layer_stack=nn.Sequential(
        nn.Linear(in_features=input_features,out_features=hidden_units),
        #nn.ReLU(),
        nn.Linear(in_features=hidden_units,out_features=hidden_units),
        #nn.ReLU(),
        nn.Linear(in_features=hidden_units,out_features=output_features)
    )
  def forward(self,x):
    return self.linear_layer_stack(x)

# Create an instance of BlobModel and send it to the target device
model_4 = BlobModel(input_features=2,
                    output_features=4,
                    hidden_units=8).to(device)
model_4

In [None]:
X_blob_train.shape,y_blob_train.shape
len(y_blob_train),torch.unique(y_blob_train) # unique란 내부의 원소들을 중복없이 크기순서대로 나열

### 8.3 Create a Loss function and an Optimizer for multi-class classification model

In [None]:
# Create a loss function for multi-class classification - loss function measures how wrong our model's predictions are
loss_fn=nn.CrossEntropyLoss()

# Create an optimizer for multi-class classification - Optimizer updates our model parameters to try and reduce the loss
optimizer=torch.optim.SGD(params=model_4.parameters(),
                          lr=0.1) # learning rate is a hyperparameter you can change

### 8.4 Getting prediction probabilities for a multi-class Pytorch Model

In order to evaluate and train and test or model, we need to convert our model's output(logits) to prediction probabilities and then to prediction label.

Logits (raw output of the model, `model_4(X_train)`) -> pred_probs (use ` torch.softmax(y_logits,dim=1)`) -> pred_labels (use `torch.argmax(y_pred_probs,dim=1)`)

## **자기의 모델과 데이터가 지금 어디 장치에 있는지 확인하는방법!!!!!!!!!!!!!**


In [None]:
next(model_4.parameters()).device, X_blob_test.device

In [None]:
# Let's get some raw outputs of our model called logits
model_4.eval()
with torch.inference_mode():
  y_logits=model_4(X_blob_test.to(device))

y_logits[:10]

In [None]:
y_blob_test[:10]

In [None]:
# Convert our model's logit outputs to prediction probabilities (로짓을 확률로 바꿔주기)
y_pred_probs=torch.softmax(y_logits,dim=1) # dim=1 인 이유는 세로줄에 있는 것들을 더하라고 인것 같다. (1행 1열 + 1행2열 + 1행3열 + 1행4열)
print(y_logits[:5])
print(y_pred_probs[:5])

In [None]:
def softmax(x):
  return torch.exp(x)/torch.sum(torch.exp(x))
  # 소프트맥스는 해당 원소내의 있는 것들을 예측 확률로 변환해준다.

In [None]:
torch.argmax(y_pred_probs[0])
# 2번 인덱스가 가장 확률이 높아요 이런

In [None]:
# Conver our model's prediction probabilities to prediction labels (확률을 라벨로 변환하기)
y_preds=torch.argmax(y_pred_probs,dim=1)
y_preds

In [None]:
y_preds.shape,y_blob_test.shape

In [None]:
X_blob_train,y_blob_train=X_blob_train.to(device),y_blob_train.to(device)
X_blob_test,y_blob_test=X_blob_test.to(device),y_blob_test.to(device)

In [None]:
y_logits=model_4(X_blob_train)

In [None]:
y_logits[0]

### 8.5 Creating a training loop and testing loop for a multi-class pytorch model

In [None]:
def accuracy_fn(y_true,y_pred):
  correct=torch.eq(y_true,y_pred).sum().item()
  acc=(correct/len(y_pred))*100
  return acc

In [None]:
y_blob_train[:10],y_logits

In [None]:
y_blob_train=y_blob_train.type(torch.LongTensor)
y_logits.dtype,y_blob_train.dtype

In [None]:
# Fit the mulit-class model to the data
torch.manual_seed(42)
torch.cuda.manual_seed(42)

# Set the number of epochs
epochs=100

# Put data to the target device
X_blob_train,y_blob_train=X_blob_train.to(device),y_blob_train.to(device)
X_blob_test,y_blob_test=X_blob_test.to(device),y_blob_test.to(device)

# Loop through data
for epoch in range(epochs):
  ## Training
  model_4.train()
  y_logits=model_4(X_blob_train)
  y_pred=torch.softmax(y_logits,dim=1).argmax(dim=1)

  loss=loss_fn(y_logits,y_blob_train.type(torch.LongTensor))# torch.Longtensor 는 cpu에서 작동, torch.cuda.LongTensor는 gpu에서 작동)
  acc=accuracy_fn(y_true=y_blob_train,
                  y_pred=y_pred)
  optimizer.zero_grad()
  loss.backward()
  optimizer.step()

  ### Testing
  model_4.eval()
  with torch.inference_mode():
    test_logits=model_4(X_blob_test)
    test_preds=torch.softmax(test_logits,dim=1).argmax(dim=1)

    test_loss=loss_fn(test_logits,y_blob_test)  # 크로스 엔트로피는 정수형...
    test_acc=accuracy_fn(y_true=y_blob_test,
                         y_pred=test_preds)


  # Print out what's happening
  if epoch % 10 ==0:
    print(f"Epoch:{epoch} | Loss:{loss:.4f} acc:{acc:.2f} | Test loss {test_loss:.4f}, Testacc:{test_acc:.2f}")

In [None]:
test_logits.device,y_blob_test.device

### 8.6 Making and evaluating predictions with a Pytorch multi-class model

In [None]:
# Make predictions
model_4.eval()
with torch.inference_mode():
  y_logits=model_4(X_blob_test)

# View the first 10 predictions
y_logits[:10]

In [None]:
# Go from logits -> prediction probabilities
y_pred_probs=torch.softmax(y_logits,dim=1)
y_pred_probs[:10]

In [None]:
# Go from pred probs to pred labels
y_preds=torch.argmax(y_pred_probs,dim=1)
y_preds[:10]

In [None]:
plt.figure(figsize=(12,6))
plt.subplot(1,2,1)
plt.title("train")
plot_decision_boundary(model_4,X_blob_train,y_blob_train)
plt.subplot(1,2,2)
plt.title("test")
plot_decision_boundary(model_4,X_blob_test,y_blob_test)

### 9. A few more classification metrics(평가지표) (to evaluate our classification model)

* Accuracy - out of 100 samples, how many does our model get right?
* Precision
* Recall
* F1-score
* Cnfusion Matrix
* Classification report

https://torchmetrics.readthedocs.io/en/latest/pages/quickstart.html

In [None]:
!pip install torchmetrics

In [None]:
from torchmetrics import Accuracy

# Setup metric (메트릭스도 diagnostic code가 있어야한다.)
torchmetric_accuracy=Accuracy('multiclass').to(device)

# Calculate accuracy