# CH8. 성능 최적화

- 실습해볼 task : Sentence Positive/Negative Classificationn using CNN(CNN 기반 문장의 긍부정 분류)
- 실습 순서
    - [참고자료](https://github.com/graykode/nlp-tutorial/blob/master/2-1.TextCNN/TextCNN.py)

## 0. 필요한 라이브러리 & 모듈

In [1]:
import numpy as np
import torch

from models import text_cnn
from train import train_model
from test import test_model
from train import train_model_with_early_stop

## 1. CH9장 내용 기반 데이터 전처리

- 데이터(일종의 corpus) load

In [2]:
corpus = [
            "i love you", 
            "he loves me", 
            "she likes baseball", 
            "i hate you", 
            "sorry for that", 
            "this is awful"
]
labels = [1, 1, 1, 0, 0, 0] 

- Tokenization 진행
    - 문장을 단어 단위로 분리

In [3]:
words = " ".join(corpus).split()
words = list(set(words))
words

['hate',
 'he',
 'baseball',
 'love',
 'me',
 'you',
 'loves',
 'for',
 'this',
 'is',
 'awful',
 'i',
 'likes',
 'that',
 'sorry',
 'she']

- 토큰에 정수 매핑하는 딕셔너리 만들기
    - 단어 데이터를 모델이 이해할 수 있는 정수로 이루어진 벡터로 바꾸기 위해

In [5]:
word_dict = {w: i for i, w in enumerate(words)}
word_dict

{'hate': 0,
 'he': 1,
 'baseball': 2,
 'love': 3,
 'me': 4,
 'you': 5,
 'loves': 6,
 'for': 7,
 'this': 8,
 'is': 9,
 'awful': 10,
 'i': 11,
 'likes': 12,
 'that': 13,
 'sorry': 14,
 'she': 15}

- input 데이터 텐서화
    - 문장 -> 벡터

In [8]:
sentence_arrays = [np.asarray([word_dict[n] for n in sentence.split()]) for sentence in corpus]
inputs = torch.LongTensor(sentence_arrays)
label_array = np.asarray(labels)
targets = torch.LongTensor(label_array)
print(f"input 데이터 shape: {inputs.size()}")
print(f"target 데이터 shape: {targets.size()}")

input 데이터 shape: torch.Size([6, 3])
target 데이터 shape: torch.Size([6])


In [10]:
print(f"Before Mapping: {corpus[0]}")
print(f"After Mapping: {inputs[0]}")

Before Mapping: i love you
After Mapping: tensor([11,  3,  5])


## 2. 성능 최적화 이용 학습
- Batch Normalization
- Drop Out
- Early Stopping

- train & test vanilla model

In [11]:
num_filters = 3 
filter_sizes = [2, 2, 2] 
vocab_size = len(word_dict)
embedding_size = 2 
sequence_length = 3 
num_classes = 2 

model = text_cnn.TextCNN(
    num_filters, filter_sizes, vocab_size,
    embedding_size, sequence_length, num_classes
)
model 

TextCNN(
  (W): Embedding(16, 2)
  (Weight): Linear(in_features=9, out_features=2, bias=False)
  (filter_list): ModuleList(
    (0-2): 3 x Conv2d(1, 3, kernel_size=(2, 2), stride=(1, 1))
  )
)

In [12]:

train_model(model,inputs,targets,100)


Epoch: 0001 cost = 0.826689
Epoch: 0002 cost = 0.822922
Epoch: 0003 cost = 0.819225
Epoch: 0004 cost = 0.815597
Epoch: 0005 cost = 0.812040
Epoch: 0006 cost = 0.808555
Epoch: 0007 cost = 0.805143
Epoch: 0008 cost = 0.801804
Epoch: 0009 cost = 0.798730
Epoch: 0010 cost = 0.795727
Epoch: 0011 cost = 0.792770
Epoch: 0012 cost = 0.789866
Epoch: 0013 cost = 0.787021
Epoch: 0014 cost = 0.784236
Epoch: 0015 cost = 0.781511
Epoch: 0016 cost = 0.778848
Epoch: 0017 cost = 0.776245
Epoch: 0018 cost = 0.773702
Epoch: 0019 cost = 0.771216
Epoch: 0020 cost = 0.768786
Epoch: 0021 cost = 0.766410
Epoch: 0022 cost = 0.764085
Epoch: 0023 cost = 0.761810
Epoch: 0024 cost = 0.759582
Epoch: 0025 cost = 0.757399
Epoch: 0026 cost = 0.755259
Epoch: 0027 cost = 0.753160
Epoch: 0028 cost = 0.751117
Epoch: 0029 cost = 0.749130
Epoch: 0030 cost = 0.747160
Epoch: 0031 cost = 0.745208
Epoch: 0032 cost = 0.743294
Epoch: 0033 cost = 0.741432
Epoch: 0034 cost = 0.739599
Epoch: 0035 cost = 0.737793
Epoch: 0036 cost = 0

In [17]:
test_text = 'he loves you'
tests = [np.asarray([word_dict[n] for n in test_text.split()])]
test_input = torch.LongTensor(tests)
prediction = test_model(model,test_input)

if prediction == 0:
    print(test_text,"is Bad Mean...")
else:
    print(test_text,"is Good Mean!!")

he loves you is Good Mean!!


### 2.1 What is Batch Normalization?
- `normalization`
    - 정규화 : 데이터 범위를 사용자가 원하는 범위로 제한하는 것
        - feature scaling으로도 불림
    - 방법
        - [nn.BatchNorm2d](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html)

- train vanilla model + batch normalization

In [18]:
model_batchnormalized = text_cnn.TextCNN(
    num_filters, filter_sizes, vocab_size,
    embedding_size, sequence_length, num_classes,is_batch_normalize=True
)
model_batchnormalized

TextCNN(
  (W): Embedding(16, 2)
  (Weight): Linear(in_features=9, out_features=2, bias=False)
  (filter_list): ModuleList(
    (0-2): 3 x Conv2d(1, 3, kernel_size=(2, 2), stride=(1, 1))
  )
  (batch_norm_list): ModuleList(
    (0-2): 3 x BatchNorm2d(3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
)

In [19]:
train_model(model_batchnormalized,inputs,targets,100)

Epoch: 0001 cost = 0.620625
Epoch: 0002 cost = 0.613888
Epoch: 0003 cost = 0.607239
Epoch: 0004 cost = 0.600677
Epoch: 0005 cost = 0.594204
Epoch: 0006 cost = 0.587821
Epoch: 0007 cost = 0.581639
Epoch: 0008 cost = 0.575679
Epoch: 0009 cost = 0.569792
Epoch: 0010 cost = 0.563975
Epoch: 0011 cost = 0.558229
Epoch: 0012 cost = 0.552551
Epoch: 0013 cost = 0.546994
Epoch: 0014 cost = 0.541419
Epoch: 0015 cost = 0.535926
Epoch: 0016 cost = 0.530508
Epoch: 0017 cost = 0.525145
Epoch: 0018 cost = 0.519833
Epoch: 0019 cost = 0.514572
Epoch: 0020 cost = 0.509361
Epoch: 0021 cost = 0.504197
Epoch: 0022 cost = 0.499081
Epoch: 0023 cost = 0.494011
Epoch: 0024 cost = 0.488986
Epoch: 0025 cost = 0.484007
Epoch: 0026 cost = 0.479072
Epoch: 0027 cost = 0.474181
Epoch: 0028 cost = 0.469333
Epoch: 0029 cost = 0.464530
Epoch: 0030 cost = 0.459768
Epoch: 0031 cost = 0.455047
Epoch: 0032 cost = 0.450363
Epoch: 0033 cost = 0.445718
Epoch: 0034 cost = 0.441112
Epoch: 0035 cost = 0.436544
Epoch: 0036 cost = 0

### 2.2 What is Drop Out?
- drop out 
    - 드롭아웃 : 학습 시 , 일정 비율의 뉴런만 사용하고 나머지 뉴런에 해당하는 가중치는 업데이트 하지 않는 방법
        - 매 단계마다 사용하지 않는 뉴런을 바꾼다.
    - 방법 
        - [nn.Dropout](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html)

- train & test vanilla model + dropout

In [20]:
model_dropout = text_cnn.TextCNN(
    num_filters, filter_sizes, vocab_size,
    embedding_size, sequence_length, num_classes, dropout_prob = 0.5
)
model_dropout

TextCNN(
  (W): Embedding(16, 2)
  (Weight): Linear(in_features=9, out_features=2, bias=False)
  (filter_list): ModuleList(
    (0-2): 3 x Conv2d(1, 3, kernel_size=(2, 2), stride=(1, 1))
  )
  (dropout): Dropout(p=0.5, inplace=False)
)

In [21]:
train_model(model_dropout,inputs,targets,100)

Epoch: 0001 cost = 0.717489
Epoch: 0002 cost = 0.658339
Epoch: 0003 cost = 0.755907
Epoch: 0004 cost = 0.526005
Epoch: 0005 cost = 0.750599
Epoch: 0006 cost = 0.693036
Epoch: 0007 cost = 0.832940
Epoch: 0008 cost = 0.757068
Epoch: 0009 cost = 0.563177
Epoch: 0010 cost = 0.616058
Epoch: 0011 cost = 0.610943
Epoch: 0012 cost = 0.726284
Epoch: 0013 cost = 0.643613
Epoch: 0014 cost = 0.847523
Epoch: 0015 cost = 0.652903
Epoch: 0016 cost = 0.851932
Epoch: 0017 cost = 0.639031
Epoch: 0018 cost = 0.809204
Epoch: 0019 cost = 0.722253
Epoch: 0020 cost = 0.753233
Epoch: 0021 cost = 0.613244
Epoch: 0022 cost = 0.679789
Epoch: 0023 cost = 0.674083
Epoch: 0024 cost = 0.766926
Epoch: 0025 cost = 0.780195
Epoch: 0026 cost = 0.633055
Epoch: 0027 cost = 0.728218
Epoch: 0028 cost = 0.564681
Epoch: 0029 cost = 0.590229
Epoch: 0030 cost = 0.659879
Epoch: 0031 cost = 0.463122
Epoch: 0032 cost = 0.698427
Epoch: 0033 cost = 0.853713
Epoch: 0034 cost = 0.746249
Epoch: 0035 cost = 0.725091
Epoch: 0036 cost = 0

### 2.3 What is Early Stopping?
- early stopping
    - 조기 종료: 검증 데이터셋에 대한 오차가 증가하는 시점에 학습을 멈추도록 조정
    - 방법
        - [참고 코드](https://teddylee777.github.io/pytorch/early-stopping/)

- train vanilla model with early stop

In [22]:
train_model_with_early_stop(model_dropout,inputs,targets,100)

Epoch: 0001 cost = 0.535123
Epoch: 0002 cost = 0.627853
Epoch: 0003 cost = 0.601587
Epoch: 0004 cost = 0.461544
Epoch: 0005 cost = 0.572803
Epoch: 0006 cost = 0.635137
Epoch: 0007 cost = 0.571512
Epoch: 0008 cost = 0.715966
Epoch: 0009 cost = 0.598509
Early stopping at epoch 9 due to lack of improvement.
