#Naver Movie Sentiment Analysis

Reference : https://github.com/bentrevett/pytorch-sentiment-analysis

이번 장에서는 파이토치를 이용해 감성을 감지하는 머신러닝 모델을 만들것이다. (다시말해 문장이 긍정적인지 부정적인지) 네이버 영화 감성 코퍼스를 사용해서 한국 영화 리뷰를 분석할 것이다.

이 노트북에서 먼저 일반적인 컨셉 이해를 위해 좋은 결과에 신경쓰지 않고 아주 간단한 모델로 시작할것이다. 

##Introduction

시퀀스 분석에 주로 사용되는 **recurrent neural network**(RNN)을 사용할것이다. RNN은 단어의 순서를 따른다. 한번에 하나씩 히든 스테이트 $h$ 를 만든다. RNN을 반복적으로 사용할것이다. 



##Preparing Data

In [1]:
!pip3 install konlpy

Collecting konlpy
[?25l  Downloading https://files.pythonhosted.org/packages/85/0e/f385566fec837c0b83f216b2da65db9997b35dd675e107752005b7d392b1/konlpy-0.5.2-py2.py3-none-any.whl (19.4MB)
[K     |████████████████████████████████| 19.4MB 37.6MB/s 
[?25hCollecting beautifulsoup4==4.6.0
[?25l  Downloading https://files.pythonhosted.org/packages/9e/d4/10f46e5cfac773e22707237bfcd51bbffeaf0a576b0a847ec7ab15bd7ace/beautifulsoup4-4.6.0-py3-none-any.whl (86kB)
[K     |████████████████████████████████| 92kB 7.4MB/s 
[?25hCollecting tweepy>=3.7.0
  Downloading https://files.pythonhosted.org/packages/36/1b/2bd38043d22ade352fc3d3902cf30ce0e2f4bf285be3b304a2782a767aec/tweepy-3.8.0-py2.py3-none-any.whl
Collecting colorama
  Downloading https://files.pythonhosted.org/packages/c9/dc/45cdef1b4d119eb96316b3117e6d5708a08029992b2fee2c143c7a0a5cc5/colorama-0.4.3-py2.py3-none-any.whl
Collecting JPype1>=0.7.0
[?25l  Downloading https://files.pythonhosted.org/packages/4e/c9/cde4aae2f4ae7da6b46258d7233511

In [0]:
import pandas as pd
import os
import numpy as np
import torch
import torch.utils.data
import konlpy
from konlpy.tag import Kkma

In [3]:
os.getcwd()

'/content'

In [0]:
os.chdir('./drive/My Drive/News')

In [0]:
df = pd.read_csv('news_2020-04-13~19.csv')

In [0]:
df_day = df.set_index(df['0'])

In [0]:
del df_day['Unnamed: 0']
del df_day['0']

In [0]:
df_day.columns = ['content']

In [9]:
df_day

Unnamed: 0_level_0,content
0,Unnamed: 1_level_1
2020-04-13,[머니투데이 한지연 기자] 13일 아시아 증시가 하락세를 보였다.이날 일본 닛케이2...
2020-04-13,원·달러 1217.90원…전일比 9.10원 상승13일 오후 서울 중구 을지로 하나은...
2020-04-13,13일 코스피지수는 사흘 만에 하락했다. 활짝 웃은 것은 코오롱(002020)계열사...
2020-04-13,[서울=뉴시스] 류병화 기자 = 코스피가 기관과 외국인의 매도세에 1.8% 하락했다...
2020-04-13,"코스피 1.88% 내린 1825.76, 코스닥 2.38% 내린 596.71달러/원 ..."
...,...
2020-04-19,[머니투데이 반준환 기자] [편집자주] [종목대해부]매일같이 수조원의 자금이 오가는...
2020-04-19,반도체 유틸리티 통신 음식료 등 '실적 안전지대' 업종에 초점[아이뉴스24 한수연 ...
2020-04-19,"- “1Q 실적, 위협 수준 아니면 영향 제한적”- 추세적 성장 위해선 턴어라운드 ..."
2020-04-19,'밀크' 마일리지 통합…암호화 토큰으로 현금화 가능'크로스' 해외 송금…시간 단축·...


##Build train and valid dataset

In [10]:
data_path = './ratings_train_10000.txt'

dataframe = pd.read_table(data_path, sep='\t')
print('[Info] Get {} data from {}'.format(len(dataframe), data_path))
dataframe = dataframe.dropna(how='any')
print('[Info] Drop null data, now the length of this data is {}'.format(len(dataframe)))

dataframe = pd.DataFrame(np.random.permutation(dataframe), columns=['id', 'document', 'label']) 

[Info] Get 9999 data from ./ratings_train_10000.txt
[Info] Drop null data, now the length of this data is 9999


In [11]:
def tokenizer(string):
    string.replace("[^ㄱ-ㅎㅏ-ㅣ가-힣 ]","")
    kkma = Kkma()

    return kkma.morphs(string)

def apply_tokenizer(dataframe):
    print('[Info] Tokenize...')
    return dataframe.apply(tokenizer).tolist()

sentence, label = apply_tokenizer(dataframe['document']), dataframe['label'].tolist()

[Info] Tokenize...


In [12]:
def build_vocab(sentence):
    print('[Info] Build vocabulary')
    vocab_set = set(token for sent in sentence for token in sent)

    vocab = { "<pad>": 0, "<unk>": 1}
    start_index = len(vocab)
    for i, token in enumerate(vocab_set):
        vocab[token] = start_index + i    
    return vocab

def vectorize(vocab, sentences, one_sentence=False):
    UNK = vocab.get('<unk>')
    if one_sentence:
        return [vocab.get(token, UNK) for token in sentences]
    return [[vocab.get(token, UNK) for token in sent] for sent in sentences]

vocab = build_vocab(sentence)
print('[Info] Vocabulary size=', len(vocab))
vec_sentence = vectorize(vocab, sentence)

[Info] Build vocabulary
[Info] Vocabulary size= 12203


In [13]:
split_ratio = 0.7

n_split = int(len(sentence) * split_ratio)
trn_sentence, trn_label = vec_sentence[:n_split], label[:n_split]
val_sentence, val_label = vec_sentence[n_split:], label[n_split:]
print('[Info] Split {} data to {} for train data,  {} for valid data.'.format(len(vec_sentence), len(trn_sentence), len(val_sentence)))

[Info] Split 9999 data to 6999 for train data,  3000 for valid data.


In [14]:
class MovieDataset(torch.utils.data.Dataset):
    def __init__(self, vocab, sentence, label):
        self.vocab = vocab
        self.sentence = sentence
        self.label = label

    def __len__(self):
        return len(self.sentence)
    
    def __getitem__(self, idx):
        sentence = torch.LongTensor(self.sentence[idx])
        label = torch.LongTensor([self.label[idx]])
        return sentence, label

train_dataset, valid_dataset = MovieDataset(vocab, trn_sentence, trn_label), MovieDataset(vocab, val_sentence, val_label)
print('[Info] Build train and valid dataset')

[Info] Build train and valid dataset


In [15]:
print(f'Number of training examples: {len(train_dataset)}')
print(f'Number of valid examples: {len(valid_dataset)}')

Number of training examples: 6999
Number of valid examples: 3000


In [16]:
print([word for word, idx in vocab.items() if idx in train_dataset[0][0]])

['에', 'ㄴ', '위하', '는', '당시', '폴', '의', '적', '하', '더', '다', '신선', '그', '이', '!', '고', '컬트', '이나', '만큼', '한편', '또한', '어른', '동화', '었', '을']


##Build test dataset

In [25]:
data_path = './ratings_test_3000.txt'

dataframe = pd.read_table(data_path, sep='\t')
print('[Info] Get {} data from {}'.format(len(dataframe), data_path))
dataframe = dataframe.dropna(how='any')
print('[Info] Drop null data, now the length of this data is {}'.format(len(dataframe)))

[Info] Get 2999 data from ./ratings_test_3000.txt
[Info] Drop null data, now the length of this data is 2999


In [26]:
sentence, label = apply_tokenizer(dataframe['document']), dataframe['label'].tolist()

[Info] Tokenize...


In [0]:
vec_sentence = vectorize(vocab, sentence)

In [28]:
test_dataset = MovieDataset(vocab, vec_sentence, label)
print('[Info] Build test dataset')

[Info] Build test dataset


In [29]:
print(f'Number of test examples: {len(test_dataset)}')

Number of test examples: 2999


## Build data iterator

In [0]:
def padding(inputs): 
    sentence, label = list(zip(*inputs))
    sentence = torch.nn.utils.rnn.pad_sequence(sentence, batch_first=True, padding_value=0)
    batch =  [ sentence, torch.cat(label, dim=0) ]
    return batch

In [0]:
BATCH_SIZE = 32

train_iterator = torch.utils.data.DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True, collate_fn=padding)
valid_iterator = torch.utils.data.DataLoader(valid_dataset, batch_size=BATCH_SIZE, shuffle=True, collate_fn=padding)
test_iterator = torch.utils.data.DataLoader(test_dataset, batch_size=BATCH_SIZE, shuffle=True, collate_fn=padding)

# 1 - Simple Sentiment Analysis

In [0]:
import torch.nn as nn

class RNN(nn.Module):
    def __init__(self, input_dim, embedding_dim, hidden_dim, output_dim): 
        
        super().__init__()
        
        self.embedding = nn.Embedding(input_dim, embedding_dim)
        
        self.rnn = nn.RNN(embedding_dim, hidden_dim, batch_first = True)
        
        self.fc = nn.Linear(hidden_dim, output_dim)
        
    def forward(self, text):

        #text = [batch size, sent len]
        
        embedded = self.embedding(text)
        
        #embedded = [batch size, sent len, emb dim]
        
        output, hidden = self.rnn(embedded)
        
        #output = [batch size, sent len, hid dim]
        #hidden = [batch size, 1, hid dim]
        
        return self.fc(hidden.squeeze(0))

In [0]:
INPUT_DIM = len(vocab)
EMBEDDING_DIM = 100
HIDDEN_DIM = 256
OUTPUT_DIM = 1

model = RNN(INPUT_DIM, EMBEDDING_DIM, HIDDEN_DIM, OUTPUT_DIM)

In [34]:
def count_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

print(f'The model has {count_parameters(model):,} trainable parameters')

The model has 1,312,205 trainable parameters


## Train the Model

In [0]:
import torch.optim as optim

optimizer = optim.Adam(model.parameters(), lr=1e-3)

In [0]:
criterion = nn.BCEWithLogitsLoss()

In [0]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

In [0]:
model = model.to(device)
criterion = criterion.to(device)

In [0]:
def binary_accuracy(preds, y):
    """
    Returns accuracy per batch, i.e. if you get 8/10 right, this returns 0.8, NOT 8
    """

    #round predictions to the closest integer
    rounded_preds = torch.round(torch.sigmoid(preds))
    correct = (rounded_preds == y).float() #convert into float for division 
    acc = correct.sum() / len(correct)
    return acc

In [0]:
from tqdm import tqdm

def train(model, iterator, optimizer, criterion, device):
    
    epoch_loss = 0
    epoch_acc = 0
    
    model.train()
    
    for batch_sentence, batch_label in tqdm(iterator):
        batch_sentence = batch_sentence.to(device)
        batch_label = batch_label.to(device)
        
        optimizer.zero_grad()
                
        predictions = model(batch_sentence).squeeze(1)
       
        batch_label = batch_label.float()
        loss = criterion(predictions, batch_label)
        
        acc = binary_accuracy(predictions, batch_label)
        
        loss.backward()
        
        optimizer.step()
        
        epoch_loss += loss.item()
        epoch_acc += acc.item()
        
    return epoch_loss / len(iterator), epoch_acc / len(iterator)

In [0]:
def evaluate(model, iterator, criterion, device):
    
    epoch_loss = 0
    epoch_acc = 0
    
    model.eval()
    
    with torch.no_grad():
    
        for batch_sentence, batch_label in tqdm(iterator):
            batch_sentence = batch_sentence.to(device)
            batch_label = batch_label.to(device)
            
            predictions = model(batch_sentence).squeeze(1)
            
            batch_label = batch_label.float()
            loss = criterion(predictions, batch_label)
            
            acc = binary_accuracy(predictions, batch_label)

            epoch_loss += loss.item()
            epoch_acc += acc.item()
        
    return epoch_loss / len(iterator), epoch_acc / len(iterator)

In [0]:
import time

def epoch_time(start_time, end_time):
    elapsed_time = end_time - start_time
    elapsed_mins = int(elapsed_time / 60)
    elapsed_secs = int(elapsed_time - (elapsed_mins * 60))
    return elapsed_mins, elapsed_secs

In [43]:
N_EPOCHS = 5

best_valid_loss = float('inf')

for epoch in range(N_EPOCHS):

    start_time = time.time()
    
    train_loss, train_acc = train(model, train_iterator, optimizer, criterion, device)
    valid_loss, valid_acc = evaluate(model, valid_iterator, criterion, device)
    
    end_time = time.time()

    epoch_mins, epoch_secs = epoch_time(start_time, end_time)
    
    if valid_loss < best_valid_loss:
        best_valid_loss = valid_loss
    
    print(f'Epoch: {epoch+1:02} | Epoch Time: {epoch_mins}m {epoch_secs}s')
    print(f'\tTrain Loss: {train_loss:.3f} | Train Acc: {train_acc*100:.2f}%')
    print(f'\t Val. Loss: {valid_loss:.3f} |  Val. Acc: {valid_acc*100:.2f}%')

100%|██████████| 219/219 [00:04<00:00, 48.22it/s]
100%|██████████| 94/94 [00:00<00:00, 146.87it/s]
  3%|▎         | 6/219 [00:00<00:04, 51.95it/s]

Epoch: 01 | Epoch Time: 0m 5s
	Train Loss: 0.700 | Train Acc: 49.38%
	 Val. Loss: 0.696 |  Val. Acc: 49.15%


100%|██████████| 219/219 [00:04<00:00, 50.40it/s]
100%|██████████| 94/94 [00:00<00:00, 149.00it/s]
  3%|▎         | 6/219 [00:00<00:04, 51.38it/s]

Epoch: 02 | Epoch Time: 0m 4s
	Train Loss: 0.706 | Train Acc: 51.00%
	 Val. Loss: 0.698 |  Val. Acc: 48.94%


100%|██████████| 219/219 [00:04<00:00, 50.67it/s]
100%|██████████| 94/94 [00:00<00:00, 147.07it/s]
  3%|▎         | 6/219 [00:00<00:04, 52.17it/s]

Epoch: 03 | Epoch Time: 0m 4s
	Train Loss: 0.698 | Train Acc: 49.34%
	 Val. Loss: 0.693 |  Val. Acc: 50.94%


100%|██████████| 219/219 [00:04<00:00, 51.07it/s]
100%|██████████| 94/94 [00:00<00:00, 147.43it/s]
  2%|▏         | 5/219 [00:00<00:04, 47.43it/s]

Epoch: 04 | Epoch Time: 0m 4s
	Train Loss: 0.696 | Train Acc: 50.39%
	 Val. Loss: 0.702 |  Val. Acc: 50.78%


100%|██████████| 219/219 [00:04<00:00, 50.83it/s]
100%|██████████| 94/94 [00:00<00:00, 145.00it/s]

Epoch: 05 | Epoch Time: 0m 4s
	Train Loss: 0.698 | Train Acc: 50.68%
	 Val. Loss: 0.693 |  Val. Acc: 49.39%





In [44]:
test_loss, test_acc = evaluate(model, test_iterator, criterion, device)

print(f'Test Loss: {test_loss:.3f} | Test Acc: {test_acc*100:.2f}%')

100%|██████████| 94/94 [00:00<00:00, 154.23it/s]

Test Loss: 0.693 | Test Acc: 48.96%





#2-Updated Sentiment Analysis

##Build the Model

In [0]:
import torch.nn as nn

class RNN(nn.Module):
    def __init__(self, vocab_size, embedding_dim, hidden_dim, output_dim, n_layers, 
                 bidirectional, dropout, pad_idx):
        
        super().__init__()
        
        self.embedding = nn.Embedding(vocab_size, embedding_dim)
        
        self.rnn = nn.LSTM(embedding_dim, 
                           hidden_dim, 
                           num_layers=n_layers, 
                           bidirectional=bidirectional, 
                           dropout=dropout, batch_first=True)
        
        self.fc = nn.Linear(hidden_dim * 2, output_dim)
        
        self.dropout = nn.Dropout(dropout)
        
    def forward(self, text):
        
        #text = [batch size, sent len]
        
        embedded = self.dropout(self.embedding(text))
        
        #embedded = [batch size, sent len, emb dim]
        
        output, (hidden, cell) = self.rnn(embedded)
        
        #output = [batch size, sent len, hid dim * num directions]
        #output over padding tokens are zero tensors
        
        #hidden = [batch size, num layers * num directions, hid dim]
        #cell = [batch size, num layers * num directions, hid dim]
        
        #concat the final forward (hidden[-2,:,:]) and backward (hidden[-1,:,:]) hidden layers
        #and apply dropout
        
        hidden = self.dropout(torch.cat((hidden[-2,:,:], hidden[-1,:,:]), dim = 1))
                
        #hidden = [batch size, hid dim * num directions]
            
        return self.fc(hidden)

In [0]:
INPUT_DIM = len(vocab)
EMBEDDING_DIM = 100
HIDDEN_DIM = 256
OUTPUT_DIM = 1
N_LAYERS = 2
BIDIRECTIONAL = True
DROPOUT = 0.5
PAD_IDX = vocab["<pad>"]

In [0]:
model = RNN(INPUT_DIM, 
            EMBEDDING_DIM, 
            HIDDEN_DIM, 
            OUTPUT_DIM, 
            N_LAYERS, 
            BIDIRECTIONAL, 
            DROPOUT, 
            PAD_IDX)

In [20]:
def count_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

print(f'The model has {count_parameters(model):,} trainable parameters')

The model has 3,530,957 trainable parameters


In [0]:
UNK_IDX = vocab["<unk>"]

model.embedding.weight.data[UNK_IDX] = torch.zeros(EMBEDDING_DIM)
model.embedding.weight.data[PAD_IDX] = torch.zeros(EMBEDDING_DIM)

##Train the Model

In [0]:
import torch.optim as optim

optimizer = optim.Adam(model.parameters())

In [0]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
criterion = nn.BCEWithLogitsLoss()

model = model.to(device)

In [50]:
N_EPOCHS = 100

best_valid_loss = float('inf')

for epoch in range(N_EPOCHS):

    start_time = time.time()
    
    train_loss, train_acc = train(model, train_iterator, optimizer, criterion, device)
    valid_loss, valid_acc = evaluate(model, valid_iterator, criterion, device)
    
    end_time = time.time()

    epoch_mins, epoch_secs = epoch_time(start_time, end_time)
    
    if valid_loss < best_valid_loss:
        best_valid_loss = valid_loss
    
    print(f'Epoch: {epoch+1:02} | Epoch Time: {epoch_mins}m {epoch_secs}s')
    print(f'\tTrain Loss: {train_loss:.3f} | Train Acc: {train_acc*100:.2f}%')
    print(f'\t Val. Loss: {valid_loss:.3f} |  Val. Acc: {valid_acc*100:.2f}%')

100%|██████████| 219/219 [00:04<00:00, 50.77it/s]
100%|██████████| 94/94 [00:00<00:00, 146.57it/s]
  2%|▏         | 5/219 [00:00<00:04, 48.34it/s]

Epoch: 01 | Epoch Time: 0m 4s
	Train Loss: 0.695 | Train Acc: 51.17%
	 Val. Loss: 0.698 |  Val. Acc: 48.63%


100%|██████████| 219/219 [00:04<00:00, 49.19it/s]
100%|██████████| 94/94 [00:00<00:00, 137.18it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.58it/s]

Epoch: 02 | Epoch Time: 0m 5s
	Train Loss: 0.697 | Train Acc: 50.87%
	 Val. Loss: 0.702 |  Val. Acc: 49.07%


100%|██████████| 219/219 [00:04<00:00, 49.65it/s]
100%|██████████| 94/94 [00:00<00:00, 135.16it/s]
  3%|▎         | 6/219 [00:00<00:04, 51.49it/s]

Epoch: 03 | Epoch Time: 0m 5s
	Train Loss: 0.696 | Train Acc: 50.74%
	 Val. Loss: 0.693 |  Val. Acc: 48.80%


100%|██████████| 219/219 [00:04<00:00, 49.50it/s]
100%|██████████| 94/94 [00:00<00:00, 136.13it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.37it/s]

Epoch: 04 | Epoch Time: 0m 5s
	Train Loss: 0.697 | Train Acc: 49.82%
	 Val. Loss: 0.693 |  Val. Acc: 49.25%


100%|██████████| 219/219 [00:04<00:00, 49.39it/s]
100%|██████████| 94/94 [00:00<00:00, 146.41it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.10it/s]

Epoch: 05 | Epoch Time: 0m 5s
	Train Loss: 0.695 | Train Acc: 49.77%
	 Val. Loss: 0.693 |  Val. Acc: 48.78%


100%|██████████| 219/219 [00:04<00:00, 50.93it/s]
100%|██████████| 94/94 [00:00<00:00, 144.85it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.52it/s]

Epoch: 06 | Epoch Time: 0m 4s
	Train Loss: 0.695 | Train Acc: 50.98%
	 Val. Loss: 0.699 |  Val. Acc: 51.74%


100%|██████████| 219/219 [00:04<00:00, 51.01it/s]
100%|██████████| 94/94 [00:00<00:00, 151.23it/s]
  3%|▎         | 6/219 [00:00<00:04, 49.53it/s]

Epoch: 07 | Epoch Time: 0m 4s
	Train Loss: 0.694 | Train Acc: 50.29%
	 Val. Loss: 0.704 |  Val. Acc: 51.43%


100%|██████████| 219/219 [00:04<00:00, 51.27it/s]
100%|██████████| 94/94 [00:00<00:00, 147.46it/s]
  3%|▎         | 6/219 [00:00<00:04, 51.25it/s]

Epoch: 08 | Epoch Time: 0m 4s
	Train Loss: 0.694 | Train Acc: 50.86%
	 Val. Loss: 0.695 |  Val. Acc: 51.25%


100%|██████████| 219/219 [00:04<00:00, 51.25it/s]
100%|██████████| 94/94 [00:00<00:00, 147.30it/s]
  3%|▎         | 7/219 [00:00<00:03, 60.43it/s]

Epoch: 09 | Epoch Time: 0m 4s
	Train Loss: 0.695 | Train Acc: 50.72%
	 Val. Loss: 0.699 |  Val. Acc: 49.17%


100%|██████████| 219/219 [00:04<00:00, 51.65it/s]
100%|██████████| 94/94 [00:00<00:00, 145.76it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.42it/s]

Epoch: 10 | Epoch Time: 0m 4s
	Train Loss: 0.693 | Train Acc: 51.19%
	 Val. Loss: 0.705 |  Val. Acc: 49.31%


100%|██████████| 219/219 [00:04<00:00, 51.49it/s]
100%|██████████| 94/94 [00:00<00:00, 151.15it/s]
  3%|▎         | 6/219 [00:00<00:04, 51.13it/s]

Epoch: 11 | Epoch Time: 0m 4s
	Train Loss: 0.695 | Train Acc: 51.09%
	 Val. Loss: 0.701 |  Val. Acc: 51.13%


100%|██████████| 219/219 [00:04<00:00, 51.33it/s]
100%|██████████| 94/94 [00:00<00:00, 151.51it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.87it/s]

Epoch: 12 | Epoch Time: 0m 4s
	Train Loss: 0.693 | Train Acc: 50.41%
	 Val. Loss: 0.692 |  Val. Acc: 50.01%


100%|██████████| 219/219 [00:04<00:00, 51.12it/s]
100%|██████████| 94/94 [00:00<00:00, 152.65it/s]
  2%|▏         | 5/219 [00:00<00:04, 48.80it/s]

Epoch: 13 | Epoch Time: 0m 4s
	Train Loss: 0.689 | Train Acc: 49.67%
	 Val. Loss: 0.703 |  Val. Acc: 49.38%


100%|██████████| 219/219 [00:04<00:00, 51.32it/s]
100%|██████████| 94/94 [00:00<00:00, 149.91it/s]
  3%|▎         | 6/219 [00:00<00:04, 51.27it/s]

Epoch: 14 | Epoch Time: 0m 4s
	Train Loss: 0.692 | Train Acc: 51.09%
	 Val. Loss: 0.704 |  Val. Acc: 48.96%


100%|██████████| 219/219 [00:04<00:00, 50.80it/s]
100%|██████████| 94/94 [00:00<00:00, 145.88it/s]
  3%|▎         | 6/219 [00:00<00:03, 54.91it/s]

Epoch: 15 | Epoch Time: 0m 4s
	Train Loss: 0.693 | Train Acc: 51.66%
	 Val. Loss: 0.724 |  Val. Acc: 49.37%


100%|██████████| 219/219 [00:04<00:00, 51.07it/s]
100%|██████████| 94/94 [00:00<00:00, 148.37it/s]
  3%|▎         | 6/219 [00:00<00:04, 53.19it/s]

Epoch: 16 | Epoch Time: 0m 4s
	Train Loss: 0.692 | Train Acc: 50.73%
	 Val. Loss: 0.695 |  Val. Acc: 49.20%


100%|██████████| 219/219 [00:04<00:00, 50.82it/s]
100%|██████████| 94/94 [00:00<00:00, 147.04it/s]
  2%|▏         | 5/219 [00:00<00:04, 47.78it/s]

Epoch: 17 | Epoch Time: 0m 4s
	Train Loss: 0.691 | Train Acc: 50.41%
	 Val. Loss: 0.697 |  Val. Acc: 48.98%


100%|██████████| 219/219 [00:04<00:00, 51.96it/s]
100%|██████████| 94/94 [00:00<00:00, 149.47it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.71it/s]

Epoch: 18 | Epoch Time: 0m 4s
	Train Loss: 0.691 | Train Acc: 50.64%
	 Val. Loss: 0.695 |  Val. Acc: 49.40%


100%|██████████| 219/219 [00:04<00:00, 51.24it/s]
100%|██████████| 94/94 [00:00<00:00, 150.53it/s]
  3%|▎         | 6/219 [00:00<00:03, 55.62it/s]

Epoch: 19 | Epoch Time: 0m 4s
	Train Loss: 0.690 | Train Acc: 51.07%
	 Val. Loss: 0.694 |  Val. Acc: 51.05%


100%|██████████| 219/219 [00:04<00:00, 50.95it/s]
100%|██████████| 94/94 [00:00<00:00, 148.35it/s]
  3%|▎         | 6/219 [00:00<00:04, 51.21it/s]

Epoch: 20 | Epoch Time: 0m 4s
	Train Loss: 0.690 | Train Acc: 50.96%
	 Val. Loss: 0.698 |  Val. Acc: 51.02%


100%|██████████| 219/219 [00:04<00:00, 51.59it/s]
100%|██████████| 94/94 [00:00<00:00, 150.40it/s]
  3%|▎         | 6/219 [00:00<00:04, 53.25it/s]

Epoch: 21 | Epoch Time: 0m 4s
	Train Loss: 0.689 | Train Acc: 51.30%
	 Val. Loss: 0.703 |  Val. Acc: 50.73%


100%|██████████| 219/219 [00:04<00:00, 51.05it/s]
100%|██████████| 94/94 [00:00<00:00, 146.41it/s]
  3%|▎         | 6/219 [00:00<00:04, 52.99it/s]

Epoch: 22 | Epoch Time: 0m 4s
	Train Loss: 0.688 | Train Acc: 50.81%
	 Val. Loss: 0.701 |  Val. Acc: 49.42%


100%|██████████| 219/219 [00:04<00:00, 51.06it/s]
100%|██████████| 94/94 [00:00<00:00, 151.63it/s]
  3%|▎         | 6/219 [00:00<00:04, 49.75it/s]

Epoch: 23 | Epoch Time: 0m 4s
	Train Loss: 0.687 | Train Acc: 50.62%
	 Val. Loss: 0.703 |  Val. Acc: 48.95%


100%|██████████| 219/219 [00:04<00:00, 51.79it/s]
100%|██████████| 94/94 [00:00<00:00, 148.76it/s]
  3%|▎         | 6/219 [00:00<00:03, 54.86it/s]

Epoch: 24 | Epoch Time: 0m 4s
	Train Loss: 0.688 | Train Acc: 52.04%
	 Val. Loss: 0.700 |  Val. Acc: 50.91%


100%|██████████| 219/219 [00:04<00:00, 51.17it/s]
100%|██████████| 94/94 [00:00<00:00, 149.75it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.06it/s]

Epoch: 25 | Epoch Time: 0m 4s
	Train Loss: 0.687 | Train Acc: 51.23%
	 Val. Loss: 0.702 |  Val. Acc: 49.14%


100%|██████████| 219/219 [00:04<00:00, 51.26it/s]
100%|██████████| 94/94 [00:00<00:00, 149.18it/s]
  3%|▎         | 6/219 [00:00<00:03, 56.44it/s]

Epoch: 26 | Epoch Time: 0m 4s
	Train Loss: 0.686 | Train Acc: 50.67%
	 Val. Loss: 0.703 |  Val. Acc: 48.29%


100%|██████████| 219/219 [00:04<00:00, 51.48it/s]
100%|██████████| 94/94 [00:00<00:00, 149.91it/s]
  3%|▎         | 6/219 [00:00<00:04, 52.40it/s]

Epoch: 27 | Epoch Time: 0m 4s
	Train Loss: 0.688 | Train Acc: 50.88%
	 Val. Loss: 0.704 |  Val. Acc: 50.94%


100%|██████████| 219/219 [00:04<00:00, 51.15it/s]
100%|██████████| 94/94 [00:00<00:00, 148.64it/s]
  2%|▏         | 5/219 [00:00<00:04, 48.66it/s]

Epoch: 28 | Epoch Time: 0m 4s
	Train Loss: 0.686 | Train Acc: 52.83%
	 Val. Loss: 0.706 |  Val. Acc: 49.51%


100%|██████████| 219/219 [00:04<00:00, 50.59it/s]
100%|██████████| 94/94 [00:00<00:00, 146.16it/s]
  2%|▏         | 5/219 [00:00<00:04, 47.84it/s]

Epoch: 29 | Epoch Time: 0m 4s
	Train Loss: 0.685 | Train Acc: 50.61%
	 Val. Loss: 0.707 |  Val. Acc: 49.08%


100%|██████████| 219/219 [00:04<00:00, 50.23it/s]
100%|██████████| 94/94 [00:00<00:00, 142.16it/s]
  3%|▎         | 6/219 [00:00<00:03, 53.47it/s]

Epoch: 30 | Epoch Time: 0m 5s
	Train Loss: 0.689 | Train Acc: 49.81%
	 Val. Loss: 0.729 |  Val. Acc: 48.91%


100%|██████████| 219/219 [00:04<00:00, 49.92it/s]
100%|██████████| 94/94 [00:00<00:00, 142.38it/s]
  3%|▎         | 6/219 [00:00<00:04, 51.66it/s]

Epoch: 31 | Epoch Time: 0m 5s
	Train Loss: 0.685 | Train Acc: 51.31%
	 Val. Loss: 0.700 |  Val. Acc: 50.86%


100%|██████████| 219/219 [00:04<00:00, 50.16it/s]
100%|██████████| 94/94 [00:00<00:00, 141.90it/s]
  2%|▏         | 5/219 [00:00<00:04, 47.71it/s]

Epoch: 32 | Epoch Time: 0m 5s
	Train Loss: 0.684 | Train Acc: 51.40%
	 Val. Loss: 0.700 |  Val. Acc: 51.42%


100%|██████████| 219/219 [00:04<00:00, 50.01it/s]
100%|██████████| 94/94 [00:00<00:00, 143.36it/s]
  3%|▎         | 6/219 [00:00<00:03, 53.68it/s]

Epoch: 33 | Epoch Time: 0m 5s
	Train Loss: 0.686 | Train Acc: 50.78%
	 Val. Loss: 0.700 |  Val. Acc: 51.20%


100%|██████████| 219/219 [00:04<00:00, 50.01it/s]
100%|██████████| 94/94 [00:00<00:00, 145.08it/s]
  3%|▎         | 6/219 [00:00<00:04, 50.51it/s]

Epoch: 34 | Epoch Time: 0m 5s
	Train Loss: 0.684 | Train Acc: 51.79%
	 Val. Loss: 0.714 |  Val. Acc: 49.12%


100%|██████████| 219/219 [00:04<00:00, 50.31it/s]
100%|██████████| 94/94 [00:00<00:00, 145.91it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.93it/s]

Epoch: 35 | Epoch Time: 0m 5s
	Train Loss: 0.686 | Train Acc: 51.86%
	 Val. Loss: 0.706 |  Val. Acc: 50.91%


100%|██████████| 219/219 [00:04<00:00, 50.33it/s]
100%|██████████| 94/94 [00:00<00:00, 147.30it/s]
  3%|▎         | 6/219 [00:00<00:03, 53.28it/s]

Epoch: 36 | Epoch Time: 0m 4s
	Train Loss: 0.684 | Train Acc: 51.35%
	 Val. Loss: 0.705 |  Val. Acc: 49.12%


100%|██████████| 219/219 [00:04<00:00, 49.79it/s]
100%|██████████| 94/94 [00:00<00:00, 143.64it/s]
  3%|▎         | 6/219 [00:00<00:03, 55.49it/s]

Epoch: 37 | Epoch Time: 0m 5s
	Train Loss: 0.682 | Train Acc: 51.39%
	 Val. Loss: 0.702 |  Val. Acc: 51.26%


100%|██████████| 219/219 [00:04<00:00, 50.25it/s]
100%|██████████| 94/94 [00:00<00:00, 145.25it/s]
  2%|▏         | 5/219 [00:00<00:04, 47.77it/s]

Epoch: 38 | Epoch Time: 0m 5s
	Train Loss: 0.684 | Train Acc: 51.79%
	 Val. Loss: 0.712 |  Val. Acc: 49.08%


100%|██████████| 219/219 [00:04<00:00, 50.53it/s]
100%|██████████| 94/94 [00:00<00:00, 149.47it/s]
  3%|▎         | 6/219 [00:00<00:03, 55.12it/s]

Epoch: 39 | Epoch Time: 0m 4s
	Train Loss: 0.689 | Train Acc: 50.60%
	 Val. Loss: 0.705 |  Val. Acc: 51.11%


100%|██████████| 219/219 [00:04<00:00, 49.92it/s]
100%|██████████| 94/94 [00:00<00:00, 142.86it/s]
  2%|▏         | 5/219 [00:00<00:04, 46.24it/s]

Epoch: 40 | Epoch Time: 0m 5s
	Train Loss: 0.684 | Train Acc: 50.75%
	 Val. Loss: 0.707 |  Val. Acc: 49.26%


100%|██████████| 219/219 [00:04<00:00, 50.91it/s]
100%|██████████| 94/94 [00:00<00:00, 143.64it/s]
  3%|▎         | 6/219 [00:00<00:03, 53.65it/s]

Epoch: 41 | Epoch Time: 0m 4s
	Train Loss: 0.687 | Train Acc: 51.08%
	 Val. Loss: 0.705 |  Val. Acc: 51.15%


100%|██████████| 219/219 [00:04<00:00, 50.21it/s]
100%|██████████| 94/94 [00:00<00:00, 150.78it/s]
  3%|▎         | 6/219 [00:00<00:04, 49.67it/s]

Epoch: 42 | Epoch Time: 0m 4s
	Train Loss: 0.682 | Train Acc: 50.62%
	 Val. Loss: 0.707 |  Val. Acc: 49.04%


100%|██████████| 219/219 [00:04<00:00, 49.63it/s]
100%|██████████| 94/94 [00:00<00:00, 147.68it/s]
  3%|▎         | 6/219 [00:00<00:04, 50.65it/s]

Epoch: 43 | Epoch Time: 0m 5s
	Train Loss: 0.683 | Train Acc: 51.72%
	 Val. Loss: 0.703 |  Val. Acc: 51.11%


100%|██████████| 219/219 [00:04<00:00, 49.67it/s]
100%|██████████| 94/94 [00:00<00:00, 146.63it/s]
  3%|▎         | 6/219 [00:00<00:03, 53.88it/s]

Epoch: 44 | Epoch Time: 0m 5s
	Train Loss: 0.682 | Train Acc: 51.30%
	 Val. Loss: 0.711 |  Val. Acc: 49.20%


100%|██████████| 219/219 [00:04<00:00, 49.69it/s]
100%|██████████| 94/94 [00:00<00:00, 150.60it/s]
  3%|▎         | 6/219 [00:00<00:04, 50.40it/s]

Epoch: 45 | Epoch Time: 0m 5s
	Train Loss: 0.685 | Train Acc: 51.58%
	 Val. Loss: 0.698 |  Val. Acc: 51.11%


100%|██████████| 219/219 [00:04<00:00, 50.72it/s]
100%|██████████| 94/94 [00:00<00:00, 144.11it/s]
  3%|▎         | 6/219 [00:00<00:04, 51.91it/s]

Epoch: 46 | Epoch Time: 0m 4s
	Train Loss: 0.683 | Train Acc: 51.27%
	 Val. Loss: 0.712 |  Val. Acc: 51.02%


100%|██████████| 219/219 [00:04<00:00, 49.87it/s]
100%|██████████| 94/94 [00:00<00:00, 144.70it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.70it/s]

Epoch: 47 | Epoch Time: 0m 5s
	Train Loss: 0.682 | Train Acc: 50.99%
	 Val. Loss: 0.703 |  Val. Acc: 50.99%


100%|██████████| 219/219 [00:04<00:00, 50.17it/s]
100%|██████████| 94/94 [00:00<00:00, 148.05it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.78it/s]

Epoch: 48 | Epoch Time: 0m 5s
	Train Loss: 0.682 | Train Acc: 50.02%
	 Val. Loss: 0.707 |  Val. Acc: 50.89%


100%|██████████| 219/219 [00:04<00:00, 50.10it/s]
100%|██████████| 94/94 [00:00<00:00, 144.66it/s]
  3%|▎         | 6/219 [00:00<00:03, 54.99it/s]

Epoch: 49 | Epoch Time: 0m 5s
	Train Loss: 0.681 | Train Acc: 51.26%
	 Val. Loss: 0.718 |  Val. Acc: 49.08%


100%|██████████| 219/219 [00:04<00:00, 49.96it/s]
100%|██████████| 94/94 [00:00<00:00, 144.85it/s]
  3%|▎         | 6/219 [00:00<00:03, 54.32it/s]

Epoch: 50 | Epoch Time: 0m 5s
	Train Loss: 0.684 | Train Acc: 49.94%
	 Val. Loss: 0.704 |  Val. Acc: 49.16%


100%|██████████| 219/219 [00:04<00:00, 50.01it/s]
100%|██████████| 94/94 [00:00<00:00, 147.25it/s]
  2%|▏         | 5/219 [00:00<00:04, 48.44it/s]

Epoch: 51 | Epoch Time: 0m 5s
	Train Loss: 0.683 | Train Acc: 51.52%
	 Val. Loss: 0.709 |  Val. Acc: 50.81%


100%|██████████| 219/219 [00:04<00:00, 49.99it/s]
100%|██████████| 94/94 [00:00<00:00, 138.71it/s]
  3%|▎         | 6/219 [00:00<00:03, 53.27it/s]

Epoch: 52 | Epoch Time: 0m 5s
	Train Loss: 0.682 | Train Acc: 51.65%
	 Val. Loss: 0.719 |  Val. Acc: 48.69%


100%|██████████| 219/219 [00:04<00:00, 50.48it/s]
100%|██████████| 94/94 [00:00<00:00, 146.53it/s]
  2%|▏         | 5/219 [00:00<00:04, 44.89it/s]

Epoch: 53 | Epoch Time: 0m 4s
	Train Loss: 0.679 | Train Acc: 51.88%
	 Val. Loss: 0.714 |  Val. Acc: 51.23%


100%|██████████| 219/219 [00:04<00:00, 49.82it/s]
100%|██████████| 94/94 [00:00<00:00, 148.42it/s]
  3%|▎         | 6/219 [00:00<00:04, 51.88it/s]

Epoch: 54 | Epoch Time: 0m 5s
	Train Loss: 0.684 | Train Acc: 50.80%
	 Val. Loss: 0.708 |  Val. Acc: 50.89%


100%|██████████| 219/219 [00:04<00:00, 50.26it/s]
100%|██████████| 94/94 [00:00<00:00, 139.65it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.12it/s]

Epoch: 55 | Epoch Time: 0m 5s
	Train Loss: 0.684 | Train Acc: 51.79%
	 Val. Loss: 0.708 |  Val. Acc: 48.85%


100%|██████████| 219/219 [00:04<00:00, 50.74it/s]
100%|██████████| 94/94 [00:00<00:00, 145.43it/s]
  3%|▎         | 6/219 [00:00<00:03, 54.60it/s]

Epoch: 56 | Epoch Time: 0m 4s
	Train Loss: 0.680 | Train Acc: 51.66%
	 Val. Loss: 0.721 |  Val. Acc: 48.98%


100%|██████████| 219/219 [00:04<00:00, 50.17it/s]
100%|██████████| 94/94 [00:00<00:00, 148.42it/s]
  3%|▎         | 6/219 [00:00<00:04, 52.98it/s]

Epoch: 57 | Epoch Time: 0m 5s
	Train Loss: 0.684 | Train Acc: 51.19%
	 Val. Loss: 0.701 |  Val. Acc: 49.27%


100%|██████████| 219/219 [00:04<00:00, 50.23it/s]
100%|██████████| 94/94 [00:00<00:00, 140.34it/s]
  3%|▎         | 6/219 [00:00<00:04, 50.40it/s]

Epoch: 58 | Epoch Time: 0m 5s
	Train Loss: 0.682 | Train Acc: 51.22%
	 Val. Loss: 0.721 |  Val. Acc: 49.05%


100%|██████████| 219/219 [00:04<00:00, 50.60it/s]
100%|██████████| 94/94 [00:00<00:00, 145.04it/s]
  3%|▎         | 6/219 [00:00<00:04, 52.75it/s]

Epoch: 59 | Epoch Time: 0m 4s
	Train Loss: 0.684 | Train Acc: 51.40%
	 Val. Loss: 0.713 |  Val. Acc: 51.42%


100%|██████████| 219/219 [00:04<00:00, 50.59it/s]
100%|██████████| 94/94 [00:00<00:00, 147.39it/s]
  3%|▎         | 6/219 [00:00<00:03, 57.18it/s]

Epoch: 60 | Epoch Time: 0m 4s
	Train Loss: 0.682 | Train Acc: 50.78%
	 Val. Loss: 0.713 |  Val. Acc: 49.06%


100%|██████████| 219/219 [00:04<00:00, 49.96it/s]
100%|██████████| 94/94 [00:00<00:00, 138.89it/s]
  2%|▏         | 5/219 [00:00<00:04, 46.62it/s]

Epoch: 61 | Epoch Time: 0m 5s
	Train Loss: 0.685 | Train Acc: 50.75%
	 Val. Loss: 0.708 |  Val. Acc: 49.17%


100%|██████████| 219/219 [00:04<00:00, 50.29it/s]
100%|██████████| 94/94 [00:00<00:00, 139.92it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.20it/s]

Epoch: 62 | Epoch Time: 0m 5s
	Train Loss: 0.691 | Train Acc: 50.67%
	 Val. Loss: 0.703 |  Val. Acc: 49.27%


100%|██████████| 219/219 [00:04<00:00, 49.83it/s]
100%|██████████| 94/94 [00:00<00:00, 148.34it/s]
  3%|▎         | 6/219 [00:00<00:04, 52.18it/s]

Epoch: 63 | Epoch Time: 0m 5s
	Train Loss: 0.684 | Train Acc: 51.49%
	 Val. Loss: 0.710 |  Val. Acc: 50.71%


100%|██████████| 219/219 [00:04<00:00, 49.77it/s]
100%|██████████| 94/94 [00:00<00:00, 141.47it/s]
  3%|▎         | 6/219 [00:00<00:03, 55.78it/s]

Epoch: 64 | Epoch Time: 0m 5s
	Train Loss: 0.683 | Train Acc: 50.69%
	 Val. Loss: 0.695 |  Val. Acc: 51.22%


100%|██████████| 219/219 [00:04<00:00, 50.50it/s]
100%|██████████| 94/94 [00:00<00:00, 147.56it/s]
  3%|▎         | 6/219 [00:00<00:03, 53.38it/s]

Epoch: 65 | Epoch Time: 0m 4s
	Train Loss: 0.686 | Train Acc: 51.76%
	 Val. Loss: 0.705 |  Val. Acc: 48.79%


100%|██████████| 219/219 [00:04<00:00, 50.15it/s]
100%|██████████| 94/94 [00:00<00:00, 148.77it/s]
  3%|▎         | 6/219 [00:00<00:03, 55.89it/s]

Epoch: 66 | Epoch Time: 0m 5s
	Train Loss: 0.683 | Train Acc: 51.85%
	 Val. Loss: 0.709 |  Val. Acc: 50.99%


100%|██████████| 219/219 [00:04<00:00, 49.90it/s]
100%|██████████| 94/94 [00:00<00:00, 141.63it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.69it/s]

Epoch: 67 | Epoch Time: 0m 5s
	Train Loss: 0.680 | Train Acc: 51.63%
	 Val. Loss: 0.715 |  Val. Acc: 51.25%


100%|██████████| 219/219 [00:04<00:00, 50.37it/s]
100%|██████████| 94/94 [00:00<00:00, 144.80it/s]
  2%|▏         | 5/219 [00:00<00:04, 48.34it/s]

Epoch: 68 | Epoch Time: 0m 5s
	Train Loss: 0.683 | Train Acc: 51.13%
	 Val. Loss: 0.704 |  Val. Acc: 50.65%


100%|██████████| 219/219 [00:04<00:00, 50.60it/s]
100%|██████████| 94/94 [00:00<00:00, 151.96it/s]
  3%|▎         | 6/219 [00:00<00:03, 55.27it/s]

Epoch: 69 | Epoch Time: 0m 4s
	Train Loss: 0.683 | Train Acc: 51.15%
	 Val. Loss: 0.704 |  Val. Acc: 51.15%


100%|██████████| 219/219 [00:04<00:00, 50.98it/s]
100%|██████████| 94/94 [00:00<00:00, 145.07it/s]
  2%|▏         | 5/219 [00:00<00:04, 46.28it/s]

Epoch: 70 | Epoch Time: 0m 4s
	Train Loss: 0.678 | Train Acc: 50.61%
	 Val. Loss: 0.700 |  Val. Acc: 49.25%


100%|██████████| 219/219 [00:04<00:00, 51.02it/s]
100%|██████████| 94/94 [00:00<00:00, 143.62it/s]
  3%|▎         | 6/219 [00:00<00:03, 53.45it/s]

Epoch: 71 | Epoch Time: 0m 4s
	Train Loss: 0.681 | Train Acc: 52.52%
	 Val. Loss: 0.708 |  Val. Acc: 49.25%


100%|██████████| 219/219 [00:04<00:00, 50.93it/s]
100%|██████████| 94/94 [00:00<00:00, 147.74it/s]
  3%|▎         | 6/219 [00:00<00:04, 52.37it/s]

Epoch: 72 | Epoch Time: 0m 4s
	Train Loss: 0.680 | Train Acc: 50.84%
	 Val. Loss: 0.706 |  Val. Acc: 49.34%


100%|██████████| 219/219 [00:04<00:00, 50.36it/s]
100%|██████████| 94/94 [00:00<00:00, 147.32it/s]
  2%|▏         | 5/219 [00:00<00:04, 47.54it/s]

Epoch: 73 | Epoch Time: 0m 4s
	Train Loss: 0.679 | Train Acc: 51.75%
	 Val. Loss: 0.705 |  Val. Acc: 49.19%


100%|██████████| 219/219 [00:04<00:00, 50.96it/s]
100%|██████████| 94/94 [00:00<00:00, 142.85it/s]
  3%|▎         | 6/219 [00:00<00:04, 51.96it/s]

Epoch: 74 | Epoch Time: 0m 4s
	Train Loss: 0.683 | Train Acc: 51.72%
	 Val. Loss: 0.705 |  Val. Acc: 51.30%


100%|██████████| 219/219 [00:04<00:00, 50.38it/s]
100%|██████████| 94/94 [00:00<00:00, 149.24it/s]
  3%|▎         | 6/219 [00:00<00:03, 53.73it/s]

Epoch: 75 | Epoch Time: 0m 4s
	Train Loss: 0.680 | Train Acc: 51.07%
	 Val. Loss: 0.709 |  Val. Acc: 49.15%


100%|██████████| 219/219 [00:04<00:00, 50.76it/s]
100%|██████████| 94/94 [00:00<00:00, 151.16it/s]
  2%|▏         | 5/219 [00:00<00:04, 46.47it/s]

Epoch: 76 | Epoch Time: 0m 4s
	Train Loss: 0.678 | Train Acc: 52.90%
	 Val. Loss: 0.718 |  Val. Acc: 49.20%


100%|██████████| 219/219 [00:04<00:00, 49.99it/s]
100%|██████████| 94/94 [00:00<00:00, 142.38it/s]
  2%|▏         | 5/219 [00:00<00:04, 48.00it/s]

Epoch: 77 | Epoch Time: 0m 5s
	Train Loss: 0.680 | Train Acc: 52.46%
	 Val. Loss: 0.706 |  Val. Acc: 49.38%


100%|██████████| 219/219 [00:04<00:00, 50.34it/s]
100%|██████████| 94/94 [00:00<00:00, 146.39it/s]
  3%|▎         | 6/219 [00:00<00:04, 50.89it/s]

Epoch: 78 | Epoch Time: 0m 5s
	Train Loss: 0.684 | Train Acc: 50.77%
	 Val. Loss: 0.702 |  Val. Acc: 51.29%


100%|██████████| 219/219 [00:04<00:00, 50.34it/s]
100%|██████████| 94/94 [00:00<00:00, 150.68it/s]
  3%|▎         | 6/219 [00:00<00:03, 54.35it/s]

Epoch: 79 | Epoch Time: 0m 4s
	Train Loss: 0.682 | Train Acc: 51.94%
	 Val. Loss: 0.715 |  Val. Acc: 49.43%


100%|██████████| 219/219 [00:04<00:00, 51.51it/s]
100%|██████████| 94/94 [00:00<00:00, 144.39it/s]
  2%|▏         | 5/219 [00:00<00:04, 48.44it/s]

Epoch: 80 | Epoch Time: 0m 4s
	Train Loss: 0.685 | Train Acc: 50.40%
	 Val. Loss: 0.704 |  Val. Acc: 51.00%


100%|██████████| 219/219 [00:04<00:00, 50.78it/s]
100%|██████████| 94/94 [00:00<00:00, 143.16it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.73it/s]

Epoch: 81 | Epoch Time: 0m 4s
	Train Loss: 0.679 | Train Acc: 50.95%
	 Val. Loss: 0.709 |  Val. Acc: 49.34%


100%|██████████| 219/219 [00:04<00:00, 49.71it/s]
100%|██████████| 94/94 [00:00<00:00, 151.15it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.61it/s]

Epoch: 82 | Epoch Time: 0m 5s
	Train Loss: 0.682 | Train Acc: 52.31%
	 Val. Loss: 0.701 |  Val. Acc: 49.67%


100%|██████████| 219/219 [00:04<00:00, 49.77it/s]
100%|██████████| 94/94 [00:00<00:00, 147.68it/s]
  2%|▏         | 5/219 [00:00<00:04, 43.38it/s]

Epoch: 83 | Epoch Time: 0m 5s
	Train Loss: 0.681 | Train Acc: 51.28%
	 Val. Loss: 0.716 |  Val. Acc: 49.27%


100%|██████████| 219/219 [00:04<00:00, 50.24it/s]
100%|██████████| 94/94 [00:00<00:00, 141.16it/s]
  2%|▏         | 5/219 [00:00<00:04, 47.43it/s]

Epoch: 84 | Epoch Time: 0m 5s
	Train Loss: 0.679 | Train Acc: 51.43%
	 Val. Loss: 0.710 |  Val. Acc: 49.16%


100%|██████████| 219/219 [00:04<00:00, 49.80it/s]
100%|██████████| 94/94 [00:00<00:00, 145.65it/s]
  3%|▎         | 6/219 [00:00<00:04, 52.82it/s]

Epoch: 85 | Epoch Time: 0m 5s
	Train Loss: 0.678 | Train Acc: 52.18%
	 Val. Loss: 0.719 |  Val. Acc: 50.99%


100%|██████████| 219/219 [00:04<00:00, 50.25it/s]
100%|██████████| 94/94 [00:00<00:00, 141.75it/s]
  2%|▏         | 5/219 [00:00<00:04, 46.95it/s]

Epoch: 86 | Epoch Time: 0m 5s
	Train Loss: 0.680 | Train Acc: 51.76%
	 Val. Loss: 0.713 |  Val. Acc: 50.89%


100%|██████████| 219/219 [00:04<00:00, 49.92it/s]
100%|██████████| 94/94 [00:00<00:00, 144.15it/s]
  2%|▏         | 5/219 [00:00<00:04, 47.40it/s]

Epoch: 87 | Epoch Time: 0m 5s
	Train Loss: 0.679 | Train Acc: 49.82%
	 Val. Loss: 0.715 |  Val. Acc: 49.26%


100%|██████████| 219/219 [00:04<00:00, 50.64it/s]
100%|██████████| 94/94 [00:00<00:00, 148.99it/s]
  3%|▎         | 6/219 [00:00<00:04, 51.46it/s]

Epoch: 88 | Epoch Time: 0m 4s
	Train Loss: 0.678 | Train Acc: 52.80%
	 Val. Loss: 0.735 |  Val. Acc: 49.53%


100%|██████████| 219/219 [00:04<00:00, 50.70it/s]
100%|██████████| 94/94 [00:00<00:00, 150.79it/s]
  2%|▏         | 5/219 [00:00<00:04, 47.34it/s]

Epoch: 89 | Epoch Time: 0m 4s
	Train Loss: 0.678 | Train Acc: 51.25%
	 Val. Loss: 0.713 |  Val. Acc: 49.10%


100%|██████████| 219/219 [00:04<00:00, 50.34it/s]
100%|██████████| 94/94 [00:00<00:00, 141.41it/s]
  3%|▎         | 6/219 [00:00<00:04, 51.13it/s]

Epoch: 90 | Epoch Time: 0m 5s
	Train Loss: 0.680 | Train Acc: 50.77%
	 Val. Loss: 0.711 |  Val. Acc: 49.45%


100%|██████████| 219/219 [00:04<00:00, 50.04it/s]
100%|██████████| 94/94 [00:00<00:00, 147.27it/s]
  3%|▎         | 6/219 [00:00<00:03, 55.42it/s]

Epoch: 91 | Epoch Time: 0m 5s
	Train Loss: 0.680 | Train Acc: 52.50%
	 Val. Loss: 0.721 |  Val. Acc: 49.28%


100%|██████████| 219/219 [00:04<00:00, 50.34it/s]
100%|██████████| 94/94 [00:00<00:00, 148.90it/s]
  2%|▏         | 5/219 [00:00<00:04, 44.97it/s]

Epoch: 92 | Epoch Time: 0m 4s
	Train Loss: 0.676 | Train Acc: 52.10%
	 Val. Loss: 0.710 |  Val. Acc: 51.17%


100%|██████████| 219/219 [00:04<00:00, 50.52it/s]
100%|██████████| 94/94 [00:00<00:00, 144.06it/s]
  2%|▏         | 5/219 [00:00<00:04, 47.35it/s]

Epoch: 93 | Epoch Time: 0m 4s
	Train Loss: 0.678 | Train Acc: 51.55%
	 Val. Loss: 0.714 |  Val. Acc: 49.02%


100%|██████████| 219/219 [00:04<00:00, 50.54it/s]
100%|██████████| 94/94 [00:00<00:00, 148.44it/s]
  3%|▎         | 6/219 [00:00<00:03, 53.71it/s]

Epoch: 94 | Epoch Time: 0m 4s
	Train Loss: 0.679 | Train Acc: 51.09%
	 Val. Loss: 0.724 |  Val. Acc: 50.86%


100%|██████████| 219/219 [00:04<00:00, 50.57it/s]
100%|██████████| 94/94 [00:00<00:00, 149.76it/s]
  3%|▎         | 6/219 [00:00<00:04, 48.96it/s]

Epoch: 95 | Epoch Time: 0m 4s
	Train Loss: 0.680 | Train Acc: 51.45%
	 Val. Loss: 0.711 |  Val. Acc: 49.06%


100%|██████████| 219/219 [00:04<00:00, 49.66it/s]
100%|██████████| 94/94 [00:00<00:00, 141.97it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.86it/s]

Epoch: 96 | Epoch Time: 0m 5s
	Train Loss: 0.679 | Train Acc: 51.62%
	 Val. Loss: 0.715 |  Val. Acc: 49.10%


100%|██████████| 219/219 [00:04<00:00, 49.87it/s]
100%|██████████| 94/94 [00:00<00:00, 150.05it/s]
  2%|▏         | 5/219 [00:00<00:04, 49.29it/s]

Epoch: 97 | Epoch Time: 0m 5s
	Train Loss: 0.677 | Train Acc: 50.90%
	 Val. Loss: 0.715 |  Val. Acc: 49.20%


100%|██████████| 219/219 [00:04<00:00, 50.32it/s]
100%|██████████| 94/94 [00:00<00:00, 148.97it/s]
  2%|▏         | 5/219 [00:00<00:04, 46.56it/s]

Epoch: 98 | Epoch Time: 0m 4s
	Train Loss: 0.678 | Train Acc: 51.88%
	 Val. Loss: 0.712 |  Val. Acc: 51.00%


100%|██████████| 219/219 [00:04<00:00, 50.23it/s]
100%|██████████| 94/94 [00:00<00:00, 145.22it/s]
  3%|▎         | 6/219 [00:00<00:04, 50.21it/s]

Epoch: 99 | Epoch Time: 0m 5s
	Train Loss: 0.679 | Train Acc: 51.33%
	 Val. Loss: 0.708 |  Val. Acc: 49.07%


100%|██████████| 219/219 [00:04<00:00, 50.89it/s]
100%|██████████| 94/94 [00:00<00:00, 146.72it/s]

Epoch: 100 | Epoch Time: 0m 4s
	Train Loss: 0.675 | Train Acc: 52.37%
	 Val. Loss: 0.721 |  Val. Acc: 49.02%





In [51]:
test_loss, test_acc = evaluate(model, test_iterator, criterion, device)

print(f'Test Loss: {test_loss:.3f} | Test Acc: {test_acc*100:.2f}%')

100%|██████████| 94/94 [00:00<00:00, 145.77it/s]

Test Loss: 0.729 | Test Acc: 48.82%





In [0]:
def predict_sentiment(model, sentence):
    model.eval()
    tokenized = [tok for tok in tokenizer(sentence)]
    indexed = vectorize(vocab, tokenized, one_sentence=True)
    tensor = torch.LongTensor(indexed).unsqueeze(0).to(device)
    prediction = torch.sigmoid(model(tensor))
    return prediction.item()

In [53]:
predict_sentiment(model, "이건 별로야")

0.03074686974287033

In [54]:
predict_sentiment(model, "이 영화는 너무 감동적이었어")

0.6928694248199463

# 3 - Convolutional Sentiment Analysis

In [0]:
import torch.nn.functional as F

class CNN(nn.Module):
    def __init__(self, vocab_size, embedding_dim, n_filters, filter_sizes, output_dim, 
                 dropout, pad_idx):
        
        super().__init__()
                
        self.embedding = nn.Embedding(vocab_size, embedding_dim, padding_idx = pad_idx)
        
        self.convs = nn.ModuleList([
                                    nn.Conv2d(in_channels = 1, 
                                              out_channels = n_filters, 
                                              kernel_size = (fs, embedding_dim)) 
                                    for fs in filter_sizes
                                    ])
        
        self.fc = nn.Linear(len(filter_sizes) * n_filters, output_dim)
        
        self.dropout = nn.Dropout(dropout)
        
    def forward(self, text):
        
        #text = [batch size, sent len]
        
        embedded = self.embedding(text)
                
        #embedded = [batch size, sent len, emb dim]
        
        embedded = embedded.unsqueeze(1)
        
        #embedded = [batch size, 1, sent len, emb dim]
        
        conved = [F.relu(conv(embedded)).squeeze(3) for conv in self.convs]
            
        #conved_n = [batch size, n_filters, sent len - filter_sizes[n] + 1]
                
        pooled = [F.max_pool1d(conv, conv.shape[2]).squeeze(2) for conv in conved]
        
        #pooled_n = [batch size, n_filters]
        
        cat = self.dropout(torch.cat(pooled, dim = 1))

        #cat = [batch size, n_filters * len(filter_sizes)]
            
        return self.fc(cat)

In [0]:
INPUT_DIM = len(vocab)
EMBEDDING_DIM = 100
N_FILTERS = 100
FILTER_SIZES = [3,4,5]
OUTPUT_DIM = 1
DROPOUT = 0.5
PAD_IDX = vocab["<pad>"]

model = CNN(INPUT_DIM, EMBEDDING_DIM, N_FILTERS, FILTER_SIZES, OUTPUT_DIM, DROPOUT, PAD_IDX)

In [57]:
def count_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

print(f'The model has {count_parameters(model):,} trainable parameters')

The model has 1,340,901 trainable parameters


In [0]:
UNK_IDX = vocab["<unk>"]

model.embedding.weight.data[UNK_IDX] = torch.zeros(EMBEDDING_DIM)
model.embedding.weight.data[PAD_IDX] = torch.zeros(EMBEDDING_DIM)

##Train the Model

In [0]:
import torch.optim as optim

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

optimizer = optim.Adam(model.parameters())

criterion = nn.BCEWithLogitsLoss()

model = model.to(device)
criterion = criterion.to(device)

In [68]:
N_EPOCHS = 5

best_valid_loss = float('inf')

for epoch in range(N_EPOCHS):

    start_time = time.time()
    
    train_loss, train_acc = train(model, train_iterator, optimizer, criterion, device)
    valid_loss, valid_acc = evaluate(model, valid_iterator, criterion, device)
    
    end_time = time.time()

    epoch_mins, epoch_secs = epoch_time(start_time, end_time)
    
    if valid_loss < best_valid_loss:
        best_valid_loss = valid_loss
    
    print(f'Epoch: {epoch+1:02} | Epoch Time: {epoch_mins}m {epoch_secs}s')
    print(f'\tTrain Loss: {train_loss:.3f} | Train Acc: {train_acc*100:.2f}%')
    print(f'\t Val. Loss: {valid_loss:.3f} |  Val. Acc: {valid_acc*100:.2f}%')

100%|██████████| 219/219 [00:02<00:00, 98.78it/s] 
100%|██████████| 94/94 [00:00<00:00, 414.99it/s]
  4%|▎         | 8/219 [00:00<00:02, 74.71it/s]

Epoch: 01 | Epoch Time: 0m 2s
	Train Loss: 0.007 | Train Acc: 99.83%
	 Val. Loss: 3.166 |  Val. Acc: 77.71%


100%|██████████| 219/219 [00:02<00:00, 103.44it/s]
100%|██████████| 94/94 [00:00<00:00, 390.21it/s]
  6%|▌         | 13/219 [00:00<00:01, 123.55it/s]

Epoch: 02 | Epoch Time: 0m 2s
	Train Loss: 0.007 | Train Acc: 99.70%
	 Val. Loss: 3.207 |  Val. Acc: 77.89%


100%|██████████| 219/219 [00:02<00:00, 100.18it/s]
100%|██████████| 94/94 [00:00<00:00, 395.69it/s]
  6%|▋         | 14/219 [00:00<00:01, 131.42it/s]

Epoch: 03 | Epoch Time: 0m 2s
	Train Loss: 0.006 | Train Acc: 99.79%
	 Val. Loss: 3.316 |  Val. Acc: 78.26%


100%|██████████| 219/219 [00:02<00:00, 101.93it/s]
100%|██████████| 94/94 [00:00<00:00, 391.29it/s]
  6%|▋         | 14/219 [00:00<00:01, 132.50it/s]

Epoch: 04 | Epoch Time: 0m 2s
	Train Loss: 0.005 | Train Acc: 99.81%
	 Val. Loss: 3.230 |  Val. Acc: 78.04%


100%|██████████| 219/219 [00:02<00:00, 100.79it/s]
100%|██████████| 94/94 [00:00<00:00, 392.95it/s]

Epoch: 05 | Epoch Time: 0m 2s
	Train Loss: 0.005 | Train Acc: 99.86%
	 Val. Loss: 3.217 |  Val. Acc: 77.94%





In [69]:
test_loss, test_acc = evaluate(model, test_iterator, criterion, device)

print(f'Test Loss: {test_loss:.3f} | Test Acc: {test_acc*100:.2f}%')

100%|██████████| 94/94 [00:00<00:00, 355.45it/s]

Test Loss: 2.964 | Test Acc: 79.46%





In [72]:
predict_sentiment(model, df_day['content'][2])

3.219496875317418e-06

In [91]:
predict_sentiment(model, '뉴욕 증시가 폭락중')

0.8541789650917053