# Pytorch many2many 예제 by wygo
기본 rnn 사용, 추후 lstm, bilstm으로 업데이트
dropout 추가하여 성능 향상 가능

In [1]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable
dtype = torch.FloatTensor

In [2]:
batch_size = 3
sequence_length = 4
ont_hot_embedding_size = 5
label_size = 4  # start, end, middle, none

In [49]:
## 임의의 데이터 생성
batch_size = 3
sequence_length = 4
ont_hot_embedding_size = 5
label_size = 4  # start, end, middle, none
n_hidden = 7 # number of hidden units in one cell

X, Y = [], []
for i in range(batch_size):
    # sequence 생성
    sequence = []
    # random 임의 생성
    temp = np.random.choice(ont_hot_embedding_size, 4)
    for ii in range(sequence_length):
        sequence.append(np.eye(ont_hot_embedding_size)[temp[ii]])
    
    # 배치 생성
    X.append(sequence)
    Y.append(np.random.choice(label_size, sequence_length))  # 0~3 사이 랜덤으로 label 생성
    
# X : (batch_size, sequence_length, ont_hot_embedding_size)
# Y : (batch_size)

# to Torch.Tensor
input_batch = Variable(torch.Tensor(X))
target_batch = Variable(torch.LongTensor(Y))

print('input  shape : ', input_batch.shape)
print('output shape : ', target_batch.shape)

input  shape :  torch.Size([3, 4, 5])
output shape :  torch.Size([3, 4])


In [52]:
class TextRNN(nn.Module):
    def __init__(self):
        super(TextRNN, self).__init__()

        self.rnn = nn.RNN(input_size=ont_hot_embedding_size, hidden_size=n_hidden)
        self.fc = nn.Linear(n_hidden, label_size)
        
    def forward(self, hidden, X):
        # X    : (batch_size, sequence_length, ont_hot_embedding_size)
        # input: (sequence_length, batch_size, ont_hot_embedding_size)
        input = X.transpose(0, 1)
        
        # input  : (sequence_length, batch_size, ont_hot_embedding_size)
        # outputs: (sequence_length, batch_size, n_hidden)
        # hidden : (num_layers(=1) * num_directions(=1), batch_size, n_hidden)
        outputs, hidden = self.rnn(input, hidden)
        
#         # many to one
#         outputs = outputs[-1]
        
        # model : [sequence_length, batch_size, label_size]
        model = self.fc(outputs)        
        return model
    

model = TextRNN()
print(model)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

TextRNN(
  (rnn): RNN(5, 7)
  (fc): Linear(in_features=7, out_features=4, bias=True)
)


In [53]:
# Training
for epoch in range(10000):
    optimizer.zero_grad()

    # hidden : [num_layers * num_directions, batch, hidden_size]
    hidden = Variable(torch.zeros(1, batch_size, n_hidden))
    
    # input_batch : [batch_size, n_step, n_class]
    output = model(hidden, input_batch)

    # output : [sequence_length, batch_size, label_size]
    #        ->[batch_size, sequence_length, label_size]
    output = output.transpose(0, 1)
    loss = 0
    # batch 별로 for문을돌리며 loss 합
    for i in range(0, batch_size):
        # output[i] : [max_len+1, num_directions(=1) * n_hidden, target_batch[i] : max_len+1]
        loss += criterion(output[i], target_batch[i])

        
    if (epoch + 1) % 1000 == 0:
        print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.6f}'.format(loss))

    loss.backward()
    optimizer.step()


Epoch: 1000 cost = 0.514071
Epoch: 2000 cost = 0.383276
Epoch: 3000 cost = 0.361003
Epoch: 4000 cost = 0.353580
Epoch: 5000 cost = 0.350328


In [119]:
# hidden : [num_layers * num_directions, batch, hidden_size]
hidden = Variable(torch.zeros(1, batch_size, n_hidden))
output = model(hidden, input_batch)
# output : [sequence_length, batch_size, label_size]
print('output  : [sequence_length, batch_size, label_size]')
print('output  :',output.shape)

output  : [sequence_length, batch_size, label_size]
output  : torch.Size([4, 3, 4])


In [115]:
predict = output.data.max(2, keepdim=True)[1]
predict = predict.transpose(0, 1)  # [batch_size, sequence_length, label_size]

print(predict.shape)

for i in range(batch_size):
    print('\nTrue : %s  /%s  /%s  /%s' %(target_batch[i][0], target_batch[i][1], target_batch[i][2], target_batch[0][3]))
    print('Pred : %s/%s/%s/%s' %(predict[i][0], predict[i][1], predict[i][2], predict[i][3]))

torch.Size([3, 4, 1])

True : tensor(0)  /tensor(1)  /tensor(0)  /tensor(1)
Pred : tensor([0])/tensor([1])/tensor([0])/tensor([1])

True : tensor(3)  /tensor(3)  /tensor(2)  /tensor(1)
Pred : tensor([0])/tensor([3])/tensor([2])/tensor([3])

True : tensor(0)  /tensor(0)  /tensor(2)  /tensor(1)
Pred : tensor([0])/tensor([0])/tensor([2])/tensor([0])


In [None]:
# 위의 3줄 풀어쓴 코드, byte(256) -> 8진수로 변환, 256 차원은 너무 커서
# 255 -> 1/16 * [ 1, 1, 1, 1, 1, ,1 ,1, 1]

# byte의 seq를 8진수로 바꿔서 8진수의 seq로 변환
seq_oct = []
for byte in seq:
    # byte -> 8진수로 변환
    byte2oct = []
    for p in range(7, -1, -1):
        # byte와 2^p가 같으면 수행, 1<<p : 2^p
        if byte & (1<<p):
            byte2oct.append(encode_val)
        else:
            byte2oct.append(-encode_val)
    seq_oct.append(byte2oct)
new_data[j, :len(seq), :] = seq_oct