## RNNによる文章生成
* seq2seqというニューラルネットワークを使用する
* これを使用することで機械翻訳やチャットボットなどの様々なアプリケーションで利用することができる。

In [1]:
import numpy as np
import os 
import matplotlib.pyplot as plt
%matplotlib inline
os.chdir("./deep-learning-from-scratch-2-master/")

In [2]:
from common.functions import softmax
from ch06.rnnlm import Rnnlm
from ch06.better_rnnlm import BetterRnnlm

class RnnlmGen(Rnnlm):
    def generate(self, start_id, skip_ids=None, sample_size=100):
        word_ids = [start_id]
        
        x = start_id
        while len(word_ids) < sample_size:
            x = np.array(x).reshape(1, 1)
            score = self.predict(x)
            p = softmax(score.flatten())
            
            sampled = np.random.choice(len(p), size=1, p=p)
            if(skip_ids is None) or (sampled not in skip_ids):
                x = sampled
                word_ids.append(int(x))
        return word_ids

In [3]:
from dataset import ptb

corpus, word_to_id, id_to_word = ptb.load_data("train")
vocab_size = len(word_to_id)
corpus_size = len(corpus)

model = RnnlmGen()

start_word = "you"
start_id = word_to_id[start_word]
skip_words = ["N", "<unk>", "$"]
skip_ids = [word_to_id[w] for w in skip_words]

word_ids = model.generate(start_id, skip_ids)
txt = ' '.join([id_to_word[i] for i in word_ids])
txt = txt.replace(" <eos>", ".\n")
print(txt)

you newsletter executing stir enter gerrymandering digital whittle sees prevailed salvador rubble hunt publicized prominent panamanian contracted thrust sydney indicted cabrera contrasts even misses securities turbulence jose provigo bearing impression p&g issuance spin statistical acted spielvogel lying microsoft secretaries thousands notable authority food historically yet flags relieve disabled statute duty legislature statewide wound painfully fat skidded still shared manufacture supposedly lunch seattle mills reuter trend shouting observes surveyed huge island secret bids westridge bush intervene cable-tv capable analyzing prudent cruise concerning seen wonder monthly cap boesel violent memories tendered troop rapidly line-item manhattan binge pressed price jump uncertainty group track


In [4]:
corpus, word_to_id, id_to_word = ptb.load_data("train")
vocab_size = len(word_to_id)
corpus_size = len(corpus)

model = RnnlmGen()
model.load_params("./ch06/Rnnlm.pkl")

start_word = "you"
start_id = word_to_id[start_word]
skip_words = ["N", "<unk>", "$"]
skip_ids = [word_to_id[w] for w in skip_words]

word_ids = model.generate(start_id, skip_ids)
txt = ' '.join([id_to_word[i] for i in word_ids])
txt = txt.replace(" <eos>", ".\n")
print(txt)

you get the idea of that negotiations with the transition.
 the ual board collapsed about five shares at a promotion.
 the army supreme court merchants policy was responsible for several years to shy a shift.
 this man britain the new york complaint revenues will guarantee an additional wage through the positive form of assets but it familiar with how to piece on principal voters and does mr. roman 's mediator back to stay out of our.
 the negative time mr. honecker says.
 that differently is too far enough to join the rubles of the abundant
