<a href="https://colab.research.google.com/github/chongzicbo/Dive-into-Deep-Learning-tf.keras/blob/master/6.5.%20%E5%BE%AA%E7%8E%AF%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C%E7%9A%84%E7%AE%80%E6%B4%81%E5%AE%9E%E7%8E%B0.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##6.5. 循环神经网络的简洁实现
本节将使用tf.keras来更简洁地实现基于循环神经网络的语言模型。首先，我们读取周杰伦专辑歌词数据集。

In [0]:
%matplotlib inline
import math
import tensorflow as tf
import numpy as np
from IPython import display
import matplotlib.pyplot as plt
from tensorflow import keras
from tensorflow.keras import losses
from tensorflow.data import Dataset
import time
import random
import zipfile

In [0]:
tf.enable_eager_execution()

In [0]:
def load_data_jay_lyrics():
  from google.colab import drive
  drive.mount('/content/drive')
  with zipfile.ZipFile('/content/drive/My Drive/data/d2l-zh-tensoflow/jaychou_lyrics.txt.zip')as zin:
    with zin.open('jaychou_lyrics.txt') as f:
      corpus_chars=f.read().decode('utf-8')
  corpus_chars=corpus_chars.replace('\n',' ').replace('\r',' ')
  corpus_chars=corpus_chars[0:10000]
  idx_to_char=list(set(corpus_chars))
  char_to_idx=dict([(char,i) for i,char in enumerate(idx_to_char)])
  vocab_size=len(char_to_idx)
  corpus_indices=[char_to_idx[char] for char in corpus_chars]
  return corpus_indices,char_to_idx,idx_to_char,vocab_size

(corpus_indices,char_to_idx,idx_to_char,vocab_size)=load_data_jay_lyrics() 

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [0]:
seq_length=100
examples_per_epoch=len(corpus_indices) //seq_length
char_dataset=Dataset.from_tensor_slices(np.array(corpus_indices))
sequences=char_dataset.batch(seq_length+1,drop_remainder=True)
def split_input_target(chunk):
  input_text=chunk[:-1]
  target_text=chunk[1:]
  return input_text,target_text
dataset=sequences.map(split_input_target)
BATCH_SIZE=64
setps_per_epoch=examples_per_epoch // BATCH_SIZE
BUFFER_SIZE=10000
dataset=dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE,drop_remainder=True)
for x in dataset:
  print(x)
  break

(<tf.Tensor: id=998933, shape=(64, 100), dtype=int64, numpy=
array([[ 734,  921,  813, ...,  257,  986,  807],
       [ 734,  595,  418, ...,   32,  874,  807],
       [ 450,  794,   12, ...,  572,  807,  932],
       ...,
       [ 869,   89,  212, ...,  986, 1005,  807],
       [ 203,  652,  841, ...,  872,  807,  654],
       [ 104,  257,  807, ...,  230,   21,  257]])>, <tf.Tensor: id=998934, shape=(64, 100), dtype=int64, numpy=
array([[ 921,  813,  807, ...,  986,  807,  257],
       [ 595,  418,  806, ...,  874,  807,  631],
       [ 794,   12,  526, ...,  807,  932,  558],
       ...,
       [  89,  212,  943, ..., 1005,  807,  388],
       [ 652,  841,  807, ...,  807,  654,  433],
       [ 257,  807,  583, ...,   21,  257,  807]])>)


###6.5.1. 定义模型
keras的layers模块提供了循环神经网络的实现。下面构造一个含单隐藏层、隐藏单元个数为256的循环神经网络层，并编译、训练。

In [0]:
num_hiddens=256
embedding_dim=256
net=keras.Sequential()
net.add(keras.layers.Embedding(input_dim=vocab_size,output_dim=vocab_size,batch_input_shape=(BATCH_SIZE,seq_length)))
net.add(keras.layers.SimpleRNN(num_hiddens,unroll=True,return_sequences=True,stateful=True))
net.add(keras.layers.Dense(vocab_size,activation='softmax'))
net.compile(optimizer=keras.optimizers.Adam(),loss=losses.SparseCategoricalCrossentropy(),metrics=['acc'])

In [0]:
net.fit_generator(dataset.repeat(),steps_per_epoch=setps_per_epoch,epochs=500)

In [0]:
new_net=keras.Sequential()
new_net.add(keras.layers.Embedding(input_dim=vocab_size,output_dim=vocab_size,batch_input_shape=(1,1)))
new_net.add(keras.layers.SimpleRNN(num_hiddens,unroll=True,return_sequences=True,stateful=True))
new_net.add(keras.layers.Dense(vocab_size,activation='softmax'))
new_net.set_weights(net.get_weights())
new_net.compile(optimizer=keras.optimizers.Adam(),loss=losses.SparseCategoricalCrossentropy(),metrics=['acc'])

### 6.5.2. 文本生成

In [0]:
text_generated=['分','开']
for i in range(10):
  id=char_to_idx[text_generated[-1]]
  char=idx_to_char[tf.argmax(new_net.predict(tf.constant(value=[id]))[0],axis=-1).numpy()[0]]
  text_generated.append(char)

In [0]:
idx_to_char[tf.argmax(new_net.predict(tf.constant(value=[1]))[0],axis=-1).numpy()[0]]

'底'

In [0]:
text_generated

['分', '开', '不', '了', '口', ' ', '周', '杰', '伦', ' ', '才', '离']