# Poetry Generation
In this project, I'll present a way of generating Chinese poetry using RNNs.
## Get the Data
The data is alread privoded. This dataset include more than 40000 Chinese poetries without title. Poetries are seperated by '\n'

In [1]:
import helper

data_dir = "./data/five_words_poetries.txt"
text = helper.load_data(data_dir)
text = text.replace("\n", "")

## Explore the Data

In [2]:
import numpy as np

view_poetry_range = (100, 110)

print('Dataset Stats')
print('Roughly the number of unique Chinese Character: {}'.format(len({word: None for word in list(text)})))

poetries = text.split('\n')
print('Number of poetries: {}'.format(len(poetries)))

charactors_count = [len(list(poetry)) for poetry in poetries]
print("Averate number of charactors in each poetry: {}".format(np.average(charactors_count)))
print()
print('The sentences {} to {}:'.format(*view_poetry_range))
print('\n'.join(poetries[view_poetry_range[0]:view_poetry_range[1]]))

Dataset Stats
Roughly the number of unique Chinese Character: 3015
Number of poetries: 1
Averate number of charactors in each poetry: 55224.0

The sentences 100 to 110:



## Preprocessing Functions
- Lookup Table

### Lookup Table
- Dictionary to go from the charactors to an id, we'll call `vocab_to_int`
- Dictionary to go from the id to charactors, we'll call `int_to_vocab`

Return these dictionaries in the following tuple (vocab_to_int, int_to_vocab)

In [3]:
import numpy as np
import problem_unittests as tests

def create_lookup_tables(text):
    vocab = set(text)
    vocab_to_int = {word: i for i, word in enumerate(vocab)}
    int_to_vocab = dict(enumerate(vocab))
    return vocab_to_int, int_to_vocab

tests.test_create_lookup_tables(create_lookup_tables)

Test Passed


## Preprocess all the data and save it

In [4]:
helper.preprocess_and_save_data(data_dir, create_lookup_tables)

## Check Point
The preprocessed data has been saved to disk

In [5]:
import helper
import numpy as np
import problem_unittests as tests

int_text, vocab_to_int, int_to_vocab = helper.load_preprocess()

## Build the Neural Network
Build the network by following functions below:
- get_inputs
- get_init_cell
- get_embed
- build_rnn
- build_nn
- get_batches

### Check the Version of Tensorflow and Access to GPU

In [6]:
from distutils.version import LooseVersion
import warnings
import tensorflow as tf

assert LooseVersion(tf.__version__) >= LooseVersion('1.0'), 'Please use Tensorflow version 1.0 or newer'
print('Tensorflow Version: {}'.format(tf.__version__))

if not tf.test.gpu_device_name():
    warnings.warn('No GPU found. Please use a GPU to train your neural network.')
else:
    print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))

Tensorflow Version: 1.1.0
Default GPU Device: /gpu:0


## Input
Input create the following placeholders:
- Input text placeholder named "input" using the TF Placeholder name paramteter.
- Targets placeholder
- Learning Rate placeholder

return the placeholders in the following tuple (Input, Targets, LearningRate)

In [7]:
def get_inputs():
    input_placeholder = tf.placeholder(tf.int32, [None, None], name='input')
    target_placeholder = tf.placeholder(tf.int32, [None, None], name='target')
    learning_rate_placeholder = tf.placeholder(tf.float32, name='learning_rate')
    return input_placeholder, target_placeholder, learning_rate_placeholder

tests.test_get_inputs(get_inputs)

(?, ?)
Test Passed


## Build RNN Cell and Initialize
Stack one or more BasicLSTMCells in a MultiRNNCell
- The Rnn size shoulde be set using rnn_size
- Initialize Cell State using the MultiRNNCell's zero_state() function
    - Apply the name "initial_state" to the initial state using tf.identity()
   
Return the cell and inital state in the following tuple `(Cell, InitialState)`

In [8]:
def get_init_cell(batch_size, rnn_size):
    num_layers = 2
    def create_cell(rnn_size):
        lstm = tf.contrib.rnn.BasicLSTMCell(rnn_size)
        lstm = tf.contrib.rnn.DropoutWrapper(lstm, output_keep_prob=0.75)
        return lstm
    cell = tf.contrib.rnn.MultiRNNCell([create_cell(rnn_size) for _ in range(num_layers)])
    initial_state = cell.zero_state(batch_size, tf.float32)
    initial_state = tf.identity(initial_state, name="initial_state")
    return cell, initial_state

tests.test_get_init_cell(get_init_cell)

Test Passed


## Word Embedding
Apply embedding to input_data using Tensorflow. Return the embedded sequence.

In [9]:
def get_embed(input_data, vocab_size, embed_dim):
    embed = tf.contrib.layers.embed_sequence(input_data, vocab_size=vocab_size, embed_dim=embed_dim)
    return embed

tests.test_get_embed(get_embed)

Test Passed


## Build RNN
Time to use the cell to create a RNN.
- Build the RNN using the tf.nn.dynamic_rnn()
    - Apply the name "final_state" to the final state using tf.identity()
    
Return the outputs and final_state state in the folowing tuple `(Outpus, FinalState)`

In [10]:
def build_rnn(cell, inputs):
    outputs, state = tf.nn.dynamic_rnn(cell, inputs, dtype=tf.float32)
    final_state = tf.identity(state, name="final_state")
    return outputs, final_state

tests.test_build_rnn(build_rnn)

Test Passed


## Build the Neural Network

- Apply embedding to input_data using `get_embed(input_data, vocab_size, embed_dim)` function.
- Build RNN using `cell` and your `build_rnn(cell, inputs)` function
- Apply a fully connected laye3r with a linear activation and `vocab_size` as the number of outputs.

Return the logits and final state in the following tuple (Logits, FinalState)

In [11]:
def build_nn(cell, rnn_size, input_data, vocab_size, embed_dim):
    embed = get_embed(input_data, vocab_size, embed_dim)
    rnn_outputs, final_state = build_rnn(cell, embed)
    logits = tf.contrib.layers.fully_connected(rnn_outputs, vocab_size, activation_fn=None)
    return logits, final_state

tests.test_build_nn(build_nn)

Test Passed


## Batches

In [12]:
def get_batches(int_text, batch_size, seq_length):
    n_batches = int(len(int_text) / (batch_size * seq_length))
    xdata = np.array(int_text[:n_batches * batch_size * seq_length])
    ydata = np.array(int_text[1:n_batches * batch_size * seq_length + 1])
    x_batches = np.split(xdata.reshape(batch_size, -1), n_batches, 1)
    y_batches = np.split(ydata.reshape(batch_size, -1), n_batches, 1)
    
    y_batches[-1][-1][-1] = int_text[0]
    
    return np.array(list(zip(x_batches, y_batches)))

tests.test_get_batches(get_batches)

Test Passed


## Neural Network Training
### Hyperparameters

In [13]:
num_epochs = 256
batch_size = 256
rnn_size = 512
embed_dim = 512
seq_length = 7
learning_rate = 0.005
show_every_n_batches = 50

save_dir = './save'

## Build the Graph
Build the graph using the neural network

In [14]:
from tensorflow.contrib import seq2seq

train_graph = tf.Graph()
with train_graph.as_default():
    vocab_size = len(int_to_vocab)
    input_text, targets, lr = get_inputs()
    input_data_shape = tf.shape(input_text)
    cell, initial_state = get_init_cell(input_data_shape[0], rnn_size)
    logits, final_state = build_nn(cell, rnn_size, input_text, vocab_size, embed_dim)
    
    probs = tf.nn.softmax(logits, name='probs')
    
    cost = seq2seq.sequence_loss(
        logits,
        targets,
        tf.ones([input_data_shape[0], input_data_shape[1]])
    )
    
    optimizer = tf.train.AdamOptimizer(lr)
    
    gradients = optimizer.compute_gradients(cost)
    capped_gradients = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gradients if grad is not None]
    train_op = optimizer.apply_gradients(capped_gradients)
    

## Train
Train the neural network on the rpeprocessed data.

In [15]:
batches = get_batches(int_text, batch_size, seq_length)
with tf.Session(graph=train_graph) as sess:
    sess.run(tf.global_variables_initializer())
    
    for epoch_i in range(num_epochs):
        state = sess.run(initial_state, {input_text: batches[0][0]})
        for batch_i, (x, y) in enumerate(batches):
            feed = {
                input_text: x,
                targets: y,
                initial_state: state,
                lr: learning_rate
            }
            train_loss, state, _ = sess.run([cost, final_state, train_op], feed)
            
            if (epoch_i * len(batches) + batch_i) % show_every_n_batches == 0:
                print('Epoch {:>3} Batch {:>4}/{} train_loss = {:.3f}'.format(
                    epoch_i,
                    batch_i,
                    len(batches),
                    train_loss))
    
    saver = tf.train.Saver()
    saver.save(sess, save_dir)
    print('Model Trained and Saved')

Epoch   0 Batch    0/32 train_loss = 8.012
Epoch   1 Batch   18/32 train_loss = 5.930
Epoch   3 Batch    4/32 train_loss = 5.655
Epoch   4 Batch   22/32 train_loss = 5.496
Epoch   6 Batch    8/32 train_loss = 5.291
Epoch   7 Batch   26/32 train_loss = 5.125
Epoch   9 Batch   12/32 train_loss = 4.984
Epoch  10 Batch   30/32 train_loss = 4.737
Epoch  12 Batch   16/32 train_loss = 4.440
Epoch  14 Batch    2/32 train_loss = 4.207
Epoch  15 Batch   20/32 train_loss = 3.921
Epoch  17 Batch    6/32 train_loss = 3.645
Epoch  18 Batch   24/32 train_loss = 3.410
Epoch  20 Batch   10/32 train_loss = 3.158
Epoch  21 Batch   28/32 train_loss = 2.902
Epoch  23 Batch   14/32 train_loss = 2.601
Epoch  25 Batch    0/32 train_loss = 2.494
Epoch  26 Batch   18/32 train_loss = 2.320
Epoch  28 Batch    4/32 train_loss = 2.193
Epoch  29 Batch   22/32 train_loss = 1.953
Epoch  31 Batch    8/32 train_loss = 1.909
Epoch  32 Batch   26/32 train_loss = 1.777
Epoch  34 Batch   12/32 train_loss = 1.689
Epoch  35 B

## Save Parameters
Save seq_length and save_dir for generating a new TV script

In [16]:
helper.save_params((seq_length, save_dir))

## Checkpoint

In [17]:
import tensorflow as tf
import numpy as np
import helper
import problem_unittests as tests

_, vocab_to_int, int_to_vocab = helper.load_preprocess()
seq_length = 25
load_dir = './save'

## Generate Functions
### Get Tensors

Get tensors from loaded_graph using the function get_tensor_by_name(). Get the tensors using the following names:
- "input:0"
- "initial_state:0"
- "final_state:0"
- "probs:0"

Return the tensor in the following tuple `(InputTensor, InitialStateTensor, FinalStateTensor, ProbsTensor)`


In [18]:
def get_tensors(loaded_graph):
    input_tensor = loaded_graph.get_tensor_by_name("input:0")
    initial_state = loaded_graph.get_tensor_by_name("initial_state:0")
    final_state = loaded_graph.get_tensor_by_name("final_state:0")
    probs = loaded_graph.get_tensor_by_name("probs:0")
    return input_tensor, initial_state, final_state, probs

tests.test_get_tensors(get_tensors)

Test Passed


## Choose Word
Select the next word using probabilities.

In [19]:
def pick_word(probabilities, int_to_vocab):
    return int_to_vocab[np.random.choice(len(probabilities), size=1, p=probabilities)[0]]

tests.test_pick_word(pick_word)

Test Passed


In [38]:
def pick_5_words_poetry_word(probabilities, int_to_vocab, gen_sentences_length):
    result = ""
    if gen_sentences_length == 5 or gen_sentences_length == 17:
        result = '，'
    elif gen_sentences_length == 11 or gen_sentences_length == 23:
        result = '。'
    else:
        choose = np.random.choice(len(probabilities), size=10, p=probabilities)
        print(choose)
        for i in range(10):
            word = int_to_vocab[choose[i]]
            if (word is not '，') and (word is not '。'):
                result = word
                break;
          
    return result


In [39]:
gen_length = 24

prim_word = '青'

loaded_graph = tf.Graph()
with tf.Session(graph=loaded_graph) as sess:
    loader = tf.train.import_meta_graph(load_dir + '.meta')
    loader.restore(sess, load_dir)
    
    input_text, initial_state, final_state, probs = get_tensors(loaded_graph)
    
    gen_sentences = [prim_word]
    prev_state = sess.run(initial_state, {input_text: np.array([[1]])})
    
    for n in range(gen_length):
        dyn_input = [[vocab_to_int[word] for word in gen_sentences[-seq_length:]]]
        dyn_seq_length = len(dyn_input[0])
        
        probabilities, prev_state = sess.run([probs, final_state],
                                            {input_text: dyn_input, initial_state: prev_state})
        pred_word = pick_5_words_poetry_word(probabilities[dyn_seq_length-1], int_to_vocab, len(gen_sentences))
        gen_sentences.append(pred_word)
    
    poetry = ''.join(gen_sentences)
    print(poetry)

INFO:tensorflow:Restoring parameters from ./save
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
青山上。
，轩双国孤情。
桥南菰叶，残景奈赊何。



## Generate Chinese Poetry
This will generate chinese poetry. Set get_length to the length of poetry you want generate.

## Generate Five words poetry
This will generate five words poetry. 