# Play generator
    * I am going to create a model which can predict a play using RNNs
    * I will show an example on what kind of thing we need to create for the model, so that it can write a version of it's own
    * The model will be a charcter predictive model that can, when provided a sequence of characters give the next character.
    * Output from the last prediction can be given as an input to the new prediction.

In [1]:
from keras.datasets import imdb
from keras.preprocessing import sequence
import keras
import tensorflow as tf
import os
import numpy as np

In [2]:
path_to_file = tf.keras.utils.get_file("shakespeare.txt", "https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt")

In [3]:
# If we wont we can load our own data.. just run the below code to do so if you are using google colab..
# from google.colab import files

# path_to_file2 = list(files.upload().keys())[0]

In [4]:
# read and decode the data to python 2 compatible
text = open(path_to_file, 'rb').read().decode(encoding='utf-8') # rb = read in binary mode

print(text[:250])

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.



# Preprocessing 
    * We need to convert characters (in this case bcecause we are creating a character predicitive model) to integers

In [5]:
vocab = sorted(set(text)) # only the unique characters will be there in the vocabulary

In [6]:
# Create mapping from character to indexes
char2ind = {u:i for i,u in enumerate(vocab)}
print(char2ind)

{'\n': 0, ' ': 1, '!': 2, '$': 3, '&': 4, "'": 5, ',': 6, '-': 7, '.': 8, '3': 9, ':': 10, ';': 11, '?': 12, 'A': 13, 'B': 14, 'C': 15, 'D': 16, 'E': 17, 'F': 18, 'G': 19, 'H': 20, 'I': 21, 'J': 22, 'K': 23, 'L': 24, 'M': 25, 'N': 26, 'O': 27, 'P': 28, 'Q': 29, 'R': 30, 'S': 31, 'T': 32, 'U': 33, 'V': 34, 'W': 35, 'X': 36, 'Y': 37, 'Z': 38, 'a': 39, 'b': 40, 'c': 41, 'd': 42, 'e': 43, 'f': 44, 'g': 45, 'h': 46, 'i': 47, 'j': 48, 'k': 49, 'l': 50, 'm': 51, 'n': 52, 'o': 53, 'p': 54, 'q': 55, 'r': 56, 's': 57, 't': 58, 'u': 59, 'v': 60, 'w': 61, 'x': 62, 'y': 63, 'z': 64}


In [7]:
# Create mapping from indexes to characters
ind2char = np.array(vocab)
print(ind2char)

['\n' ' ' '!' '$' '&' "'" ',' '-' '.' '3' ':' ';' '?' 'A' 'B' 'C' 'D' 'E'
 'F' 'G' 'H' 'I' 'J' 'K' 'L' 'M' 'N' 'O' 'P' 'Q' 'R' 'S' 'T' 'U' 'V' 'W'
 'X' 'Y' 'Z' 'a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' 'k' 'l' 'm' 'n' 'o'
 'p' 'q' 'r' 's' 't' 'u' 'v' 'w' 'x' 'y' 'z']


In [8]:
def text_to_int(text):
    return np.array([char2ind[c] for c in text])

In [9]:
print(text_to_int(text[:13]))
print(text[:13])

[18 47 56 57 58  1 15 47 58 47 64 43 52]
First Citizen


In [10]:
def int_to_text(ints):
    try:
        ints = ints.numpy() # convert ints to a numpy array
    except:
        pass                # pass if ints is already a numpy array
    return ''.join(ind2char[ints])

print(int_to_text(text_to_int(text[:13])))

First Citizen


In [11]:
text_as_int = text_to_int(text)

# Creating training examples
    * we need a sequence length first
    * if sequence length is 4 our training examples should look like below
        input - Hell
        output - ello

In [12]:
seq_length = 100
examples_per_epoch = len(text) // (seq_length + 1)

char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int) # Create the training examples
sequences = char_dataset.batch(seq_length+1, drop_remainder = True) # Create batches to our desired sequence size... by droping the remainder we can ensure that all batches are exactly same sized ... not smaller than the desired.
# Basically what is happening in the last line here is,
    # * Batches are created according to the seq_length
    # * Each batch contains a sequence of characters which are ints

In [13]:
def split_input_target(chunk): # eg:- hello
    input_text = chunk[:-1] # hell
    output_text = chunk[1:] # ello
    return input_text, output_text

dataset = sequences.map(split_input_target) # map the data we have using the split_input_target method

In [14]:
print(dataset)

<MapDataset element_spec=(TensorSpec(shape=(100,), dtype=tf.int32, name=None), TensorSpec(shape=(100,), dtype=tf.int32, name=None))>


In [15]:
for x,y in dataset.take(1):
    print("\n\nExample\n\n")
    print("INPUT")
    print(int_to_text(x))
    print("OUTPUT")
    print(int_to_text(y))



Example


INPUT
First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You
OUTPUT
irst Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You 


# Create the batches to be trained

In [16]:
batch_size = 64
vocab_size = len(vocab)
embedding_dim = 256
rnn_units = 1024

buffer_size = 10000

data = dataset.shuffle(buffer_size).batch(batch_size, drop_remainder=True)

# Building the model

In [17]:
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
    model = tf.keras.Sequential([
                tf.keras.layers.Embedding(
                                            input_dim = vocab_size,     # input_dim is the size of the inputting vocabulary
                                            output_dim = embedding_dim, # dimension of the dense embedding
                                            batch_input_shape = [batch_size, None]
                                         ),
                tf.keras.layers.LSTM(
                                        units = rnn_units,       # Dimensionality of the output space
                                        return_sequences = True, # Will return the last output of the output sequence, if False will return the full output
                                        stateful = True,         # last state of a batch will be used as the initial state of the following batch
                                        recurrent_initializer = 'glorot_uniform'
                                    ),
                tf.keras.layers.Dense(units = vocab_size)
            ])
    return model

# ---------
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size)
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (64, None, 256)           16640     
                                                                 
 lstm (LSTM)                 (64, None, 1024)          5246976   
                                                                 
 dense (Dense)               (64, None, 65)            66625     
                                                                 
Total params: 5,330,241
Trainable params: 5,330,241
Non-trainable params: 0
_________________________________________________________________


# Creating a loss function

In [18]:
# input_example_batch = a random batch [1st batch in this case]
for input_example_batch, output_example_batch in data.take(1):
    print(input_example_batch)
    print(int_to_text(input_example_batch[0]))
    example_batch_predictions = model(input_example_batch)
    print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")

tf.Tensor(
[[ 0 21  1 ... 57  1 39]
 [ 1 19 50 ... 33 15 17]
 [57 58  1 ... 52  1 58]
 ...
 [ 0 20 39 ... 63  1 57]
 [53 59  1 ...  0 35 43]
 [ 0 20 39 ...  1 41 53]], shape=(64, 100), dtype=int32)

I neither care for the world nor your general: for
such things as you, I can scarce think there's a
(64, 100, 65) # (batch_size, sequence_length, vocab_size)


In [19]:
# print(example_batch_predictions)

In [20]:
pred = example_batch_predictions[0] # each value in the array is the probability of happenning each and every character after this character according to the untrained model
print(pred)

tf.Tensor(
[[-8.6691795e-04  1.5072282e-03  2.7200195e-04 ... -2.8751758e-03
  -1.7817949e-03  4.0948158e-04]
 [-9.4313622e-03  3.0455128e-03  2.4110589e-03 ... -7.8672133e-03
  -3.3789794e-03 -5.5195885e-03]
 [-6.4988388e-03  1.4274057e-03  3.1210750e-04 ... -7.4578300e-03
   2.1406270e-03 -2.8918926e-03]
 ...
 [-1.5778413e-02  8.2441559e-04  7.6857405e-03 ... -4.7804434e-03
  -9.4336374e-03 -1.4078832e-04]
 [-1.1638284e-02 -5.5997982e-04  4.2322185e-03 ... -4.2745527e-03
  -3.4966886e-03  3.4082221e-04]
 [-7.2977180e-03 -4.0121842e-05  6.9800327e-03 ... -3.2673192e-03
  -3.6742794e-04 -3.5499877e-03]], shape=(100, 65), dtype=float32)


In [21]:
# the predicited characters of the untrained model
# 1. Get a output sample from the predicied output and convert values to integers
sampled_indices = tf.random.categorical(pred, num_samples=1, dtype='int64')

# 2. Reshape to become 1 dimenional
sampled_indices = np.reshape(sampled_indices, (1, -1))[0]

# 3. Convert to chars
sampled_indices = int_to_text(sampled_indices)
print(sampled_indices)

mhCuD3,wET
cH CYommYUBu!rUkO3I.XIAJLQni?EP;DX& FOxmy:jaT3nz,H!HFrzpWlv;eP?gHAQ,N3aWo$$AlnPZjx;RJ qLP


# Create the loss function

In [22]:
# loss functions gives us an idea about how much far the prediction is from the true value
def loss(labels, logits):
    return tf.keras.losses.sparse_categorical_crossentropy(
        y_true = labels,     # ground truth values
        y_pred = logits,     # predicted values
        from_logits = True)

# Compile the model

In [23]:
model.compile(
    optimizer = 'adam',
    loss = loss
)

# Creating check points
    * like save points in sql procedures
    * we can start from a check point and start trainng the model from there again

In [24]:
# directory where checkpoints will be saved
checkpoint_dir = 'training_checkpoints'

# name of the checkpoint file
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
                          filepath = checkpoint_dir,
                          save_weights_only = True
                      )

# Training the model

In [25]:
history = model.fit(
            x = data,
            epochs = 2, # we can increase this to increase accuracy
            callbacks = [checkpoint_callback]
          )

Epoch 1/2
Epoch 2/2


# Loading the model
    * in here what we will do is we will rebuild the model
    * since what we really want is to predict one character at a time .. lets change the batch size to 1 so that the sequence which is inside the will only have 1 character 

In [26]:
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size = 1)

In [29]:
# lets use the latest checkpoint to rebuild the model
latest_checkpoint = tf.train.latest_checkpoint(checkpoint_dir)
model.load_weights(latest_checkpoint) # - to load the latest checkpoint
model.build(tf.TensorShape([1, None])) # None is the initial input... this is handy when we dont know what the initial input is



# checkpoint_num = 10
# model.load_weights(tf.train.load_checkpoint('./Users/Pasindu Siriwardana/JupyterNoteBookFiles/Tensorflow/NLP/training_checkpoints/checkpoint' + str(checkpoint_num)))


In [46]:
def generate_text(model, start_string):
    # Evaluation step - generating text using the learned model
    
    num_generate = 800 # number of characters to generate
    
    # converting start_string to ints
    input_eval = text_to_int(start_string)
    input_eval = tf.expand_dims(input_eval, 0) # this is like converting a list like [1,2,3] into [[1,2,3]] - expanding by a 1 dimension
    
    text_generated = [] # to store the generated text
    
    # Note - 
    # Low temperature results in predictable text
    # High temperature results in surprising text
    temperature = 1.0 
    
    model.reset_states() # it might have the last trained status stored in it.
    
    for i in range(num_generate):
        predictions = model(input_eval)
        predictions = tf.squeeze(predictions, 0) # remove the extra dimension we added using expand_dim
        predictions = predictions / temperature
        predicted_id = tf.random.categorical(predictions, num_samples = 1)[-1,0].numpy() # sampling from the predictions tensor
        
        # we pass the predicted character as the next input to the model along with the previous hidden state
        input_eval = tf.expand_dims([predicted_id], 0) # a new length 1 axis will be inserted at axis 0
        
        text_generated.append(int_to_text(predicted_id))
    
    return (start_string + ''.join(text_generated))
    

In [47]:
inp = input("Type a text to be predicted - ")
print(generate_text(model, inp))

Type a text to be predicted -  Picaso


Picasochy, I warr fall be fall:
I hale not beake to bid othand,
But so far his; Hansig all be de groply this tains:
Claut, was ad deed hemby back your gonder?

LEDRY PORLANDET:
May?
The knowns, sir,
Be, my, Nut, agay;
Whomeines, me hath that thou ene sthick?
Thy sear Rays and spoll them.

ToRY ANIZA:
Row teath, is Adbal thy dear-s;
More to shall we althand the deagh,
I well ray my not!

RIUIN MINGRETET:
Hese for injeikns this areiors, liknream!

LLOUCENTER:
The aruers, would are dosimen.

HORTe Mayces,
Wete as is in the sabe.
I God the such spadted, Dowen, it loss! good ging twonks, what a taga knect
Heregoons mespirery? patce mest fies!
Dest thas less un arpeiver?

KING RICHARD III:
Wensee! gown,
I carst now, with where chissone-s mine oul airst upon cenfored I death,
And streake: leppun vill.

