<a href="https://colab.research.google.com/github/Abhieo07/Basics_ml/blob/main/Play_Generator.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#RNN Play Generator

We are going to RNN to generate play. We will simply show the RNN an example of something we want to recreate and it will learn how to write a version of it on it's own. We'll do this using a character predictive model that will take as input a variable length sequence  and predict the next character. We can use the model many times a row with the output from the last prediction as input for the next call to generate a sequence.

In [36]:
from tensorflow.keras.preprocessing import sequence
import keras
import tensorflow as tf
import os
import numpy as np

**Dataset**



In [37]:
path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')

**Loading Your Own Data**

To load your own data you'll need to upload a file from below. then you'll need to follow the steps from above but load in this new file instead.

In [38]:
# from google.colab import files
# path_to_file = list(files.upload().keys())[0]

**Read Contents of File**

Let's look at the contents of the file.

In [39]:
# Read, then decode for py2 compat.
text = open(path_to_file, 'rb').read().decode(encoding='utf-8')
# length of the text is the number of characters in it
print('Length of text: {} characters.'.format(len(text)))

Length of text: 1115394 characters.


In [40]:
# Take a look at first 250 characters in text
print(text[:250])

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.



**Encoding**

We are going encode each character as a different integer.

In [41]:
vocab = sorted(set(text))
# Creating a mapping from unique characters to indices
char2idx = {u:i for i,u in enumerate(vocab)}
idx2char = np.array(vocab)

def text_to_int(text):
  return np.array([char2idx[c] for c in text])

text_as_int = text_to_int(text)

In [42]:
# lets look at how part of our text is encoded
print("Text: ",text[:13])
print("Encoded: ", text_to_int(text[:13]))

Text:  First Citizen
Encoded:  [18 47 56 57 58  1 15 47 58 47 64 43 52]


In [43]:
# decode function
def int_to_text(ints):
  try:
    ints = ints.numpy()
  except:
    pass
  return ''.join(idx2char[ints])

print(int_to_text(text_as_int[:13]))

First Citizen


**Creating Training Examples**

    input: hell | output: ello

In [44]:
seq_length = 100 # length of sequence for a training example
examples_per_epochs = len(text)//(seq_length + 1)

# Create training examples / targets
char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)

Next we can use the batch method to turn this stream of characters into batches of desired length

In [45]:
sequences = char_dataset.batch(seq_length+1, drop_remainder=True)

Now we need to use these sequences of length 101 and split them into input and output

In [46]:
def split_input_target(chunk): # for the example: hello
  input_text = chunk[:-1] # hell
  target_text = chunk[1:] # ello
  return input_text, target_text # hell, ello

dataset = sequences.map(split_input_target) # we use map to apply the above function to every entry

In [47]:
for x,y in dataset.take(2):
  print('\n\nEXAMPLE\n')
  print("INPUT")
  print(int_to_text(x))
  print("\nOUTPUT")
  print(int_to_text(y))



EXAMPLE

INPUT
First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You

OUTPUT
irst Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You 


EXAMPLE

INPUT
are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you 

OUTPUT
re all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you k


Finally we need to make training batches

In [48]:
BATCH_SIZE = 64
VOCAB_SIZE = len(vocab) # vocab is number of unique characters
EMBEDDING_DIM = 256
RNN_UNITS = 1024

# Buffer size to shuffle the dataset
# (TF data is designed to work with possibly infinite sequences,
# so it doesn't attempt to shuffle the entire sequence in memory. Instead,
# it maintains a buffer in which it shuffle elements).
BUFFER_SIZE = 1000

data = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

**Building Model**

We will use an embedding layer a LSTM and one dense layer that contains a node for each unique character in our training data. The dense layer will give us a probability distribution over all nodes.

In [49]:
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
      tf.keras.layers.Embedding(vocab_size, embedding_dim,
                                batch_input_shape=[batch_size,None]),
      tf.keras.layers.LSTM(rnn_units,
                           return_sequences=True,
                           stateful=True,
                           recurrent_initializer='glorot_uniform'),
      tf.keras.layers.Dense(vocab_size)
  ])
  return model

model = build_model(VOCAB_SIZE,EMBEDDING_DIM,RNN_UNITS,BATCH_SIZE)
model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_2 (Embedding)     (64, None, 256)           16640     
                                                                 
 lstm_2 (LSTM)               (64, None, 1024)          5246976   
                                                                 
 dense_2 (Dense)             (64, None, 65)            66625     
                                                                 
Total params: 5,330,241
Trainable params: 5,330,241
Non-trainable params: 0
_________________________________________________________________


**Creating a Loss Function**

Now we are actually going to create our own loss function for this problem. This is because our model will output a (64,seq_length,65)
shaped tensor that represents the probability distribution of each character at each timestep for every sequence in the batch.


Howevwe, before we do that let's have a look at a sample input and output from untrained model. This is so we can understand what the model is actually giving us.

In [50]:
for input_example_batch, target_example_batch in  data.take(1):
  example_batch_predictions = model(input_example_batch) # ask our model for prediction of 1st batch of training data
  print(example_batch_predictions.shape, "# (batch_size, seq_length, vocab_size)")

(64, 100, 65) # (batch_size, seq_length, vocab_size)


In [51]:
# we can see that the prediction is an array of array of 64 arrays, one for each entry in the batch
print(len(example_batch_predictions))
print(example_batch_predictions)

64
tf.Tensor(
[[[ 2.83692591e-03  1.77111174e-03 -2.14785058e-03 ...  1.92721223e-03
   -5.26937749e-03 -3.93753313e-03]
  [ 2.14448362e-03 -2.98338011e-03 -5.61148021e-03 ...  5.86950919e-03
   -7.00313970e-03 -4.33111703e-03]
  [ 9.34975920e-04  2.25574896e-03 -6.23774529e-03 ...  5.53878630e-03
   -1.26721114e-02 -5.16613433e-03]
  ...
  [ 2.00874428e-03  5.23186941e-03  8.63040425e-03 ... -4.57914360e-03
    6.57539535e-03 -9.69186798e-03]
  [-1.75818661e-03  6.69746194e-03  1.36090312e-02 ... -3.79288918e-03
    8.36840551e-03 -1.10050775e-02]
  [-7.27020111e-03  8.80870037e-03  1.05984528e-02 ...  2.78808852e-03
    4.60444391e-03 -1.41241280e-02]]

 [[ 9.87269683e-04 -2.11464963e-03 -3.29534290e-03 ... -2.53885752e-04
    7.22376071e-03 -6.17576856e-03]
  [-5.36122266e-03  8.24265997e-04  5.45688672e-04 ...  1.27689005e-03
    4.31040488e-03 -1.19175427e-02]
  [ 3.41332378e-03  9.55476053e-03 -2.24774424e-03 ... -1.24922092e-03
    6.42551435e-03 -9.79508553e-03]
  ...
  [ 1.445

In [52]:
# lets examine one prediction
pred = example_batch_predictions[0]
print(len(pred))
print(pred)
# notice this a 2d array of length 100, where each interior array is the prediction for the next character at each state

100
tf.Tensor(
[[ 0.00283693  0.00177111 -0.00214785 ...  0.00192721 -0.00526938
  -0.00393753]
 [ 0.00214448 -0.00298338 -0.00561148 ...  0.00586951 -0.00700314
  -0.00433112]
 [ 0.00093498  0.00225575 -0.00623775 ...  0.00553879 -0.01267211
  -0.00516613]
 ...
 [ 0.00200874  0.00523187  0.0086304  ... -0.00457914  0.0065754
  -0.00969187]
 [-0.00175819  0.00669746  0.01360903 ... -0.00379289  0.00836841
  -0.01100508]
 [-0.0072702   0.0088087   0.01059845 ...  0.00278809  0.00460444
  -0.01412413]], shape=(100, 65), dtype=float32)


In [53]:
# and finally we'll look at a prediction at the first timestep
time_pred = pred[0]
print(len(time_pred))
print(time_pred)

65
tf.Tensor(
[ 2.8369259e-03  1.7711117e-03 -2.1478506e-03  7.4027380e-04
  2.4220231e-04 -3.6946347e-03 -1.1471356e-03 -1.1030131e-03
 -6.8294653e-03 -2.6284901e-03 -6.0392311e-04 -5.2380105e-03
  3.1472077e-03  3.0842719e-03  7.2007459e-03  3.1593363e-03
 -3.5951447e-03 -2.1588565e-03  2.7275572e-03  2.2456492e-03
 -1.0777775e-03  1.6110386e-03  6.1599916e-04 -1.5882589e-04
  2.3725384e-03 -1.3883115e-03  3.2173158e-03 -9.4944006e-04
  4.1671251e-03 -2.6755799e-03  2.5629469e-03  4.1337742e-05
 -5.7050167e-04  1.3731166e-03 -2.3633330e-03 -2.1147784e-03
  1.8921886e-04  8.9884119e-04 -6.2549737e-04  3.0041668e-03
 -8.3092536e-04 -1.9122225e-03  2.8508413e-03 -2.1663192e-04
 -1.6890892e-03  2.0580641e-03  1.9130351e-03 -4.9918313e-03
 -1.9067647e-03  6.1703911e-03  4.3750275e-03 -3.0608784e-04
 -1.1495260e-03 -3.8898939e-03  3.9929547e-03  5.8299261e-03
  3.2487740e-03 -1.8145121e-04 -3.2204648e-03 -4.0315660e-03
  6.8684639e-03 -1.2727323e-04  1.9272122e-03 -5.2693775e-03
 -3.937533

In [54]:
# If we want to determine the predicted character we need to sample the output distribution (pick a value based on probability)
sampled_indices = tf.random.categorical(pred, num_samples=1)

# now we can reshape that array and convert all the integers to numbers to see the actual charcters
sampled_indices = np.reshape(sampled_indices, (1,-1))[0]
predicted_chars = int_to_text(sampled_indices)

predicted_chars # and this is what the model predicted for the training sequences 1

"Veq!tAYxYReVrn:mu3WLVlMpd:mNZqoZVAHldA:'h&w?FA,!\nRj&j&O.Iw'LXFJIo?:gr:;?bgr!BHEhHAAq'-GXZTqrpA&B.ZYi"

So now we need to create a loss function that can compare that output to the expected output and give us some numeric value representing how close the two were.

In [55]:
def loss(labels, logits):
  return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

**Compiling the model**

At this point we can think of our problem as a classification problem where the model predicts the probability of each unique letter comming next.

In [56]:
model.compile(optimizer='adam', loss = loss)

**Creating Chackpoints**

This will allow us to load our model from a checkpoint and continue training it.

In [57]:
# Directory where the checkpoints will be saved
checkpoint_dir = './training_checkpoints'
# Name of the checkpoint files
checkpoint_prefix = os.path.join(checkpoint_dir, 'ckpt_{epoch}')

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True
)

**Training**

In [58]:
history = model.fit(data,epochs=50,callbacks=[checkpoint_callback])

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


Loading the model

In [59]:
model = build_model(VOCAB_SIZE, EMBEDDING_DIM, RNN_UNITS, batch_size=1)

In [62]:
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([1,None]))

In [None]:
checkpoint_num = 10
model.load_weights(tf.train.load_checkpoint("./training_checkpoints/ckpt_"+str(checkpoint_num)))
model.build(tf.TensorShape([1,None]))

Generating Text

In [67]:
def generate_text(model, start_string):
  num_generate = 800 # number of characters to generate

  # Converting our start string to numbers (vectorizing)
  input_eval = [char2idx[s] for s in start_string]
  input_eval = tf.expand_dims(input_eval, 0)

  # Empty string to store our results
  text_generated = []

  temperature = 1.0

  # here batch_size == 1
  model.reset_states()
  for i in range(num_generate):
    predictions = model(input_eval)
    # remove the batch dimension
    predictions = tf.squeeze(predictions, 0)
    predictions = predictions/temperature
    predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()
    input_eval = tf.expand_dims([predicted_id], 0)

    text_generated.append(idx2char[predicted_id])

  return (start_string + ''.join(text_generated))

In [69]:
inp = input("Type a strating string: ")
print(generate_text(model, inp))

Type a strating string: hello
hellos:
If namer, not yimphed wetch, be gone.

PROSPERO:
Cas your sheep Henry blemish'd,
As thou three-bear'd tyranny---
To take the blood; is this behavior so much.

STRANUS:
Nurse!

ARIEL:
No dount with heaven above it. Anbertant, I
art thift her, lounds.

ARTONIO:
Take thy love.

Lord:
How now! I am lost, and sure'd; what thou this sadditymen gates,
Which thou hast born'd to figning in his trowness:
A mortal match; I'll cut our world.

ANTONIO:
He is har these fires so groundful kingrance.

DUKE VINCENTIO:
And so helceforth o' the world.

Provost:
Having of us; must.

GRUMIO:
O False, now, sir, his name of Nos mercy?

LUCENTIO:
I pray thee; a saily countenance came;
Mistressed means to lie, alasks, we would cban forth
Sear madam, and put by colours virtuous sound
Is a boar e'er done. What th
