<a href="https://colab.research.google.com/github/ThoufiqAhmed/Spoon-Knife/blob/main/MainAssignment2(RNN).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Assignment: Text generation with an RNN

This tutorial demonstrates how to generate text using a character-based RNN. We will work with a dataset of Shakespeare's writing from Andrej Karpathy's [The Unreasonable Effectiveness of Recurrent Neural Networks](http://karpathy.github.io/2015/05/21/rnn-effectiveness/). Given a sequence of characters from this data ("Shakespear"), train a model to predict the next character in the sequence ("e"). Longer sequences of text can be generated by calling the model repeatedly.

<pre>
QUEENE:
I had thought thou hadst a Roman; for the oracle,
Thus by All bids the man against the word,
Which are so weak of care, by old care done;
Your children were in your holy love,
And the precipitation through the bleeding throne.

BISHOP OF ELY:
Marry, and will, my lord, to weep in such a one were prettiest;
Yet now I was adopted heir
Of the world's lamentable day,
To watch the next way with his father with his face?

ESCALUS:
The cause why then we are all resolved more sons.

VOLUMNIA:
O, no, no, no, no, no, no, no, no, no, no, no, no, no, no, no, no, no, no, no, no, it is no sin it should be dead,
And love and pale as any will to that word.

QUEEN ELIZABETH:
But how long have I heard the soul for this world,
And show his hands of life be proved to stand.

PETRUCHIO:
I say he look'd on, if I must be content
To stay him from the fatal of our country's bliss.
His lordship pluck'd from this sentence then for prey,
And then let us twain, being the moon,
were she such a case as fills m
</pre>

While some of the sentences are grammatical, most do not make sense. The model has not learned the meaning of words, but consider:

* The model is character-based. When training started, the model did not know how to spell an English word, or that words were even a unit of text.

* The structure of the output resembles a play—blocks of text generally begin with a speaker name, in all capital letters similar to the dataset.

* As you will show, the model is trained on small batches of text (100 characters each), and is still able to generate a longer sequence of text with coherent structure.

### Import Necessary Libraries

In [None]:
import tensorflow as tf

import numpy as np
import os
import time

import warnings
warnings.filterwarnings("ignore")

from tensorflow.keras.losses import sparse_categorical_crossentropy
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Embedding, GRU, Dense, Dropout

### Download the dataset

Note:  In the future you can use your own data set by changing the path if you like!

In [None]:
path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')

### Read the data (5 Marks)


In [None]:
# Read, then decode for python compatibility.                                                                                                                                              path = '../input/shakespeare/shakespeare.txt'
text = open(path_to_file, 'rb').read().decode(encoding="UTF-8")     

In [None]:
# Take a look at the first 250 characters in text
print(text[:250])

In [None]:
# The unique characters in the file'
vocab = sorted(set(text))
length_vocab = len(vocab)
print('The no. of unique characters in the file are',length_vocab)
print('The unique characters are...',vocab)

## Process the text (10 Marks)

### Vectorize the text

Before training, we need to map strings to a numerical representation. Create two lookup tables: one mapping characters to numbers, and another for numbers to characters.

In [None]:
# Creating a mapping from unique characters to indices
char2idx = {u:i for i, u in enumerate(vocab)}

In [None]:
# Print the mapping
idx2char = np.array(vocab)
idx2char

In [None]:
# Show how the first 13 characters from the text are mapped to integers
text_as_int = np.array([char2idx[c] for c in text])
print(text_as_int[:13])

### Create training examples and targets (10 Marks)

Please divide the text into example sequences. Each input sequence will contain `sequence length` characters from the text.

For each input sequence, the corresponding targets contain the same length of text, except shifted one character to the right.

Then break the text into chunks of `sequence length + 1`. Example, say `seq_length` is 4 and our text is "Hello". The input sequence would be "Hell", and the target sequence "ello".

To do this first use the `tf.data.Dataset.from_tensor_slices` function to convert the text vector into a stream of character indices.

In [None]:
#Your Code Here
seq_len = 120
total_num_seq = len(text) // (seq_len+1)

Note:  You may want to use the `batch` method.  It lets you easily convert these individual characters to sequences of the desired size.

In [None]:
#Your Code Here
char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)
sequences = char_dataset.batch(seq_len+1, drop_remainder=True)

For each sequence, duplicate and shift it to form the input and target text

In [None]:
#Your Code Here
def split_input_target(chunk):
    input_text = chunk[:-1]
    target_text = chunk[1:]
    return input_text, target_text

Now print the first examples input and target values:

In [None]:
#Your Code Here
dataset = sequences.map(split_input_target)
dataset

### Create training batches (5 Marks)

Use the tensorflow shuffle method to shuffle the data and pack it into batches.

In [None]:
# Batch size
BATCH_SIZE = 64

# Buffer size to shuffle the dataset
BUFFER_SIZE = 10000

#Your Code Here
dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

## Build The Model (20 Marks)

Please Use `tf.keras.Sequential` to define the model. Use the follwing layers:

* `tf.keras.layers.Embedding`: The input layer. A trainable lookup table that will map the numbers of each character to a vector with `embedding_dim` dimensions;
* `tf.keras.layers.GRU`: A type of RNN with size `units=rnn_units` (You can also use a LSTM layer here.)
* `tf.keras.layers.Dense`: The output layer, with `vocab_size` outputs.

In [None]:
# Length of the vocabulary in chars
vocab_size = len(vocab)

# The embedding dimension
embedding_dim = 256

# Number of RNN units
rnn_units = 1024

In [None]:
#Your Code Here
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(vocab_size, embedding_dim,batch_input_shape=[batch_size, None]),
        tf.keras.layers.GRU(rnn_units,return_sequences=True,stateful=True,recurrent_initializer='glorot_uniform'),
        tf.keras.layers.Dense(vocab_size)
    ])
    return model

model = build_model(vocab_size = len(vocab),embedding_dim = embedding_dim,rnn_units = rnn_units,batch_size = BATCH_SIZE)

## Try the model (10 Marks)

Now run the model to see that it behaves as expected.

First check the shape of the output:

In [None]:
#Your Code Here
for layer in model.layers:
    print(layer.output_shape)

Show a summary of the model

In [None]:
#Your Code Here
model.summary()

## Train the model (20 Marks)

Now we can approach this problems as a standard classification problem. Given the previous RNN state, and the input this time step, predict the class of the next character.

### Attach an optimizer, and a loss function

The standard `tf.keras.losses.sparse_categorical_crossentropy` loss function works in this case because it is applied across the last dimension of the predictions.

Because our model returns logits, we need to set the `from_logits` flag.  For more insight into logits see: (https://datascience.stackexchange.com/questions/31041/what-does-logits-in-machine-learning-mean)


In [None]:
#Your Code Here
def loss(labels, logits):
  return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

Compile the model with `tf.keras.optimizers.Adam` with default arguments and the loss function.

In [None]:
#Your Code Here
model.compile(optimizer="adam", loss=loss, metrics=['accuracy'])

### Configure checkpoints

Use a `tf.keras.callbacks.ModelCheckpoint` to ensure that checkpoints are saved during training:

In [None]:
#Your Code Here
checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True)

### Execute the training (10 Marks)

Please use 10 epochs to train the model

In [None]:
EPOCHS=10

In [None]:
#Your Code Here
history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback],verbose=1)

## Generate text

### Restore the latest checkpoint

To keep this prediction step simple, use a batch size of 1.

Because of the way the RNN state is passed from timestep to timestep, the model only accepts a fixed batch size once built.

To run the model with a different `batch_size`, we need to rebuild the model and restore the weights from the checkpoint.


In [None]:
#Your Code Here
tf.train.latest_checkpoint(checkpoint_dir)
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([1, None]))
model.summary()

In [None]:
# Please use this provided function to generate your text!
def generate_text(model, start_string):
  # Evaluation step (generating text using the learned model)

  # Number of characters to generate
  num_generate = 1000

  # Converting our start string to numbers (vectorizing)
  input_eval = [char2idx[s] for s in start_string]
  input_eval = tf.expand_dims(input_eval, 0)

  # Empty string to store our results
  text_generated = []

  # Low temperatures results in more predictable text.
  # Higher temperatures results in more surprising text.
  # Experiment to find the best setting.
  temperature = 1.0

  # Here batch size == 1
  model.reset_states()
  for i in range(num_generate):
    predictions = model(input_eval)
    # remove the batch dimension
    predictions = tf.squeeze(predictions, 0)

    # using a categorical distribution to predict the character returned by the model
    predictions = predictions / temperature
    predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()

    # We pass the predicted character as the next input to the model
    # along with the previous hidden state
    input_eval = tf.expand_dims([predicted_id], 0)

    text_generated.append(idx2char[predicted_id])

  return (start_string + ''.join(text_generated))

In [None]:
#Your Code Here
print(generate_text(model, start_string=u"ROMEO: "))


#Bonus (5 Marks): 
Please experiment with a different start string, adding another RNN layer to improve the model's accuracy, and adjusting the temperature parameter to generate more or less random predictions.

Your explination of how you modified the model or hyperparameters and what were your results?

In [None]:
EPOCHS=30
temperature = 2.0


In [None]:
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(vocab_size, embedding_dim,batch_input_shape=[batch_size, None]),
        tf.keras.layers.GRU(rnn_units,return_sequences=True,stateful=True,recurrent_initializer='glorot_uniform'),
        tf.keras.layers.LSTM(rnn_units,return_sequences=True,stateful=True,recurrent_initializer='glorot_uniform'),
        tf.keras.layers.Dense(vocab_size)
    ])
    return model

model = build_model(vocab_size = len(vocab),embedding_dim = embedding_dim,rnn_units = rnn_units,batch_size = BATCH_SIZE)

model.compile(optimizer="adam", loss=loss, metrics=['accuracy'])

checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True)

history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback],verbose=1)

tf.train.latest_checkpoint(checkpoint_dir)
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([1, None]))

In [None]:
print(generate_text(model, start_string=u"ARCNIDAMAE: "))