## Text Generation with RNNs

How to generate text using a character based RNN, using Shakespeare's writings.

Given a sequence  of characters from the data, train a model to predict the next character in the sequence.


Some sentences are grammatically correct, but the others do not make sense 
<li> The model is character-based. When training started, the model did not know how to spell an ENglish word, or that words were even a unit of text.
    


In [1]:
# import tensorflow and the dataset
import tensorflow as tf
from tensorflow.keras.layers.experimental import preprocessing

import numpy as np
import os
import time
path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')

Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt


#### Reading the Data

In [2]:
text = open(path_to_file,'rb').read().decode(encoding = 'utf-8')

print(f'Length of the text: {len(text)} characters')

Length of the text: 1115394 characters


In [3]:
# take a look at first 1000 letters
print(text[:1000])

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.

All:
We know't, we know't.

First Citizen:
Let us kill him, and we'll have corn at our own price.
Is't a verdict?

All:
No more talking on't; let it be done: away, away!

Second Citizen:
One word, good citizens.

First Citizen:
We are accounted poor citizens, the patricians good.
What authority surfeits on would relieve us: if they
would yield us but the superfluity, while it were
wholesome, we might guess they relieved us humanely;
but they think we are too dear: the leanness that
afflicts us, the object of our misery, is as an
inventory to particularise their abundance; our
sufferance is a gain to them Let us revenge this with
our pikes, ere we become rakes: for the gods know I
speak this in hunger for bread, not in thirst for revenge.



In [4]:
# how many unique characters

vocab = sorted(set(text))
#vocab

In [5]:
# how many unique characters

vocab = sorted(set(text))
#vocab

In [6]:
print(f'{len(vocab)} unique characters')

# the unique characters consist the alphabets captial and small and punctuations

65 unique characters


### Vectorize the Text

Before training, we need to convert the strings to a numerical representation

We'll use "preprocessing.StringLookup" layer can convert each character into a numeric ID

We just need to split texts into token

In [7]:
example_texts = ['Monit', 'Sharma']
chars = tf.strings.unicode_split(example_texts, input_encoding='UTF-8')
chars

<tf.RaggedTensor [[b'M', b'o', b'n', b'i', b't'], [b'S', b'h', b'a', b'r', b'm', b'a']]>

In [8]:
# now the preprocessing one
ids_from_chars = preprocessing.StringLookup(vocabulary= list(vocab), mask_token= None)

ids = ids_from_chars(chars)

ids

<tf.RaggedTensor [[26, 54, 53, 48, 59], [32, 47, 40, 57, 52, 40]]>

In [9]:
chars_from_ids = tf.keras.layers.experimental.preprocessing.StringLookup(
vocabulary = ids_from_chars.get_vocabulary(), invert = True, mask_token = None)

In [10]:
chars = chars_from_ids(ids)
chars

<tf.RaggedTensor [[b'M', b'o', b'n', b'i', b't'], [b'S', b'h', b'a', b'r', b'm', b'a']]>

In [11]:
# make it back as a string
tf.strings.reduce_join(chars, axis=-1).numpy()

array([b'Monit', b'Sharma'], dtype=object)

In [12]:
# simply make a function out of it
def text_from_ids(ids):
    return tf.strings.reduce_join(chars_from_ids(ids), axis=-1)

### The Prediciton Task
The input of the model will be a sequence of characters, and we train the model to predict the output- the following characters at each time step.

#### Create Training examples and targets

Divide the text into example sequences. Each input will contain seq_length characters from the text

For each input sequence, the corresponding targets contain the same length of text, except shifted one character to right.

For example, say seq_length is 4 and our text is "Hello". The input sequence would be "Hell", and the target sequence "ello".


We'll use the 'tf.data.Dataset.from_tensor_slices" function to convert the text vector into a stream of character indices

In [13]:
all_ids = ids_from_chars(tf.strings.unicode_split(text, 'UTF-8'))
all_ids

<tf.Tensor: shape=(1115394,), dtype=int64, numpy=array([19, 48, 57, ..., 46,  9,  1])>

In [14]:
ids_dataset = tf.data.Dataset.from_tensor_slices(all_ids)

In [15]:
for ids in ids_dataset.take(20):
    print(chars_from_ids(ids).numpy().decode('utf-8'))

F
i
r
s
t
 
C
i
t
i
z
e
n
:


B
e
f
o
r


In [16]:
# we'll take sequence of 100 char

seq_length = 100
examples_per_epoch = len(text)//(seq_length+1)

In [17]:
# the batch method helps us to convert the individual characters to sequences of desired size

sequences = ids_dataset.batch(seq_length+1, drop_remainder= True)

for seq in sequences.take(1):
    print(chars_from_ids(seq))

tf.Tensor(
[b'F' b'i' b'r' b's' b't' b' ' b'C' b'i' b't' b'i' b'z' b'e' b'n' b':'
 b'\n' b'B' b'e' b'f' b'o' b'r' b'e' b' ' b'w' b'e' b' ' b'p' b'r' b'o'
 b'c' b'e' b'e' b'd' b' ' b'a' b'n' b'y' b' ' b'f' b'u' b'r' b't' b'h'
 b'e' b'r' b',' b' ' b'h' b'e' b'a' b'r' b' ' b'm' b'e' b' ' b's' b'p'
 b'e' b'a' b'k' b'.' b'\n' b'\n' b'A' b'l' b'l' b':' b'\n' b'S' b'p' b'e'
 b'a' b'k' b',' b' ' b's' b'p' b'e' b'a' b'k' b'.' b'\n' b'\n' b'F' b'i'
 b'r' b's' b't' b' ' b'C' b'i' b't' b'i' b'z' b'e' b'n' b':' b'\n' b'Y'
 b'o' b'u' b' '], shape=(101,), dtype=string)


In [18]:
# join them back in strings
for seq in sequences.take(5):
    print(text_from_ids(seq).numpy())

b'First Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou '
b'are all resolved rather to die than to famish?\n\nAll:\nResolved. resolved.\n\nFirst Citizen:\nFirst, you k'
b"now Caius Marcius is chief enemy to the people.\n\nAll:\nWe know't, we know't.\n\nFirst Citizen:\nLet us ki"
b"ll him, and we'll have corn at our own price.\nIs't a verdict?\n\nAll:\nNo more talking on't; let it be d"
b'one: away, away!\n\nSecond Citizen:\nOne word, good citizens.\n\nFirst Citizen:\nWe are accounted poor citi'


In [19]:
def split_input_target(sequence):
    input_text = sequence[:-1]
    target_text = sequence[1:]
    return input_text, target_text

In [20]:
split_input_target(list("Monit Sharma"))

(['M', 'o', 'n', 'i', 't', ' ', 'S', 'h', 'a', 'r', 'm'],
 ['o', 'n', 'i', 't', ' ', 'S', 'h', 'a', 'r', 'm', 'a'])

In [21]:
dataset = sequences.map(split_input_target)

In [22]:
for input_example, target_example in dataset.take(1):
    print("Input: " ,text_from_ids(input_example).numpy())
    print("Target: ", text_from_ids(target_example).numpy())

Input:  b'First Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou'
Target:  b'irst Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou '


#### Creating trainig batches

We need to shuffle the data and put that into batches

In [23]:
BATCH_SIZE = 64

# Buffer size to shuffle the dataset
# (TF data is designed to work with possibly infinite sequences,
# so it doesn't attempt to shuffle the entire sequence in memory. Instead,
# it maintains a buffer in which it shuffles elements)


BUFFER_SIZE = 10000

dataset = (
dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder = True)
.prefetch(tf.data.experimental.AUTOTUNE))

dataset

<PrefetchDataset shapes: ((64, 100), (64, 100)), types: (tf.int64, tf.int64)>

### Build the Model

This model has three layers:

<li>tf.keras.layers.Embedding: The input layer. A trainable lookup table that will map each character-ID to a vector with embedding_dim dimensions;
<li>tf.keras.layers.GRU: A type of RNN with size units=rnn_units (You can also use an LSTM layer here.)
<lI>tf.keras.layers.Dense: The output layer, with vocab_size outputs. It outputs one logit for each character in the vocabulary. These are the log-likelihood of each character according to the model.

In [24]:
# length of the vocabulary in hte chars
vocab_size = len(vocab)

# embedding _ dimension
embedding_dim = 256

# number of RNN units

rnn_units = 1024

In [25]:
# now the model

class MyModel(tf.keras.Model):
    def __init__(self, vocab_size, embedding_dim, rnn_units):
        super().__init__(self)
        self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
        self.gru = tf.keras.layers.GRU(rnn_units, return_sequences=True, return_state= True)
        self.dense = tf.keras.layers.Dense(vocab_size)
        
        
    def call(self, inputs, states= None, return_state= False, training=False):
        x = inputs
        x = self.embedding(x, training= training)
        if states is None:
            states = self.gru.get_initial_state(x)
        x, states = self.gru(x, initial_state=states, training=training)
        x = self.dense(x, training=training)
        
        if return_state:
            return x, states
        else:
            return x

   

In [26]:
model = MyModel(
    # Be sure the vocabulary size matches the `StringLookup` layers.
    vocab_size=len(ids_from_chars.get_vocabulary()),
    embedding_dim=embedding_dim,
    rnn_units=rnn_units)

### Try the Model



In [27]:
# checking the shape of the output
for input_example_batch, target_example_batch in dataset.take(1):
    example_batch_predictions = model(input_example_batch)
    print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")

(64, 100, 66) # (batch_size, sequence_length, vocab_size)


In [28]:
model.summary()

Model: "my_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        multiple                  16896     
_________________________________________________________________
gru (GRU)                    multiple                  3938304   
_________________________________________________________________
dense (Dense)                multiple                  67650     
Total params: 4,022,850
Trainable params: 4,022,850
Non-trainable params: 0
_________________________________________________________________


In [29]:
sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1)
sampled_indices = tf.squeeze(sampled_indices, axis=-1).numpy()

In [30]:
# This gives us, at each timestep, a prediction of the next character index:
sampled_indices

array([17, 25, 54, 20, 14, 16,  7, 22, 31, 25, 43, 62,  6, 50, 15, 19, 31,
       28,  5, 17, 44, 60,  9, 23, 48,  7, 58, 13,  1, 38, 40, 21, 26, 59,
        9,  1, 34,  9, 11,  5, 42, 56, 56,  4, 43, 14, 63, 43, 53, 12, 39,
       31, 47, 27,  8, 63, 36, 50, 19, 56, 42, 35, 60, 23,  0, 27, 29, 27,
       25, 31,  8, 25, 22, 23, 61, 30, 57,  8, 64, 31, 31, 32, 43, 32, 51,
       46, 62, 28, 64, 10, 45, 26, 36, 58,  7, 23, 42, 18, 49, 23])

In [31]:
# Decode these to see the text predicted by this untrained model:

print("Input:\n", text_from_ids(input_example_batch[0]).numpy())
print()
print("Next Char Predictions:\n", text_from_ids(sampled_indices).numpy())

Input:
 b'ther circumstances\nMade up to the deed, doth push on this proceeding:\nYet, for a greater confirmatio'

Next Char Predictions:
 b"DLoGAC,IRLdw'kBFRO&Deu.Ji,s?\nYaHMt.\nU.:&cqq$dAxdn;ZRhN-xWkFqcVuJ[UNK]NPNLR-LIJvQr-yRRSdSlgwOy3fMWs,JcEjJ"


### Train the Model
At this point the problem can be treated as a standard classification problem. Given the previous RNN state, and the input this time step, predict the class of the next character.

#### Attach an Optimizer and a loss function
The standard tf.keras.losses.sparse_categorical_crossentropy loss function works in this case because it is applied across the last dimension of the predictions.

Because our model returns logits, we need to set the from_logits flag.

In [32]:
loss = tf.losses.SparseCategoricalCrossentropy(from_logits=True)

In [33]:
example_batch_loss = loss(target_example_batch, example_batch_predictions)
mean_loss = example_batch_loss.numpy().mean()
print("Prediction shape: ", example_batch_predictions.shape, " # (batch_size, sequence_length, vocab_size)")
print("Mean loss:        ", mean_loss)

Prediction shape:  (64, 100, 66)  # (batch_size, sequence_length, vocab_size)
Mean loss:         4.1890397


In [34]:
tf.exp(mean_loss).numpy()

65.95942

In [35]:
model.compile(optimizer='adam', loss=loss)

### Configure Checkpoints
Use a tf.keras.callbacks.ModelCheckpoint to ensure that checkpoints are saved during training:

In [36]:
# Directory where the checkpoints will be saved
checkpoint_dir = './training_checkpoints'
# Name of the checkpoint files
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True)

### Execute the Model

In [37]:

EPOCHS = 20

In [38]:

history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


## generate text
The simplest way to generate text with this model is to run it in a loop, and keep track of the model's internal state as you execute it.


Each time we call the model we pass in some text and an internal state. The model returns a prediction for the next character and its new state. Pass the prediction and state back in to continue generating text.

The following makes a single step prediction:

In [39]:
class OneStep(tf.keras.Model):
  def __init__(self, model, chars_from_ids, ids_from_chars, temperature=1.0):
    super().__init__()
    self.temperature = temperature
    self.model = model
    self.chars_from_ids = chars_from_ids
    self.ids_from_chars = ids_from_chars

    # Create a mask to prevent "[UNK]" from being generated.
    skip_ids = self.ids_from_chars(['[UNK]'])[:, None]
    sparse_mask = tf.SparseTensor(
        # Put a -inf at each bad index.
        values=[-float('inf')]*len(skip_ids),
        indices=skip_ids,
        # Match the shape to the vocabulary
        dense_shape=[len(ids_from_chars.get_vocabulary())])
    self.prediction_mask = tf.sparse.to_dense(sparse_mask)

  @tf.function
  def generate_one_step(self, inputs, states=None):
    # Convert strings to token IDs.
    input_chars = tf.strings.unicode_split(inputs, 'UTF-8')
    input_ids = self.ids_from_chars(input_chars).to_tensor()

    # Run the model.
    # predicted_logits.shape is [batch, char, next_char_logits]
    predicted_logits, states = self.model(inputs=input_ids, states=states,
                                          return_state=True)
    # Only use the last prediction.
    predicted_logits = predicted_logits[:, -1, :]
    predicted_logits = predicted_logits/self.temperature
    # Apply the prediction mask: prevent "[UNK]" from being generated.
    predicted_logits = predicted_logits + self.prediction_mask

    # Sample the output logits to generate token IDs.
    predicted_ids = tf.random.categorical(predicted_logits, num_samples=1)
    predicted_ids = tf.squeeze(predicted_ids, axis=-1)

    # Convert from token ids to characters
    predicted_chars = self.chars_from_ids(predicted_ids)

    # Return the characters and model state.
    return predicted_chars, states

In [40]:

one_step_model = OneStep(model, chars_from_ids, ids_from_chars)

Run it in a loop to generate some text. Looking at the generated text, you'll see the model knows when to capitalize, make paragraphs and imitates a Shakespeare-like writing vocabulary. With the small number of training epochs, it has not yet learned to form coherent sentences.

In [41]:
start = time.time()
states = None
next_char = tf.constant(['ROMEO:'])
result = [next_char]

for n in range(1000):
  next_char, states = one_step_model.generate_one_step(next_char, states=states)
  result.append(next_char)

result = tf.strings.join(result)
end = time.time()
print(result[0].numpy().decode('utf-8'), '\n\n' + '_'*80)
print('\nRun time:', end - start)

ROMEO:
O, sir, it is an hunn and odd! These are
past three of our commission. Go not succeed.

ISABELLA:
Ay, but by guess.

WARWICK:
And I could live and mock nor fram, and in this
complaint that makes the mind spoil from hence,
More than you think, that dog into the tortures
Do execute their headst of her becomes.
The mightive hand that is dear partain
As I thou wast boy were four asheem again,
Remember 'twas I war; but being obedience.
And look you, ladies out of noble Bolingbroke
Will all spread up to the world that drops of thee,
Nay, Women on off from him they have, with brood
And many by 'shal, a bait of breath,
When I shall you no great account of his sin.
One Margaret, but we ordend and lust:
Bianco, of as is springs bound to thee and trields,
A child, my reason why they have paid
The stamp'd-nice breaking onacters.

Second Murderer:
O God'd my lord; it is a guest friend Wills,
And set ochem against the prince my state:
A-blow is only in some-planets, that sall you,
And how dos

The easiest thing you can do to improve the results is to train it for longer (try EPOCHS = 30).

You can also experiment with a different start string, try adding another RNN layer to improve the model's accuracy, or adjust the temperature parameter to generate more or less random predictions.

If you want the model to generate text faster the easiest thing you can do is batch the text generation. In the example below the model generates 5 outputs in about the same time it took to generate 1 above.

In [42]:
start = time.time()
states = None
next_char = tf.constant(['ROMEO:', 'ROMEO:', 'ROMEO:', 'ROMEO:', 'ROMEO:'])
result = [next_char]

for n in range(1000):
  next_char, states = one_step_model.generate_one_step(next_char, states=states)
  result.append(next_char)

result = tf.strings.join(result)
end = time.time()
print(result, '\n\n' + '_'*80)
print('\nRun time:', end - start)

tf.Tensor(
[b"ROMEO:\nFalsely to make the law: there is the man of winds\nFrom the charp of young and by o'er-heart\nAnd pass'd the morning rodumen before his\nsweet bybutish two my heart; and when the heart they saw he married\nWere frozand of a cunning law: may it hath\nbanied, young bridegroom from Parrish centurius.\nTo him the tomb.\n\nANTIGOUNE:\nmark you.\n\nAUTILLIUS:\nI stand forbid\nTo have him out o' the clod; thou'rt in mark'd\nBy so dwelling att, and ne'er the day of beating:\nNow, no more poth, sir. Pame your sword I saw him from her:\nYou take your daughter make a knee as Kender: it\nis the wance of heaven and the blashf\nLers is their parties, and the hearts to enewe\nWhere he appears he in the towns; herein we be spoke,\nWhich contradicts him: in her hope herein with\nWarwick: this is known that Henry's eyes.\nThis sensible damned Juliet, who should keep him where now,\nHath win my all hath been a caseless shun.\nA bact of you, impartience bladement,\nThan dignity in f

### Export the generator
This single-step model can easily be saved and restored, allowing you to use it anywhere a tf.saved_model is accepted.

In [43]:
tf.saved_model.save(one_step_model, 'one_step')
one_step_reloaded = tf.saved_model.load('one_step')






FOR DEVS: If you are overwriting _tracking_metadata in your class, this property has been used to save metadata in the SavedModel. The metadta field will be deprecated soon, so please move the metadata to a different file.



FOR DEVS: If you are overwriting _tracking_metadata in your class, this property has been used to save metadata in the SavedModel. The metadta field will be deprecated soon, so please move the metadata to a different file.


INFO:tensorflow:Assets written to: one_step/assets


INFO:tensorflow:Assets written to: one_step/assets


In [44]:
states = None
next_char = tf.constant(['ROMEO:'])
result = [next_char]

for n in range(100):
  next_char, states = one_step_reloaded.generate_one_step(next_char, states=states)
  result.append(next_char)

print(tf.strings.join(result)[0].numpy().decode("utf-8"))

ROMEO:
Go loud unto your majesty: thou hast dinner
More than her countenance; and 'twere first
Sent for th


### Advanced: Customize Training
The above training procedure is simple, but does not give you much control. It uses teacher-forcing which prevents bad predictions from being fed back to the model, so the model never learns to recover from mistakes.

So now that you've seen how to run the model manually next you'll implement the training loop. This gives a starting point if, for example, you want to implement curriculum learning to help stabilize the model's open-loop output.

The most important part of a custom training loop is the train step function.

Use tf.GradientTape to track the gradients. You can learn more about this approach by reading the eager execution guide.

The basic procedure is:

Execute the model and calculate the loss under a tf.GradientTape.
Calculate the updates and apply them to the model using the optimizer.

In [45]:
class CustomTraining(MyModel):
  @tf.function
  def train_step(self, inputs):
      inputs, labels = inputs
      with tf.GradientTape() as tape:
          predictions = self(inputs, training=True)
          loss = self.loss(labels, predictions)
      grads = tape.gradient(loss, model.trainable_variables)
      self.optimizer.apply_gradients(zip(grads, model.trainable_variables))

      return {'loss': loss}

The above implementation of the train_step method follows Keras' train_step conventions. This is optional, but it allows you to change the behavior of the train step and still use keras' Model.compile and Model.fit methods.

In [46]:
model = CustomTraining(
    vocab_size=len(ids_from_chars.get_vocabulary()),
    embedding_dim=embedding_dim,
    rnn_units=rnn_units)

In [47]:
model.compile(optimizer = tf.keras.optimizers.Adam(),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True))

In [48]:
model.fit(dataset, epochs=1)



<tensorflow.python.keras.callbacks.History at 0x7fc1d6c48b10>

In [49]:
EPOCHS = 10

mean = tf.metrics.Mean()

for epoch in range(EPOCHS):
    start = time.time()

    mean.reset_states()
    for (batch_n, (inp, target)) in enumerate(dataset):
        logs = model.train_step([inp, target])
        mean.update_state(logs['loss'])

        if batch_n % 50 == 0:
            template = f"Epoch {epoch+1} Batch {batch_n} Loss {logs['loss']:.4f}"
            print(template)

    # saving (checkpoint) the model every 5 epochs
    if (epoch + 1) % 5 == 0:
        model.save_weights(checkpoint_prefix.format(epoch=epoch))

    print()
    print(f'Epoch {epoch+1} Loss: {mean.result().numpy():.4f}')
    print(f'Time taken for 1 epoch {time.time() - start:.2f} sec')
    print("_"*80)

model.save_weights(checkpoint_prefix.format(epoch=epoch))

Epoch 1 Batch 0 Loss 2.1852
Epoch 1 Batch 50 Loss 2.0757
Epoch 1 Batch 100 Loss 1.9578
Epoch 1 Batch 150 Loss 1.8748

Epoch 1 Loss: 1.9947
Time taken for 1 epoch 11.64 sec
________________________________________________________________________________
Epoch 2 Batch 0 Loss 1.8493
Epoch 2 Batch 50 Loss 1.7206
Epoch 2 Batch 100 Loss 1.6861
Epoch 2 Batch 150 Loss 1.6450

Epoch 2 Loss: 1.7139
Time taken for 1 epoch 10.71 sec
________________________________________________________________________________
Epoch 3 Batch 0 Loss 1.5871
Epoch 3 Batch 50 Loss 1.5880
Epoch 3 Batch 100 Loss 1.5619
Epoch 3 Batch 150 Loss 1.5737

Epoch 3 Loss: 1.5509
Time taken for 1 epoch 10.38 sec
________________________________________________________________________________
Epoch 4 Batch 0 Loss 1.4670
Epoch 4 Batch 50 Loss 1.4482
Epoch 4 Batch 100 Loss 1.4608
Epoch 4 Batch 150 Loss 1.4165

Epoch 4 Loss: 1.4517
Time taken for 1 epoch 10.18 sec
_____________________________________________________________________