<a href="https://colab.research.google.com/github/mohamedbhy/RNN_LSTM/blob/master/RNN_LSTM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Text Generation With RNN (Recurrent Neural Network) (Fifty shades of grey style)

![RNN_NETWORK](https://colah.github.io/posts/2015-08-Understanding-LSTMs/img/RNN-shorttermdepdencies.png)

## Install Dependencies

In [0]:
!pip install -q tensorflow-gpu==2.0.0-alpha0
!pip install numpy
!pip install json

## Import the dependencies

In [0]:
import tensorflow as tf
import numpy as np
import os
import json

## Data Preprocessing

### Download Data

In [0]:
file = tf.keras.utils.get_file('fifty_shades_of_grey.txt','https://ia800200.us.archive.org/24/items/FiftyShadesOfGrey_201603/Fifty%20Shades%20of%20Grey_djvu.txt')
text = open(file,'rt').read()

*Selecting First Five Chapters*

In [0]:
text = text.split("Chapter Five")[0].split("Chapter One")[1]

### Vectorizing Text



*   Getting char list
*   create array of indexes (each char has own corresponding index)
*   create array of chars
*   converting text to indexes



In [0]:
vocab = sorted(set(text))
char_to_index = {c:i for i,c in enumerate(vocab)}
index_to_char = np.array(vocab)
text_as_int = np.array([char_to_index[c] for c in text])

### Divide text into sequences

In [0]:
sequence_length = 100
char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)
sequences = char_dataset.batch(sequence_length+1,drop_remainder=True)

### Duplicate and shift sequences (Creating Input and target data)

In [0]:
def split_input(chunk):
  input_text = chunk[:-1]
  target_text = chunk[1:]
  return input_text,target_text
dataset = sequences.map(split_input)

In [48]:
for input_example, target_example in  dataset.take(1):
  print ('Input data: ', repr(''.join(index_to_char[input_example.numpy()])))
  print ('Target data:', repr(''.join(index_to_char[target_example.numpy()])))

Input data:  ' \n\n\nI scowl with frustration at myself in the mirror. Damn my hair - it just won’t behave, \nand damn'
Target data: '\n\n\nI scowl with frustration at myself in the mirror. Damn my hair - it just won’t behave, \nand damn '


### Create Training batches

In [0]:
BATCH_SIZE = 64
BUFFER_SIZE = 10000
dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE,drop_remainder=True)

## Define Model

### Define Model

In [0]:
def model(vocab_size,embedding_dim,rnn_units,batch_size):
  model = tf.keras.models.Sequential()
  model.add(tf.keras.layers.Embedding(vocab_size,embedding_dim,batch_input_shape=[batch_size,None]))
  model.add(tf.keras.layers.LSTM(rnn_units,return_sequences=True,stateful=True,recurrent_initializer='glorot_uniform'))
  model.add(tf.keras.layers.Dropout(0.5))
  model.add(tf.keras.layers.LSTM(rnn_units,return_sequences=True))
  model.add(tf.keras.layers.Dropout(0.25))
  model.add(tf.keras.layers.Dense(vocab_size))
  return model

### Define Loss Function

**Categorical Cross Entropy:**![Categorical_cross_entropy](https://cwiki.apache.org/confluence/download/thumbnails/95651724/ce_loss.png?version=1&modificationDate=1539795211000&api=v2)

In [0]:
def loss(labels,logits):
  return tf.losses.sparse_categorical_crossentropy(labels,logits,from_logits=True)

### Define Checkpoint

In [0]:
checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")
checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_prefix,save_weights_only=True)

### Define HyperParameters

In [0]:
vocab_size = len(vocab)
embedding_dim = 256
rnn_units = 1024
EPOCHS = 30

### Create Model Instance

In [36]:
modelInstance = model(vocab_size,embedding_dim,rnn_units,BATCH_SIZE)

W0603 14:40:25.957062 139892791498624 tf_logging.py:161] <tensorflow.python.keras.layers.recurrent.UnifiedLSTM object at 0x7f39e97a35f8>: Note that this layer is not optimized for performance. Please use tf.keras.layers.CuDNNLSTM for better performance on GPU.
W0603 14:40:26.087980 139892791498624 tf_logging.py:161] <tensorflow.python.keras.layers.recurrent.UnifiedLSTM object at 0x7f39e979d390>: Note that this layer is not optimized for performance. Please use tf.keras.layers.CuDNNLSTM for better performance on GPU.


*Model Summary*

In [14]:
modelInstance.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (64, None, 256)           18432     
_________________________________________________________________
unified_lstm (UnifiedLSTM)   (64, None, 1024)          5246976   
_________________________________________________________________
dropout (Dropout)            (64, None, 1024)          0         
_________________________________________________________________
unified_lstm_1 (UnifiedLSTM) (64, None, 1024)          8392704   
_________________________________________________________________
dropout_1 (Dropout)          (64, None, 1024)          0         
_________________________________________________________________
dense (Dense)                (64, None, 72)            73800     
Total params: 13,731,912
Trainable params: 13,731,912
Non-trainable params: 0
____________________________________________

### Compile Model

In [0]:
modelInstance.compile(loss=loss,optimizer='adam')

## Train Model

In [38]:
modelInstance.fit(dataset,epochs=EPOCHS,callbacks=[checkpoint_callback])

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<tensorflow.python.keras.callbacks.History at 0x7f39e7b84a20>

## Generate Text

### Restore To Last Checkpoint

In [50]:
modelInstance = model(vocab_size, embedding_dim, rnn_units, batch_size=1)
modelInstance.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
modelInstance.build(tf.TensorShape([1,]))

W0603 15:18:19.862729 139892791498624 tf_logging.py:161] <tensorflow.python.keras.layers.recurrent.UnifiedLSTM object at 0x7f39dd373d30>: Note that this layer is not optimized for performance. Please use tf.keras.layers.CuDNNLSTM for better performance on GPU.
W0603 15:18:19.993071 139892791498624 tf_logging.py:161] <tensorflow.python.keras.layers.recurrent.UnifiedLSTM object at 0x7f39dc547cc0>: Note that this layer is not optimized for performance. Please use tf.keras.layers.CuDNNLSTM for better performance on GPU.


*Generate Text Function*

In [0]:
def generate_text(model,start_str,num_to_generate):
  input_tensor = [char_to_index[c] for c in start_str]
  input_tensor = tf.expand_dims(input_tensor,0)
  generated_text = []
  model.reset_states()
  for i in range(num_to_generate):
    predictions = model(input_tensor)
    predictions = tf.squeeze(predictions,0)
    predicted_id = tf.random.categorical(predictions,num_samples=1)[-1,0].numpy()
    input_tensor = tf.expand_dims([predicted_id],0)
    generated_text.append(index_to_char[predicted_id])
  return (start_str + ''.join(generated_text))

### Result

In [51]:
print(generate_text(modelInstance,"shape",1000))

shaper, crefuly the fovome, and’t g. 

“mangh. The heav his Euld Code of dGle h. I h s pouCher mirthy in ar. I” arros the 


be sPleat it’s the wkep, at corraph. 

Crre” I sOhed dop bat.” 

Gre; of dwughes at my coffucte loothek 
hith isn attectif 
Grezy as a ty my dSmyely for a musy-: fl. Grey stechinghip, his rounlyd r hip u3 sthirga I thi- I d9aghed bacou? 
I d8er. I h t. Kire thre 

t’s d I sted s umpol 
this ir: 

Histife it h shinal, wwith 

I coffand we 
is koked ff axam’t ire athange: thiry tree gJ I’nt tre h’s t. 
by rengr.” Greathan 
qut if. 
spit he hinds I” eloring “me ge!” me are if 
stilled his linger as a I’t 

te” Thintre Gr4 StSaga2 I” 

I pEnterozefrinces Grey athe 


“We poulPate do a 
my ad, muridooted Cherescon- 

WaticoulHike-fore: is te me a Cw. 

p, I thirse br, drow mainthatF the fuzer h h, think, my ske dithth L‘sce dingre Ki

L. He adut’s I fof anod I is 
aly fo a the” ate for ble? Squ5” I fe my withet Gy han 
ses de 
frrom Om, and 
o. Suithe umunglat 
Khe uW