In [1]:
# checking GPU connection
# you can skip this cell
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))
tf.device(device_name)

Found GPU at: /device:GPU:0


<tensorflow.python.eager.context._EagerDeviceContext at 0x7fdfcf86f908>

In [2]:
# connecting my drive
# you can skip this cell
from google.colab import drive
drive.mount('/content/drive/')

Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).


## Importing Libraries

In [3]:
import numpy as np
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.layers import SimpleRNN, LSTM, GRU
from tensorflow.keras.models import Sequential

## Loading Dataset

Explanation of code below:
* 'lines' is a list that will contain each line
* the text file is opened and given the variable name '_in'
* 'line' iterates through each line in the text document
* strip() function removes any spaces before and after the line
* lower() function converts each character into lowercase
* decode() function has to parameters
    * 'ascii' represnts the desired encoding
    * 'ignore' is used to ignore the character and continue with the next if an error arises because of that character.
* if length of a line is zero then it is not appended to the list 'lines'
* the loop continues till the end of the document
* 'text' is a single string that contais the whole data of the document
* 'text' can be seen as concatination of all the srtrings present in the list 'lines'
* 'char' is the set of all the characters in the document
* 'no_chars' is the number of unique characters in the document 

In [4]:
# Read lines from an example source file.
# enter the path to your text file below if you are running a copy of this notebook
with open("/content/drive/MyDrive/Study/DL/NLP/Text Generation/alice_in_wonderland.txt", 'rb') as _in:
    lines = []
    for line in _in:
        line = line.strip().lower().decode("ascii", "ignore")
        if len(line) == 0:
            continue
        lines.append(line)
text = " ".join(lines)
chars = set([c for c in text])
no_chars = len(chars)

## Data Preprocessing

Create a character index and reverse mapping to go between a numerical ID and a specific character. 

The numerical ID will correspond to a column number when using a one-hot encoded representation of character inputs.

In [5]:
char2index = {c: i for i, c in enumerate(chars)}
index2char = {i: c for i, c in enumerate(chars)}

Now we will create a 2 lists
* A list of input sequence as 'input_chars'
* List of corresponding next character for that sequence as 'label_chars'

For convenience, I chose a fixed sequence length of 10 characters you can chose 

In [6]:
SEQLEN, STEP = 10, 1
input_chars, label_chars = [], []

In [7]:
# Convert the data into a series of different SEQLEN-length subsequences.
for i in range(0, len(text) - SEQLEN, STEP):
    input_chars.append(text[i: i + SEQLEN])
    label_chars.append(text[i + SEQLEN])

Computing one-hot encoding of the input sequences X and the next character i.e. the label (y)

In [8]:
X = np.zeros((len(input_chars), SEQLEN, no_chars), dtype=np.bool)
y = np.zeros((len(input_chars), no_chars), dtype=np.bool)
for i, input_char in enumerate(input_chars):
    for j, ch in enumerate(input_char):
        X[i, j, char2index[ch]] = 1
    y[i, char2index[label_chars[i]]] = 1

## Model Building

Now we will build a model with two GRU layers.

We will train the model for few iterations and then see the results it is giving.
Then we will loop the process of training and producing a new sequence for few times.

In [9]:
# Set up a bunch of hyperparameters for the network and training.
BATCH_SIZE, HIDDEN_SIZE = 128, 128
NUM_ITERATIONS = 5
NUM_EPOCHS_PER_ITERATION = 5

# this number will specify the number of character predicted in a single iteration
NUM_PREDS_PER_ITERATION = 100

Create a simple recurrent neural network.

There are 2 recurrent layer that produces an embedding of size HIDDEN_SIZE. The first layer takes input from the one-hot encoded input layer, output of this layer acts as the input of second layer. This is followed by a Dense fully-connected layer across the set of possible next characters, which is converted to a probability score via a standard softmax activation with a multi-class cross-entropy loss function linking the prediction to the one-hot
encoding character label.

In [10]:
model = Sequential()
model.add(
    GRU(  # You can vary this with LSTM or SimpleRNN to try alternatives.
        HIDDEN_SIZE,
        return_sequences=True,
        input_shape=(SEQLEN, no_chars),
        unroll=True
    )
)
model.add(
    GRU(
        HIDDEN_SIZE,
        return_sequences=False,
        unroll=True
    )
)
model.add(Dense(no_chars))
model.add(Activation("softmax"))
model.compile(loss="categorical_crossentropy", optimizer="adam")



In [11]:
# Execute a series of training and demonstration iterations.
for iteration in range(NUM_ITERATIONS):

    # For each iteration, run the model fitting procedure for a number of epochs.
    print("_" * 150)
    print("Iteration #: %d" % (iteration))
    model.fit(X, y, batch_size=BATCH_SIZE, epochs=NUM_EPOCHS_PER_ITERATION)

    # Select a random example input sequence.
    test_idx = np.random.randint(len(input_chars))
    test_chars = input_chars[test_idx]

    # For a number of prediction steps using the current version of the trained model
    # construct a one-hot encoding of the test input and append a prediction.
    print("Generating from seed: %s" % (test_chars))
    print(test_chars, end="")
    for i in range(NUM_PREDS_PER_ITERATION):

        # Here is the one-hot encoding.
        X_test = np.zeros((1, SEQLEN, no_chars))
        for j, ch in enumerate(test_chars):
            X_test[0, j, char2index[ch]] = 1

        # Make a prediction with the current model.
        pred = model.predict(X_test, verbose=0)[0]
        y_pred = index2char[np.argmax(pred)]

        # Print the prediction appended to the test example.
        print(y_pred, end="")

        # Increment the test example to contain the prediction as if it
        # were the correct next letter.
        test_chars = test_chars[1:] + y_pred
    print()

______________________________________________________________________________________________________________________________________________________
Iteration #: 0
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Generating from seed: may charge
may charge the footman in the first the footman in the first the footman in the first the footman in the first
______________________________________________________________________________________________________________________________________________________
Iteration #: 1
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Generating from seed: : she foun
: she found the whole party at the trials were all the trials were all the trials were all the trials were all
______________________________________________________________________________________________________________________________________________________
Iteration #: 2
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Generating from seed: ame solemn
ame solemnly, and the mouse onl

We can see that some sensible words are being generated although the texts are starting to repeat itslef after some time. Considering the number of epochs and the depth of network, this is pretty impressive.