<a href="https://colab.research.google.com/github/ibribr/DT8807/blob/master/simpleRNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

A simple example of language modeling, where we want to predict the next word in a sequence given the previous words (seq to one).

In this example, we first define a vocabulary of three characters ('a', 'b', and 'c') and encode the training data as sequences of one-hot encoded characters. so a is encoded into [1 0 0], b as [0 1 0], and c as [0 0 1]. 

We then define an RNN model with a SimpleRNN layer with 10 units and a Dense layer with a softmax activation function to output the predicted probabilities for the next character.

We compile the model with categorical cross-entropy loss and the Adam optimizer and train it on the training data for 100 epochs. Finally, we test the model on two sequences and print the predicted probabilities for the next character.

In the next example (seq to seq), given a setence 'abc' we need to predict the next sentence which is 'bca', etc. 

https://amitness.com/2020/04/recurrent-layers-keras/ 


In [13]:
import tensorflow as tf
import numpy as np
from tensorflow import one_hot
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, SimpleRNN
from keras.utils import to_categorical
from tensorflow.keras.utils import plot_model

In [20]:
# seq to one
# Define the vocabulary
vocab = {'a': 0, 'b': 1, 'c': 2}

# make the vocabulary in to one-hot encoding
a = one_hot(0,3)
b = one_hot(1,3)
c = one_hot(2,3)

# Define the training data
# inputs (sequence)
x_train = np.array([[a,b,c],[b,c,a],[c,a,b]])
# targets (one)
y_train = np.array([b,c,a])

# Define the RNN model
model = Sequential()
model.add(SimpleRNN(units=10, input_shape = (3, 3)))
model.add(Dense(units=3, activation='softmax'))

# Compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# plot the model
model.summary()
plot_model(model, to_file = 'rnn.jpg', show_shapes = True, show_layer_names = True)

# Train the model
model.fit(x_train, y_train, epochs=1000, verbose=False)

# Evaluate the model
score = model.evaluate(x_train, y_train)
print('Train loss (%): ', score[0]*100)
print('Train acc (%): ', score[1]*100)

# Test the model
x_test = np.array([[a,b,c],[b,c,a]])
# 'a b c'     --> the next word is 'b' [1],
# 'b c a'     --> the next word is 'c' [2],

y_pred = model.predict(x_test)
print(y_pred)

output_letter = (np.argmax(y_pred, axis=-1))
print(output_letter)

Model: "sequential_13"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 simple_rnn_13 (SimpleRNN)   (None, 10)                140       
                                                                 
 dense_13 (Dense)            (None, 3)                 33        
                                                                 
Total params: 173
Trainable params: 173
Non-trainable params: 0
_________________________________________________________________
Train loss (%):  0.5262067541480064
Train acc (%):  100.0
[[0.00280921 0.99555814 0.00163256]
 [0.00250984 0.00192721 0.9955629 ]]
[1 2]


In [22]:
# Seq to Seq
# Define the vocabulary
vocab = {'a': 0, 'b': 1, 'c': 2}
vocab_verse = {value: key for key, value in vocab.items()}
print(vocab_verse)

# make the vocabulary in to one-hot encoding
a = one_hot(0,3)
b = one_hot(1,3)
c = one_hot(2,3)

# Define the training data
# inputs (seq)
X_train = np.array([[a,b,c],[b,c,a],[c,a,b]])
# target (seq)
y_train = np.array([[b,c,a],[c,a,b],[a,b,c]])

# Define the RNN model
model = Sequential()
model.add(SimpleRNN(units=10, input_shape = (3, 3), return_sequences=True))
model.add(Dense(units=3, activation='softmax'))

# Compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# plot the model
model.summary()
plot_model(model, to_file = 'rnn.jpg', show_shapes = True, show_layer_names = True)

# Train the model
model.fit(X_train, y_train, epochs=1000, verbose=False)

# Evaluate the model
score = model.evaluate(X_train, y_train)
print('Training loss and acc.: ', score)

# Test the model
X_test = np.array([[a,b,c],[b,c,a]])
# 'a b c' [[0], [1], [2]]  --> the next sentence is 'b c a' [[1], [2], [0]],
# 'b c a' [[1], [2], [0]]  --> the next sentence is 'c a b' [[2], [0], [1]],

y_pred = model.predict(X_test)
print(y_pred)

output_letter = (np.argmax(y_pred, axis=-1))
print(output_letter)

{0: 'a', 1: 'b', 2: 'c'}
Model: "sequential_15"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 simple_rnn_15 (SimpleRNN)   (None, 3, 10)             140       
                                                                 
 dense_15 (Dense)            (None, 3, 3)              33        
                                                                 
Total params: 173
Trainable params: 173
Non-trainable params: 0
_________________________________________________________________
Training loss and acc.:  [0.023111017420887947, 1.0]
[[[2.7689941e-02 9.3317854e-01 3.9131500e-02]
  [1.4385805e-03 1.2828900e-03 9.9727851e-01]
  [9.9588341e-01 3.0942813e-03 1.0222401e-03]]

 [[3.1862341e-02 2.7933210e-02 9.4020450e-01]
  [9.9687743e-01 1.9705945e-03 1.1520423e-03]
  [1.4082309e-03 9.9806064e-01 5.3111213e-04]]]
[[1 2 0]
 [2 0 1]]
