#Sequencing Success: A Hands-On Workshop in Deep Learning for Sequence-to-Sequence Models
Moratuwa Engineering Research Conference 2023 (MERCon 2023) is the 9th international conference organized by the Engineering Research Unit at the University of Moratuwa. As part of MERCon 2023, we are hosting a Hands-On Workshop on Deep Learning for Sequence-to-Sequence Models. This workshop spans four hours and is divided into four one-hour sessions, covering the following topics:
- Introduction to Sequence-to-Sequence Learning
- Sequence-to-Sequence Learning with Recurrent Neural Networks (RNNs)
- Sequence-to-Sequence Learning with Encoder-Decoder Models
- Sequence-to-Sequence Learning with Encoder-Decoder Models and Attention Mechanisms

This notebook is prepared for session **Sequence-to-Sequence Learning with Recurrent Neural Networks**.

All rights reserved.

Authors:
1.   Dr.T.Uthayasanker ([rtuthaya.lk](https://rtuthaya.lk))
2.   Mr.S.Braveenan ([Braveenan Sritharan](https://www.linkedin.com/in/braveenan-sritharan/))

[For more information - MERCon 2023](https://mercon.uom.lk)

#A Simple Seq2Seq Problem 2: The reverse sentence problem
In this simple Seq2Seq problem, we are provided with a **parallel dataset** comprising two sentences, X (input) and y (output). In this scenario, the output sentence, y[i], is constructed by reversing the order of the words in the input sentence, X[i]. To illustrate, consider an example where the **input sentence X[i]** has a length of 6, such as:

X[i] = **he ate apple**

The corresponding **output sentence, y[i]**, would be:

y[i] = **apple ate he**

This problem serves as a foundational example of a Sequence-to-Sequence (Seq2Seq) task, where the objective is to learn to reverse sentences effectively.

In [None]:
#@title Import Libraries
import random
import numpy as np
import matplotlib.pyplot as plt

from keras import Input
from keras.layers import RepeatVector, Dense, SimpleRNN, GRU, LSTM, TimeDistributed
from keras.callbacks import EarlyStopping
from keras.utils import plot_model
from keras.models import Sequential, Model

#Auxiliary functions

Certainly! Here's a more concise introduction to the key functions in the code snippet:
1. **generate_text_sequence(length, word_array)**: Generates a random text sequence of a given length using words from an array.
2. **one_hot_encode_text(text_sequence, word_array)**: Converts a text sequence into one-hot encoded vectors using a word array.
3. **one_hot_decode_text(encoded_seq, word_array)**: Decodes a one-hot encoded sequence back into its original text form.
4. **get_reversed_pairs(time_steps, word_array, verbose=False)**: Generates pairs of random sequences and their reversals, one-hot encodes them, and returns them for training.
5. **create_dataset(train_size, test_size, time_steps, word_array, verbose=False)**: Creates training and testing datasets by generating reversed pairs.
6. **train_test(model, X_train, y_train, X_test, y_test, epochs=100, verbose=0)**: Trains a neural network model, evaluates it, and returns the model and training history.
7. **visualize_history(history)**: Visualizes the training history, showing accuracy and loss over epochs.
8. **check_samples(model, X_test, y_test, word_array, num_samples=10)**: Checks the model's performance on a set of sample sequences from the testing data.

These functions collectively support the process of training and evaluating a neural network for a sentence reversal task.

In [None]:
#@title Function to generate a text sequence
# generate sequence
def generate_text_sequence(length, word_array):
    word_sequence = [random.choice(word_array) for _ in range(length)]
    text_sequence = ' '.join(word_sequence)
    return text_sequence

In [None]:
#@title Function to encode and decode text sequence
# one hot encode sequence
def one_hot_encode_text(text_sequence, word_array):
    encoding = []
    for word in text_sequence.split():
        vector = [0] * len(word_array)
        if word in word_array:
            vector[word_array.index(word)] = 1
        encoding.append(vector)
    return np.array(encoding)

# decode a one hot encoded string
def one_hot_decode_text(encoded_seq, word_array):
    decoded_sequence = [word_array[np.argmax(vector)] for vector in encoded_seq]
    return ' '.join(decoded_sequence)

In [None]:
#@title Function to generate reverse pair dataset
# create one reverse pair
def get_reversed_pairs(time_steps,word_array,verbose= False):
		# generate random sequence
		sequence_in = generate_text_sequence(time_steps, word_array)
		sequence_out = ' '.join(sequence_in.split()[::-1])

		# one hot encode
		X = one_hot_encode_text(sequence_in, word_array)
		y = one_hot_encode_text(sequence_out, word_array)
		# reshape as 3D
		X = X.reshape((1, X.shape[0], X.shape[1]))
		y = y.reshape((1, y.shape[0], y.shape[1]))

		if(verbose):
			print('\nSample X and y')
			print('\nIn raw format:')
			print('X[0]=%s, y[0]=%s' % (one_hot_decode_text(X[0], word_array), one_hot_decode_text(y[0], word_array)))
			print('\nIn one_hot_encoded format:')
			print('X[0]=%s' % (X[0]))
			print('y[0]=%s' % (y[0]))
		return X,y

# create final dataset
def create_dataset(train_size, test_size, time_steps,word_array, verbose= False):
		pairs = [get_reversed_pairs(time_steps,word_array) for _ in range(train_size)]
		pairs=np.array(pairs).squeeze()
		X_train = pairs[:,0]
		y_train = pairs[:,1]
		pairs = [get_reversed_pairs(time_steps,word_array) for _ in range(test_size)]
		pairs=np.array(pairs).squeeze()
		X_test = pairs[:,0]
		y_test = pairs[:,1]

		if(verbose):
			print('\nGenerated sequence datasets as follows')
			print('X_train.shape: ', X_train.shape,'y_train.shape: ', y_train.shape)
			print('X_test.shape: ', X_test.shape,'y_test.shape: ', y_test.shape)

		return X_train, y_train, X_test, 	y_test

In [None]:
#@title Function to train and evaluate model
def train_test(model, X_train, y_train , X_test, y_test, epochs=100, verbose=0):
    # patient early stopping
    es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=20)

    # train model
    print('training for ',epochs,' epochs begins with validation_split= 0.1 & EarlyStopping(monitor= val_loss, patience=20)....')
    history = model.fit(X_train, y_train, validation_split=0.1, epochs=epochs, verbose=verbose, callbacks=[es])
    print(epochs,' epoch training finished...')

    # evaluate the model
    _, train_acc = model.evaluate(X_train, y_train, verbose=0)
    _, test_acc = model.evaluate(X_test, y_test, verbose=0)

    print('\nPREDICTION ACCURACY (%):')
    print('Train: %.3f, Test: %.3f' % (train_acc*100, test_acc*100))

    return model, history.history

In [None]:
#@title Function to visualize loss and accuracy
def visualize_history(history):
	# summarize history for accuracy
	plt.plot(history['accuracy'])
	plt.plot(history['val_accuracy'])
	plt.ylabel('accuracy')
	plt.xlabel('epoch')
	plt.legend(['train', 'val'], loc='upper left')
	plt.show()
	# summarize history for loss
	plt.plot(history['loss'])
	plt.plot(history['val_loss'])
	plt.ylabel('loss')
	plt.xlabel('epoch')
	plt.legend(['train', 'val'], loc='upper left')
	plt.show()

In [None]:
#@title Function to check some examples
def check_samples(model, X_test, y_test, word_array, num_samples=10):
    sample_indices = random.sample(range(len(X_test)), num_samples)

    for id in sample_indices:
        X, y = X_test[id], y_test[id]
        X = np.expand_dims(X, axis=0)
        y = np.expand_dims(y, axis=0)
        yhat = model.predict(X, verbose=0)
        print(f"Input: {one_hot_decode_text(X[0], word_array)} \nExpected: {one_hot_decode_text(y[0], word_array)} \nPredicted: {one_hot_decode_text(yhat[0], word_array)} \n{np.array_equal(one_hot_decode_text(y[0], word_array), one_hot_decode_text(yhat[0], word_array))}\n")

#Create reverse sentence dataset


This code snippet is designed to generate a dataset for the **Reverse Sentence Task**. In this task, each data point consists of a sentence and its reversed counterpart. The essential parameters include the input sequence length (n_timesteps_in), the number of unique words (n_features), the size of the training dataset (train_size), and the size of the testing dataset (test_size).

The code accomplishes the following:
1. It generates a random sentence and its reversed version, one-hot encodes them, and optionally displays sample pairs of sentences to illustrate the dataset structure.
2. The code then creates training and testing datasets by generating pairs of random sentences and their reversals. The dataset sizes are determined by the parameters train_size and test_size.

This dataset is a crucial component for training and evaluating models for the **Reverse Sentence Task**, which is a common problem in natural language processing.

In [None]:
word_array = ["apple", "banana", "cherry", "orange", "strawberry",
             "carrot", "broccoli", "potato", "tomato", "cucumber",
             "rose", "tulip", "daisy", "lily", "sunflower",
             "red", "blue", "green", "yellow", "purple",
             "Colombo", "London", "Paris", "Tokyo", "Sydney",
             "car", "bus", "bicycle", "train", "motorcycle",
             "guitar", "piano", "violin", "trumpet", "flute",
             "beach", "mountain", "park", "desert", "island",
             "book", "computer", "chair", "table", "lamp",
             "dog", "cat", "bird", "elephant", "lion"]

In [None]:
#@title Generating dataset
# Default configuration parameters
n_timesteps_in = 6
n_features = len(word_array)
train_size = 20000
test_size = 200

# Generate random sequence using specified parameters
X, y = get_reversed_pairs(n_timesteps_in, word_array, verbose=True)

# Generate datasets using specified parameters
X_train, y_train, X_test, y_test = create_dataset(train_size, test_size, n_timesteps_in, word_array, verbose=True)

#1. Multi-Layer Perceptron network model

In [None]:
#@title Create Multi-Layer Perceptron network model

In [None]:
#@title Train and Evaluate Multi-Layer Perceptron network model

In [None]:
#@title Visualize training and validation Multi-Layer Perceptron network model

In [None]:
#@title Check random samples Multi-Layer Perceptron network model

#2. Recurrent Neural Networks

#2.1. Simple Recurrent Neural Network model (RNN)

In [None]:
#@title Create simple RNN model

In [None]:
#@title Train and Evaluate simple RNN model

In [None]:
#@title Visualize training and validation simple RNN model

In [None]:
#@title Check random samples simple RNN model

#2.2. Gated Recurrent Units model (GRU)

In [None]:
#@title Create GRU model

In [None]:
#@title Train and Evaluate GRU model

In [None]:
#@title Visualize training and validation GRU model

In [None]:
#@title Check random samples GRU model

#2.3. Long Short-Term Memory model (LSTM)



In [None]:
#@title Create LSTM model

In [None]:
#@title Train and Evaluate LSTM model

In [None]:
#@title Visualize training and validation LSTM model

In [None]:
#@title Check random samples LSTM model

#3. Information Sharing between RNN Layers

#3.1. LSTM model - Only last hidden state



In [None]:
#@title Create LSTM model with only last hidden state

In [None]:
#@title Train and Evaluate LSTM model with only last hidden state

In [None]:
#@title Visualize training and validation LSTM model with only last hidden state

In [None]:
#@title Check random samples LSTM model with only last hidden state

#3.2. LSTM model - Last hidden state and last cell state

In [None]:
#@title Create LSTM model with last hidden state and last cell state

In [None]:
#@title Train and Evaluate LSTM model with last hidden state and last cell state

In [None]:
#@title Visualize training and validation LSTM model with last hidden state and last cell state

In [None]:
#@title Check random samples LSTM model with last hidden state and last cell state

#3.3. LSTM model - All hidden states and last cell state

In [None]:
#@title Create LSTM model with all hidden states and last cell state

In [None]:
#@title Train and Evaluate LSTM model with all hidden states and last cell state

In [None]:
#@title Visualize training and validation LSTM model with all hidden states and last cell state

In [None]:
#@title Check random samples LSTM model with all hidden states and last cell state

#Reference
1. https://www.muratkarakaya.net/2022/11/seq2seq-learning-tutorial-series.html
2. https://deeplearningmath.org/sequence-models.html
