<a href="https://colab.research.google.com/github/vvamsi91/RTML_AS3/blob/main/RTML_AS3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
from sklearn.model_selection import train_test_split
import time

In [None]:
text = """Next character prediction is a fundamental task in the field of natural language processing (NLP) that involves predicting the next character in a sequence of text based on the characters that precede it. This task is essential for various applications, including text auto-completion, spell checking, and even in the development of sophisticated AI models capable of generating human-like text.

At its core, next character prediction relies on statistical models or deep learning algorithms to analyze a given sequence of text and predict which character is most likely to follow. These predictions are based on patterns and relationships learned from large datasets of text during the training phase of the model.

One of the most popular approaches to next character prediction involves the use of Recurrent Neural Networks (RNNs), and more specifically, a variant called Long Short-Term Memory (LSTM) networks. RNNs are particularly well-suited for sequential data like text, as they can maintain information in 'memory' about previous characters to inform the prediction of the next character. LSTM networks enhance this capability by being able to remember long-term dependencies, making them even more effective for next character prediction tasks.

Training a model for next character prediction involves feeding it large amounts of text data, allowing it to learn the probability of each character's appearance following a sequence of characters. During this training process, the model adjusts its parameters to minimize the difference between its predictions and the actual outcomes, thus improving its predictive accuracy over time.

Once trained, the model can be used to predict the next character in a given piece of text by considering the sequence of characters that precede it. This can enhance user experience in text editing software, improve efficiency in coding environments with auto-completion features, and enable more natural interactions with AI-based chatbots and virtual assistants.

In summary, next character prediction plays a crucial role in enhancing the capabilities of various NLP applications, making text-based interactions more efficient, accurate, and human-like. Through the use of advanced machine learning models like RNNs and LSTMs, next character prediction continues to evolve, opening new possibilities for the future of text-based technology."""


In [None]:
unique_chars = sorted(set(text))
char_to_ix = {ch: i for i, ch in enumerate(unique_chars)}
ix_to_char = {i: ch for i, ch in enumerate(unique_chars)}


In [None]:
def prepare_dataset(text, max_length):
    # Initialize empty lists for sequences (X) and labels (y)
    X = []
    y = []

    # Create a character-to-index mapping for the unique characters in the text
    char_to_ix = {ch: i for i, ch in enumerate(sorted(set(text)))}

    # Iterate through the text to create sequences and corresponding labels
    for i in range(len(text) - max_length):
        # Extract a sequence of length 'max_length' and its label
        sequence = text[i:i + max_length]
        label = text[i + max_length]

        # Convert characters to indices using the char_to_ix mapping
        X.append([char_to_ix[char] for char in sequence])
        y.append(char_to_ix[label])

    # Convert lists to NumPy arrays
    X = np.array(X)
    y = np.array(y)

    return X, y

In [None]:


class RNNModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(RNNModel, self).__init__()

        # Set hyperparameters
        self.hidden_size = hidden_size

        # Define layers
        self.embedding = nn.Embedding(input_size, hidden_size)
        self.rnn = nn.RNN(hidden_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        # Embed input sequence
        embedded = self.embedding(x)

        # Apply RNN layer
        output, _ = self.rnn(embedded)

        # Extract the last time step and pass through the linear layer
        output = self.fc(output[:, -1, :])

        return output


In [None]:
class LSTMModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(LSTMModel, self).__init__()

        # Set hyperparameters
        self.hidden_size = hidden_size

        # Define layers
        self.embedding = nn.Embedding(input_size, hidden_size)
        self.lstm = nn.LSTM(hidden_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        # Embed input sequence
        embedded = self.embedding(x)

        # Apply LSTM layer
        output, _ = self.lstm(embedded)

        # Extract the last time step and pass through the linear layer
        output = self.fc(output[:, -1, :])

        return output

In [None]:
class GRUModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(GRUModel, self).__init__()

        # Set hyperparameters
        self.hidden_size = hidden_size

        # Define layers
        self.embedding = nn.Embedding(input_size, hidden_size)
        self.gru = nn.GRU(hidden_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        # Pass the input sequence through the embedding layer
        embedded = self.embedding(x)

        # Apply GRU layer to the embedded sequence
        output, hidden_state = self.gru(embedded)

        # Extract the last time step's output and pass it through the linear layer
        output = self.fc(output[:, -1, :])

        return output

In [None]:
def count_parameters(model):
    # Count the number of trainable parameters in the model
    num_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    return num_params


In [None]:
hidden_size = 256
learning_rate = 0.002
epochs = 130

In [None]:


def train_and_validate(model, X_train, y_train, X_val, y_val, criterion, optimizer, epochs, device):
    """
    Train and validate a PyTorch model.

    Args:
        model (nn.Module): The PyTorch model.
        X_train (torch.Tensor): Training input data.
        y_train (torch.Tensor): Training target data.
        X_val (torch.Tensor): Validation input data.
        y_val (torch.Tensor): Validation target data.
        criterion: Loss function.
        optimizer: Optimizer for training.
        epochs (int): Number of training epochs.
        device (torch.device): Device to which the model and data are moved.

    Returns:
        None
    """

    model = model.to(device)
    X_train, y_train = X_train.to(device), y_train.to(device)
    X_val, y_val = X_val.to(device), y_val.to(device)

    start_time = time.time()

    for epoch in range(epochs):
        model.train()
        optimizer.zero_grad()
        output = model(X_train)
        loss = criterion(output, y_train)
        loss.backward()
        optimizer.step()

        if (epoch + 1) % 10 == 0:
            model.eval()
            with torch.no_grad():
                val_output = model(X_val)
                val_loss = criterion(val_output, y_val)
                _, predicted = torch.max(val_output, 1)
                val_accuracy = (predicted == y_val).float().mean()

            # Print training and validation results
            print(f'Epoch {epoch + 1}, Loss: {loss.item()}, Validation Loss: {val_loss.item()}, Validation Accuracy: {val_accuracy.item()}')

    end_time = time.time()
    execution_time = end_time - start_time
    print(f"Total Execution Time: {execution_time} seconds")


In [None]:
# Prepare dataset for sequences of length 10
X_10, y_10 = prepare_dataset(text, max_length=10)

# Split the data into training and validation sets using train_test_split
X_train_10, X_val_10, y_train_10, y_val_10 = train_test_split(X_10, y_10, test_size=0.2, random_state=42)

# Convert NumPy arrays to PyTorch tensors
X_train_10 = torch.tensor(X_train_10, dtype=torch.long)
y_train_10 = torch.tensor(y_train_10, dtype=torch.long)
X_val_10 = torch.tensor(X_val_10, dtype=torch.long)
y_val_10 = torch.tensor(y_val_10, dtype=torch.long)



In [None]:


# Prepare dataset for sequences of length 20
X_20, y_20 = prepare_dataset(text, max_length=20)

# Split the data into training and validation sets
X_train_20, X_val_20, y_train_20, y_val_20 = train_test_split(X_20, y_20, test_size=0.2, random_state=42)

# Convert NumPy arrays to PyTorch tensors
X_train_20 = torch.tensor(X_train_20, dtype=torch.long)
y_train_20 = torch.tensor(y_train_20, dtype=torch.long)
X_val_20 = torch.tensor(X_val_20, dtype=torch.long)
y_val_20 = torch.tensor(y_val_20, dtype=torch.long)


In [None]:
# Prepare dataset for sequences of length 30
X_30, y_30 = prepare_dataset(text, max_length=30)

# Split the data into training and validation sets
X_train_30, X_val_30, y_train_30, y_val_30 = train_test_split(X_30, y_30, test_size=0.2, random_state=42)

# Convert NumPy arrays to PyTorch tensors
X_train_30 = torch.tensor(X_train_30, dtype=torch.long)
y_train_30 = torch.tensor(y_train_30, dtype=torch.long)
X_val_30 = torch.tensor(X_val_30, dtype=torch.long)
y_val_30 = torch.tensor(y_val_30, dtype=torch.long)

In [None]:
rnn_model = RNNModel(len(unique_chars), hidden_size, len(unique_chars))
lstm_model = LSTMModel(len(unique_chars), hidden_size, len(unique_chars))
gru_model = GRUModel(len(unique_chars), hidden_size, len(unique_chars))


In [None]:
# Define CrossEntropyLoss criterion
criterion = nn.CrossEntropyLoss()

# Define Adam optimizers for RNN, LSTM, and GRU models
rnn_optimizer = optim.Adam(params=rnn_model.parameters(), lr=learning_rate)
lstm_optimizer = optim.Adam(params=lstm_model.parameters(), lr=learning_rate)
gru_optimizer = optim.Adam(params=gru_model.parameters(), lr=learning_rate)


In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [None]:
print("RNN Model:")
train_and_validate(rnn_model, X_train_10, y_train_10, X_val_10, y_val_10, criterion, rnn_optimizer,epochs,device)
print(get_num_params(rnn_model))

RNN Model:
Epoch 10, Loss: 2.204080581665039, Validation Loss: 2.3146212100982666, Validation Accuracy: 0.3836477994918823
Epoch 20, Loss: 1.750116229057312, Validation Loss: 2.124817132949829, Validation Accuracy: 0.4109014570713043
Epoch 30, Loss: 1.40797758102417, Validation Loss: 1.9943206310272217, Validation Accuracy: 0.446540892124176
Epoch 40, Loss: 1.1076242923736572, Validation Loss: 1.9384828805923462, Validation Accuracy: 0.48427674174308777
Epoch 50, Loss: 0.8298348188400269, Validation Loss: 1.9624595642089844, Validation Accuracy: 0.5199161171913147
Epoch 60, Loss: 0.5753982663154602, Validation Loss: 2.032067060470581, Validation Accuracy: 0.5031446814537048
Epoch 70, Loss: 0.3654784858226776, Validation Loss: 2.16438889503479, Validation Accuracy: 0.5178197026252747
Epoch 80, Loss: 0.2133205384016037, Validation Loss: 2.3261218070983887, Validation Accuracy: 0.5031446814537048
Epoch 90, Loss: 0.12231635302305222, Validation Loss: 2.464176654815674, Validation Accuracy:

In [None]:
train_and_validate(rnn_model, X_train_20, y_train_20, X_val_20, y_val_20, criterion, rnn_optimizer, epochs,device)
print(get_num_params(rnn_model))

Epoch 10, Loss: 0.30162516236305237, Validation Loss: 0.9746853113174438, Validation Accuracy: 0.7810526490211487
Epoch 20, Loss: 0.14231836795806885, Validation Loss: 1.0017677545547485, Validation Accuracy: 0.7473683953285217
Epoch 30, Loss: 0.08050323277711868, Validation Loss: 1.0949468612670898, Validation Accuracy: 0.7347368597984314
Epoch 40, Loss: 0.04798579588532448, Validation Loss: 1.1397864818572998, Validation Accuracy: 0.7410526275634766
Epoch 50, Loss: 0.0322086475789547, Validation Loss: 1.1717268228530884, Validation Accuracy: 0.7389473915100098
Epoch 60, Loss: 0.02426544949412346, Validation Loss: 1.2148566246032715, Validation Accuracy: 0.730526328086853
Epoch 70, Loss: 0.02012496255338192, Validation Loss: 1.2314369678497314, Validation Accuracy: 0.7263157963752747
Epoch 80, Loss: 0.017672916874289513, Validation Loss: 1.2462728023529053, Validation Accuracy: 0.7242105007171631
Epoch 90, Loss: 0.016013430431485176, Validation Loss: 1.261729121208191, Validation Accu

In [None]:
print("RNN Model:")
train_and_validate(rnn_model, X_train_30, y_train_30, X_val_30, y_val_30, criterion, rnn_optimizer, epochs,device)
print(get_num_params(rnn_model))


RNN Model:
Epoch 10, Loss: 0.019167689606547356, Validation Loss: 0.7448652982711792, Validation Accuracy: 0.8287526369094849
Epoch 20, Loss: 0.01282532513141632, Validation Loss: 0.7504331469535828, Validation Accuracy: 0.8287526369094849
Epoch 30, Loss: 0.009540220722556114, Validation Loss: 0.7375034689903259, Validation Accuracy: 0.8350951671600342
Epoch 40, Loss: 0.007875164039433002, Validation Loss: 0.7486464977264404, Validation Accuracy: 0.8329809904098511
Epoch 50, Loss: 0.006886853836476803, Validation Loss: 0.7524468302726746, Validation Accuracy: 0.8287526369094849
Epoch 60, Loss: 0.006215290632098913, Validation Loss: 0.7601218223571777, Validation Accuracy: 0.8245242834091187
Epoch 70, Loss: 0.005718742497265339, Validation Loss: 0.7656318545341492, Validation Accuracy: 0.8266384601593018
Epoch 80, Loss: 0.005327148362994194, Validation Loss: 0.7727333903312683, Validation Accuracy: 0.8245242834091187
Epoch 90, Loss: 0.005004767794162035, Validation Loss: 0.7797280550003

In [None]:
print("LSTM Model:")
train_and_validate(lstm_model, X_train_10, y_train_10, X_val_10, y_val_10, criterion, lstm_optimizer,epochs, device)
print(get_num_params(lstm_model))

LSTM Model:
Epoch 10, Loss: 2.5516715049743652, Validation Loss: 2.485450029373169, Validation Accuracy: 0.3249475955963135
Epoch 20, Loss: 2.0436315536499023, Validation Loss: 2.176022529602051, Validation Accuracy: 0.42138364911079407
Epoch 30, Loss: 1.6361463069915771, Validation Loss: 2.0043509006500244, Validation Accuracy: 0.44863730669021606
Epoch 40, Loss: 1.2730463743209839, Validation Loss: 1.8949774503707886, Validation Accuracy: 0.48218029737472534
Epoch 50, Loss: 0.9295815229415894, Validation Loss: 1.864094614982605, Validation Accuracy: 0.5052410960197449
Epoch 60, Loss: 0.6324527263641357, Validation Loss: 1.8942550420761108, Validation Accuracy: 0.5073375105857849
Epoch 70, Loss: 0.3773435354232788, Validation Loss: 1.9715511798858643, Validation Accuracy: 0.4947589039802551
Epoch 80, Loss: 0.20819148421287537, Validation Loss: 2.089708089828491, Validation Accuracy: 0.46960169076919556
Epoch 90, Loss: 0.11929092556238174, Validation Loss: 2.2224555015563965, Validatio

In [None]:

train_and_validate(lstm_model, X_train_20, y_train_20, X_val_20, y_val_20, criterion, lstm_optimizer, epochs, device)
print(get_num_params(lstm_model))

Epoch 10, Loss: 0.5648934841156006, Validation Loss: 0.9973219633102417, Validation Accuracy: 0.7410526275634766
Epoch 20, Loss: 0.32111856341362, Validation Loss: 1.062440276145935, Validation Accuracy: 0.7136842012405396
Epoch 30, Loss: 0.1807396113872528, Validation Loss: 1.1228240728378296, Validation Accuracy: 0.6926316022872925
Epoch 40, Loss: 0.10763271898031235, Validation Loss: 1.161333680152893, Validation Accuracy: 0.6863157749176025
Epoch 50, Loss: 0.06855873763561249, Validation Loss: 1.1874879598617554, Validation Accuracy: 0.6800000071525574
Epoch 60, Loss: 0.04864540323615074, Validation Loss: 1.2171175479888916, Validation Accuracy: 0.6694737076759338
Epoch 70, Loss: 0.037469975650310516, Validation Loss: 1.2383980751037598, Validation Accuracy: 0.6673684120178223
Epoch 80, Loss: 0.030485359951853752, Validation Loss: 1.2610251903533936, Validation Accuracy: 0.6652631759643555
Epoch 90, Loss: 0.02603078819811344, Validation Loss: 1.2806742191314697, Validation Accuracy

In [None]:

train_and_validate(lstm_model, X_train_30, y_train_30, X_val_30, y_val_30, criterion, lstm_optimizer, epochs, device)
print(get_num_params(lstm_model))

Epoch 10, Loss: 0.08281832188367844, Validation Loss: 0.4681369662284851, Validation Accuracy: 0.8816067576408386
Epoch 20, Loss: 0.03869743272662163, Validation Loss: 0.561776340007782, Validation Accuracy: 0.8562368154525757
Epoch 30, Loss: 0.023055076599121094, Validation Loss: 0.5998199582099915, Validation Accuracy: 0.8372092843055725
Epoch 40, Loss: 0.016188932582736015, Validation Loss: 0.6104481220245361, Validation Accuracy: 0.8393234610557556
Epoch 50, Loss: 0.0127890445291996, Validation Loss: 0.6243615746498108, Validation Accuracy: 0.8372092843055725
Epoch 60, Loss: 0.01078642439097166, Validation Loss: 0.6334517598152161, Validation Accuracy: 0.8350951671600342
Epoch 70, Loss: 0.009466378018260002, Validation Loss: 0.6414098739624023, Validation Accuracy: 0.8372092843055725
Epoch 80, Loss: 0.008511037565767765, Validation Loss: 0.6481615900993347, Validation Accuracy: 0.8435518145561218
Epoch 90, Loss: 0.0077705979347229, Validation Loss: 0.6557490825653076, Validation Ac

In [88]:
print("GRU Model:")
train_and_validate(gru_model, X_train_10, y_train_10, X_val_10, y_val_10, criterion, gru_optimizer,epochs, device)
print(get_num_params(gru_model))

GRU Model:
Epoch 10, Loss: 2.344938039779663, Validation Loss: 2.34242844581604, Validation Accuracy: 0.3563941419124603
Epoch 20, Loss: 1.8512067794799805, Validation Loss: 2.111920118331909, Validation Accuracy: 0.4297693967819214
Epoch 30, Loss: 1.454903244972229, Validation Loss: 1.9604456424713135, Validation Accuracy: 0.4716981053352356
Epoch 40, Loss: 1.1046106815338135, Validation Loss: 1.8878982067108154, Validation Accuracy: 0.5031446814537048
Epoch 50, Loss: 0.7899007797241211, Validation Loss: 1.8893554210662842, Validation Accuracy: 0.5220125913619995
Epoch 60, Loss: 0.5263925194740295, Validation Loss: 1.9535163640975952, Validation Accuracy: 0.5387840867042542
Epoch 70, Loss: 0.32988712191581726, Validation Loss: 2.059784412384033, Validation Accuracy: 0.5324947834014893
Epoch 80, Loss: 0.19832926988601685, Validation Loss: 2.2002086639404297, Validation Accuracy: 0.5262054800987244
Epoch 90, Loss: 0.11936786025762558, Validation Loss: 2.341063976287842, Validation Accur

In [89]:
print("GRU Model:")
train_and_validate(gru_model, X_train_20, y_train_20, X_val_20, y_val_20, criterion, gru_optimizer,epochs, device)
print(get_num_params(gru_model))

GRU Model:
Epoch 10, Loss: 0.18730255961418152, Validation Loss: 0.888883113861084, Validation Accuracy: 0.8021052479743958
Epoch 20, Loss: 0.08144550770521164, Validation Loss: 0.9288865923881531, Validation Accuracy: 0.8021052479743958
Epoch 30, Loss: 0.04682812839746475, Validation Loss: 1.0188289880752563, Validation Accuracy: 0.7810526490211487
Epoch 40, Loss: 0.03153342753648758, Validation Loss: 1.071380853652954, Validation Accuracy: 0.7810526490211487
Epoch 50, Loss: 0.022716475650668144, Validation Loss: 1.097452163696289, Validation Accuracy: 0.7768421173095703
Epoch 60, Loss: 0.01781545951962471, Validation Loss: 1.1149767637252808, Validation Accuracy: 0.7768421173095703
Epoch 70, Loss: 0.015446878038346767, Validation Loss: 1.1298617124557495, Validation Accuracy: 0.7768421173095703
Epoch 80, Loss: 0.014033342711627483, Validation Loss: 1.1394327878952026, Validation Accuracy: 0.7789473533630371
Epoch 90, Loss: 0.013084528967738152, Validation Loss: 1.1501317024230957, Va

In [90]:
print("GRU Model:")
train_and_validate(gru_model, X_train_30, y_train_30, X_val_30, y_val_30, criterion, gru_optimizer,epochs, device)
print(get_num_params(gru_model))

GRU Model:
Epoch 10, Loss: 0.061818927526474, Validation Loss: 0.4742857813835144, Validation Accuracy: 0.8816067576408386
Epoch 20, Loss: 0.02443995326757431, Validation Loss: 0.5293704867362976, Validation Accuracy: 0.8710359334945679
Epoch 30, Loss: 0.014393078163266182, Validation Loss: 0.5423020124435425, Validation Accuracy: 0.873150110244751
Epoch 40, Loss: 0.009872260503470898, Validation Loss: 0.5413827300071716, Validation Accuracy: 0.8773784637451172
Epoch 50, Loss: 0.007356345187872648, Validation Loss: 0.5511928796768188, Validation Accuracy: 0.8773784637451172
Epoch 60, Loss: 0.006115723866969347, Validation Loss: 0.5615842342376709, Validation Accuracy: 0.8752642869949341
Epoch 70, Loss: 0.005405019968748093, Validation Loss: 0.5673173069953918, Validation Accuracy: 0.8773784637451172
Epoch 80, Loss: 0.0049299742095172405, Validation Loss: 0.5732458829879761, Validation Accuracy: 0.8752642869949341
Epoch 90, Loss: 0.004583163186907768, Validation Loss: 0.5770503878593445