<a href="https://colab.research.google.com/github/shuvad23/Deep-learning-with-PyTorch/blob/main/Question_Answering_System_using_PyTorch(RNN)_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

1Ô∏è‚É£ What is an RNN?

- A Recurrent Neural Network (RNN) is a type of artificial neural network designed to process sequential data by maintaining a memory (hidden state) of previous inputs.

- Unlike Feedforward Neural Networks, RNNs use feedback connections, meaning the output of a neuron at time t-1 is fed back into the network at time t.

üìå Key idea:

- RNNs remember past information to influence current predictions.


2Ô∏è‚É£ Why Do We Need RNNs?

- Traditional neural networks assume independent inputs, but many real-world problems involve sequences.

| Problem Type            | Why RNN                            |
| ----------------------- | ---------------------------------- |
| Speech recognition      | Meaning depends on previous sounds |
| Language translation    | Word order matters                 |
| Stock price prediction  | Past prices affect future          |
| Sentiment analysis      | Context from earlier words         |
| Time-series forecasting | Sequential dependency              |


4Ô∏è‚É£ How RNN Works (Step-by-Step)

- Input enters at time t

- Previous hidden state is combined with current input

- Activation function updates hidden state

- Hidden state produces output

- Hidden state is passed to the next time step

‚û° This loop continues until the sequence ends.


5Ô∏è‚É£ Types of RNN Architectures
üîπ Based on Input‚ÄìOutput Mapping

| Type                | Example              |
| ------------------- | -------------------- |
| One-to-One          | Image classification |
| One-to-Many         | Image captioning     |
| Many-to-One         | Sentiment analysis   |
| Many-to-Many        | Machine translation  |
| Many-to-Many (sync) | POS tagging          |




In [3]:
import pandas as pd

df = pd.read_csv('/content/100_Unique_QA_Dataset.csv')
df.head()

Unnamed: 0,question,answer
0,What is the capital of France?,Paris
1,What is the capital of Germany?,Berlin
2,Who wrote 'To Kill a Mockingbird'?,Harper-Lee
3,What is the largest planet in our solar system?,Jupiter
4,What is the boiling point of water in Celsius?,100


In [5]:
# tokenize
import re
def tokenize_the_text(text):
    text = text.lower()
    text = re.sub(r'[^\w\s]', '', text)
    text = re.sub(r'\s+', ' ', text)
    text = text.replace('\n', '')
    text = text.replace('\t', '')
    text = text.replace("'", '')
    text = text.replace('"', '')
    text = text.replace(':', '')
    text = text.replace('?','')
    text = text.replace('!','')
    text = text.replace('.','')
    text = text.strip()
    tokens = text.split()
    return tokens

In [7]:
tokenize_the_text(df['question'][0])

['what', 'is', 'the', 'capital', 'of', 'france']

In [12]:
#vocab
vocab = {'<UNK>': 0}
def build_vocab(row):
    tokenized_question = tokenize_the_text(row['question'])
    tokenized_answer = tokenize_the_text(row['answer'])
    merged_tokens = tokenized_question + tokenized_answer

    for token in merged_tokens:
        if token not in vocab:
            vocab[token] = len(vocab)


In [16]:
df.apply(build_vocab, axis=1)

In [21]:
# convert words to numerical indices
def text_to_indices(text,vocab):
    tokenized_text = tokenize_the_text(text)
    indices = [vocab.get(token, vocab['<UNK>']) for token in tokenized_text]
    return indices
print(text_to_indices(df['question'][0],vocab))

[1, 2, 3, 4, 5, 6]


In [22]:
# create class dataset
import torch
from torch.utils.data import Dataset, DataLoader

class QADataset(Dataset):
    def __init__(self, df, vocab):
        self.df = df
        self.vocab = vocab

    def __len__(self):
        return len(self.df)

    def __getitem__(self, idx):
        row = self.df.iloc[idx]
        question = row['question']
        answer = row['answer']
        question_indices = text_to_indices(question, self.vocab)
        answer_indices = text_to_indices(answer, self.vocab)
        return torch.tensor(question_indices), torch.tensor(answer_indices)


In [23]:
dataset = QADataset(df, vocab)

In [24]:
dataloader = DataLoader(dataset, batch_size=1, shuffle=True)

In [26]:
for question_indices, answer_indices in dataloader:
    print(question_indices,answer_indices)

tensor([[ 42,  86,  87, 241, 242,  19,  39, 243]]) tensor([[244]])
tensor([[ 10,  29, 130, 131]]) tensor([[132]])
tensor([[ 1,  2,  3, 33, 34,  5, 35]]) tensor([[36]])
tensor([[ 78,  79, 195,  81,  19,   3, 196, 197, 198]]) tensor([[199]])
tensor([[  1,   2,   3, 221,   5, 222, 223, 224]]) tensor([[225]])
tensor([[  1,   2,   3,   4,   5, 109]]) tensor([[317]])
tensor([[  1,   2,   3,  92, 137,  19,   3,  45]]) tensor([[185]])
tensor([[  1,   2,   3,   4,   5, 135]]) tensor([[136]])
tensor([[ 10,  11, 189, 158, 190]]) tensor([[191]])
tensor([[ 10,  75,   3, 296,  19, 297]]) tensor([[298]])
tensor([[  1,   2,   3, 122, 123,  19,   3,  45]]) tensor([[124]])
tensor([[ 42, 137,   2, 138,  39, 175, 269]]) tensor([[99]])
tensor([[ 42, 137, 118,   3, 247,   5, 248]]) tensor([[249]])
tensor([[  1,   2,   3, 163, 164, 165,  83,  84]]) tensor([[166]])
tensor([[ 42,   2,   3, 274, 211, 275]]) tensor([[276]])
tensor([[ 10, 140,   3, 141, 171,   5,   3,  70, 172]]) tensor([[173]])
tensor([[ 42, 200

In [33]:
import torch.nn as nn
class SimpleRNN(nn.Module):
    def __init__(self, vocab_size):
        super(SimpleRNN, self).__init__()
        self.embedding = nn.Embedding(vocab_size, embedding_dim=50)
        self.rnn = nn.RNN(input_size=50, hidden_size=128, batch_first=True)
        self.fc = nn.Linear(128, vocab_size)

    def forward(self, question):
        question = self.embedding(question)
        hidden, final= self.rnn(question)
        output = self.fc(final)
        return output

In [30]:
lr = 0.001
epochs = 20

In [34]:
model = SimpleRNN(len(vocab))
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr= lr)

In [37]:
# training loop
for epoch in range(epochs):

    model.train()
    total_loss = 0
    accuracy = 0
    for question_indices, answer_indices in dataloader:
        optimizer.zero_grad()
        #forward pass
        output = model(question_indices)
        #loss
        loss = criterion(output.view(-1, len(vocab)), answer_indices.view(-1))
        #gradients
        loss.backward()
        #update
        optimizer.step()

        total_loss += loss.item()
        accuracy += (output.argmax(dim=2) == answer_indices).sum().item()

    print(f'Epoch: {epoch+1}/{epochs} Loss: {total_loss/len(dataloader)}, Accuracy: {accuracy/len(dataloader.dataset)}')

Epoch: 1/20 Loss: 0.004901245412313276, Accuracy: 1.0
Epoch: 2/20 Loss: 0.004618310910235676, Accuracy: 1.0
Epoch: 3/20 Loss: 0.004336724810612699, Accuracy: 1.0
Epoch: 4/20 Loss: 0.004066997797538837, Accuracy: 1.0
Epoch: 5/20 Loss: 0.0038321863911632034, Accuracy: 1.0
Epoch: 6/20 Loss: 0.003610570160930769, Accuracy: 1.0
Epoch: 7/20 Loss: 0.003397528427497794, Accuracy: 1.0
Epoch: 8/20 Loss: 0.0032115936343972053, Accuracy: 1.0
Epoch: 9/20 Loss: 0.0030303542230588694, Accuracy: 1.0
Epoch: 10/20 Loss: 0.0028620210696115264, Accuracy: 1.0
Epoch: 11/20 Loss: 0.0026920376785306466, Accuracy: 1.0
Epoch: 12/20 Loss: 0.0025465024285949768, Accuracy: 1.0
Epoch: 13/20 Loss: 0.0024107081181783644, Accuracy: 1.0
Epoch: 14/20 Loss: 0.002280667315547665, Accuracy: 1.0
Epoch: 15/20 Loss: 0.0021596377780143585, Accuracy: 1.0
Epoch: 16/20 Loss: 0.002046862945684956, Accuracy: 1.0
Epoch: 17/20 Loss: 0.0019359969747407982, Accuracy: 1.0
Epoch: 18/20 Loss: 0.0018355056391252825, Accuracy: 1.0
Epoch: 19

In [58]:
def predict(model, question, vocab, threshold=0.5):
    # convert question to numbers
    numerical_question = text_to_indices(question, vocab)
    # convert to tensor
    question_tensor = torch.tensor(numerical_question).unsqueeze(0)
    # predict
    model.eval()
    with torch.no_grad():
        output = model(question_tensor)
    # convert logits to probabilities
    probabilities = torch.nn.functional.softmax(output, dim=2)
    # get the predicted word
    predicted_word_idx = torch.argmax(probabilities, dim=2).squeeze().tolist()
    predicted_word = [k for k,v in vocab.items() if v == predicted_word_idx][0]
    if probabilities[0, 0, predicted_word_idx] < threshold:
        return "I Don't Know"
    return predicted_word, probabilities[0, 0, predicted_word_idx]

In [59]:
question = input("Enter Your Question?: ")
predict(model, question, vocab)

Enter Your Question?: Who painted the Mona Lisa?


('leonardodavinci', tensor(0.9987))