<a href="https://colab.research.google.com/github/Rajat-Yd/Rag_Intern_apllication/blob/main/RAG_Chatbot_Assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **NOTE:**

I have used **all-MiniLM-L6-v2** an open source model from hugging face.

# **Reason for not using ChatGpt is my Billing issue:**

**"RateLimitError: Error code: 429 -**
{'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

# **Library Installation**

1. transformers
Purpose: Provides pre-trained models and tools for natural language processing (NLP) tasks.
Example Models: BERT, GPT, GPT-3, T5.

2. sentence-transformers
Purpose: Builds and applies sentence embeddings (numerical representations of text).
 (Sentence-BERT).
3. faiss-cpu
Purpose: A library for efficient similarity search and clustering of dense vectors.

4. pandas
Purpose: A library for data analysis and manipulation.

5. pyngrok
Purpose: Provides a Python interface for ngrok, a tool to expose local servers to the internet.
Usage: Helps share local projects (e.g., Streamlit apps) with secure public URLs.

In [None]:
!pip install transformers sentence-transformers faiss-cpu pandas
!pip install streamlit
!pip install pyngrok

# **Importing Module's and libraries**

In [None]:
from transformers import pipeline, AutoTokenizer, AutoModelForSeq2SeqLM
from sentence_transformers import SentenceTransformer
import faiss
import pandas as pd

# **Fetching Model**

**model_name:**

Specifies the pre-trained sentence-transformer model, all-MiniLM-L6-v2.
Known for generating lightweight and efficient embeddings for semantic tasks.


**SentenceTransformer:**

Loads the model to create sentence embeddings.
Example: Converts textual data into numerical vectors.

**embedder.encode:**

Encodes the list of answers (data["answer"]) into dense embeddings.
convert_to_tensor=True: Ensures the embeddings are returned as PyTorch tensors.

In [None]:
data = pd.read_csv('knowledge_base.csv')

model_name = "all-MiniLM-L6-v2"  # Lightweight sentence-transformer model
embedder = SentenceTransformer(model_name)
embeddings = embedder.encode(data["answer"].tolist(), convert_to_tensor=True)


In [None]:
import numpy as np

# Convert embeddings to NumPy and create an index
faiss_index = faiss.IndexFlatL2(embeddings.shape[1])  # L2 distance metric
faiss_index.add(np.array(embeddings.cpu()))

# Save questions for reference
questions = data["question"].tolist()


# **Function Creation to:**

retrieve_answer(query, k=1):

**Purpose:** Retrieves the top k most relevant answers for a given query based on similarity.
**Parameters:**
query: The input string for which an answer is being sought.
k: The number of top results to retrieve (default: 1).

# **Steps:**

**embedder.encode:** Converts the input query into an embedding.
faiss_index.search:
Finds the k closest embeddings in the index using L2 distance.
Returns distances and their indices.
Result Construction:
Maps the indices to their corresponding questions and answers from the dataset.

**Return Value:**
A list of tuples containing matched questions and their respective answers.

In [None]:
def retrieve_answer(query, k=1):
    query_embedding = embedder.encode([query], convert_to_tensor=False)
    distances, indices = faiss_index.search(np.array(query_embedding), k)
    return [(questions[i], data['answer'][i]) for i in indices[0]]


In [None]:
model_name = "t5-small"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)


In [None]:
def generate_response(query):
    # Retrieve the most relevant answer for the given query
    retrieved_answer = retrieve_answer(query)[0][1]

    # Prepare the input text by combining the query and the retrieved answer
    input_text = f"question: {query} answer: {retrieved_answer}"

    # Tokenize the input text for the model
    # - return_tensors="pt": Converts the output into PyTorch tensors
    # - max_length=512: Ensures the input does not exceed the model's maximum length
    # - truncation=True: Truncates input text that exceeds the maximum length
    inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True)

    # Generate a response from the model
    # - max_length=150: Limits the length of the generated response
    outputs = model.generate(**inputs, max_length=150)

    # Decode the model's output tokens into a human-readable string
    # - skip_special_tokens=True: Removes special tokens like <s>, </s>, etc.
    return tokenizer.decode(outputs[0], skip_special_tokens=True)


In [None]:
while True:
    # Prompt the user to input a query
    user_query = input("Ask a question: ")

    # Check if the user wants to exit the loop
    if user_query.lower() == "exit":
        break  # Exit the loop if the user types "exit"

    # Generate and print the response for the user's query
    print("Answer:", generate_response(user_query))
