<a href="https://colab.research.google.com/github/PSivaMallikarjun/simple-web-based-RAG-chatbot/blob/main/RAG_chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



 # A simple web-based RAG chatbot using Gradio for user interaction.



New Features:
* Gradio UI – A chatbot interface for users to ask questions.
* Dynamic Document Uploading – Users can upload their own files.
* Enhanced Retrieval – Uses FAISS for efficient similarity search.

RAG chatbot using Gradio and alternative libraries instead of LangChain. This implementation uses FAISS for retrieval and Hugging Face Transformers for text generation.

In [2]:
!pip install gradio faiss-cpu sentence-transformers transformers


Collecting faiss-cpu
  Downloading faiss_cpu-1.10.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (4.4 kB)
Downloading faiss_cpu-1.10.0-cp311-cp311-manylinux_2_28_x86_64.whl (30.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m30.7/30.7 MB[0m [31m50.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: faiss-cpu
Successfully installed faiss-cpu-1.10.0


In [10]:
import gradio as gr
import faiss
import numpy as np
from sentence_transformers import SentenceTransformer
from transformers import pipeline

# Load sentence transformer model for embedding generation
embed_model = SentenceTransformer("all-MiniLM-L6-v2")

# Load documents (replace 'sample.txt' with your actual file)
with open("sample.txt", "r", encoding="utf-8") as f:
    data = f.read().split("\n\n")  # Splitting paragraphs

# Generate embeddings
doc_embeddings = embed_model.encode(data, convert_to_numpy=True)

# Create FAISS index
dimension = doc_embeddings.shape[1]
faiss_index = faiss.IndexFlatL2(dimension)
faiss_index.add(doc_embeddings)

# Load text-generation model (Hugging Face Transformers)
generator = pipeline("text-generation", model="gpt2", max_new_tokens=50)


Device set to use cpu


In [11]:
# Create a sample knowledge base file
sample_content = """
RAG (Retrieval-Augmented Generation) is a technique that enhances AI models by retrieving relevant documents from a knowledge base before generating responses.

How RAG Works:
1. User asks a question.
2. The system retrieves relevant documents using a vector store.
3. The retrieved content is used to generate a response.

FAISS (Facebook AI Similarity Search) is an efficient similarity search library used for fast document retrieval.

Gradio is an easy-to-use UI library for AI models, allowing real-time user interaction with machine learning models.

Streamlit is another UI framework used to build interactive AI applications with minimal coding.

GPT-2 is a text-generation model by OpenAI, which can generate responses based on retrieved information.

To implement a simple RAG chatbot:
- Use FAISS for fast retrieval.
- Use Sentence Transformers for embedding generation.
- Use GPT-2 for response generation.
- Use Gradio or Streamlit for the user interface.

This chatbot does not require an API key and runs locally in Google Colab.
"""

# Save the content to a text file
with open("sample.txt", "w", encoding="utf-8") as f:
    f.write(sample_content)

print("sample.txt file created successfully!")


sample.txt file created successfully!


In [12]:
def chatbot(input_text):
    # Convert input query to embedding
    input_embedding = embed_model.encode([input_text], convert_to_numpy=True)

    # Retrieve top 2 matching documents
    distances, indices = faiss_index.search(input_embedding, k=2)
    retrieved_docs = " ".join([data[i] for i in indices[0]])

    # Generate response using GPT-2
    prompt = f"Based on the following information, answer the query:\n{retrieved_docs}\nQuery: {input_text}\nAnswer:"
    response = generator(prompt)[0]['generated_text'].split("Answer:")[-1].strip()

    return response if response else "I couldn't find an answer."


In [13]:
gr.Interface(fn=chatbot, inputs="text", outputs="text", title="RAG Chatbot").launch(share=True)

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://05d520c385766a6300.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


