# 📚 LLM-powered RAG chatbot using LangChain, Ollama, and ChromaDB for local document querying.

## Project Overview

This project implements a **Retrieval-Augmented Generation (RAG) Chatbot** that allows users to ask questions based on a custom uploaded text file (e.g., `model.txt`). It leverages the power of large language models (LLMs) with local context retrieval to produce grounded, document-specific answers. The pipeline consists of preprocessing the input document, indexing it with vector embeddings, and querying it with a user-friendly Gradio web interface.

##  Key Features

- **Upload or Preload Document**: Supports loading a static file or user-uploaded `.txt` documents.
- **Embeddings and Retrieval**: Converts the text into vector embeddings using OpenAI or HuggingFace-compatible models and performs semantic similarity search to fetch relevant chunks.
- **LLM-Powered Answers**: Uses a language model to generate answers based on retrieved chunks, making it more accurate and context-aware than a vanilla chatbot.
- **Clean & Generic Output**: Strips out all system-level or internal model artifacts like `<think>`, `<|im_start|>`, etc., ensuring clean, readable answers.
- **Minimalist Gradio UI**: A responsive interface where users can input questions and receive instant, relevant answers.

##  Technologies Used

- `LangChain`: For chaining vector database retrieval with LLM responses.
- `FAISS`: For fast local vector search over the document.
- `OpenAI` or `HuggingFace` models: To create embeddings and answer queries.
- `Gradio`: For creating a sleek, no-code web interface.
- `Python`: Core language for scripting, processing, and logic handling.

##  Use Cases

- Summarizing or querying academic research papers.
- Understanding product manuals or documentation.
- Interactive Q&A over any domain-specific content.
- Lightweight alternative to full-fledged AI search engines with controllable context.

##  How It Works

1. **Preprocess Document**: The input text file is split into overlapping chunks for better retrieval granularity.
2. **Embed and Store**: Each chunk is converted to vector form and stored in a FAISS index.
3. **User Query**: The user asks a question via the UI.
4. **Retrieve + Answer**: The most relevant text chunks are retrieved and passed to an LLM, which generates a clean response.
5. **Return Clean Result**: All internal tokens and artifacts are stripped before displaying to the user.

---

This project demonstrates how you can integrate traditional document processing with modern LLM-powered natural language interfaces, creating a powerful tool for contextual knowledge extraction.


**1. Install Required Libraries (If Needed)**

In [None]:
!pip install langchain chromadb gradio


**Load and Split the Document**

Load a .txt file and break it into manageable chunks for embedding and retrieval.

In [None]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter

loader = TextLoader("model.txt")
documents = loader.load()

text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
docs = text_splitter.split_documents(documents)



# **Create Embeddings and Vectorstore**

Convert document chunks into vector embeddings and store them in a Chroma vector database.

In [None]:
from langchain.embeddings import OllamaEmbeddings
from langchain.vectorstores import Chroma

embeddings = OllamaEmbeddings(model="nomic-embed-text")
vectorstore = Chroma.from_documents(docs, embeddings)


# **Set Up Retriever and LLM**

Configure the retriever to search relevant document chunks and use an LLM to generate answers.



In [None]:
from langchain.llms import Ollama
from langchain.chains import RetrievalQA

retriever = vectorstore.as_retriever()
llm = Ollama(model="deepseek-r1:1.5b-qwen-distill-fp16")

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    return_source_documents=False
)


**Define Chatbot Logic**

Create a function to handle user input and return context-based responses.

In [None]:
def chatbot(query):
    response = qa_chain.run(query)

    # Clean up LLM artifacts and generic tags
    if isinstance(response, str):
        cleaned = response.replace("<|im_start|>", "").replace("<|im_end|>", "").replace("<think>", "").strip()
        return cleaned if cleaned else "Sorry, I couldn’t find an answer to that."
    return "Sorry, I couldn’t process your question."


# **Build and Launch Gradio Interface**

Create a user-friendly web interface for chatting with your local document-powered assistant.

In [None]:
import gradio as gr

iface = gr.Interface(
    fn=chatbot,
    inputs=gr.Textbox(lines=2, placeholder="Ask a question about model.txt..."),
    outputs=gr.Textbox(label="Response"),
    title="📚 Local RAG Chatbot",
    description="Interact with a Retrieval-Augmented Generation (RAG) chatbot that answers based on your uploaded document."
)

iface.launch(share=True)
