# Installing Required Packages

- `langchain`: The core library for creating LLM-based applications. It offers tools for document loading, splitting, embedding, vector storage, memory, and more.
- `langchain-community`: Contains community-maintained integrations for document loaders, vector stores, and models.
- `langchain-mistralai`: Provides access to **Mistral LLMs** via LangChain, enabling us to use Mistral as our language model backend.
- `chromadb`: A lightweight and fast vector database used to store and retrieve document embeddings efficiently. It's a key component of our retrieval system.
- `gradio`: A UI library that allows us to build an interactive web interface where users can type questions and receive AI-generated answers from our fairy tale chatbot.

By installing these packages, we’re setting up the software environment to support every core function of the RAG architecture: **data ingestion, preprocessing, embedding, retrieval, generation, and user interaction**.


In [None]:
!pip install -q langchain langchain-community langchain-mistralai chromadb gradio

## Importing Required Modules

This cell imports all the core libraries and components needed to build the Retrieval-Augmented Generation (RAG) chatbot pipeline.


- `PyPDFLoader`:  
  Loads and parses PDF files. This is used to ingest the fairy tale documents into the system.

- `RecursiveCharacterTextSplitter`:  
  Splits long text documents into smaller, overlapping chunks. This improves retrieval quality and helps ensure that each chunk is within the token limit for embedding.

- `Chroma`:  
  A vector database used to store and retrieve text embeddings. It performs similarity searches when a user asks a question.

- `HuggingFaceEmbeddings`:  
  Transforms text into vector embeddings using a transformer-based model from Hugging Face.

- `ConversationBufferMemory`:  
  Maintains the conversation history, allowing the chatbot to respond with context-aware answers over multiple turns.

- `ConversationalRetrievalChain`:  
  A LangChain component that combines document retrieval and LLM-based answer generation in a single chain.

- `ChatMistralAI`:  
  Provides an interface to the Mistral large language model via API, which is used to generate answers to user queries.

- `gradio`:  
  A Python library for building interactive web UIs. This will be used to create the chatbot interface.

- `os`:  
  Standard Python module used here to handle environment variables, such as setting the API key securely.

These components form the foundational building blocks of the RAG system: loading, splitting, embedding, storing, retrieving, and responding to user queries.


In [None]:
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain
from langchain_mistralai.chat_models import ChatMistralAI
import gradio as gr
import os

## Setting the Mistral API Key and Initializing the Language Model

This cell configures access to the Mistral language model by setting the required API key and initializing the model.


- `os.environ["MISTRAL_API_KEY"] = "key"`  
  This line sets the `MISTRAL_API_KEY` environment variable, which is used to authenticate requests made to the Mistral API.  
  In production or shared environments, the actual key should be stored securely and not hardcoded.

- `ChatMistralAI(model="mistral-small", temperature=0)`  
  Initializes the Mistral language model with the specified parameters:
  - `model="mistral-small"` refers to the specific model variant being used.
  - `temperature=0` sets the randomness of the output to zero, making the model’s responses more deterministic and consistent.

This model will be used later in the pipeline to generate natural language responses based on the user’s query and the retrieved document content.


In [None]:
os.environ["MISTRAL_API_KEY"] = "ZYxgsoPPGqJYjHIGoCIEbP1vcxbNJstX"
mistral_llm = ChatMistralAI(model="mistral-small", temperature=0)


## Installing PDF Parser and Loading Documents

This cell performs two key actions: installing a PDF parsing library and loading the fairy tale PDF documents from a directory.



- `!pip install pypdf`  
  Installs the `pypdf` library, which is a dependency for parsing and reading PDF files. This is required for LangChain’s PDF loaders to function correctly.

- `from langchain_community.document_loaders import PyPDFDirectoryLoader`  
  Imports the `PyPDFDirectoryLoader` class, which allows batch loading of all PDF files from a specified folder.

- `loader = PyPDFDirectoryLoader("/content")`  
  Creates a document loader instance targeting the `/content` directory (default working directory in Google Colab). All PDF files placed in this folder will be read.

- `documents = loader.load()`  
  Loads and parses the PDF files into a list of LangChain `Document` objects. Each document contains:
  - The textual content extracted from the PDF
  - Metadata such as the file name and page number

These documents will later be split into chunks, embedded into vectors, and stored in a vector database for retrieval during chatbot interactions.


In [None]:
!pip install pypdf



In [None]:
from langchain_community.document_loaders import PyPDFDirectoryLoader
loader = PyPDFDirectoryLoader("/content")
documents = loader.load()

## Splitting Documents into Chunks

This cell splits the loaded documents into smaller text chunks using LangChain’s text splitter utility. Splitting is essential for efficient embedding and retrieval in a RAG pipeline.

- `RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)`  
  Initializes a text splitter that breaks large documents into smaller segments of up to 500 characters each.  
  The `chunk_overlap=50` means that 65 characters from the end of one chunk are repeated at the start of the next chunk.  
  This overlapping technique helps preserve context and avoids cutting off important information at chunk boundaries.

- `docs = splitter.split_documents(documents)`  
  Applies the text splitter to the previously loaded `documents`.  
  The result is a list of smaller, manageable text chunks stored in `docs`. Each chunk retains the original document’s metadata.

Splitting documents is a critical preprocessing step in RAG systems. It ensures that:
- The input size fits within token limits of embedding models and LLMs
- Retrieval is more fine-grained and contextually relevant


In [None]:
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
docs = splitter.split_documents(documents)

##Creating Embeddings and Initializing Vector Store

This cell sets up the document embedding model and initializes the vector database for efficient semantic search.


We use `HuggingFaceEmbeddings` with the pre-trained model `"all-MiniLM-L6-v2"` to convert each chunk of text into a dense numerical vector. This model is widely used due to its excellent trade-off between speed and semantic performance. It captures sentence-level meaning and is lightweight enough for real-time inference on consumer-grade hardware. Hugging Face models also run locally and are open-source, making them cost-effective and flexible for academic and prototype projects.

The vector store is implemented using `Chroma`, which supports fast in-memory and persistent storage of vector data. `Chroma.from_documents()` takes the list of preprocessed document chunks and their embeddings, storing them internally for future retrieval. Finally, we extract a `retriever` object from the vector store using `.as_retriever()`, which enables semantic similarity search based on user queries.

This approach ensures that when a user asks a question, the system can find and retrieve the most relevant text chunks from the embedded knowledge base using cosine similarity.


In [None]:
embedding = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectordb = Chroma.from_documents(docs, embedding=embedding)
retriever = vectordb.as_retriever(search_kwargs={"k": 5})

  embedding = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


##  Initializing Conversation Memory

This cell sets up memory management for the chatbot using LangChain's `ConversationBufferMemory`. Memory is essential for maintaining the context of multi-turn conversations.



- `ConversationBufferMemory(...)`  
  Creates a memory object that stores the full conversation history in memory, enabling the chatbot to respond with awareness of prior exchanges. It helps produce more coherent and context-aware answers.

#### Parameters:
- `memory_key="chat_history"`  
  Defines the key used to store and retrieve past dialogue messages from memory.

- `return_messages=True`  
  Ensures that past messages are returned in their original message format (rather than raw text).

- `output_key="answer"`  
  Specifies that the output of the retrieval-augmented generation pipeline will be stored under the key `"answer"` in memory.

Using memory in RAG applications enhances the chatbot’s ability to hold meaningful conversations over multiple turns rather than responding in isolation.


In [None]:
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True,
    output_key="answer"
)


  memory = ConversationBufferMemory(


## Creating the Conversational Retrieval Chain

This cell sets up the core of the Retrieval-Augmented Generation (RAG) system using LangChain’s `ConversationalRetrievalChain`. This chain combines document retrieval with language model generation in a single, seamless interface.

- `ConversationalRetrievalChain.from_llm(...)`  
  This method initializes a conversation-aware QA system that can:
  1. Retrieve relevant document chunks from the vector store based on the current user query
  2. Combine that with prior conversation history (via memory)
  3. Pass everything to the language model to generate a well-informed response

#### Parameters:

- `llm=mistral_llm`  
  Specifies the language model to use for generating answers. In this case, it's the previously initialized Mistral model (`mistral-small`).

- `retriever=retriever`  
  Connects the retriever (typically a Chroma-based similarity search tool) that fetches relevant document chunks based on the user’s query.

- `memory=memory`  
  Injects the conversation memory created earlier. This allows the chain to handle multi-turn conversations and maintain context from earlier interactions.

- `return_source_documents=True`  
  Ensures that the original document chunks used for generating the answer are returned along with the answer itself. This is useful for transparency or debugging.

- `output_key="answer"`  
  Defines the key under which the generated response will be stored and accessed.

This chain becomes the brain of the chatbot, managing context, performing retrieval, and generating responses.


In [None]:
qa_chain = ConversationalRetrievalChain.from_llm(
    llm=mistral_llm,
    retriever=retriever,
    memory=memory,
    return_source_documents=True,
    output_key="answer"
)

## Building the Gradio Chatbot Interface

This cell defines the backend logic for the chatbot’s interactive user interface using Gradio. It handles user messages, determines appropriate responses, and invokes the RAG model to generate answers.


- `respond_to_user(message, history)`  
  This is the main callback function that gets triggered whenever a user sends a message through the Gradio chat interface. It processes the message, interacts with the QA pipeline, and returns the response.

### Key Functional Steps:

1. **Greeting and Exit Handling:**
   - The input message is converted to lowercase and stripped of extra spaces.
   - If the message is a greeting (like “hi”, “hello”, etc.), a friendly welcome message is returned.
   - If the user wants to exit (e.g., “bye”, “goodbye”), a farewell message is shown.
   - These are handled before invoking the QA chain.

2. **Answer Retrieval:**
   - The `qa_chain.invoke({"question": message})` call sends the user's message to the Conversational Retrieval Chain.
   - The chain returns a dictionary with keys like `"answer"` or `"result"`, depending on the LLM used.
   - The code safely retrieves the response using `.get()` to avoid key errors.

3. **Fallback Handling:**
   - If the answer indicates uncertainty (e.g., contains phrases like “don’t know” or “not sure”), a soft, encouraging fallback message is added.
   - This keeps the user experience positive, even if the chatbot cannot find a good answer.

4. **Exception Handling:**
   - If an error occurs during execution, the traceback is printed for debugging, and a formatted error message is returned to the UI.

This function enables dynamic interaction between users and the RAG model, supporting real-time Q&A with conversational memory and fallback safety.


In [None]:
def respond_to_user(message, history):
    try:
        message_lower = message.lower().strip()
        greetings = ["hi", "hello", "hey", "good morning", "good evening", "what's up", "how are you"]
        exit_phrases = ["bye", "goodbye", "see you later", "exit"]

        if message_lower in greetings:
            return "🧚‍♀️ Hello! Ask me anything about fairy tales and I’ll do my best to help!"
        if message_lower in exit_phrases:
            return "👋 Bye! Have a magical day! 🌟"

        response = qa_chain.invoke({"question": message})
        answer = response.get("answer", "") or response.get("result", "")

        if "don't know" in answer.lower() or "not sure" in answer.lower():
          answer += " 😊 I'm sorry, I don't know the answer to this question. But I'm always learning!"

        return answer + " \n🧙‍♀️Thanks for asking!\n Do you want to ask anything else?"

    except Exception as e:
        import traceback
        print(traceback.format_exc())
        return f"Error:\n{str(e)}"


## Creating the Gradio UI for the Fairy Tale Chatbot

This cell builds a user-friendly web interface using `gradio.Blocks` for interacting with the fairy tale RAG chatbot. It includes a visual header, an optional image, and a conversational chat interface that calls the backend function `respond_to_user()`.


- `with gr.Blocks(theme=gr.themes.Soft()) as demo:`  
  Initializes a Gradio Blocks interface with a clean and modern visual theme. The `Soft` theme provides a light, user-friendly design.

- `gr.Markdown(...)`  
  Adds custom Markdown text to the interface:
  - The first line serves as a title.
  - The second line introduces the chatbot and what users can expect from it.

- `gr.Image("/content/bg.gif", height=278, width=500)`  
  Displays a background or thematic image (e.g., a fairy-tale themed GIF). This makes the UI visually appealing and engaging for users.

- `gr.ChatInterface(...)`  
  Builds the core chat module using Gradio’s high-level chat wrapper. Key attributes include:
  - `fn=respond_to_user`: Connects the user’s message input to the chatbot response function.
  - `title`: Sets the chat window’s title.
  - `description`: Provides instructions or context for the user.
  - `examples`: Offers predefined example questions users can click on to get started.
  - `type="messages"`: Formats the input and output as chat-style messages.

- `demo.launch(share=True, inline=False)`  
  Launches the Gradio app.
  - `share=True` generates a public link so the chatbot can be accessed and tested outside of the notebook.
  - `inline=False` ensures that the interface opens in a new browser tab instead of embedding within the notebook.

This interface makes it easy for users to have natural conversations with the fairy tale chatbot without needing to interact with raw code or command-line prompts.

In [None]:
with gr.Blocks(theme=gr.themes.Soft()) as demo:
    gr.Markdown("## 🧚‍♀️ Welcome to Your Magical Fairy Tale Chatbot!")
    gr.Markdown("Talk to classic fairy tales like never before ✨ Ask about plots, characters, morals, and more.")
    gr.Image("/content/bg.gif", height=278, width = 500)
    gr.ChatInterface(
        fn=respond_to_user,
        title="🧚 Fairy Tale RAG Chatbot",
        description="Ask anything about your favourite fairy tales!",
        examples=["Does the little mermaid sing?", "Who helped Rapunzel escape?"],
        type="messages"
        )
demo.launch(share=True, inline=False)


Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://f79d989e09cc498a5a.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




In [None]:
response = qa_chain.invoke({"question": 'Tell me about the little mermaid'})

In [None]:
response

{'question': 'Tell me about the little mermaid',
 'chat_history': [HumanMessage(content='Does the little mermaid sing?', additional_kwargs={}, response_metadata={}),
  AIMessage(content='Yes, the little mermaid is described as having a lovely and sweet voice. However, she gives up her voice to the sea witch in order to obtain legs and be with the human prince she loves.', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='tell me more about the witch', additional_kwargs={}, response_metadata={}),
  AIMessage(content='In "The Little Mermaid" story by Hans Christian Andersen, the sea witch is a significant character who plays a crucial role in the mermaid\'s journey. She is an outcast from the rest of the sea kingdom, living in a dark and gloomy region filled with polyps. The sea witch is known for her power and knowledge of magic, which she uses to help the mermaid in exchange for something valuable.\n\nThe mermaid visits the sea witch to request legs so she can be wit