# 🤖**RAG Chatbot**

In [None]:
# ========================================
# Imports: Core LangChain and Dependencies
# ========================================

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_community.document_loaders import UnstructuredHTMLLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_groq import ChatGroq
from langchain.vectorstores import Chroma
from langchain.prompts import PromptTemplate

import os
from dotenv import load_dotenv

# ========================
# Environment Preparation
# ========================

load_dotenv()  # Load environment variables from a .env file

# =============================
# Step 1: Load HTML Document(s)
# =============================

loader = UnstructuredHTMLLoader(file_path="data/mg-zs-warning-messages.html")
car_docs = loader.load()

# Optional: Preview first loaded document
# print(car_docs[0])

# =============================
# Step 2: Split HTML into Chunks
# =============================

chunk_size = 300
chunk_overlap = 100

splitter = RecursiveCharacterTextSplitter(
    chunk_size=chunk_size,
    chunk_overlap=chunk_overlap,
    separators=["\n\n", "\n", " ", ""]
)

docs = splitter.split_documents(car_docs)

# Optional: Preview first chunk
print(docs[0])

# =============================
# Step 3: Generate Embeddings
# =============================

embedding_function = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

# Store embeddings in a persistent Chroma vector store
vectorstore = Chroma.from_documents(
    docs,
    embedding=embedding_function,
    persist_directory=os.getcwd()
)

# ===============================
# Step 4: Configure the Retriever
# ===============================

retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 3}
)

# ==========================
# Step 5: Define the Prompt
# ==========================

prompt_template = PromptTemplate(
    input_variables=["question", "context"],
    template="""
    You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question.
    If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.

    Question: {question}
    Context: {context}
    Answer:"""
)

# ==========================
# Step 6: Initialize the LLM
# ==========================

llm = ChatGroq(
    model="llama-3.1-8b-instant",
    temperature=0.7,
    max_tokens=100
)

# =====================================
# Step 7: Create the Retrieval QA Chain
# =====================================

rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt_template
    | llm
)

# ======================================
# Step 8: Query the System with a Prompt
# ======================================

question = "The Gasoline Particular Filter Full warning has appeared. What does this mean and what should I do about it?"
response = rag_chain.invoke(question)

# Print the final answer
print(response.content)



# ⭕**Components used in your CODE**

## 🔁 `RunnablePassthrough` (from `langchain_core.runnables`)

**Purpose:**
This is a special component in LangChain that simply **forwards its input as-is** to the next step in the chain. It acts like a placeholder that does nothing except pass the value along.

**Syntax:**

```python
from langchain_core.runnables import RunnablePassthrough

runnable = RunnablePassthrough()
output = runnable.invoke("input text")  # returns "input text"
```

**Why it's used:**
In the RAG pipeline, we often need to feed the user’s question directly into the final template. But when using a dictionary to pass multiple inputs to a chain (e.g., `{"context": ..., "question": ...}`), we must wrap each source properly. Here, `RunnablePassthrough()` is used to make the `question` available downstream without modification.

---

## 📞 `.invoke()` Method

**Purpose:**
The `invoke()` method is how you **execute a LangChain chain** with a given input and get the final result. It’s part of the `Runnable` interface in LangChain.

**Syntax:**

```python
response = chain.invoke(input_data)
```

**Example:**

```python
response = rag_chain.invoke("What is the capital of France?")
print(response.content)
```

**Why it's used:**
`.invoke()` executes your entire RAG pipeline, from retrieval → prompt construction → LLM response, and returns the output.

---

## 🧩 `PromptTemplate` (from `langchain.prompts`)

**Purpose:**
`PromptTemplate` helps you **define and format a prompt** dynamically by inserting variables (like user input or context) into a fixed template.

**Syntax:**

```python
from langchain.prompts import PromptTemplate

prompt = PromptTemplate(
    input_variables=["question", "context"],
    template="""
    Use the context below to answer the question:
    Context: {context}
    Question: {question}
    Answer:"""
)
formatted_prompt = prompt.format(question="What is AI?", context="AI stands for Artificial Intelligence...")
```

**Why it's used:**
It provides clean separation between your prompt logic and runtime inputs. This allows dynamic insertion of question and retrieved context.

---

## 🔍 `as_retriever()` (from `Chroma` Vector Store)

**Purpose:**
The `as_retriever()` method transforms a vector store (like Chroma) into a **retriever object**, which can fetch similar documents given a user query.

**Syntax:**

```python
retriever = vectorstore.as_retriever(
    search_type="similarity",       # or 'mmr'
    search_kwargs={"k": 3}          # fetch top 3 matches
)
```

**Why it's used:**
The RAG system needs a way to fetch relevant context based on user queries. This method provides that interface.

---

## 🧠 `ChatGroq` (LLM Wrapper)

**Purpose:**
`ChatGroq` is an LLM wrapper that lets you interface with Groq-hosted large language models (e.g., LLaMA 3, Mixtral) through LangChain.

**Syntax:**

```python
from langchain_groq import ChatGroq

llm = ChatGroq(
    model="llama-3.1-8b-instant",
    temperature=0.7,
    max_tokens=100
)
```

**Why it's used:**
This is the final step in the pipeline. After the prompt is ready, the LLM uses it to generate the answer.

---

## 🧱 `RecursiveCharacterTextSplitter`

**Purpose:**
Splits long documents into manageable text chunks using recursive heuristics and custom separators.

**Syntax:**

```python
splitter = RecursiveCharacterTextSplitter(
    chunk_size=300,
    chunk_overlap=100,
    separators=["\n\n", "\n", " ", ""]
)
chunks = splitter.split_documents(docs)
```

**Why it's used:**
LLMs have context length limits. This helps fit content into that limit while preserving semantic meaning.

---

## 📚 `GoogleGenerativeAIEmbeddings`

**Purpose:**
This creates numerical representations (vectors) of text using Google’s `embedding-001` model. These vectors can be stored in a vector database for similarity search.

**Syntax:**

```python
from langchain_google_genai import GoogleGenerativeAIEmbeddings

embedding_function = GoogleGenerativeAIEmbeddings(model="models/embedding-001")
```

**Why it's used:**
Vector embeddings allow semantic search — you can find similar content even if the wording is different.

---

## 🧠 `Chroma.from_documents`

**Purpose:**
Indexes your documents by converting them into vectors and storing them persistently for fast retrieval.

**Syntax:**

```python
vectorstore = Chroma.from_documents(
    docs,  # List of Document objects
    embedding=embedding_function,
    persist_directory=os.getcwd()  # optional: store in current directory
)
```

**Why it's used:**
It builds the backend database that powers the semantic retrieval step.

---

## 🔗 RAG Chain Composition

**This part:**

```python
rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt_template
    | llm
)
```

**Explanation:**
This composes a chain of operations in LangChain using the pipe (`|`) operator:

1. `{"context": retriever, "question": RunnablePassthrough()}`:
   Combines two inputs: retrieved context + user question

2. `| prompt_template`:
   Fills the prompt template with the context and question

3. `| llm`:
   Sends the final prompt to the LLM and gets the response
