<a href="https://colab.research.google.com/github/RAHUL2002-k/kerala-ayurveda-chatbot/blob/main/kerala_ayurveda_chatbot_assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Kerala Ayurveda – Internal Q&A + Agentic Article Workflow (RAG using LangChain + ChatGroq)

This Colab notebook implements the assignment for an internal Kerala Ayurveda assistant using only the provided content pack.

**Part A – Q&A RAG System**
- Loads internal Ayurveda content (foundations, dosha guide, FAQs, products, treatments).
- Chunks documents (with special handling for FAQs) and stores them in a FAISS vector store using HuggingFace embeddings.
- Uses a similarity-based retriever to fetch the top-k chunks.
- Calls a ChatGroq LLM to answer questions **only from the retrieved context**.
- Returns both:
  - A natural-language answer.
  - Structured citations `{ doc_id, section_id }` for each supporting chunk.

**Part B – Agentic Workflow with LangGraph**
- Defines a small agentic pipeline:
  1. Outline Agent – creates a sectioned article outline from a brief.
  2. Writer Agent – generates a draft article using RAG and adds citation markers.
  3. Fact-Checker Agent – checks grounding against the corpus and computes a grounding score.
  4. Tone Editor Agent – applies style/tone rules based on the internal style guide.
- Uses LangGraph to wire these agents into a simple step graph.

**Tech Stack**
- Python, Google Colab
- `langchain`, `langchain-community`, `langchain-text-splitters`
- `langchain-groq` for ChatGroq LLM calls
- `langchain-huggingface` + `faiss-cpu` for vector embeddings and retrieval
- `langgraph` for the agentic workflow (Part B)

**How to run**
1. Run the dependency installation cell.
2. Set your `GROQ_API_KEY` in Colab secrets (`google.colab.userdata`).
3. Run the data loading and chunking cells.
4. Run the FAISS / retriever setup cell.
5. Run the Q&A helper (`answer_user_query`) and the interactive loop cell to chat.
6. Run the LangGraph cells to build and execute the agentic article workflow.


## 1. Install dependencies

Install all required libraries for this notebook:

- `langchain`, `langchain-community` and `langchain-text-splitters` for RAG building blocks
- `langchain-groq` to call the ChatGroq LLM
- `langchain-huggingface` and `faiss-cpu` for embeddings + vector store (FAISS)

> Run this cell once at the start of the Colab session.


In [11]:
!pip install -qU \
  langchain \
  langchain-community \
  langchain-text-splitters \
  langchain-groq \
  langchain-huggingface \
  faiss-cpu \
  langgraph

## LLM & Core Library Initialization

In this section, we:
- Import all core LangChain, vector database, and embedding components
- Configure the Large Language Model (LLM) using **ChatGroq**
- Prepare the environment for both:
  - Part A: RAG-based internal Q&A
  - Part B: Agentic workflows using LangGraph

The Groq API key is securely loaded from Colab secrets using `userdata`
to avoid hardcoding sensitive credentials.

The chosen LLM configuration prioritizes:
- **Low temperature** for factual, grounded responses
- **Moderate max tokens** to keep answers concise and cost-efficient


In [4]:
import os
from langchain_community.document_loaders import TextLoader, CSVLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_groq import ChatGroq
from langchain_core.documents import Document
from google.colab import userdata
os.environ["GROQ_API_KEY"] = userdata.get("your_grok_api_key_here")
llm = ChatGroq(
    model="llama-3.3-70b-versatile",  # or another Groq-supported model you prefer
    temperature=0.1,
    max_tokens=512,
)

## Loading the Internal Ayurveda Corpus

This section loads the complete internal Ayurveda knowledge corpus that the
system is allowed to use.

The corpus includes:
- Foundational Ayurveda concepts
- Dosha-specific guidance (Vata, Pitta, Kapha)
- Patient-facing FAQs
- Internal product notes (Ashwagandha, Brahmi Tailam, Triphala)
- Internal treatment documentation (Stress Support Program)
- A structured product catalog (CSV)

Each document is loaded with explicit metadata such as:
- **doc_id** — a stable identifier for citation and traceability
- **doc_type** — a coarse classification (guide, faq, product, treatment)

This metadata is later used for:
- Accurate citations in RAG outputs
- Filtering or prioritizing results in retrieval
- Debugging and evaluation during development

The system strictly relies on this corpus and does not use any external
Ayurveda knowledge.


In [5]:
base_path = "/content/"

file_specs = [
    ("ayurveda_foundations.md", "foundations"),
    ("content_style_and_tone_guide.md", "style_guide"),
    ("dosha_guide_vata_pitta_kapha.md", "dosha_guide"),
    ("faq_general_ayurveda_patients.md", "faq"),
    ("product_ashwagandha_tablets_internal.md", "product_ashwagandha"),
    ("product_brahmi_tailam_internal.md", "product_brahmi_tailam"),
    ("product_triphala_capsules_internal.md", "product_triphala"),
    ("treatment_stress_support_program.md", "treatment_stress"),
]

docs: list[Document] = []

# Loading text files
for filename, doc_id in file_specs:
    loader = TextLoader(os.path.join(base_path, filename), encoding="utf-8")
    loaded = loader.load()
    # attaching metadata
    for d in loaded:
        d.metadata["doc_id"] = doc_id
        d.metadata["doc_type"] = doc_id.split("_", 1)[0] # rough type
    docs.extend(loaded)

# 2.2 Loading products CSV as one chunk per row
csv_path = os.path.join(base_path, "products_catalog.csv")
if os.path.exists(csv_path):
    csv_loader = CSVLoader(
        file_path=csv_path,
        encoding="utf-8",
        csv_args={"delimiter": ","},
    )
    csv_docs = csv_loader.load()
    for d in csv_docs:
        d.metadata["doc_id"] = "products_catalog"
        d.metadata["doc_type"] = "product_row"
    docs.extend(csv_docs)



## Document Chunking Strategy

In this step, the loaded corpus is split into smaller, semantically meaningful
chunks suitable for retrieval.

### Chunking approach

- A **RecursiveCharacterTextSplitter** is used for long-form documents
  such as guides, treatments, and product notes.
- The splitter prioritizes:
  - Section boundaries (`##`)
  - Paragraph breaks
  - Line breaks
  - Whitespace (as a last fallback)

**Configuration:**
- Chunk size: ~800 characters  
- Chunk overlap: ~120 characters  

This balances:
- Preserving enough context for accurate answers
- Avoiding overly large chunks that reduce retrieval precision

###


In [6]:
splitter_long = RecursiveCharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=120,
    separators=["\n##", "\n\n", "\n", " "],
)

split_docs: list[Document] = []

for d in docs:
    # special handling for FAQ: ensure one Q&A per chunk if possible
    if d.metadata.get("doc_id") == "faq":
        # naive split by "Q" markers; adjust to actual format if needed
        chunks = d.page_content.split("\n\nQ")
        for idx, chunk in enumerate(chunks):
            text = ("Q" + chunk) if idx > 0 else chunk
            split_docs.append(
                Document(
                    page_content=text.strip(),
                    metadata={
                        **d.metadata,
                        "section_id": f"qa_{idx}",
                    },
                )
            )
    else:
        sub_docs = splitter_long.split_documents([d])
        for idx, sd in enumerate(sub_docs):
            # keeping original metadata and adding section_id
            sd.metadata["section_id"] = sd.metadata.get("section_id", f"sec_{idx}")
            split_docs.append(sd)

print(f"Total chunks: {len(split_docs)}")

Total chunks: 35


## Embeddings & Vector Store Construction

This section converts the processed document chunks into vector embeddings
and stores them in a vector database for semantic retrieval.

### Embeddings

- We use **HuggingFace sentence embeddings**
  (`sentence-transformers/all-MiniLM-L6-v2`)
- This model provides:
  - Strong performance on short-to-medium length text
  - Good semantic similarity for question–answer tasks
  - Low latency and reasonable memory usage for Colab

These properties make it well-suited for an internal RAG system where:
- The corpus siz


In [7]:
# Embeddings and storing in faiss
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

vector_store = FAISS.from_documents(split_docs, embeddings)

retriever = vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 8},
)


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

### 6. Define system prompt and `answer_user_query` function

This cell defines the core Q&A behavior:

1. **SYSTEM_PROMPT**
   - Instructs the LLM to:
     - Act as an internal Kerala Ayurveda assistant.
     - Use **only** the provided context.
     - Say “I don't know” when the answer is not in the corpus.
   - Enforces safety and tone:
     - Clear, warm, reassuring.
     - No absolute guarantees.
     - Emphasize consulting qualified Ayurveda physicians when needed.
   - Asks the model to reference context blocks with `[1]`, `[2]`, etc.

2. **`build_context_and_citation_map(docs)`**
   - Takes retrieved chunks and:
     - Builds a numbered context string like:
       - `[1] doc_id=product_ashwagandha, section_id=sec_0`
     - Keeps a mapping of:
       - `index` → `doc_id`, `section_id`
   - This is used both in the prompt and in the final citations.

3. **`answer_user_query(query: str)`**
   - Full RAG pipeline:
     - Uses `retriever.invoke(query)` to fetch relevant chunks.
     - Builds context + citation map.
     - Formats the system prompt with `context` and `question`.
     - Calls the Groq LLM (`llm.invoke(prompt)`).
   - Returns:
     - `"answer"`: model’s natural language answer.
     - `"citations"`: list of `{doc_id, section_id}` for transparency and debugging.

This is the function you would expose to the internal Q&A front-end or API.


In [8]:
from textwrap import dedent

SYSTEM_PROMPT = dedent("""
You are an internal Q&A assistant for Kerala Ayurveda.
Use ONLY the provided context to answer questions.
If the answer is not in the context, say you don't know and suggest consulting a qualified Ayurveda physician.

Tone guidelines (from internal style guide):
- Be clear, warm, and reassuring.
- Avoid absolute claims or guarantees.
- Emphasize safety and personalized consultation when appropriate.

When you use a fact from a specific context block, reference it with [1], [2], etc.,
where the number refers to the block index.

Context:
{context}

Question: {question}

Answer (with inline citation markers like [1], [2] where relevant):
""").strip()


def build_context_and_citation_map(docs: list[Document]):
    """Build context string with numbered blocks and return citation metadata."""
    blocks = []
    citations = []
    for i, d in enumerate(docs, start=1):
        doc_id = d.metadata.get("doc_id", "unknown_doc")
        section_id = d.metadata.get("section_id", str(i))
        header = f"[{i}] doc_id={doc_id}, section_id={section_id}"
        blocks.append(f"{header}\n{d.page_content}\n")
        citations.append(
            {
                "index": i,
                "doc_id": doc_id,
                "section_id": section_id,
            }
        )
    context = "\n\n".join(blocks)
    return context, citations


def answer_user_query(query: str) -> dict:
    """
    Main function expected by the assignment.
    Returns:
    {
        "answer": str,
        "citations": [ { "doc_id": str, "section_id": str } ]
    }
    """
    # 1. Retrieving relevant chunks
    retrieved_docs = retriever.invoke(query)

    # 2. Building context + citation map
    context, citation_map = build_context_and_citation_map(retrieved_docs)

    # 3. Build the prompt
    prompt = SYSTEM_PROMPT.format(context=context, question=query)

    # 4. Calling LLM (ChatGroq)
    llm_response = llm.invoke(prompt)
    if hasattr(llm_response, "content"):
        answer_text = llm_response.content
    else:
        answer_text = str(llm_response)

    # we return ALL retrieved docs as citations.

    citations = [
        {"doc_id": c["doc_id"], "section_id": c["section_id"]}
        for c in citation_map
    ]

    return {
        "answer": answer_text,
        "citations": citations,
    }


## Interactive Q&A Loop (Manual Testing)

This cell provides a simple interactive interface for manually testing
the RAG-based internal Q&A system.

### What this does

- Prompts the user to enter natural-language questions.
- For each question:
  1. Calls `answer_user_query(query)`.
  2. Prints the generated answer.
  3. Prints the associated citations:
     - `doc_id` and `section_id` for each retrieved context chunk.
- Continues running until the user types **`exit`**.

### Why this is useful

- Allows quick, iterative testing of retrieval quality and answer grounding.
- Makes it easy to inspect:
  - Whether answers are actually backed by internal documents.
  - Which parts of the corpus the model relied on.
- Serves as a lightweight stand-in for an internal UI or API.

In a production setting, this loop could be replaced by:
- A simple web UI (e.g., Gradio or Streamlit), or
- An internal API endpoint exposed to other teams.


In [16]:
while True:
    q = input("Ask a question about Kerala Ayurveda (or 'exit'): ")
    if q.lower() == "exit":
        break
    res = answer_user_query(q)
    print("\nANSWER:\n", res["answer"])
    print("\nCITATIONS:", res["citations"])
    print("\n" + "="*60 + "\n")


Ask a question about Kerala Ayurveda (or 'exit'): What are the key benefits of Ashwagandha tablet

ANSWER:
 The key benefits of Ashwagandha tablets, as traditionally used in Ayurveda, include supporting the body's ability to adapt to stress, promoting calmness and emotional balance, supporting strength and stamina, and helping maintain restful sleep [1]. Our Ashwagandha Stress Balance Tablets are designed to provide daily support for stress resilience and restful sleep, inspired by the classical adaptogenic herb Ashwagandha [1]. It's essential to note that individual results may vary, and if you have any underlying medical conditions or concerns, it's best to consult with a qualified healthcare provider before use [2].

CITATIONS: [{'doc_id': 'product_ashwagandha', 'section_id': 'sec_0'}, {'doc_id': 'product_ashwagandha', 'section_id': 'sec_2'}, {'doc_id': 'products_catalog', 'section_id': 'sec_0'}, {'doc_id': 'product_triphala', 'section_id': 'sec_0'}, {'doc_id': 'foundations', 'secti

KeyboardInterrupt: Interrupted by user

## 8. LangGraph agentic workflow for Ayurveda article generation (Part B)

This cell sketches the **agentic pipeline** described in the assignment using LangGraph:

### Shared state (`ArticleState`)
A typed dictionary shared across agents, containing:
- `brief`: the input article brief (topic, angle, target reader).
- `outline`: structured outline generated by the Outline Agent.
- `draft`: full article draft generated by the Writer Agent.
- `citations`: list of `{ doc_id, section_id }` used in the draft.
- `fact_check_report`: grounding / fact-check result.
- `tone_issues`: any style/tone problems detected.
- `final_draft`: post-edited draft ready for human review.
- `status`: simple status flag for debugging.

### Agents (nodes)
- **`outline_agent`**
  - Input: `brief`.
  - Output: `outline` with sections + descriptions, sets `status = "drafting"`.
- **`writer_agent`**
  - For each outline section:
    - (Conceptually) calls the RAG Q&A tools to fetch context.
    - Uses LLM to generate section content with citation markers.
  - Combines sections into `draft` and collects `citations`.
  - Sets `status = "fact_checking"`.
- **`fact_checker_agent`**
  - (Conceptually) re-runs RAG to validate claims in the draft.
  - Fills `fact_check_report` and an `overall_grounding_score`.
  - Sets `status = "editing"`.
- **`tone_editor_agent`**
  - Applies style/tone adjustments based on the internal style guide.
  - Writes `final_draft` and `tone_issues`, sets `status = "done"`.

### Graph wiring
- Entry node: `outline_agent`.
- Linear path: `outline_agent → writer_agent → fact_checker_agent`.
- Conditional edge after fact checking:
  - If `overall_grounding_score < 0.75` → loop back to `writer_agent` for revision.
  - Else → proceed to `tone_editor_agent`.
- `tone_editor_agent → END`.

A `MemorySaver` checkpointer is used so the workflow state can be inspected across steps.


In [9]:
from typing import TypedDict, List, Dict, Optional
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.memory import MemorySaver

# Shared state type

class ArticleState(TypedDict, total=False):
    brief: Dict
    outline: Dict
    draft: str
    citations: List[Dict[str, str]]
    fact_check_report: Dict
    tone_issues: List[str]
    final_draft: str
    status: str


# Node functions (agents)

def outline_agent(state: ArticleState) -> ArticleState:
    brief = state["brief"]
    # Call LLM (ChatGroq) with simple prompt to create outline (OPTIONALLY call RAG)
    # outline = ...
    outline = {
        "sections": [
            {"heading": "Introduction to Vata Dosha", "description": "Plain explanation..."},
            {"heading": "Signs of Vata Imbalance", "description": "Common patterns..."},
            {"heading": "Lifestyle and Dietary Support", "description": "High-level tips..."},
            {"heading": "When to Consult a Physician", "description": "Safety guidance..."},
        ]
    }
    state["outline"] = outline
    state["status"] = "drafting"
    return state


def writer_agent(state: ArticleState) -> ArticleState:
    # For each section in outline:
    #   - call a RAG tool to get chunks
    #   - call LLM to write that section with citation markers
    # combine sections into `draft`
    # collect citations from RAG calls
    draft = "# " + state["brief"]["title"] + "\n\n..."  # generated text
    citations = [
        {"doc_id": "dosha_guide_vata_pitta_kapha.md", "section_id": "vata_overview_0"}
    ]
    state["draft"] = draft
    state["citations"] = citations
    state["status"] = "fact_checking"
    return state


def fact_checker_agent(state: ArticleState) -> ArticleState:
    # Use RAG again to check each claim in state["draft"]
    fact_report = {
        "unsupported_claims": [],
        "missing_citations": [],
        "overall_grounding_score": 0.9,
    }
    state["fact_check_report"] = fact_report
    state["status"] = "editing"
    return state


def tone_editor_agent(state: ArticleState) -> ArticleState:
    # Load style guide from corpus or env variable; call LLM to rewrite
    final_draft = state["draft"]  # in reality, call LLM with style prompt
    tone_issues = []
    state["final_draft"] = final_draft
    state["tone_issues"] = tone_issues
    state["status"] = "done"
    return state


# Graph wiring

builder = StateGraph(ArticleState)

builder.add_node("outline_agent", outline_agent)
builder.add_node("writer_agent", writer_agent)
builder.add_node("fact_checker_agent", fact_checker_agent)
builder.add_node("tone_editor_agent", tone_editor_agent)

builder.set_entry_point("outline_agent")

builder.add_edge("outline_agent", "writer_agent")
builder.add_edge("writer_agent", "fact_checker_agent")

# If grounding too low, we can loop back to writer_agent
def route_after_fact_check(state: ArticleState) -> str:
    report = state.get("fact_check_report", {})
    score = report.get("overall_grounding_score", 0.0)
    if score < 0.75:
        return "writer_agent"
    return "tone_editor_agent"

builder.add_conditional_edges("fact_checker_agent", route_after_fact_check)

builder.add_edge("tone_editor_agent", END)

memory = MemorySaver()

app = builder.compile(checkpointer=memory)


## Step 2 – Short Reflection

- **Time spent:**  
  I spent approximately **6–8 hours** overall understanding the content pack, designing the RAG pipeline, implementing the FAISS-based retriever, and sketching the agentic workflow using LangGraph.

- **Most interesting part:**  
  The most interesting aspect was designing the **fact-checking step** in the agentic workflow and thinking about how to systematically enforce grounding using the same RAG corpus instead of relying on model trust alone.

- **Most unclear or challenging aspect:**  
  One challenge was balancing **chunk size and retrieval depth** to ensure sufficient context without overwhelming the LLM or introducing noisy evidence, especially for multi-part or safety-related questions.

- **Use of AI tools:**  
  I used AI tools (including ChatGPT) to:
  - Brainstorm RAG and agentic workflow designs.
  - Draft and refine code scaffolding for LangChain and LangGraph.
  - Debug Colab-specific dependency and import issues.
  - Rephrase explanations and comments for clarity and conciseness.

- **Final judgment:**  
  The final solution prioritizes **clarity, grounding, and pragmatic design** over over-engineering, with clear extension points for future improvements such as stricter claim validation or UI integration.
