# 5.1 — Gradio RAG App (interactive UI)

This notebook loads the docs *via* `src.paths`, retrieves with FAISS, and calls OpenAI with context.

Goal: Expose the RAG pipeline behind a simple web UI. Loads the corpus and
artifacts, defines a `rag_respond()` function that retrieves + generates, and
launches a Gradio app for interactive testing.

In [1]:
%pip install -q python-dotenv openai sentence-transformers faiss-cpu gradio
from dotenv import load_dotenv, find_dotenv; load_dotenv(find_dotenv())

import sys
from pathlib import Path
sys.path.append(str(Path.cwd().parent))
from src.paths import RAW, DOCS_LIST, EMBED, INDEX

from openai import OpenAI
client = OpenAI()

Note: you may need to restart the kernel to use updated packages.


### Step 2 — Load filenames and texts from the corpus

Reads filenames from `docs_list.txt`, verifies each file exists under `RAW`,
loads the texts into memory, and prints a short preview for sanity.

In [2]:
# set the list to just real file name
Path(DOCS_LIST).write_text("clinical_demo.md\n", encoding="utf-8")

# sanity check
print("Using list:", Path(DOCS_LIST).resolve())
print(Path(DOCS_LIST).read_text())
print("RAW has:", [p.name for p in Path(RAW).glob("*")])

Using list: /Users/priya/VSCode/Python/Personal Projects/RAG Multimodal Chatbot/Complete RAG folder/data/raw/docs_list.txt
clinical_demo.md

RAW has: ['docs_list.txt', 'clinical_demo.md']


### Step 3 — Prepare encoder, embeddings, and FAISS index

Initializes `all-MiniLM-L6-v2`, loads `EMBED` (NumPy array), and opens `INDEX`
(FAISS). Prints shapes/`ntotal` to confirm artifact consistency.

In [3]:
%pip install -q python-dotenv openai sentence-transformers faiss-cpu gradio numpy

from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())  # reads .env from repo root

# Make ../src importable when running from Complete RAG folder/notebooks/
import sys
from pathlib import Path
parent = Path.cwd().parent
if str(parent) not in sys.path:
    sys.path.append(str(parent))

from openai import OpenAI
client = OpenAI()  # reads OPENAI_API_KEY from environment

from src.paths import RAW, DOCS_LIST, EMBED, INDEX
print("Using paths:")
print(" RAW:", RAW)
print(" DOCS_LIST:", DOCS_LIST)
print(" EMBED:", EMBED)
print(" INDEX:", INDEX)


Note: you may need to restart the kernel to use updated packages.
Using paths:
 RAW: /Users/priya/VSCode/Python/Personal Projects/RAG Multimodal Chatbot/Complete RAG folder/data/raw
 DOCS_LIST: /Users/priya/VSCode/Python/Personal Projects/RAG Multimodal Chatbot/Complete RAG folder/data/raw/docs_list.txt
 EMBED: /Users/priya/VSCode/Python/Personal Projects/RAG Multimodal Chatbot/Complete RAG folder/data/processed/embed.npy
 INDEX: /Users/priya/VSCode/Python/Personal Projects/RAG Multimodal Chatbot/Complete RAG folder/data/processed/faiss_doc_index.index


In [4]:
from pathlib import Path
from src.paths import RAW, DOCS_LIST

# read raw lines
raw_lines = [ln.strip() for ln in Path(DOCS_LIST).read_text(encoding="utf-8").splitlines() if ln.strip()]

# keep only real filenames (exist in RAW or look like files)
def looks_like_file(name: str) -> bool:
    return any(name.endswith(ext) for ext in (".md", ".txt", ".pdf")) or (RAW / name).exists()

names = [ln for ln in raw_lines if looks_like_file(ln)]
if not names:
    # fallback to your known file
    names = ["clinical_demo.md"]

# validate existence
doc_paths = [RAW / n for n in names]
for p in doc_paths:
    assert p.exists(), f"Missing source file: {p}"

# auto-fix docs_list.txt to the cleaned names
Path(DOCS_LIST).write_text("\n".join(names) + "\n", encoding="utf-8")

# load texts
docs_text = [p.read_text(encoding="utf-8") for p in doc_paths]
print("Loaded docs:", names)
print("First doc preview:\n", docs_text[0][:300])


Loaded docs: ['clinical_demo.md']
First doc preview:
 # Clinical Knowledge Pack (Demo)
_This is an educational demo for a local RAG prototype. Not medical advice._

## FAQ
- **What is this?** Retrieval notes the bot uses to answer questions with citations.
- **Scope:** Common chronic conditions + labs and a tiny antibiotic cheat sheet.
- **Update caden


In [5]:
from sentence_transformers import SentenceTransformer
import numpy as np, faiss

encoder = SentenceTransformer("all-MiniLM-L6-v2")
embeds = np.load(EMBED).astype("float32")
index  = faiss.read_index(str(INDEX))
print("Embeddings shape:", embeds.shape, "| Index ntotal:", index.ntotal)


Embeddings shape: (1, 384) | Index ntotal: 1


### Step 4 — Retrieve, ground the answer in context, and cite sources

Encodes the query with normalized embeddings, searches FAISS (cap `k` at
`index.ntotal`), builds the context from retrieved texts, calls the LLM, and
returns the answer with a `Sources:` line showing the contributing files.
Includes a guard to handle empty results or artifact mismatches gracefully.

In [6]:
import textwrap

def rag_respond(q: str) -> str:
    try:
        # 1) Retrieve top-k
        qv = encoder.encode([q], normalize_embeddings=True).astype("float32")
        D, I = index.search(qv, 3)
        hits = [i for i in I[0] if i >= 0]
        if not hits:
            return "No results found in the knowledge base."

        # 2) Build context strictly from retrieved text
        context = "\n\n".join(docs_text[i] for i in hits)

        # 3) Ask OpenAI with the retrieved context
        prompt = f"""Use ONLY the context to answer the question. 
Be concise and cite specifics.

Context:
{context}

Question: {q}
Short answer:"""

        resp = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {"role":"system","content":"You are a careful assistant. Do not invent facts outside the provided context."},
                {"role":"user","content": prompt}
            ],
            temperature=0.2,
        )
        answer = resp.choices[0].message.content

        sources = ", ".join(dict.fromkeys(names[i] for i in hits))
        return f"{answer}\n\n---\nSources: {sources}"
    except Exception as e:
        return f"Error: {e}"


### Step 5 — Verify inputs and artifacts before launching

Prints loaded filenames, confirms files exist on disk, and checks that both
`EMBED` and `INDEX` are present. Useful for catching path or version issues early.

In [7]:
# Quick checks to show  filename(s) and that artifacts exist.
from pathlib import Path
print("Names:", names)
print("Files exist:", [p.exists() for p in doc_paths])
print("Embeddings exist:", Path(EMBED).exists(), "Index exists:", Path(INDEX).exists())


Names: ['clinical_demo.md']
Files exist: [True]
Embeddings exist: True Index exists: True


### Step 6 — Launch the interactive app and test prompts

Starts the Gradio interface. Uses `prevent_thread_lock=True` so later cells (e.g.,
debug tools) can still run in the same kernel.

In [8]:
import gradio as gr

demo = gr.Interface(
    fn=rag_respond,
    inputs=gr.Textbox(label="Ask a Question"),
    outputs=gr.Textbox(label="Answer"),
    title="RAG Chatbot",
    description="Ask a question based on the embedded document knowledge base."
)

# Non-blocking so later cells can still run
demo.launch(quiet=True, prevent_thread_lock=True)




### Try these test prompts
- *What A1c level diagnoses type 2 diabetes?* → expects ≥ 6.5%  
- *Typical A1c target and monitoring cadence?* → ~7%; q3 months until target, then q6 months  
- *What are first-line meds for uncomplicated hypertension?* → Thiazide, ACEi/ARB, or CCB  
- *Normal ALT and AST ranges?* → ALT 7–56 U/L; AST 10–40 U/L
