<a href="https://colab.research.google.com/github/Balavignesh-25/Agentic-AI-System/blob/main/Agentic_Medical_RAG_System_for_Hypertension_Explanation_(No_NLTK_API_Keys).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:

!pip install wikipedia

Collecting wikipedia
  Downloading wikipedia-1.4.0.tar.gz (27 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: wikipedia
  Building wheel for wikipedia (setup.py) ... [?25l[?25hdone
  Created wheel for wikipedia: filename=wikipedia-1.4.0-py3-none-any.whl size=11678 sha256=9b0166af8d80fcde097ce49f6dbb53627d350b1f587737f57aee02c687c3c1d4
  Stored in directory: /root/.cache/pip/wheels/63/47/7c/a9688349aa74d228ce0a9023229c6c0ac52ca2a40fe87679b8
Successfully built wikipedia
Installing collected packages: wikipedia
Successfully installed wikipedia-1.4.0


In [2]:
# ==============================
# FULL AGENTIC MEDICAL RAG SYSTEM
# NO NLTK | NO API KEYS | FREE
# ==============================

# !pip install wikipedia

import wikipedia
import re
from collections import Counter

# ------------------------------
# INTERNAL MEDICAL KNOWLEDGE BASE
# This section defines a simple, static knowledge base for medical information.
# ------------------------------
INTERNAL_MEDICAL_KB = [
    "Hypertension means long-term high blood pressure in the arteries.",
    "It usually develops slowly and often has no early symptoms.",
    "Eating too much salt can raise blood pressure.",
    "Being overweight makes the heart work harder.",
    "Smoking damages blood vessels and raises blood pressure.",
    "Lack of exercise weakens the heart and blood vessels.",
    "Drinking too much alcohol increases blood pressure.",
    "Family history increases the risk of hypertension.",
    "Kidney disease and hormone problems can cause secondary hypertension.",
    "High blood pressure increases the risk of heart attack and stroke."
]

# ------------------------------
# BASIC NLP UTILITIES (NO NLTK)
# These functions provide basic Natural Language Processing capabilities
# without relying on external NLP libraries like NLTK.
# ------------------------------
def split_sentences(text):
    # Splits a given text into individual sentences.
    return re.split(r'(?<=[.!?])\s+', text.strip())

def tokenize(text):
    # Converts text to lowercase and extracts individual words (tokens).
    return re.findall(r'\b[a-zA-Z]+\b', text.lower())

# ------------------------------
# INTERNAL KB RETRIEVER
# This function searches the internal knowledge base for relevant information.
# ------------------------------
def retrieve_internal_kb(query, top_k=3):
    # Tokenizes the query to find matching words in the internal documents.
    query_words = set(tokenize(query))
    scored = []

    # Scores each document based on the number of shared words with the query.
    for doc in INTERNAL_MEDICAL_KB:
        score = len(query_words & set(tokenize(doc)))
        scored.append((score, doc))

    # Sorts documents by score and returns the top_k most relevant ones.
    scored.sort(reverse=True)
    return [doc for score, doc in scored[:top_k]]

# ------------------------------
# EXTERNAL KNOWLEDGE TOOL (WIKI)
# This function retrieves information from Wikipedia for external context.
# ------------------------------
def retrieve_wikipedia(query, sentences=4):
    try:
        # Searches Wikipedia for the query and retrieves a summary.
        title = wikipedia.search(query)[0]
        return wikipedia.summary(title, sentences=sentences)
    except Exception:
        # Returns a fallback message if Wikipedia retrieval fails.
        return "External medical knowledge could not be retrieved."

# ------------------------------
# ANSWER GENERATOR
# This function combines retrieved information to form a coherent answer.
# ------------------------------
def generate_answer(query, internal_docs, wiki_text):
    answer = "Here is a simple explanation:\n\n"

    # Appends information from the internal knowledge base.
    for doc in internal_docs:
        answer += "- " + doc + "\n"

    # Appends additional information from Wikipedia.
    answer += "\nAdditional information:\n"
    answer += wiki_text
    return answer

# ------------------------------
# PURE NLP SUMMARIZER (NO LLM)
# This function summarizes text using basic NLP techniques without an LLM.
# ------------------------------
def nlp_summarize(text, num_sentences=3):
    # Splits text into sentences and tokenizes words.
    sentences = split_sentences(text)
    words = tokenize(text)
    # Calculates word frequencies.
    freq = Counter(words)

    sentence_scores = {}
    # Scores each sentence based on the frequency of its words.
    for s in sentences:
        sentence_scores[s] = sum(freq[w] for w in tokenize(s))

    # Ranks sentences by score and returns the top_k as a summary.
    ranked = sorted(sentence_scores, key=sentence_scores.get, reverse=True)
    return " ".join(ranked[:num_sentences])

# ------------------------------
# CRITIC AGENT
# This agent evaluates the generated answer for quality and completeness.
# ------------------------------
def evaluate_answer(answer):
    # Checks if the answer is too short or misses key concepts.
    if len(answer.split()) < 60:
        return False, "Answer too shallow"
    if "blood pressure" not in answer.lower():
        return False, "Missing core concept"
    # Returns true if the answer is considered sufficient.
    return True, "Answer is sufficient"

# ------------------------------
# PLANNING AGENT
# This agent defines the sequence of steps for the RAG process.
# ------------------------------
def planner(goal):
    # Returns a predefined plan for processing the medical query.
    return [
        "retrieve_internal",
        "retrieve_external",
        "generate",
        "evaluate",
        "summarize"
    ]

# ------------------------------
# AGENTIC EXECUTION LOOP
# This is the main orchestrator that executes the plan using different agents.
# ------------------------------
def agentic_medical_rag(query):
    print("\n🤖 AGENT ACTIVATED")
    print("🎯 GOAL:", query)

    # Gets the plan from the planning agent.
    plan = planner(query)
    print("🧠 PLAN:", plan)

    memory = {} # Stores intermediate results from each step.

    for step in plan:
        print(f"\n➡ EXECUTING: {step}")

        if step == "retrieve_internal":
            # Retrieves relevant documents from the internal knowledge base.
            memory["internal"] = retrieve_internal_kb(query)
            print("📚 Internal KB used")

        elif step == "retrieve_external":
            # Retrieves information from Wikipedia.
            memory["wiki"] = retrieve_wikipedia(query)
            print("🌍 Wikipedia consulted")

        elif step == "generate":
            # Generates a draft answer using internal and external information.
            memory["answer"] = generate_answer(
                query,
                memory["internal"],
                memory["wiki"]
            )
            print("✍ Draft answer generated")

        elif step == "evaluate":
            # Evaluates the draft answer using the critic agent.
            ok, reason = evaluate_answer(memory["answer"])
            print("🔍 Critic:", reason)

            if not ok:
                # If the answer is insufficient, refines it by adding more info.
                print("🔁 Refining answer...")
                memory["internal"].append(
                    "Hypertension damages blood vessels over time if untreated."
                )
                memory["answer"] = generate_answer(
                    query,
                    memory["internal"],
                    memory["wiki"]
                )

        elif step == "summarize":
            # Summarizes the final answer using NLP summarizer.
            memory["final"] = nlp_summarize(memory["answer"])
            print("📝 Summary created")

    # Returns the final summarized answer.
    return memory["final"]

# ------------------------------
# RUN DEMO
# This section demonstrates how to use the agentic medical RAG system.
# ------------------------------
query = "Explain the causes and risk factors of hypertension in simple terms."
final_answer = agentic_medical_rag(query)

print("\n✅ FINAL ANSWER:\n")
print(final_answer)


🤖 AGENT ACTIVATED
🎯 GOAL: Explain the causes and risk factors of hypertension in simple terms.
🧠 PLAN: ['retrieve_internal', 'retrieve_external', 'generate', 'evaluate', 'summarize']

➡ EXECUTING: retrieve_internal
📚 Internal KB used

➡ EXECUTING: retrieve_external
🌍 Wikipedia consulted

➡ EXECUTING: generate
✍ Draft answer generated

➡ EXECUTING: evaluate
🔍 Critic: Answer is sufficient

➡ EXECUTING: summarize
📝 Summary created

✅ FINAL ANSWER:

Additional information:
Dementia is a syndrome, often associated with neurodegenerative diseases such as Alzheimer's, and characterized by a general decline in cognitive abilities that affects a person's ability to perform everyday activities. Aside from memory impairment and a disruption in thought patterns, the most common symptoms of dementia include emotional problems, difficulties with language, and decreased motivation. Here is a simple explanation:

- High blood pressure increases the risk of heart attack and stroke.


In [3]:
!pip install -q sentence-transformers faiss-cpu transformers wikipedia beautifulsoup4 requests accelerate


[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m23.8/23.8 MB[0m [31m69.6 MB/s[0m eta [36m0:00:00[0m
[?25h

In [4]:
# =========================

# !pip install -q sentence-transformers faiss-cpu transformers wikipedia beautifulsoup4 requests accelerate

# =========================
# IMPORTS
# =========================
import wikipedia
import requests
from bs4 import BeautifulSoup
from sentence_transformers import SentenceTransformer
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import numpy as np
import faiss
import torch
import re
from datetime import datetime

# =========================
# EXECUTION TRACE TOOL
# =========================
TRACE_LOG = []

def trace(step, message):
    timestamp = datetime.now().strftime("%H:%M:%S")
    entry = f"[{timestamp}] {step}: {message}"
    TRACE_LOG.append(entry)
    print(entry)

# =========================
# INTERNAL KNOWLEDGE SCRAPER
# =========================
def scrape_text(url):
    try:
        r = requests.get(url, timeout=10)
        soup = BeautifulSoup(r.text, "html.parser")
        text = " ".join(p.get_text() for p in soup.find_all("p"))
        return text.strip()
    except:
        return ""

urls = [
    "https://en.wikipedia.org/wiki/Hypertension",
    "https://en.wikipedia.org/wiki/Blood_pressure",
    "https://en.wikipedia.org/wiki/Cardiovascular_disease"
]

trace("SCRAPER", "Collecting internal knowledge")
documents = []

for u in urls:
    text = scrape_text(u)
    trace("SCRAPER_RETURN", f"Scraped {len(text)} characters from {u}")
    for i in range(0, len(text), 800):
        chunk = text[i:i+800]
        if len(chunk.strip()) > 200:
            documents.append(chunk)

# =========================
# FALLBACK KNOWLEDGE
# =========================
if not documents:
    trace("VALIDATOR", "No documents scraped — activating fallback knowledge")
    documents = [
        "Hypertension is a condition where blood pressure stays high for a long time.",
        "Common causes include unhealthy diet, too much salt, stress, obesity, and lack of exercise.",
        "Risk factors include age, family history, smoking, alcohol use, and chronic stress.",
        "Untreated high blood pressure can lead to heart disease, stroke, and kidney damage."
    ]
    trace("VALIDATOR_RETURN", f"Fallback documents count: {len(documents)}")

# =========================
# EMBEDDINGS + VECTOR STORE
# =========================
trace("EMBEDDER", "Generating embeddings")
embedder = SentenceTransformer("all-MiniLM-L6-v2")
doc_vectors = embedder.encode(documents)

if len(doc_vectors.shape) == 1:
    doc_vectors = doc_vectors.reshape(1, -1)

doc_vectors = doc_vectors.astype("float32")
trace("EMBEDDER_RETURN", f"Embedding shape: {doc_vectors.shape}")

trace("VECTOR_DB", f"Initializing FAISS with {len(documents)} documents")
index = faiss.IndexFlatL2(doc_vectors.shape[1])
index.add(doc_vectors)

def retrieve_internal(query, k=3):
    qv = embedder.encode([query]).astype("float32")
    _, ids = index.search(qv, k)
    results = [documents[i] for i in ids[0]]
    trace("RETRIEVER_RETURN", f"Internal docs retrieved: {len(results)}")
    return results

# =========================
# EXTERNAL KNOWLEDGE AGENT
# =========================
def retrieve_external_wiki(query):
    try:
        titles = wikipedia.search(query)
        trace("RETRIEVER_RETURN", f"Wikipedia titles found: {titles[:5]}")
        for t in titles:
            if "hypertension" in t.lower():
                summary = wikipedia.summary(t, sentences=4)
                trace("RETRIEVER_RETURN", f"External summary length: {len(summary)}")
                return summary
        return ""
    except:
        return ""

# =========================
# RELEVANCE VERIFIER
# =========================
def verify_relevance(text):
    keywords = ["blood", "pressure", "hypertension"]
    result = all(k in text.lower() for k in keywords)
    trace("VERIFIER_RETURN", f"External relevance check: {result}")
    return result

# =========================
# LLM
# =========================
trace("LLM", "Loading BLOOM-560M")
model_name = "bigscience/bloom-560m"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

llm = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=250,
    temperature=0.6
)

# =========================
# RETRIEVER AGENT
# =========================
def retriever_agent(query):
    trace("RETRIEVER", "Fetching internal knowledge")
    internal = retrieve_internal(query)

    trace("RETRIEVER", "Fetching external knowledge")
    external = retrieve_external_wiki(query)

    if not verify_relevance(external):
        trace("RETRIEVER", "External source rejected (low relevance)")
        external = ""

    trace("RETRIEVER_RETURN", f"Internal chunks: {len(internal)}, External chars: {len(external)}")
    return internal, external

# =========================
# GENERATOR AGENT
# =========================
def generator_agent(query, internal, external):
    trace("GENERATOR", "Generating draft answer")

    context = "\n".join(internal) + "\n" + external

    prompt = f"""
You are a medical expert.
Explain in very simple terms.

Context:
{context}

Question:
{query}

Answer:
"""
    output = llm(prompt)[0]["generated_text"]
    trace("GENERATOR_RETURN", f"Draft length: {len(output)} characters")
    return output

# =========================
# CRITIC AGENT
# =========================
def critic_agent(answer):
    trace("CRITIC", "Reviewing answer for safety and relevance")

    forbidden = ["atrial fibrillation", "arrhythmia"]
    for f in forbidden:
        if f in answer.lower():
            trace("CRITIC_RETURN", f"Rejected due to keyword: {f}")
            return False, "Unrelated medical condition detected"

    trace("CRITIC_RETURN", "Approved")
    return True, "Answer approved"

# =========================
# NLP SUMMARIZER
# =========================
def nlp_summarize(text, max_sentences=3):
    trace("SUMMARIZER", "Condensing answer using NLP")
    sentences = re.split(r'(?<=[.!?]) +', text)
    summary = " ".join(sentences[:max_sentences])
    trace("SUMMARIZER_RETURN", f"Summary length: {len(summary)} characters")
    return summary

# =========================
# AGENTIC ORCHESTRATOR
# =========================
def agentic_medical_rag(query):
    trace("AGENT", "Agent activated")
    trace("PLAN", "retrieve → verify → generate → critique → summarize")

    internal, external = retriever_agent(query)
    draft = generator_agent(query, internal, external)

    ok, verdict = critic_agent(draft)
    trace("CRITIC", verdict)

    if not ok:
        return "Answer rejected by critic agent."

    final = nlp_summarize(draft)
    trace("AGENT_RETURN", f"Final answer length: {len(final)} characters")
    return final

# =========================
# RUN DEMO
# =========================
query = "Explain the causes and risk factors of hypertension in simple terms."
final_answer = agentic_medical_rag(query)

print("\n✅ FINAL AGENTIC ANSWER:\n")
print(final_answer)

print("\n📊 EXECUTION TRACE:\n")
for t in TRACE_LOG:
    print(t)

[12:10:35] SCRAPER: Collecting internal knowledge
[12:10:35] SCRAPER_RETURN: Scraped 0 characters from https://en.wikipedia.org/wiki/Hypertension
[12:10:35] SCRAPER_RETURN: Scraped 0 characters from https://en.wikipedia.org/wiki/Blood_pressure
[12:10:35] SCRAPER_RETURN: Scraped 0 characters from https://en.wikipedia.org/wiki/Cardiovascular_disease
[12:10:35] VALIDATOR: No documents scraped — activating fallback knowledge
[12:10:35] VALIDATOR_RETURN: Fallback documents count: 4
[12:10:35] EMBEDDER: Generating embeddings


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

Loading weights:   0%|          | 0/103 [00:00<?, ?it/s]

BertModel LOAD REPORT from: sentence-transformers/all-MiniLM-L6-v2
Key                     | Status     |  | 
------------------------+------------+--+-
embeddings.position_ids | UNEXPECTED |  | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.


tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]



tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

[12:10:42] EMBEDDER_RETURN: Embedding shape: (4, 384)
[12:10:42] VECTOR_DB: Initializing FAISS with 4 documents
[12:10:42] LLM: Loading BLOOM-560M


config.json:   0%|          | 0.00/693 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/222 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/14.5M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/85.0 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.12G [00:00<?, ?B/s]

Loading weights:   0%|          | 0/293 [00:00<?, ?it/s]

Passing `generation_config` together with generation-related arguments=({'max_new_tokens', 'temperature'}) is deprecated and will be removed in future versions. Please pass either a `generation_config` object OR all generation parameters explicitly, but not both.
Both `max_new_tokens` (=250) and `max_length`(=20) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[12:10:56] AGENT: Agent activated
[12:10:56] PLAN: retrieve → verify → generate → critique → summarize
[12:10:56] RETRIEVER: Fetching internal knowledge
[12:10:56] RETRIEVER_RETURN: Internal docs retrieved: 3
[12:10:56] RETRIEVER: Fetching external knowledge
[12:10:56] RETRIEVER_RETURN: Wikipedia titles found: ['Dementia', 'Angina', 'Atrial fibrillation', 'Myopia', 'Metabolic dysfunction–associated steatotic liver disease']
[12:10:56] VERIFIER_RETURN: External relevance check: False
[12:10:56] RETRIEVER: External source rejected (low relevance)
[12:10:56] RETRIEVER_RETURN: Internal chunks: 3, External chars: 0
[12:10:56] GENERATOR: Generating draft answer
[12:12:06] GENERATOR_RETURN: Draft length: 1591 characters
[12:12:06] CRITIC: Reviewing answer for safety and relevance
[12:12:06] CRITIC_RETURN: Approved
[12:12:06] CRITIC: Answer approved
[12:12:06] SUMMARIZER: Condensing answer using NLP
[12:12:06] SUMMARIZER_RETURN: Summary length: 1155 characters
[12:12:06] AGENT_RETURN: Final an