# LLM Agents in Healthcare Administration

*Author: Timo Lüders*

This notebook teaches how to build simple LLM-based agents for healthcare administration workflows (e.g., appointment handling, billing questions) **without using external API keys**.

You will:
- Understand the difference between plain LLMs and agents.
- Build a simple appointment assistant that can call Python "tools".
- Add safety guardrails to keep the agent in an administrative scope.
- Use BioBERT embeddings to route queries between multiple specialized agents.
- (Optional, advanced) Add a small retrieval component for admin FAQs.

> **Disclaimer:** This notebook is for educational purposes only. It is **not** a medical device and must not be used for real patient care or decision making.

## 0. How to Use This Notebook

- Run cells from top to bottom.
- Read the explanations in Markdown cells.
- Complete the **exercises** marked with ✅.
- Many code cells contain `# TODO` comments where you should add or modify code.

At the end, you should have a small multi-agent system that can:
- Answer appointment questions.
- Answer simple billing questions (on synthetic data).
- Route queries to the right agent.
- Refuse to answer clinical questions.

A separate solutions notebook can provide example answers and full reference code.

## 1. Setup and Environment

In this section we:
- Install and import required Python packages.
- Detect whether a GPU is available.
- Briefly discuss the models used in this notebook.

We will rely on **open-source models** only:
- **BioGPT** as a biomedical generative language model.
- **BioBERT** as a biomedical encoder for embeddings and routing.

✅ **Exercise 1 (reflection)**: After running the setup cells, write a short note (2–3 sentences) describing which models you loaded and whether you are using CPU or GPU.

In [4]:
# 1.1 Install required packages (run once per session)

!pip install -q transformers torch psutil

# Optional: for similarity / math utilities
!pip install -q numpy

!pip install -q sacremoses protobuf


print("Packages installed (or already present).")


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Packages installed (or already present).


In [5]:
# 1.2 Basic environment check

import sys
import psutil
import torch

IN_COLAB = 'google.colab' in sys.modules
print(f"Running in Google Colab: {IN_COLAB}")

has_gpu = torch.cuda.is_available()
print(f"GPU available: {has_gpu}")

mem_gb = psutil.virtual_memory().available / 1e9
print(f"Approx. available system memory: {mem_gb:.2f} GB")

Running in Google Colab: False
GPU available: False
Approx. available system memory: 3.21 GB


In [6]:
# 1.3 Load open-source medical models: BioGPT (generator) and BioBERT (encoder)

from transformers import BioGptTokenizer, BioGptForCausalLM
from transformers import AutoTokenizer, AutoModel

# Load BioGPT for generation
biogpt_tokenizer = BioGptTokenizer.from_pretrained("microsoft/biogpt")

biogpt_model = BioGptForCausalLM.from_pretrained(
    "microsoft/biogpt",
    device_map="auto" if has_gpu else None,
    torch_dtype=torch.float16 if has_gpu else torch.float32,
)

print("Loaded BioGPT model for generation.")

# Load BioBERT for embeddings / routing
biobert_tokenizer = AutoTokenizer.from_pretrained("dmis-lab/biobert-base-cased-v1.2")
biobert_model = AutoModel.from_pretrained("dmis-lab/biobert-base-cased-v1.2")

if has_gpu:
    biobert_model = biobert_model.to("cuda")

print("Loaded BioBERT model for embeddings.")

`torch_dtype` is deprecated! Use `dtype` instead!


OSError: Can't load the model for 'microsoft/biogpt'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'microsoft/biogpt' is the correct path to a directory containing a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.

In [None]:
# 1.4 Helper functions for generation and embeddings

import torch
import torch.nn.functional as F


def generate_with_biogpt(prompt: str, max_new_tokens: int = 150) -> str:
    """Generate a response from BioGPT given a text prompt."""
    formatted_prompt = f"Question: {prompt}\n\nAnswer:"
    inputs = biogpt_tokenizer(formatted_prompt, return_tensors="pt")

    if has_gpu:
        inputs = {k: v.to(biogpt_model.device) for k, v in inputs.items()}

    input_ids = inputs["input_ids"]

    with torch.no_grad():
        outputs = biogpt_model.generate(
            input_ids,
            max_new_tokens=max_new_tokens,
            temperature=0.8,
            top_p=0.92,
            top_k=50,
            repetition_penalty=1.2,
            pad_token_id=biogpt_tokenizer.eos_token_id,
            attention_mask=inputs.get("attention_mask"),
        )

    full_text = biogpt_tokenizer.decode(outputs[0], skip_special_tokens=True)

    if "Answer:" in full_text:
        return full_text.split("Answer:", 1)[1].strip()
    return full_text.strip()


def get_biobert_embedding(text: str) -> torch.Tensor:
    """Return a single sentence embedding for the given text using BioBERT.

    We use the [CLS] token embedding as a simple sentence representation.
    """
    inputs = biobert_tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)

    if has_gpu:
        inputs = {k: v.to(biobert_model.device) for k, v in inputs.items()}

    with torch.no_grad():
        outputs = biobert_model(**inputs)

    # CLS token embedding
    emb = outputs.last_hidden_state[:, 0, :]  # shape: (1, hidden_dim)
    return emb.squeeze(0).cpu()


def cosine_similarity(a: torch.Tensor, b: torch.Tensor) -> float:
    """Compute cosine similarity between two 1D tensors."""
    a = a.view(-1)
    b = b.view(-1)
    return F.cosine_similarity(a.unsqueeze(0), b.unsqueeze(0)).item()

✅ **Exercise 1**  
Run the setup cells above and then, in a new Markdown cell, answer:
- Which models did you load (names)?
- Are you running on CPU or GPU?
- How might this affect speed and what you can realistically run?

## 2. From Plain LLM to Agent Concept

In this section we:
- Try a plain interaction with BioGPT (no tools).
- Introduce the idea of an **agent**.
- Start thinking about roles for different assistants.

An **LLM agent** is more than just text-in/text-out: it can use tools (Python functions, APIs), follow goals, and be constrained to a specific role (e.g. clinic receptionist).

In [None]:
# 2.1 Plain BioGPT interaction

example_prompt = "What are common administrative tasks in a general medical practice?"
response = generate_with_biogpt(example_prompt, max_new_tokens=120)
print("Prompt:\n", example_prompt)
print("\nBioGPT response:\n", response)

✅ **Exercise 2**  
In a new Markdown cell, define **two roles** for LLMs in a clinic:

1. `ClinicReceptionBot` – handles only administrative questions (appointments, opening hours, contact info).
2. `FinancialAssistantBot` – handles only billing and invoice questions.

For each role, describe in 3–4 bullet points:
- What the agent *can* do.
- What it must **not** do (e.g. no clinical advice).

## 3. Synthetic Clinic Data and Python Tools

We now create a tiny, fully synthetic dataset to simulate a clinic:
- A set of patients (with fake IDs and names).
- Doctors and their specialties.
- Appointments with dates and times.

On top of this, we implement small Python functions that our future agents can call as **tools**.

In [None]:
# 3.1 Define synthetic clinic data

from datetime import datetime, timedelta

patients = [
    {"patient_id": "P001", "name": "Alice Example"},
    {"patient_id": "P002", "name": "Bob Sample"},
    {"patient_id": "P003", "name": "Charlie Demo"},
]

doctors = [
    {"doctor_id": "D001", "name": "Dr. Smith", "specialty": "General Practice"},
    {"doctor_id": "D002", "name": "Dr. Brown", "specialty": "Internal Medicine"},
]

# appointments: list of dicts with simple fields
appointments = [
    {
        "appointment_id": "A001",
        "patient_id": "P001",
        "doctor_id": "D001",
        "datetime": datetime.now() + timedelta(days=1, hours=9),
        "type": "check-up",
    },
    {
        "appointment_id": "A002",
        "patient_id": "P002",
        "doctor_id": "D002",
        "datetime": datetime.now() + timedelta(days=2, hours=11),
        "type": "follow-up",
    },
]

print(f"Patients: {len(patients)}, Doctors: {len(doctors)}, Appointments: {len(appointments)}")

In [None]:
# 3.2 Helper functions to work with the clinic data

from typing import List, Dict, Optional


def list_available_slots(days_ahead: int = 7, doctor_id: Optional[str] = None) -> List[Dict]:
    """Return a list of available (fake) time slots in the next 'days_ahead' days.

    This is a very simplified function: in a real system, you would consider
    working hours, existing bookings, holidays, etc.
    """
    now = datetime.now()
    slots = []
    for d in range(1, days_ahead + 1):
        for hour in [9, 11, 14, 16]:  # simple fixed hours
            slot_time = now + timedelta(days=d, hours=hour)
            # skip if already booked for this doctor at this time
            for appt in appointments:
                if appt["datetime"].date() == slot_time.date() and appt["datetime"].hour == slot_time.hour:
                    if doctor_id is None or appt["doctor_id"] == doctor_id:
                        break
            else:
                slots.append({"datetime": slot_time, "doctor_id": doctor_id or "ANY"})
    return slots


def book_appointment(patient_id: str, doctor_id: str, slot_datetime: datetime, appt_type: str = "check-up") -> Dict:
    """Book a new appointment if the slot is free.

    Returns the created appointment dict.
    """
    # NOTE: In Exercise 3B you will add a proper double-booking check here.
    new_id = f"A{len(appointments) + 1:03d}"
    appt = {
        "appointment_id": new_id,
        "patient_id": patient_id,
        "doctor_id": doctor_id,
        "datetime": slot_datetime,
        "type": appt_type,
    }
    appointments.append(appt)
    return appt


def cancel_appointment(appointment_id: str) -> bool:
    """Cancel an appointment by ID. Returns True if found and removed."""
    for i, appt in enumerate(appointments):
        if appt["appointment_id"] == appointment_id:
            del appointments[i]
            return True
    return False

✅ **Exercise 3A**  
Implement a function `get_patient_appointments(patient_id: str)` that returns all appointments for that patient.

- Write the function in the next code cell.
- Test it with patient IDs like `"P001"` and `"P003"`.

✅ **Exercise 3B**  
Extend `book_appointment` so that it **refuses** to book an appointment if another appointment already exists for the same doctor at the same date and time.

- If a conflict is detected, print a short message and return `None`.
- Otherwise, book as normal.

In [None]:
# TODO: Exercise 3A – implement get_patient_appointments

def get_patient_appointments(patient_id: str) -> List[Dict]:
    """Return a list of appointments for the given patient.

    TODO: implement the filtering logic.
    """
    # Hint: iterate over 'appointments' and select those with matching patient_id.
    raise NotImplementedError("Implement get_patient_appointments")


# TODO: Exercise 3B – extend book_appointment to prevent double-booking
# You can modify the existing book_appointment function above, or
# write a new version here and replace the old one.

print("Exercise 3A/3B: Implement and test your functions above.")

## 4. A Single Tool-Using Appointment Agent

We now build our first **agent**:
- It has a clear role: appointment assistant for a GP practice.
- It can call our Python tools (e.g. list slots, book appointments).
- It uses BioGPT to generate user-facing answers.

The orchestration ("when to call which tool") is implemented in Python to keep things transparent and safe.

In [None]:
# 4.1 System prompt for the appointment agent

APPOINTMENT_AGENT_SYSTEM_PROMPT = """
You are an appointment scheduling assistant for a general medical practice.

Your responsibilities:
- Help patients find and book suitable appointment times.
- Help patients view and cancel existing synthetic appointments.
- Ask follow-up questions if important information is missing (e.g. preferred date or doctor).

Your limitations:
- You MUST NOT provide medical diagnoses or treatment recommendations.
- You MUST NOT interpret symptoms or lab values.
- If the user asks for medical advice, you must say you cannot answer that and suggest contacting a clinician.

Always be concise and polite.
""".strip()

print("Appointment agent system prompt defined.")

In [None]:
# 4.2 Simple rule-based action selection (no ML yet)

from enum import Enum


class AppointmentAction(str, Enum):
    NEW = "new_appointment"
    RESCHEDULE = "reschedule"
    CANCEL = "cancel"
    VIEW = "view"
    OTHER = "other"


def decide_appointment_action(user_message: str) -> AppointmentAction:
    """Very simple rule-based classifier for appointment-related intents.

    TODO (Exercise 4B): Refine these rules or add new ones based on examples.
    """
    text = user_message.lower()
    if any(word in text for word in ["book", "schedule", "make an appointment", "new appointment"]):
        return AppointmentAction.NEW
    if any(word in text for word in ["reschedule", "move", "change time"]):
        return AppointmentAction.RESCHEDULE
    if any(word in text for word in ["cancel", "delete appointment"]):
        return AppointmentAction.CANCEL
    if any(word in text for word in ["see my appointments", "view appointments", "what appointments"]):
        return AppointmentAction.VIEW
    return AppointmentAction.OTHER


print("Rule-based decide_appointment_action defined.")

In [None]:
# 4.3 A simple appointment agent loop using tools + BioGPT

def run_appointment_agent(user_message: str) -> str:
    """Handle a user message with the appointment agent.

    - Decide which action to take (NEW / RESCHEDULE / CANCEL / VIEW / OTHER).
    - Optionally call Python tools on the synthetic data.
    - Use BioGPT to generate the final answer.
    """
    action = decide_appointment_action(user_message)

    # For simplicity we do not implement full multi-turn slot-filling here.
    # We will call very simple tools and pass their results to the LLM.

    tool_context = ""

    if action == AppointmentAction.VIEW:
        # For demo purposes, show all appointments
        lines = []
        for appt in appointments:
            lines.append(
                f"Appointment {appt['appointment_id']}: patient {appt['patient_id']} with {appt['doctor_id']} at {appt['datetime']} ({appt['type']})"
            )
        tool_context = "\n".join(lines) or "No appointments found."

    # NOTE: Booking/rescheduling/cancel flows can be expanded as an exercise.

    prompt = f"""{APPOINTMENT_AGENT_SYSTEM_PROMPT}

User message: "{user_message}"

Relevant appointment system information:
{tool_context or 'No additional information fetched.'}

Write a concise answer to the user, following your responsibilities and limitations.
"""

    return generate_with_biogpt(prompt, max_new_tokens=200)


# Quick demo
example_user_message = "Can you show me my upcoming appointments?"
print(run_appointment_agent(example_user_message))

✅ **Exercise 4A**  
Modify `APPOINTMENT_AGENT_SYSTEM_PROMPT` so that the agent **always** confirms the appointment details (date, time, doctor) before booking a new appointment.

✅ **Exercise 4B**  
Refine `decide_appointment_action` by:
- Adding more keywords.
- Testing it on at least 5 example sentences and checking if the chosen action makes sense.

You can keep your tests in a separate code cell (e.g. a small list of messages you iterate over).

## 5. Safety and Scope Limits

In healthcare, we must be very clear about what the agent is **allowed** to do.

In this notebook, our agents are limited to **administrative** tasks:
- Appointments (booking, viewing, cancelling).
- Simple billing/invoice questions (later).

They must **not**:
- Diagnose conditions.
- Recommend treatments or medication dosages.
- Interpret symptoms or lab results.

We implement a simple text-based safety filter to detect likely clinical questions and block them.

In [None]:
# 5.1 Simple safety filter for clinical questions

CLINICAL_KEYWORDS = [
    "diagnose",
    "diagnosis",
    "should I take",
    "medication",
    "dose",
    "dosage",
    "side effect",
    "is this cancer",
    "treatment",
    "symptom",
]


def is_clinical_question(text: str) -> bool:
    lower = text.lower()
    return any(keyword in lower for keyword in CLINICAL_KEYWORDS)


def handle_with_safety_filter(user_message: str) -> str:
    """Apply safety filter, then call the appointment agent if safe."""
    if is_clinical_question(user_message):
        return (
            "I am only allowed to help with administrative questions (appointments, billing, etc.). "
            "For medical questions about symptoms, diagnoses, or treatments, please contact a healthcare professional."
        )
    return run_appointment_agent(user_message)


# Quick test
for msg in [
    "Can you book an appointment for me tomorrow?",
    "What medication should I take for high blood pressure?",
]:
    print("\nUser:", msg)
    print("Assistant:", handle_with_safety_filter(msg))

✅ **Exercise 5**  
Extend the safety filter:
- Add at least 3 more keywords/phrases to `CLINICAL_KEYWORDS`.
- Create a list of 6–8 test messages (some clinical, some administrative).
- For each message, print whether `is_clinical_question` returns `True` and whether you agree.

Reflect briefly in Markdown on any **false positives** or **false negatives** you observe.

## 6. Using BioBERT for Intent and Routing

So far, our appointment agent used **hand-written rules** (`decide_appointment_action`).

Now we use **BioBERT embeddings** to classify messages into broader intent categories:
- `admin_appointments`
- `billing`
- `intake` (non-diagnostic symptom collection)

This will later allow us to route messages to different specialized agents.

In [None]:
# 6.1 Define intent labels and pre-compute embeddings

INTENT_LABELS = {
    "admin_appointments": "Questions about appointments, scheduling, rescheduling, or cancelling visits.",
    "billing": "Questions about invoices, payments, insurance coverage, and costs.",
    "intake": "Questions where patients describe symptoms or reasons for visit, but the agent should only collect information, not diagnose.",
}

intent_label_embeddings = {
    label: get_biobert_embedding(description)
    for label, description in INTENT_LABELS.items()
}

print("Computed BioBERT embeddings for intent labels:")
for label in intent_label_embeddings:
    print("-", label)

In [None]:
# 6.2 Routing function using cosine similarity

from typing import Tuple


def route_intent(user_message: str) -> Tuple[str, float]:
    """Return (best_label, similarity_score) for the user_message."""
    emb = get_biobert_embedding(user_message)
    best_label = None
    best_score = -1.0
    for label, label_emb in intent_label_embeddings.items():
        score = cosine_similarity(emb, label_emb)
        if score > best_score:
            best_score = score
            best_label = label
    return best_label, best_score


# Quick demo
examples = [
    "I need to change my appointment from Monday to Wednesday.",
    "Why did I get two invoices for the same visit?",
    "I have chest pain and shortness of breath, what should I do?",
]

for text in examples:
    label, score = route_intent(text)
    print(f"\nText: {text}\nRouted to: {label} (similarity={score:.3f})")

✅ **Exercise 6**  
- Add at least 3 more example texts to the list in the routing demo.
- For each, check if the chosen label makes sense.
- (Advanced) Add a fourth label, e.g. `"general_info"`, with a suitable description, and update `INTENT_LABELS` and the demo.

Briefly reflect in Markdown on one case where the routing was not ideal and how you might improve it.

## 7. Multi-Agent Orchestration

We now create multiple specialized agents:

- `AdminAgent` – handles appointment-related questions using our tools.
- `BillingAgent` – answers questions about invoices and costs on a small synthetic dataset.
- `IntakeAgent` – collects structured symptom information (without diagnosis).

A **coordinator** will:
1. Apply the safety filter.
2. Use BioBERT-based routing (`route_intent`).
3. Call the appropriate agent.

All agents still use BioGPT for final answer generation.

In [None]:
# 7.1 Synthetic billing data

billing_records = [
    {"invoice_id": "INV001", "patient_id": "P001", "amount": 80.0, "description": "GP consultation", "status": "paid"},
    {"invoice_id": "INV002", "patient_id": "P001", "amount": 40.0, "description": "Lab tests", "status": "open"},
    {"invoice_id": "INV003", "patient_id": "P002", "amount": 120.0, "description": "Specialist visit", "status": "open"},
]


def get_patient_billing_summary(patient_id: str) -> str:
    """Return a human-readable summary of billing records for a patient."""
    records = [r for r in billing_records if r["patient_id"] == patient_id]
    if not records:
        return f"No billing records found for patient {patient_id}."
    lines = []
    for r in records:
        lines.append(
            f"Invoice {r['invoice_id']}: {r['description']} — {r['amount']} EUR (status: {r['status']})"
        )
    return "\n".join(lines)

In [None]:
# 7.2 Specialized agent runners

BILLING_AGENT_SYSTEM_PROMPT = """
You are a billing assistant for a small medical clinic.

- You can see synthetic billing information such as invoice IDs, amounts, and payment status.
- You explain invoices in simple language.
- You cannot change or create real invoices.
- You do not give any clinical or treatment advice.
""".strip()


INTAKE_AGENT_SYSTEM_PROMPT = """
You are an intake assistant.

Your job is to collect structured information about why a patient is seeking care.

- Ask clear follow-up questions about symptoms (onset, duration, severity, triggers).
- Summarize the information in a structured way.
- Do NOT provide diagnoses or treatment recommendations.
- Encourage the patient to discuss their symptoms with a clinician.
""".strip()


def run_admin_agent(user_message: str) -> str:
    """Currently this just calls our appointment agent with safety filter."""
    return handle_with_safety_filter(user_message)


def run_billing_agent(user_message: str, patient_id: str = "P001") -> str:
    """Use synthetic billing data and BioGPT to answer billing questions.

    TODO (Exercise 7A): make this function smarter (e.g., parse invoice IDs from the message).
    """
    billing_info = get_patient_billing_summary(patient_id)

    prompt = f"""{BILLING_AGENT_SYSTEM_PROMPT}

User message: "{user_message}"

Synthetic billing information for patient {patient_id}:
{billing_info}

Explain the situation to the user in simple terms.
"""
    return generate_with_biogpt(prompt, max_new_tokens=200)


def run_intake_agent(user_message: str) -> str:
    """Use BioGPT to collect and summarize symptom information.

    For simplicity, we use single-turn messages here.
    """
    prompt = f"""{INTAKE_AGENT_SYSTEM_PROMPT}

Patient message: "{user_message}"

Respond with a short clarification or summary that could be handed to a clinician.
Remember: do not diagnose or recommend treatments.
"""
    return generate_with_biogpt(prompt, max_new_tokens=250)

In [None]:
# 7.3 Coordinator that routes between agents

def coordinator(user_message: str, patient_id: str = "P001") -> str:
    """Top-level function:
    - Apply safety filter.
    - Route via BioBERT to admin / billing / intake.
    - Call the corresponding agent.
    """
    if is_clinical_question(user_message):
        # Clinical questions are blocked at the top level
        return (
            "I can only help with administrative questions and general intake information. "
            "For medical concerns, please contact a healthcare professional or emergency services if needed."
        )

    label, score = route_intent(user_message)
    debug_info = f"[DEBUG] Routed to: {label} (similarity={score:.3f})\n"

    if label == "admin_appointments":
        answer = run_admin_agent(user_message)
    elif label == "billing":
        answer = run_billing_agent(user_message, patient_id=patient_id)
    elif label == "intake":
        answer = run_intake_agent(user_message)
    else:
        # Fallback to admin agent
        answer = run_admin_agent(user_message)

    return debug_info + answer


# Quick multi-agent demo
messages = [
    "I need to reschedule my check-up with the doctor.",
    "Why do I have two open invoices?",
    "I have had a strong headache and blurred vision for 3 days.",
]

for msg in messages:
    print("\nUser:", msg)
    print("Assistant:\n", coordinator(msg))

✅ **Exercise 7A**  
Improve `run_billing_agent`:
- Try to detect specific invoice IDs mentioned in the user message (e.g. `"INV002"`).
- If an ID is mentioned, focus the explanation on that invoice.
- Otherwise, provide a general overview as in the current version.

✅ **Exercise 7B**  
Modify `coordinator` so that the debug information (routed label and similarity) is only shown when a variable `SHOW_DEBUG = True`.

Test your improved coordinator with at least 5 different user messages.

## 8. Optional: Mini RAG for Admin FAQs (Advanced, Optional)

If you have time and interest, you can extend your system with a tiny retrieval component:
- Create a small list of FAQ entries about clinic policies (cancellation, billing, opening hours).
- Embed them with BioBERT.
- For each user question, retrieve the most similar FAQ entries and include them in the prompt to BioGPT.

This is a simple form of **Retrieval-Augmented Generation (RAG)** for administrative knowledge.

In [None]:
# 8.1 (Optional) Tiny FAQ corpus and retrieval

FAQ_ENTRIES = [
    "You can cancel or reschedule an appointment up to 24 hours before without any fee.",
    "If you miss an appointment without notice, a small no-show fee may be charged.",
    "Invoices can be paid by bank transfer or credit card within 30 days.",
    "Our clinic is open from 8:00 to 17:00 on weekdays.",
]

faq_embeddings = [get_biobert_embedding(text) for text in FAQ_ENTRIES]


def retrieve_faqs(query: str, top_k: int = 2) -> list:
    q_emb = get_biobert_embedding(query)
    scores = [cosine_similarity(q_emb, e) for e in faq_embeddings]
    ranked = sorted(zip(FAQ_ENTRIES, scores), key=lambda x: x[1], reverse=True)
    return ranked[:top_k]


def answer_with_faq_rag(user_message: str) -> str:
    retrieved = retrieve_faqs(user_message, top_k=2)
    context_lines = [f"- {text} (score={score:.3f})" for text, score in retrieved]
    context = "\n".join(context_lines)

    prompt = f"""You are an administrative assistant for a clinic.

User question: "{user_message}"

Relevant policy information:
{context}

Use only the information above to answer the user's question. If something is unclear, say so.
"""
    return generate_with_biogpt(prompt, max_new_tokens=200)


# Quick demo
print(answer_with_faq_rag("What happens if I miss my appointment?"))

✅ **Exercise 8 (optional)**  
- Add at least 2 new FAQ entries.
- Try several user questions and look at which FAQs were retrieved.
- Adjust the number of FAQs or examine the similarity scores to improve relevance.

Briefly document one example where RAG clearly improved the answer compared to not using FAQs.

## 9. Evaluation and Reflection

In this final section, you should:
- Test your system with a small set of diverse messages.
- Check that routing and safety behave as expected.
- Reflect on limitations and potential risks.

✅ **Exercise 9**  
1. Create a list of at least 6 test messages that cover:
   - Admin appointments
   - Billing
   - Intake-like symptom descriptions
   - Clearly clinical questions that should be blocked
2. Run them through `coordinator` and record the outputs.
3. In Markdown, list:
   - 3 potential risks of using such agents in real clinics.
   - 2 ideas to make them safer (technical or organizational).

## 10. Summary and Next Steps

In this notebook you:
- Loaded open-source biomedical models (BioGPT, BioBERT) without any external API keys.
- Built Python "tools" on top of a small synthetic clinic dataset.
- Created an appointment agent that can use tools and respects a clear scope.
- Implemented a simple safety filter to block clinical questions.
- Used BioBERT embeddings to route between multiple specialized agents.
- Optionally, experimented with a tiny RAG setup for administrative FAQs.

**Next steps:**
- Explore the separate notebook on building a small Streamlit UI for your agents.
- Think about how you would log agent decisions and enable human oversight.
- Consider how real-world constraints (regulation, data protection, governance) would further shape such systems.