# Simple Clinical RAG System: A Practical Introduction for Healthcare Professionals

## Introduction

This notebook introduces **Retrieval Augmented Generation (RAG)** - a powerful technique that helps large language models (LLMs) answer questions more accurately using specific information sources.

**Why is this important for healthcare?** 
- Ensures responses are grounded in verified medical knowledge
- Gives control over the information sources used
- Provides references/citations for answers
- Reduces hallucinations (made-up information)

We'll build a simple but practical RAG system that can answer questions about clinical guidelines, research papers, or other healthcare documents.


## Setup and Installation

First, let's install the necessary packages:

In [9]:
# Install required packages
!pip install -q langchain langchain-openai langchain-text-splitters langchain-community faiss-cpu

## 1. Setting up your API keys

To use LLMs for our system, we need to set up API access. Here we'll use OpenAI's models, but you could replace this with any other provider.

In [3]:
import os
import getpass

# Uncomment and run this if you need to set your API key
os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

In [4]:
# To verify your API key is set properly
if "OPENAI_API_KEY" in os.environ:
    print("API key is set ✓")
else:
    print("API key is not set. Please run the cell above.")

API key is set ✓


## 2. Selecting Our Components

A RAG system has three essential components:
1. A chat model (for generating answers)
2. An embedding model (for converting text to numerical representations)
3. A vector store (for searching and retrieving relevant information)

In [6]:
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import FAISS

# Initialize our components
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
embeddings = OpenAIEmbeddings()

## 3. Loading Medical Documents

For this example, let's use a sample clinical guideline. In real applications, you might load actual guidelines, research papers, or hospital protocols.


In [7]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.documents import Document

# Sample clinical guideline text (normally you'd load this from a file)
sample_guideline = """
# Hypertension Management Guidelines 2024

## Definition and Classification
Hypertension is defined as systolic blood pressure (SBP) ≥ 130 mm Hg or diastolic blood pressure (DBP) ≥ 80 mm Hg.

Classification:
- Normal: SBP < 120 mm Hg and DBP < 80 mm Hg
- Elevated: SBP 120-129 mm Hg and DBP < 80 mm Hg
- Stage 1: SBP 130-139 mm Hg or DBP 80-89 mm Hg
- Stage 2: SBP ≥ 140 mm Hg or DBP ≥ 90 mm Hg
- Hypertensive Crisis: SBP > 180 mm Hg and/or DBP > 120 mm Hg

## Initial Assessment
- Comprehensive history and physical examination
- Laboratory testing: fasting blood glucose, complete blood count, lipid profile, basic metabolic panel, urinalysis, ECG
- Assess for target organ damage and cardiovascular risk factors
- Screen for secondary causes of hypertension in patients with suggestive symptoms or resistant hypertension

## Treatment Recommendations

### Lifestyle Modifications (for all patients)
- Dietary Approaches to Stop Hypertension (DASH) eating plan
- Sodium restriction (<2,300 mg/day)
- Physical activity (at least 150 minutes of moderate-intensity aerobic activity per week)
- Weight reduction for overweight or obese patients
- Limit alcohol consumption (≤2 drinks/day for men, ≤1 drink/day for women)
- Smoking cessation

### Pharmacological Therapy
- First-line agents include thiazide diuretics, ACE inhibitors, ARBs, and calcium channel blockers
- Initial therapy with two first-line agents from different classes for stage 2 hypertension
- Consider patient's comorbidities when selecting agents:
  - Diabetes mellitus: ACE inhibitors or ARBs preferred
  - Chronic kidney disease: ACE inhibitors or ARBs preferred
  - Heart failure with reduced ejection fraction: ACE inhibitors, ARBs, beta-blockers, mineralocorticoid receptor antagonists, and diuretics
  - Coronary artery disease: Beta-blockers, ACE inhibitors, or ARBs
  - Stroke history: Thiazide diuretics and ACE inhibitors

### Special Populations
- Older adults (≥65 years): Start with lower doses, monitor for orthostatic hypotension
- Pregnancy: Methyldopa, nifedipine, and labetalol are preferred; ACE inhibitors and ARBs are contraindicated
- Children and adolescents: Treatment thresholds and targets depend on age, height, and sex

## Blood Pressure Targets
- General population: <130/80 mm Hg
- Older adults (≥65 years): <130/80 mm Hg if tolerated
- Patients with diabetes or chronic kidney disease: <130/80 mm Hg
- Heart failure with reduced ejection fraction: <130/80 mm Hg

## Follow-up and Monitoring
- Evaluate monthly until BP is at target, then every 3-6 months
- Home BP monitoring is recommended for all patients
- Assess medication adherence at each visit
- Monitor for adverse effects of medications
- Reassess cardiovascular risk regularly

## Treatment-Resistant Hypertension
- Defined as BP above target despite concurrent use of 3 antihypertensive agents of different classes, including a diuretic
- Ensure proper BP measurement technique and medication adherence
- Consider secondary causes
- Consider referral to a hypertension specialist
"""

# Create a Document object
doc = Document(page_content=sample_guideline)

# Split the document into chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,  # Characters per chunk
    chunk_overlap=50,  # Overlap between chunks
    add_start_index=True  # Track original position
)
splits = text_splitter.split_documents([doc])

print(f"Split document into {len(splits)} chunks")

Split document into 9 chunks


## 4. Creating Our Vector Database

Now let's index our document chunks so we can retrieve them later:


In [10]:
# Create a vector store with our document chunks
vector_store = FAISS.from_documents(documents=splits, embedding=embeddings)

print("Vector database created successfully!")

Vector database created successfully!


## 5. Building Our RAG Application

Let's create a simple application that:
1. Takes a clinical question
2. Retrieves relevant information from our guidelines
3. Generates an evidence-based answer with citations

In [11]:
from langchain import hub
from langgraph.graph import START, StateGraph
from typing_extensions import TypedDict, List

# Define our clinical RAG prompt
clinical_rag_prompt = """You are an AI clinical assistant helping healthcare professionals by providing evidence-based information.
Use ONLY the following pieces of retrieved context to answer the question. 
If you don't know the answer based on the provided context, just say that you don't know.
Always cite your sources by mentioning which section of the guidelines you're referencing.
Use clear, clinically appropriate language.

Context:
{context}

Question: {question}

Clinical Response:"""

# Define the state for our application
class State(TypedDict):
    question: str
    context: List[Document]
    answer: str

# Define our application steps
def retrieve(state: State):
    retrieved_docs = vector_store.similarity_search(state["question"], k=3)
    return {"context": retrieved_docs}

def generate(state: State):
    # Combine the text from all retrieved documents
    docs_content = "\n\n".join(doc.page_content for doc in state["context"])
    
    # Create messages from our prompt
    from langchain_core.prompts import PromptTemplate
    prompt = PromptTemplate.from_template(clinical_rag_prompt)
    messages = prompt.invoke({"question": state["question"], "context": docs_content}).to_messages()
    
    # Generate response
    response = llm.invoke(messages)
    return {"answer": response.content}

# Build our RAG application
graph_builder = StateGraph(State)
graph_builder.add_node("retrieve", retrieve)
graph_builder.add_node("generate", generate)
graph_builder.add_edge(START, "retrieve")
graph_builder.add_edge("retrieve", "generate")
graph = graph_builder.compile()

## 6. Testing Our System

Now let's test our clinical RAG system with some healthcare questions:

In [12]:
def ask_medical_question(question):
    response = graph.invoke({"question": question})
    print(f"Question: {question}\n")
    print(f"Answer: {response['answer']}\n")
    print("Sources referenced:")
    for i, doc in enumerate(response["context"], 1):
        print(f"  Source {i}: {doc.page_content[:150]}...\n")

# Test with clinical questions
questions = [
    "What medication is recommended for hypertensive patients with diabetes?",
    "What is the blood pressure target for elderly patients?",
    "When should I refer a patient to a hypertension specialist?"
]

for question in questions:
    ask_medical_question(question)
    print("-" * 80)

Question: What medication is recommended for hypertensive patients with diabetes?

Answer: For hypertensive patients with diabetes, ACE inhibitors or ARBs are preferred as the first-line agents. This recommendation is based on the need to consider the patient's comorbidities when selecting antihypertensive therapy (Pharmacological Therapy section).

Sources referenced:
  Source 1: ### Pharmacological Therapy
- First-line agents include thiazide diuretics, ACE inhibitors, ARBs, and calcium channel blockers
- Initial therapy with ...

  Source 2: ## Initial Assessment
- Comprehensive history and physical examination
- Laboratory testing: fasting blood glucose, complete blood count, lipid profil...

  Source 3: ## Treatment Recommendations

### Lifestyle Modifications (for all patients)
- Dietary Approaches to Stop Hypertension (DASH) eating plan
- Sodium res...

--------------------------------------------------------------------------------
Question: What is the blood pressure target fo

## 7. Extending Your Knowledge Base

In a real-world setting, you might want to load multiple clinical guidelines or research papers. Here's how you could do that:


In [None]:
# This is just example code - don't run this cell unless you have actual files
"""
from langchain_community.document_loaders import PyPDFLoader, TextLoader

# Load clinical guidelines from PDF files
documents = []

# Load a PDF guideline
pdf_loader = PyPDFLoader("diabetes_guidelines_2024.pdf")
documents.extend(pdf_loader.load())

# Load a text guideline
text_loader = TextLoader("heart_failure_protocol.txt")
documents.extend(text_loader.load())

# Split all documents
all_splits = text_splitter.split_documents(documents)
print(f"Loaded and split {len(all_splits)} document chunks")

# Index in vector database
vector_store = FAISS.from_documents(documents=all_splits, embedding=embeddings)
"""

## 8. Conclusion and Next Steps

Congratulations! You've built a basic RAG system for clinical decision support that:
- Takes medical questions in natural language
- Retrieves relevant information from clinical guidelines
- Generates evidence-based answers with citations

**Potential improvements:**
- Add more clinical guidelines and research papers
- Implement advanced retrieval techniques like hybrid search
- Enable conversation history for follow-up questions
- Add the ability to upload documents through a user interface
- Implement fact-checking or validation of the generated answers


## 9. Ethical Considerations in Healthcare AI

When deploying AI systems in healthcare, consider:
- Patient privacy and data security
- Transparency about AI use with patients and colleagues
- Clinical validation of system outputs
- Clear human oversight and responsibility
- Regular auditing and monitoring of system performance
- Local regulatory compliance

Remember: AI tools should augment, not replace, clinical judgment.