<a href="https://colab.research.google.com/github/IyadSultan/AI_pediatric_oncology/blob/main/05_Healthcare_Triage_Bot_with_LangChain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Building a Healthcare Triage Bot with LangChain and LangGraph

This tutorial focuses on on using LangChain and LangGraph for healthcare applications. We’ll build a mini smart triage chatbot that leverages OpenAI’s API (e.g., GPT-4) to assess patient symptoms and guide care. By the end, you’ll know how to set up LangChain, perform retrieval-augmented generation (RAG) over clinical notes, use simple agents with tools, and create branching logic with LangGraph.

Why LangChain & LangGraph for Healthcare AI?

In healthcare, AI assistants can improve triage accuracy and efficiency, aiding clinicians and patients alike. Large language models like GPT-4 have shown diagnostic and triage performance comparable to physicians. However, raw LLMs can hallucinate or give inconsistent answers, which is risky in medicine. LangChain helps by providing tools and abstractions to integrate LLMs with knowledge bases and reasoning steps, making chatbots and retrieval systems easier to build. LangGraph, on the other hand, is a framework for controlling and orchestrating agent workflows, allowing complex decision trees (like triage flows) with reliability (e.g., human-in-the-loop and moderation).

Use cases: Imagine a chatbot that can answer patient FAQs by pulling info from medical documents (RAG), or a symptom checker that asks questions and decides “ER vs home care.” These require the LLM to use external knowledge and follow a decision logic:
- Clinical Triage: Identify emergency symptoms vs. routine issues.
- Symptom Checking: Ask follow-up questions and narrow down possibilities.
- Patient FAQ: Retrieve answers from hospital policy documents or medical literature.

LangChain + LangGraph give us the tools to build these safely and effectively by grounding outputs in data and steering the conversation flow. Exercise:

Think of a scenario where an AI assistant could help in healthcare (e.g., medication information, scheduling). How might an LLM plus external data improve the outcome? Jot down one idea before moving on.

# Setup and Hello World (Colab Friendly)
We’ll use Google Colab for this workshop so everyone can code along easily. Before coding, ensure you have an OpenAI API key ready (from your OpenAI account).



## Step 1:
Pip installs. In a Colab cell, install the required libraries:

In [None]:
# !pip install langchain langgraph openai chromadb sentence-transformers
# !pip install langchain_openai
!pip install langchain_community

**This installs:**

- LangChain (LLM framework)
- LangGraph (agent orchestration)
- OpenAI (to call GPT models)
- ChromaDB and sentence-transformers (for RAG vector database & embeddings)


## Step 2
Setup API Key. Set your OpenAI API key so LangChain can use it. In Colab, you can use getpass to input securely:

In [None]:
import os
import getpass
os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API Key:")

Alternatively, you could directly assign os.environ["OPENAI_API_KEY"] = "sk-..." but avoid hardcoding keys in shared notebooks.

## Step 3
Quick “Hello World”. Let’s verify everything works by making a simple completion call using LangChain’s OpenAI wrapper:

In [None]:
from langchain.chat_models import init_chat_model

llm = init_chat_model("gpt-4o-mini", model_provider="openai", temperature=0.7)
response = llm.invoke("Hello, how can AI assist in healthcare?")
print(response)

Run the cell. You should see a polite response from the model about AI in healthcare. Congratulations! You’ve made your first LangChain LLM call. Exercise: Try changing the prompt to ask a medical question (e.g., “What are symptoms of flu?”) and observe the answer. Also, experiment with temperature=0.7 to see how the reply varies (higher temperature yields more varied, creative outputs).

# 1. LangChain Basics: Prompt Templates & Simple LLM Calls
Now that we have a basic LLM call working, let’s explore LangChain’s features for structuring prompts and managing outputs. Prompt Templates: In LangChain, you can create templates with placeholders. This is useful in healthcare – for example, to enforce a format or inject context.

In [None]:
from langchain import PromptTemplate, LLMChain

template = "You are a helpful medical assistant. Answer briefly:\nQuestion: {question}\nAnswer:"
prompt = PromptTemplate(input_variables=["question"], template=template)

chain = LLMChain(llm=llm, prompt=prompt)
query = "What is the normal range for adult body temperature?"
print(chain.run(question=query))


Here we defined a prompt that always prefixes the role (“helpful medical assistant”) and instructs to answer briefly. The {question} gets filled with our query. The LLMChain sends this formatted prompt to the LLM. LangChain handles the prompt injection, so you focus on content. You could easily swap in another question or change the template to adjust style (e.g., more formal, or providing additional context like patient age if needed). Controlling Outputs: We already used temperature to reduce randomness. You can also set max_tokens to limit response length and top_p to control diversity. In critical apps like triage, deterministic and concise outputs are often preferable (e.g. temperature near 0 for consistency). Also consider tokens and cost: Each token (word piece) used costs money and context. Monitoring token usage ensures the bot stays within limits (like OpenAI’s 4k/8k token context windows). LangChain’s LLMResult often includes token usage info, or you could wrap the LLM in a callback to log usage. We won’t deep-dive here, but keep an eye out for .usage in responses or use LangChain’s debugging tools later. Exercise: Modify the prompt template to include a patient’s name (e.g., “Patient: John Doe. Symptom: headache...”). Run the chain again. See how the answer changes when more context is given. This practice of prompt engineering helps tailor outputs.

# 2. Retrieval-Augmented Generation (RAG) with Clinical Notes
One powerful feature for healthcare apps is Retrieval-Augmented Generation (RAG). This means the LLM can pull in information from a knowledge base (e.g., clinical guidelines, patient records) before answering. RAG helps ensure factual, up-to-date answers and reduces hallucination. Scenario: Let’s say we have a short clinical note or a medical guideline that the bot should reference. We’ll simulate a small knowledge base and ask questions.#

## Step 1

Prepare Documents. For simplicity, we’ll create a small text about triage guidelines:

In [None]:
from langchain.docstore.document import Document

text = """Chest Pain Triage Guidelines:
- If chest pain is severe, sudden, or accompanied by shortness of breath or dizziness, consider it an emergency and seek immediate care.
- If chest pain is mild and brief, and especially if related to breathing or cough, it might be musculoskeletal. Advise routine check-up.
"""
docs = [Document(page_content=text)]


In a real case, you might load PDFs or multiple files using DirectoryLoader or TextLoader. Here we directly create one Document.

## Step 2
Create Vector Store (Embedding + Index). We will embed the document text into a vector space so we can retrieve it by similarity to a query. We use a lightweight embedding model and ChromaDB as vector store:

In [None]:
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma

# Use a small, fast embedding model
embedding = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vectorstore = Chroma.from_documents(docs, embedding)


This turns our docs into a searchable vector index. In practice, you could use OpenAI embeddings or a larger model for better accuracy, but this is fine for demonstration.

## Step 3
RAG Chain (RetrievalQA). LangChain provides high-level chains to do retrieval + LLM Q&A. We’ll use RetrievalQA to connect our vector store with an LLM:

In [None]:
from langchain.chains import RetrievalQA
from langchain.chat_models import init_chat_model

# Use a slightly higher temperature for variety in explanation
qa_llm = init_chat_model("gpt-4o-mini", model_provider="openai", temperature=0.7)
qa_chain = RetrievalQA.from_chain_type(llm=qa_llm, chain_type="stuff", retriever=vectorstore.as_retriever())

question = "I have mild chest pain when I cough. Should I go to the ER?"
result = qa_chain.run(question)
print(result)


Here’s what happens under the hood:

- vectorstore.as_retriever() takes our Chroma DB and allows the chain to query it.
- The question is used to fetch relevant doc chunks (our guidelines).
- The LLM sees the retrieved text + question, then generates an answer grounded in the text.

If all went well, the answer should mention that mild pain with cough is likely not emergency and a routine check might suffice, referencing the guideline. Notice we used chain_type="stuff" (meaning it will just “stuff” the retrieved text into the prompt). For larger docs, LangChain can do more sophisticated combine strategies (map-reduce, refine), but that’s beyond our scope today. Why RAG? In healthcare especially, grounding answers in a trusted text is crucial.

RAG ensures the LLM’s output is factual and traceable. It’s great for patient FAQs (e.g., “What’s the hospital’s visiting policy?”) where answers must quote official info.

*Exercise*: Try adding another bullet in the text (e.g., advice about chest pain that worsens with exercise). Then ask a question related to that. See if the chain retrieves the new info. This shows how updating the knowledge base can instantly change the bot’s answers – a big win for maintainability.

## 3. Simple Agents and Tools (e.g., Medical Calculator)

LangChain Agents allow an LLM to decide if and when to use a tool to help answer a query. Tools can be things like search engines, calculators, or custom functions. For a healthcare bot, tools might include:
A calculator for medical scores (BMI, medication dose, etc.).
A database lookup for clinic hours or doctor information.
Calling an API (e.g., for drug interaction info).

In our workshop, we’ll implement a small custom tool and use an agent to invoke it. Example tool: BMI calculator. If a user asks “What is my BMI if I weigh 70kg and am 1.75m tall?”, the bot could calculate that rather than guess.



## Step 1

Define the tool function. A tool is essentially a Python function we expose to the agent:

In [None]:
def calc_bmi(query: str) -> str:
  """Calculate BMI given input "weight,height" in kg and meters."""
  try:
      weight, height = query.split(',')
      bmi = float(weight) / (float(height) ** 2)
      return f"Your BMI is {bmi:.1f}"
  except Exception as e:
      return "Please provide input as 'weight,height'."

# This function expects a string like "70,1.75" and returns the BMI.

## Step 2
Create the Tool object and initialize agent. LangChain has a Tool class to wrap the function with a name and description:

In [None]:
from langchain.agents import Tool, initialize_agent, AgentType

bmi_tool = Tool(
    name="BMI Calculator",
    func=calc_bmi,
    description="Calculates BMI from input 'weight,height'. Example input: '70,1.75'."
)
tools = [bmi_tool]

agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)


We set AgentType.ZERO_SHOT_REACT_DESCRIPTION, which means the agent will use the ReAct framework (Reason + Act) with the tool’s description to decide when to use it. The verbose=True will show the agent’s thought process (which is super insightful during development!).

## Step 3

Query the agent. Now ask a question that requires the tool:

In [None]:
response = agent.run("If I weigh 70 kg and my height is 1.75 m, what is my BMI?")
print(response)


Watch the logs (thanks to verbose=True). You’ll see something like the agent thinking “The user is asking for BMI, I have a tool for that. Let me parse inputs and call the tool…”. It should then output the numeric BMI.

*How it works*: The LLM, guided by LangChain’s agent prompt, will decide to use BMI Calculator tool by outputting an “action” (which LangChain then executes by calling calc_bmi), then resume answering with the tool result. This showcases an AI augmented with a capability – vital for healthcare where calculations and database lookups are common.

*Exercise*: Try tricking the agent: ask it a medical question that doesn’t need the BMI tool (like “What’s the normal heart rate for adults?”). See that it will not use the tool and just answer from its own knowledge. Then ask for BMI as above – it should use the tool. This flexibility is what makes agents powerful.

# 4. LangGraph Basics: Creating Branching Logic

We’ve seen how **LangChain** handles single-turn or sequential tool use, but what if we want a structured multi-step conversation with branching paths?  
This is where **LangGraph** comes in.

LangGraph allows you to define a graph of nodes (each node can be an LLM call or function) and edges (transitions) explicitly, giving you fine control over dialogues and workflows.

[Read more at langchain-ai.github.io](https://langchain-ai.github.io)

## Key Concepts in LangGraph:

- **State**: Carries the conversation or data. Often a list of messages or a custom object.
- **Node**: A step in the workflow (could prompt the LLM, call a tool, etc.).
- **Edge**: A transition from one node to the next. Edges can be conditional (like `if severity == "ER"` go to ER node).
- **Graph**: The overall flowchart connecting nodes.

LangChain integrates well with LangGraph – you can still use LangChain LLMs or tools inside LangGraph nodes.

---

## Example: Simple Graph for Triage

1. **Ask Symptoms Node**:  
   Bot asks, “What symptoms are you experiencing?”

2. **Assess Severity Node (LLM)**:  
   Takes the user’s symptom description and uses GPT-4 to classify severity (e.g., “ER” vs “ROUTINE”).

3. **Branch**:  
   - If "ER", go to **Emergency Advise Node**.
   - If "ROUTINE", go to **Routine Care Advise Node**.

4. **ER Advise Node**:  
   Responds with:  
   _"Your symptoms may be serious. Please visit the ER..."_

5. **Routine Advise Node**:  
   Responds with:  
   _"It looks like this can be managed with a regular doctor visit..."_

---

We will not fully implement the graph in code (which would take more than a few minutes), but here’s how one could do it conceptually in **LangGraph’s Python API**.


In [None]:
from langgraph.graph import StateGraph, START, END
from langchain_core.messages import HumanMessage, AIMessage, SystemMessage
from langchain_openai import ChatOpenAI
from typing import List, Union, TypedDict, Annotated
from typing_extensions import TypedDict
from langgraph.graph.message import add_messages

# Define our state structure
class State(TypedDict):
    messages: Annotated[List[Union[HumanMessage, AIMessage, SystemMessage]], add_messages]

# 1. Initialize graph
graph_builder = StateGraph(State)

# 2. Define nodes:
def ask_symptoms(state):
    return {"messages": [AIMessage(content="What symptoms are you experiencing?")]}

llm = ChatOpenAI(model="gpt-4", temperature=0)

def assess_severity(state):
    # Get the last user message
    user_msg = next((msg.content for msg in reversed(state["messages"])
                     if isinstance(msg, HumanMessage)), "")

    prompt = f"Classify the severity as 'ER' or 'ROUTINE' given the patient's symptoms: {user_msg}"
    response = llm.invoke([HumanMessage(content=prompt)])
    return {"messages": [response]}

def advise_er(state):
    return {"messages": [AIMessage(content="Your symptoms might be serious. Please seek emergency care.")]}

def advise_routine(state):
    return {"messages": [AIMessage(content="Your symptoms seem manageable. Schedule a routine appointment.")]}

# 3. Add nodes to the graph
graph_builder.add_node("ask_symptoms", ask_symptoms)
graph_builder.add_node("assess_severity", assess_severity)
graph_builder.add_node("advise_er", advise_er)
graph_builder.add_node("advise_routine", advise_routine)

# 4. Define edges (the flow)
graph_builder.add_edge(START, "ask_symptoms")
graph_builder.add_edge("ask_symptoms", "assess_severity")

# Conditional branch based on LLM output in state
def route_severity(state):
    # Get the last assistant message
    ai_message = next((msg.content for msg in reversed(state["messages"])
                      if isinstance(msg, AIMessage)), "")

    if "ER" in ai_message.upper():
        return "advise_er"
    else:
        return "advise_routine"

graph_builder.add_conditional_edges(
    "assess_severity",
    route_severity,
    {
        "advise_er": "advise_er",
        "advise_routine": "advise_routine"
    }
)

graph_builder.add_edge("advise_er", END)
graph_builder.add_edge("advise_routine", END)

# 5. Compile the graph
graph = graph_builder.compile()

# Create initial state with user's message
initial_state = {"messages": [HumanMessage(content="I have sudden chest pain and trouble breathing.")]}

# Run the graph
final_state = graph.invoke(initial_state)

# Print all messages in the conversation
for msg in final_state["messages"]:
    if isinstance(msg, HumanMessage):
        print(f"User: {msg.content}")
    elif isinstance(msg, AIMessage):
        print(f"Assistant: {msg.content}")

Here’s your text converted into **Markdown** format:

---

# Conversation Structure and LangGraph Overview

Don’t worry if not every line is clear – the idea is to see how we lay out the conversation structure explicitly.

In **Step 2**, we created an LLM node that will receive the user’s symptoms and produce a severity classification. We then branch to different advice nodes depending on that result.

LangGraph’s power is in this orchestration:  
It ensures the flow is followed reliably (and can even enforce human review at certain nodes if needed).

In fact, LangGraph is designed for **controllability** and **extensibility** in complex agent systems ([langchain-ai.github.io](https://langchain-ai.github.io)), which is perfect for scenarios like healthcare where you need predictable behavior and sometimes oversight.

---

### Example

If we ran the above graph with the input symptoms, **GPT-4** might classify it as “ER” (since chest pain + shortness of breath is serious) and the user would get the ER advice.

For a milder symptom input, it might go to **routine advice**.

---

### Visualization

LangGraph can also visualize graphs (with **Mermaid charts**) – handy for understanding or debugging complex flows.

In a Jupyter environment, one could do:

```python
graph.get_graph().draw_mermaid()
```

to see the flowchart.

In Colab, you might need extra steps to display it, so we skip it here.

---

### Exercise

Consider extending this flow:
- What if the bot should ask one more question for moderate cases?
- How would you insert another node and adjust conditions?

Sketch a quick extension on paper or pseudocode.  
This practice helps solidify how branching logic can be built.

---

Would you also like me to make an even **more polished** version with callouts, quotes, or tables? 🚀

# 5. Mini-Project: Smart Healthcare Triage Bot (LangGraph + LangChain)

---

# Mini-Project: Smart Healthcare Triage Bot

Now for the fun part – bringing it all together.  
In this mini-project, we’ll outline how to build a **"smart healthcare triage bot"** using everything we’ve learned:

- **LangChain** for LLM and tools.
- **LangGraph** for conversation flow.
- **OpenAI’s GPT-4** as our medical brain (you can use **3.5-turbo** if GPT-4 access is limited, though results might be less accurate).

---

## Workflow Recap

The bot will:

1. **Ask** the patient to describe their symptoms.
2. **Use the LLM** to evaluate severity (maybe even list possible conditions, but primarily decide urgency).
3. **Branch into one of two paths**:
    - **Emergency Path**: If severe, advise ER or urgent care.
    - **Routine Care Path**: If not severe, give home care tips or suggest a normal doctor visit.
4. **(Optionally)** Follow up or ask if they need anything else.

---

## Good Practices to Implement

- **Deterministic Output for Decision**:  
  Prompt GPT-4 to respond with a **structured output** (like exactly the word `ER` or `ROUTINE`) to reliably trigger the correct branch.  
  > This is a form of **prompt-based guardrail**.

- **Temperature Control**:  
  Keep **temperature low** for the decision node to avoid randomness.  
  For explanatory nodes, you could allow a bit more creativity — but still stay cautious.

- **Token Management**:  
  Our prompts are short and we have few turns, so we’re safe.  
  In longer chats, use **LangChain’s memory** or **truncation** to stay within limits.

- **Basic Guardrails**:  
  Instruct the LLM clearly:
  - Only make the **ER vs ROUTINE** call.
  - If unsure or unsafe content is detected, **default to advising an actual doctor** (simple safety net).

---

## Coding the Bot

We essentially covered this in **Section 4**.  
In practice, you would refine the `severity_node` prompt to ensure clarity.

For example:

```plaintext
If chest pain, difficulty breathing, serious injury, or other major symptoms are described, classify as "ER".
Otherwise, classify as "ROUTINE".
If unsure, classify as "ER".
Respond ONLY with "ER" or "ROUTINE" — no extra text.
```

---


---

## Next Step After Classification

After getting `"ER"` or `"ROUTINE"`, the next node might actually be **another LLM call** to explain the advice **more verbosely** to the user.

For example:
- An `er_advice_node` could be another `LLMNode` with a prompt like:

```plaintext
Explain to the patient kindly that their symptoms need emergency care and what they should do now.
```

This way:
- **GPT-4** can provide a **detailed** and **kind** response.
- And because we only enter this node after a known "ER" classification, it stays **on track**.

---

## Running the Bot

In a real app:
- You would **run the compiled graph in a loop** interacting with a **user interface** (or chat interface).

For this workshop:
- We **simulate** by providing the user’s input directly.

You can try different symptom inputs to see how it routes:

### Test 1

```python
symptoms = "I have a mild headache and a runny nose."
```
- Likely **ROUTINE** path with gentle home care advice.

### Test 2

```python
symptoms = "Sudden numbness on one side of face and confusion."
```
- Likely **ER** path because those sound like **stroke symptoms**.

---

## Exercise

Discuss with a partner or reflect:

- What other **branches** could be useful in a triage bot?
- Maybe a **third branch** for:
  - **Advice to call a nurse line**, or
  - **Integrating a FAQ retrieval** if the user asks an unrelated question.

**Consider:**  
How might you integrate a **retrieval tool** into the graph?  
*(Hint: LangChain’s vector search as a node.)*

---


---

# 6. Best Practices & Next Steps

Before we conclude, let’s summarize some best practices when building LLM-powered healthcare apps:

---

# Best Practices

- **Prompt Engineering for Clarity**:  
  Be explicit in prompts when you need structured outputs or certain behaviors.  
  Our severity prompt is an example of guiding the model firmly.

- **Keep Temperature Low for Critical Decisions**:  
  A **temperature of 0** (or very low) is ideal when the bot is making a **classification** or any **important determination** — it makes the output **more predictable**.  
  Save creativity for less critical parts (even then, healthcare usually needs a **factual tone**).

- **Monitor Tokens and Responses**:  
  Always track:
  - Length of user input
  - Length of LLM output  
  In **LangChain**, you might use:
  - `ConversationBufferMemory` with a **max token limit**
  - Manually trimming old messages
  Also, leverage OpenAI’s token usage info to log and monitor conversation costs.

- **Add Guardrails**:  
  We added simple guardrails (like **default to ER if uncertain**).  
  In production, add more:
  - Use OpenAI’s **content moderation API** or simple keyword checks.
  - Possibly insert **human review** for edge cases.  
  > *LangGraph is built with human handoff in mind, allowing a human node or approval step* ([source](https://langchain-ai.github.io)).

- **Continuous Evaluation**:  
  - Test your bot with **many scenarios**, across patient demographics, to ensure **fairness**.
  
- **Iterate and Update**:  
  - **Medical guidelines** evolve.
  - **LLMs** evolve.  
  Using **RAG**, updating info is easy — just update your documents.
  If needed, retrain or prompt-tune the model.

> *Example: A UCLA study found GPT-4 had no triage bias by race — but we must stay vigilant.*

---

# Exercise

As a closing thought experiment:

- **Design a small improvement** to the bot.
- Example:  
  - If **ER**, after giving advice, the bot asks:  
    *“Do you want me to call 911 for you?”*
  - How might you implement this?
    - Add another node
    - Use a **Twilio API** tool call

> This is just brainstorming how you can **extend the graph** further — possibilities are endless!

---

# Conclusion

In this hands-on session, we:

- Motivated why **frameworks** like **LangChain** and **LangGraph** are valuable for healthcare AI (combining LLM smarts with tools and control).
- Set up a **Colab environment** for rapid development.
- Explored **LangChain basics** with prompt templates and LLM calls.
- Implemented **Retrieval-Augmented Generation (RAG)** to ground answers in clinical documents.
- Used **LangChain Agents** to give our bot **tool-using abilities** (e.g., BMI calculator).
- Learned the fundamentals of **LangGraph** to create **branching conversational flows**.
- Built a **mini triage bot** that guides patients to the right care, showing how **GPT-4 + careful design** can triage accurately.

---

By pacing these steps over an hour with interactive coding and exercises, you should now have a **solid foundation** to start building your own **healthcare AI applications**.

**The combo of LangChain + LangGraph** gives you both:

- **High-level convenience**
- **Low-level control**

A **powerful mix** for sensitive domains like healthcare.

---

## Next Steps

- Continue experimenting!
- Ideas:
  - Hook up a **larger medical knowledge base**.
  - Integrate **speech-to-text** for **voice-based triage**.
  - Deploy your bot on a **simple web app**.

The skills you learned — **prompt design**, **RAG**, **agents**, **graphs** — will apply to **many projects beyond healthcare** too.

---

# Thank You! 🎉

Feel free to share your mini-projects or ask questions.  
Together, let’s build **AI that truly helps people**.

---

## Sources

- **LangChain**: Open-source framework for integrating LLMs with external data, memory, and tools.
- **GPT-4**: Demonstrated triage and diagnostic accuracy comparable to physicians.
- **Retrieval-Augmented Generation (RAG)**: Grounds outputs in factual, domain-specific knowledge.
- **LangGraph**: Low-level orchestration library for building reliable, controllable agents with custom workflows.
- **OpenAI Temperature Parameter**: Controls randomness; lower = more deterministic (ideal for critical apps).

---

---