<hr style="border:30px solid Firebrick "> </hr>
<hr style="border:2px solid Firebrick "> </hr>

# Agentic Workflow Automation for Northwestern Memorial Hospital
**Author:** Atef Bader, PhD

**Last Edit:** 12/17/2024



## Goals

- Automate Call/Inquiry processing using Langgraph/Langchain with OpenAI
- Use OpenAI to route and answer user's questions directed to different departments represented by different agents for Northwestern Memorial Hospital

<hr style="border:2px solid Firebrick "> </hr>


<img src="attachment:6925f10a-1fae-4385-a348-d427e8a93cf0.png" align="center" width="500"/>


<hr style="border:5px solid orange "> </hr>


In [None]:
'''%%capture --no-stderr
%pip install uv
%uv pip install chromadb==0.4.22
%uv pip install tiktoken==0.9.0
%uv pip install langchain==0.3.20
%uv pip install langchain-community==0.3.10
%uv pip install langchain-openai==0.3.1
%uv pip install langchainhub
%uv pip install langchain-text-splitters==0.3.6
%uv pip install langgraph==0.3.1
%uv pip install openai==1.65.3
%uv pip install PyMuPDF==1.25.3
%uv pip install pypdf==5.3.1
%uv pip install pillow==11.1.0
%uv pip install beautifulsoup4==4.13.3
%uv pip install  mermaid_cli
%uv pip install grandalf'''
    
  

In [None]:
from IPython.display import Image as IPImage
from IPython.display import Image, display

from typing import TypedDict, Optional, List, Dict, Any, Annotated, Tuple, Optional, Literal
from typing_extensions import TypedDict
import operator

from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import MessagesState
from langgraph.graph import StateGraph, START, END

from langgraph.prebuilt import tools_condition, ToolNode
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage, BaseMessage
from langchain_openai import ChatOpenAI

from langgraph.graph.message import add_messages

### NEW
import os
from pathlib import Path
from dotenv import load_dotenv, find_dotenv
import csv
import re
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
import json
from langchain_core.documents import Document

load_dotenv(find_dotenv())

LANGCHAIN_API_KEY = os.getenv("LANGCHAIN_API_KEY")
LANGCHAIN_PROJECT = os.getenv("LANGCHAIN_PROJECT")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
LANGCHAIN_TRACING_V2 = os.getenv("LANGCHAIN_TRACING_V2") == "true"

INPUT_DIR = Path.cwd() / "Input"
OUTPUT_DIR = Path.cwd() / "Output"

print("LANGCHAIN_PROJECT:", LANGCHAIN_PROJECT)
print("LANGCHAIN_TRACING_V2:", LANGCHAIN_TRACING_V2)

# Declare state dictionary structure

In [None]:
# Requirement 1: Define the structure of agent state for the LangGraph
class InquiryState(TypedDict):
    messages: Annotated[List[BaseMessage], add_messages]
    inquiry: str
    next_node: str
    response: Optional[str]

    # routing (NEW)
    intent: Optional[str]

    # retrieval payload (NEW)
    retrieved: Optional[List[Dict[str, Any]]]  # JSON-serializable so easy to log/debug. [{id, text, score, meta}, ...]
    retrieval_confidence: Optional[float]


def _latest_user_inquiry(state: InquiryState) -> str:
    msgs = state.get("messages") or []
    for msg in reversed(msgs):
        if isinstance(msg, HumanMessage):
            if isinstance(msg.content, str):
                return msg.content
            if isinstance(msg.content, list):
                text_parts = [part.get("text", "") for part in msg.content if isinstance(part, dict) and part.get("type") == "text"]
                return " ".join(text_parts).strip()
    return state.get("inquiry", "")

# Creating or loading knowledge base stores

In [None]:
def load_kb_jsonl(path: str, agent_name: str) -> list[Document]:
    docs: list[Document] = []
    for line in Path(path).read_text(encoding="utf-8").splitlines():
        if not line.strip():
            continue
        row = json.loads(line)
        doc_text = f"Q: {row['question']}\nA: {row['answer']}"
        docs.append(
            Document(
                page_content=doc_text,
                metadata={
                    "id": row["id"],
                    "agent": agent_name,
                    "tags": row.get("tags", []),
                    "question": row["question"],
                },
            )
        )
    return docs

# This creates the collection and saves it to disk (persist_directory). Next runs can just load it.
# if changing knowledge base content, delete the ./chroma_kb folder and rebuild to avoid accidentally keeping stale embeddings.

def build_or_load_chroma_collection(
    collection_name: str,
    persist_directory: str,
    documents: list[Document] | None = None,
):
    embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

    store = Chroma(
        collection_name=collection_name,
        persist_directory=persist_directory,
        embedding_function=embeddings,
    )

    # If we provided docs, add them (idempotency: may want to wipe/rebuild during dev)
    if documents:
        # For dev simplicity: wipe and rebuild.
        # (Chroma doesn't have a universal "drop collection" API across all versions;
        # easiest is: delete the persist directory to rebuild cleanly.)
        store.add_documents(documents)

    return store

In [None]:
CARDIOLOGY_KB_PATH = "./kb/cardiology_kb.jsonl"
CHROMA_DIR = "./chroma_kb"

cardio_docs = load_kb_jsonl(CARDIOLOGY_KB_PATH, agent_name="Cardiology")
cardiology_store = build_or_load_chroma_collection(
    collection_name="kb_cardiology",
    persist_directory=CHROMA_DIR,
    documents=cardio_docs,
)

In [None]:

embeddings_fn = OpenAIEmbeddings(model="text-embedding-3-small")

cardiology_store = Chroma(
    collection_name="kb_cardiology",
    persist_directory="./chroma_kb",
    embedding_function=embeddings_fn,
)



In [None]:
# Keyword overlap scorer

def _tokenize(s: str) -> set[str]:
    s = s.lower()
    s = re.sub(r"[^a-z0-9\s]", " ", s)
    return set(t for t in s.split() if len(t) > 2)

def keyword_overlap_score(query: str, text: str) -> float:
    q = _tokenize(query)
    if not q:
        return 0.0
    d = _tokenize(text)
    return len(q & d) / len(q)

### Hybrid retrieval function
grabs top k_vec from Chroma  
rescoring with a weighted mix of:  
- vector rank-based score (simple, stable across distance metrics)
- keyword overlap score  

outputs top k_final + a confidence value

In [None]:
def hybrid_retrieve(
    query: str,
    store: Chroma,
    k_vec: int = 8,
    k_final: int = 3,
    alpha: float = 0.75,   # weight on vector ranking
) -> Tuple[List[Dict[str, Any]], float]:
    """
    Returns: (retrieved_items, confidence)
    retrieved_items: [{id, text, score, meta}, ...]
    confidence: float in [0, 1] (rough heuristic)
    """
    docs = store.similarity_search(query, k=k_vec)

    # Convert to serializable items
    items = []
    for i, d in enumerate(docs):
        text = d.page_content
        meta = d.metadata or {}
        items.append({
            "id": meta.get("id", f"doc_{i}"),
            "text": text,
            "meta": meta,
            "vec_rank": i,  # 0 best
        })

    if not items:
        return [], 0.0

    # Vector rank score: best doc ~1.0, worst ~0.0
    denom = max(1, (len(items) - 1))
    for it in items:
        vec_score = 1.0 - (it["vec_rank"] / denom)
        kw_score = keyword_overlap_score(query, it["text"])
        it["score"] = alpha * vec_score + (1 - alpha) * kw_score

    items.sort(key=lambda x: x["score"], reverse=True)
    top = items[:k_final]

    # Confidence heuristic: best score + gap to 2nd
    best = top[0]["score"]
    second = top[1]["score"] if len(top) > 1 else 0.0
    confidence = max(0.0, min(1.0, best * 0.85 + (best - second) * 0.15))

    return top, confidence

### Cardiology retrieval + answer + guardrail

In [None]:
CARDIOLOGY_CONF_THRESHOLD = 0.62  # tune later

# retrieval
def retrieve_cardiology(state: InquiryState) -> InquiryState:
    inquiry = _latest_user_inquiry(state)
    retrieved, conf = hybrid_retrieve(inquiry, cardiology_store, k_vec=8, k_final=3)

    print("Retrieved docs:", len(retrieved))
    if retrieved:
        print("Top doc id:", retrieved[0]["id"])
        print("Top doc preview:", retrieved[0]["text"][:120])

    return {
        **state,
        "inquiry": inquiry,
        "retrieved": retrieved,
        "retrieval_confidence": conf,
    }

# conditional edge guardrail
def cardiology_gate(state: InquiryState) -> Literal["answer_cardiology", "cardiology_fallback"]:
    conf = state.get("retrieval_confidence") or 0.0
    if conf >= CARDIOLOGY_CONF_THRESHOLD and state.get("retrieved"):
        return "Cardiology_Answer"
    return "Cardiology_Fallback"

# answer generation
def answer_cardiology(state: InquiryState) -> InquiryState:
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

    retrieved = state.get("retrieved") or []
    context = "".join(
        [f"[{r['id']}]\n{r['text']}" for r in retrieved]
    )

    system = SystemMessage(content=('''
        You are the Cardiology assistant for a hospital call center.
        Answer the user's inquiry using ONLY the provided context snippets.
        If the answer is not explicitly supported by the context, say you don't have enough information and ask 1 clarifying question.
        "When you use information from a snippet, cite it like [snippet_id].'''
    ))

    human = HumanMessage(content=[{
        "type": "text",
        "text": f"User inquiry: {state['inquiry']}\n\nContext:\n{context}",
    }])

    resp = llm.invoke([system, human]).content.strip()
    final_response = "Cardiology:: " + resp

    return {
        **state,
        "response": final_response,
        "next_node": END,
        "messages": [AIMessage(content=final_response)],
    }

# fallback answer generation
def cardiology_fallback(state: InquiryState) -> InquiryState:
    # Keep this deterministic and safe
    final_response = (
        "Cardiology:: I?m not finding a confident match in our cardiology FAQ for that question. "
        "Can you share a bit more detail?are you asking about (1) scheduling/appointments, "
        "(2) a specific test or procedure, or (3) symptoms and when to seek urgent care?"
    )
    return {
        **state,
        "response": final_response,
        "next_node": END,
        "messages": [AIMessage(content=final_response)],
    }

In [None]:
'''import os, getpass

def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")

_set_env("OPENAI_API_KEY")'''


In [None]:
def operator_router(state):

    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
    inquiry = _latest_user_inquiry(state)

    query = f"""Classify the user's intents based on the following input: '{inquiry}'.
            List of possible intent values: Greeting, GeneralInquiry, ER, Radiology, PrimaryCare, Cardiology, Pediatrics, BillingInsurance
            Return only the intent value of the inquiry identified with no extra text or characters"""

    human_message = HumanMessage(
        content=[
            {"type": "text", "text": query},
        ],
    )

    system_message = SystemMessage(content="You are a helpful assistant tasked with classifying the intent of user's inquiry")

    response = llm.invoke([system_message] + [human_message])
    intent = response.content.strip()

    intent_lower = intent.lower()

    if "greeting" in intent_lower:
            greeting = "Hello there, This is Northwestern Memorial Hospital, How can I assist you today?"
            return {
                **state,
                "inquiry": inquiry,
                "intent": "Greeting",
                "next_node": END,
                "response": greeting,
                "messages": [AIMessage(content=greeting)],
            }
    if "generalinquiry" in intent_lower:
        general_response = "For general information about nearby parking, hotels and restaurants, please visit https://www.nm.org/ and navigate to Patients & Visitors link "
        return {
            **state,
            "inquiry": inquiry,
            "intent": "GeneralInquiry",
            "next_node": END,
            "response": general_response,
            "messages": [AIMessage(content=general_response)],
        }

    # Otherwise route
    return {
        **state,
        "inquiry": inquiry,
        "intent": intent,
        "next_node": intent,  # e.g., "Cardiology"
        "response": None,
    }

In [None]:
def er_agent(state):

    knowledge_base = """

    "inquiry": "Should I go to the ER or urgent care?",
    "response": "Go to the ER for chest pain, stroke symptoms, severe injuries, heavy bleeding, or difficulty breathing. Urgent care is appropriate for minor injuries or mild illnesses.",

    "inquiry": "What should I bring to the ER?",
    "response": "Bring a photo ID, insurance card, list of medications, allergies, and any relevant medical history if available.",

    "inquiry": "How long is the wait time?",
    "response": "Patients are treated based on medical urgency. Critical cases are seen first, so wait times vary.",

    "inquiry": "Will I be admitted to the hospital?",
    "response": "Admission depends on your diagnosis and condition. The ER physician will determine if inpatient care is required.",

    "inquiry": "Can someone stay with me in the ER?",
    "response": "Visitor policies depend on hospital guidelines and patient condition. Check with staff upon arrival."

    """

    print("ER KNOWLEDGE-BASE IS EMPTY")
    final_response = "ER: YOU NEED TO ADD-YOUR-KNOWLEDGE-BASE"
    return {"inquiry": _latest_user_inquiry(state), "next_node": END, "response": final_response, "messages": [AIMessage(content=final_response)]}

In [None]:
def radiology_agent(state):

    radiology_knowledge_base = """

    "inquiry": "How do I prepare for a CT scan?",
    "response": "Follow any fasting instructions provided. Inform staff about allergies, especially to contrast dye, and disclose pregnancy if applicable.",

    "inquiry": "Is radiation from X-rays safe?",
    "response": "X-rays use low levels of radiation and are generally safe. Technicians take precautions to minimize exposure.",

    "inquiry": "Do I need contrast for my MRI?",
    "response": "Some MRIs require contrast to improve image clarity. Your provider will determine if it is necessary.",

    "inquiry": "How long does imaging take?",
    "response": "Most X-rays take 10?15 minutes, while CT or MRI scans may take 30?60 minutes depending on the study.",

    "inquiry": "How will I receive my results?",
    "response": "Results are reviewed by a radiologist and sent to your ordering provider, who will discuss findings with you."

    """

    print("Radiology KNOWLEDGE-BASE IS EMPTY")
    final_response = "Radiology: YOU NEED TO ADD-YOUR-KNOWLEDGE-BASE"
    return {"inquiry": _latest_user_inquiry(state), "next_node": END, "response": final_response, "messages": [AIMessage(content=final_response)]}
    

In [None]:
def primary_care_agent(state):

    knowledge_base = """

    "inquiry": "How often should I schedule a physical exam?",
    "response": "Adults should have a routine physical annually or as recommended based on age and health conditions.",

    "inquiry": "Can I get lab work done during my visit?",
    "response": "Yes, many routine labs can be performed in-office or ordered through an affiliated laboratory.",

    "inquiry": "Do you provide vaccinations?",
    "response": "Yes, we offer routine adult immunizations including flu, COVID-19, tetanus, and other recommended vaccines.",

    "inquiry": "How do I request a specialist referral?",
    "response": "Discuss your symptoms with your Primary Care provider, who can evaluate and issue a referral if needed.",

    "inquiry": "Can I discuss multiple concerns in one appointment?",
    "response": "Yes, but complex issues may require additional appointments to ensure adequate time for evaluation."

    """

    print("Primary Care KNOWLEDGE-BASE IS EMPTY")
    final_response = "Primary Care: YOU NEED TO ADD-YOUR-KNOWLEDGE-BASE"
    return {"inquiry": _latest_user_inquiry(state), "next_node": END, "response": final_response, "messages": [AIMessage(content=final_response)]}
    

In [None]:
def cardiology_agent(state):

    knowledge_base = """
        "inquiry": "Do you have any available appointments with a cardiologist next week?",
         "response": "Appointment availability varies. New patients typically need a referral. Provide the exact date you are looking for so we can check for availability",

        "inquiry": "What tests are done during a heart check-up?",
         "response": "Standard tests include EKG, blood pressure, cholesterol screening, and physical exam. Additional tests ordered as needed.",

        "inquiry": "How should I prepare for a stress test?",
         "response": "Wear comfortable clothes and walking shoes. Avoid caffeine and heavy meals before the test. Bring a list of medications.",

        "inquiry": "What do you recommend to watch for to see if I have signs of heart problems?",
         "response": "Watch for chest pain, shortness of breath, irregular heartbeat, fatigue, and swelling in legs. Go to ER for severe symptoms."},

        "inquiry": "Do you offer heart screenings?",
         "response": "Yes, we provide preventive screenings including calcium scoring, cholesterol tests, and blood pressure monitoring.",

        """

    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

    inquiry = _latest_user_inquiry(state)
    query = f"""Provide an answer for following user's inquiry: '{inquiry}' using the knowledge_base"""

    human_message = HumanMessage(
        content=[
            {"type": "text", "text": query},
        ],
    )

    system_message = SystemMessage(content=f"You are a helpful assistant tasked with answering user's inquiry based on the answers you have in this knowledge_base only: {knowledge_base}")

    response = llm.invoke([system_message] + [human_message])
    formatted_response = "Cardiology:: " + response.content.strip()


    return {"inquiry": inquiry, "next_node": END, "response": formatted_response, "messages": [AIMessage(content=formatted_response)]}


In [None]:
def pediatrics_agent(state):

    pediatrics_knowledge_base = """

    "inquiry": "Is my child?s cough or cold serious?",
    "response": "Most colds are viral and resolve within 7?10 days. Seek care if there is high fever, breathing difficulty, or worsening symptoms.",

    "inquiry": "What vaccines does my child need?",
    "response": "Vaccinations follow CDC-recommended schedules based on age. We can review your child?s immunization record during the visit.",

    "inquiry": "What is the correct medication dose for my child?",
    "response": "Medication dosing depends on weight and age. Always follow provider instructions and avoid adult medications unless directed.",

    "inquiry": "Are developmental milestones on track?",
    "response": "We assess growth and developmental milestones at well-child visits and address any concerns early.",

    "inquiry": "When should I take my child to the ER?",
    "response": "Go to the ER for difficulty breathing, seizures, severe dehydration, uncontrolled fever in infants, or serious injury."

    """

    print("Pediatrics KNOWLEDGE-BASE IS EMPTY")
    final_response = "Pediatrics: YOU NEED TO ADD-YOUR-KNOWLEDGE-BASE."
    return {"inquiry": _latest_user_inquiry(state), "next_node": END, "response": final_response, "messages": [AIMessage(content=final_response)]}


In [None]:
def billing_agent(state):

    knowledge_base = """

    "inquiry": "Is my visit covered by insurance?",
    "response": "Coverage depends on your specific plan. Contact your insurer or our billing department to verify benefits.",

    "inquiry": "What is a deductible and copay?",
    "response": "A copay is a fixed fee paid at the time of service. A deductible is the amount you pay before insurance begins covering costs.",

    "inquiry": "Why did I receive multiple bills?",
    "response": "You may receive separate bills for facility fees, provider services, or laboratory tests.",

    "inquiry": "How do I update my insurance information?",
    "response": "Provide updated insurance details through the patient portal or contact our billing office directly.",

    "inquiry": "What payment options are available?",
    "response": "We accept credit cards, checks, online payments, and offer payment plans for qualifying balances."

    """

    print("BillingInsurance KNOWLEDGE-BASE IS EMPTY")
    final_response = "BillingInsurance: YOU NEED TO ADD-YOUR-KNOWLEDGE-BASE"
    return {"inquiry": _latest_user_inquiry(state), "next_node": END, "response": final_response, "messages": [AIMessage(content=final_response)]}
    

In [None]:
builder = StateGraph(InquiryState)

builder.add_node("Operator", operator_router)
builder.add_node("ER", er_agent)
builder.add_node("Radiology", radiology_agent)
builder.add_node("PrimaryCare", primary_care_agent)


# Cardiology subgraph nodes
builder.add_node("Cardiology_Retrieve", retrieve_cardiology)
builder.add_node("Cardiology_Answer", answer_cardiology)
builder.add_node("Cardiology_Fallback", cardiology_fallback)


builder.add_node("Pediatrics", pediatrics_agent)
builder.add_node("BillingInsurance", billing_agent)

builder.set_entry_point("Operator")

builder.add_conditional_edges(
    "Operator",
    lambda x: x["next_node"],
    {
        "ER": "ER",
        "PrimaryCare": "PrimaryCare",
        "Pediatrics": "Pediatrics",
        "Radiology": "Radiology",
        "Cardiology": "Cardiology_Retrieve", # route Cardiology intent to the retrieve node
        "BillingInsurance": "BillingInsurance",
        END: END
    }
)

# Cardiology retrieve -> (answer or fallback)
builder.add_conditional_edges(
    "Cardiology_Retrieve",
    cardiology_gate,
    {
        "Cardiology_Answer": "Cardiology_Answer",
        "Cardiology_Fallback": "Cardiology_Fallback",
    }
)

for node in ["ER", "Radiology", "PrimaryCare", "Cardiology_Answer", "Cardiology_Fallback", "Pediatrics", "BillingInsurance"]:
    builder.add_edge(node, END)


checkpointer = MemorySaver()
graph = builder.compile(checkpointer=checkpointer)

In [None]:
display(Image(graph.get_graph().draw_mermaid_png()))


In [None]:
# Sample inquiries
# My child has a fever
# I need help with my medical bill
# Can I visit my friend in the ER?
# Do I need to fast for my scan?
# I want to schedule my cardiology appointment
# I want to see my doctor for my annual exam

thread_id = input("Thread ID (use same ID to continue a conversation): ").strip() or "default-thread"
config = {"configurable": {"thread_id": thread_id}}

while True:
    user_input = input("User: ")
    if user_input.lower() in {"q", "quit"}:
        print("Goodbye!")
        break

    result = graph.invoke({"messages": [HumanMessage(content=user_input)]}, config=config)

    assistant_message = next(
        (m for m in reversed(result.get("messages", [])) if isinstance(m, AIMessage)),
        None,
    )
    response = assistant_message.content if assistant_message else result.get("response", "No Response Returned")
    print(f"\n\nResponse:\n\n{response}\n\n")

<br><br><br>

<hr style="border:30px solid coral "> </hr>
<hr style="border:2px solid coral "> </hr>


# Requirements Specification:

<hr style="border:2px solid coral "> </hr>


### Implementation Requirements:

Provide runs that will demonstrate a fully functional application for every case listed below:
1. The knowledge base for every agent
    - Knowledge Base can be generated by any GenAI model (ChatGPT, Gemini, Claude, etc.)
    - Knowledge Base can be stored in any data structure, file, or vector database
2. Multiturn conversation with every agent (For example, A person called Cardialogy Department asking for cause of their pain then decided to schedule an appointment to see cardialogist)
3. Transactions like booking an appointment or making a payment can be stored in any data structure (DataFrame, Array, List, Dictionary, ...), or file (CSV, JSON, Plaintext)
4. Your Agents must be able to answer EVERY question/inquiry listed below:
    - **ER (Emergency Room)**
        - When should I visit the ER instead of urgent care?
        - How long will I wait to be seen in the ER?
    - **Radiology**
        - How should I prepare for my MRI or CT scan?
        - When and how will I receive my imaging results?
    - **Primary Care**
        - How do I schedule or cancel an appointment?
        - Can I get a same-day visit for urgent issues?
    - **Cardiology**
        - What are common signs that I need to see a cardiologist?
        - What should I expect during a heart stress test?
    - **Pediatrics**
        - What vaccines does my child need at each age?
        - What should I do if my child develops a high fever?
    - **Billing & Insurance**
        - What insurance plans do you accept?
        - How can I view, understand, or pay my bill?
5. My name is Ashley Smith and I want to know the amount I owe you so I can pay it now using my CC.
6. My name is Johnatan Walter , I have an appointment with my doctor scheduled for Tuesday next week at 1:00pm and I want to change it to Thursday morning next week, whaat time slots are available on Thursday?
