## Medical Research Agent

# Overview

The Medical Research Agent is an AI-powered system designed to process complex medical text and provide clear and structured insights.

It can help with understanding:

Medical Conditions- diseases, disorders and related issues

Medicines- drugs, treatments and their uses

Symptoms- what they may indicate and when to be cautious

Treatments- available options and important considerations

# How It Works

Text Understanding- The agent simplifies complex medical text into clear summaries.

Information Extraction- It identifies symptoms, causes, and treatment details from the input.

Web Retrieval (RAG)- The system fetches verified medical definitions using AI-powered web search.

Structured Report Generation- All information is combined into a clean and easy-to-read medical report.

# Medical Disclaimer

This tool is for educational and informational purposes only.
It should not be used as a substitute for professional medical advice, diagnosis, or treatment.
Always consult a qualified healthcare provider for medical concerns.

## Installation

In this step, we will install all the required packages:

In [2]:
%%capture --no-stderr
%pip install --quiet -U langgraph langchain_openai langchain_community langchain_core tavily-python langchain-tavily wikipedia

## Setting up the API Keys

So for this project, we will need these 2 API keys:

1. **OpenAI API Key**: I got this from https://platform.openai.com/api-keys
2. **Tavily API Key**: I got this from https://tavily.com

In [3]:
import os
import getpass

def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")

_set_env("OPENAI_API_KEY")
_set_env("TAVILY_API_KEY")

OPENAI_API_KEY: ··········
TAVILY_API_KEY: ··········


## Importing Dependencies


In [4]:
from typing import List, Annotated
from typing_extensions import TypedDict
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import MemorySaver
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage, get_buffer_string
from langchain_tavily import TavilySearch
from langchain_community.document_loaders import WikipediaLoader
from IPython.display import display, HTML, Image
import operator

# Initialize LLM
llm = ChatOpenAI(model="gpt-4o", temperature=0)

# Initialize Tavily Search
tavily_search = TavilySearch(max_results=3)

print("All dependencies are loaded successfully!")

All dependencies are loaded successfully!


## Define Medical Analyst Models

We will create specialized medical analysts:

In [5]:
class MedicalAnalyst(BaseModel):
    """Medical specialist analyst"""
    affiliation: str = Field(description="Medical affiliation or specialty")
    name: str = Field(description="Name of the medical analyst")
    role: str = Field(description="Medical role or specialty area")
    description: str = Field(description="Focus area, concerns, and medical expertise")

    @property
    def persona(self) -> str:
        return f"Name: {self.name}\nRole: {self.role}\nAffiliation: {self.affiliation}\nDescription: {self.description}\n"

class MedicalPerspectives(BaseModel):
    analysts: List[MedicalAnalyst] = Field(
        description="List of medical analysts with their specialties"
    )

class GenerateAnalystsState(TypedDict):
    topic: str  # Medical topic or condition
    max_analysts: int  # Number of analysts
    human_analyst_feedback: str  # Human feedback
    analysts: List[MedicalAnalyst]  # Generated analysts

print("Medical analyst models defined!")

Medical analyst models defined!


## Create Medical Analysts

Generating specialized medical analysts for different aspects of the condition:

In [6]:
analyst_instructions = """You are tasked with creating a set of medical specialist personas to research a health topic.

1. Review the medical topic: {topic}

2. Consider any feedback: {human_analyst_feedback}

3. Determine the most important medical perspectives (symptoms, treatments, prevention, prognosis, causes, etc.)

4. Create {max_analysts} medical specialists, each focusing on a different aspect.

Example specialists:
- Symptomatologist (focuses on symptoms and diagnosis)
- Treatment Specialist (focuses on treatment options)
- Prevention Expert (focuses on prevention and risk factors)
- Pharmacologist (focuses on medications)
"""

def create_medical_analysts(state: GenerateAnalystsState):
    """Create medical analyst personas"""
    topic = state['topic']
    max_analysts = state['max_analysts']
    human_analyst_feedback = state.get('human_analyst_feedback', '')

    structured_llm = llm.with_structured_output(MedicalPerspectives)
    system_message = analyst_instructions.format(
        topic=topic,
        human_analyst_feedback=human_analyst_feedback,
        max_analysts=max_analysts
    )

    analysts = structured_llm.invoke([
        SystemMessage(content=system_message),
        HumanMessage(content="Generate the medical specialist analysts.")
    ])

    return {"analysts": analysts.analysts}

def should_continue(state: GenerateAnalystsState):
    """Check if we should continue or end"""
    if state.get('human_analyst_feedback', None):
        return "create_analysts"
    return END

# Build analyst generation graph
builder = StateGraph(GenerateAnalystsState)
builder.add_node("create_analysts", create_medical_analysts)
builder.add_edge(START, "create_analysts")
builder.add_conditional_edges("create_analysts", should_continue, ["create_analysts", END])

memory = MemorySaver()
analyst_graph = builder.compile(checkpointer=memory)

print("Medical analyst generation system is ready!")

Medical analyst generation system is ready!


## Medical Interview System

Setting up the interview system where the analysts can collect information from medical experts:

In [7]:
from langgraph.graph import MessagesState

class InterviewState(MessagesState):
    max_num_turns: int  # Number of interview turns
    context: Annotated[list, operator.add]  # Web search results
    analyst: MedicalAnalyst  # The analyst
    interview: str  # Interview transcript
    sections: list  # Final sections

class SearchQuery(BaseModel):
    search_query: str = Field(description="Medical search query for web research")

print("Interview state models are defined!")

Interview state models are defined!


## Generation of Questions

The Analysts will generate many relevant questions to ask medical experts:

In [8]:
question_instructions = """You are a medical analyst interviewing an expert about a health topic.

Your goal is to gather specific, evidence-based medical insights.

1. Focus on: {goals}

2. Ask specific questions about:
   - Clinical evidence and research
   - Treatment efficacy and safety
   - Patient outcomes
   - Current medical guidelines

3. Avoid vague questions - be specific and clinical

Begin by introducing yourself, then ask your medical question.

When satisfied, end with: "Thank you so much for your help!"
"""

def generate_question(state: InterviewState):
    """Generate analyst question"""
    analyst = state["analyst"]
    messages = state["messages"]

    system_message = question_instructions.format(goals=analyst.persona)
    question = llm.invoke([SystemMessage(content=system_message)] + messages)

    return {"messages": [question]}

print("The Question generation process is ready!")

The Question generation process is ready!


## Web Search Functions

Now, we will search the web and Wikipedia for getting medical information:

In [10]:

search_instructions = SystemMessage(content="""Given a medical conversation, generate a search query to find evidence-based medical information.

Focus on:
- Peer-reviewed medical sources
- Clinical guidelines
- Medical research
- Reputable health organizations

Create a precise medical search query based on the conversation.""")

def search_web(state: InterviewState):
    """Search web for medical information"""
    structured_llm = llm.with_structured_output(SearchQuery)
    search_query = structured_llm.invoke([search_instructions] + state['messages'])

    # Perform web search
    try:
        search_results = tavily_search.invoke(search_query.search_query)

        # Handle different response formats
        if isinstance(search_results, list):
            search_docs = search_results
        elif isinstance(search_results, dict):
            search_docs = search_results.get("results", [])
        else:
            search_docs = []

        # Format results
        if search_docs:
            formatted_search_docs = "\n\n---\n\n".join([
                f'<Document href="{doc.get("url", "N/A")}"/>\n{doc.get("content", doc.get("snippet", ""))}\n</Document>'
                for doc in search_docs
                if isinstance(doc, dict)
            ])
        else:
            formatted_search_docs = "No search results found."

    except Exception as e:
        print(f"Search error: {e}")
        formatted_search_docs = f"Search error occurred: {str(e)}"

    return {"context": [formatted_search_docs]}

def search_wikipedia(state: InterviewState):
    """Search Wikipedia for medical information"""
    structured_llm = llm.with_structured_output(SearchQuery)
    search_query = structured_llm.invoke([search_instructions] + state['messages'])

    try:
        # Search Wikipedia
        search_docs = WikipediaLoader(query=search_query.search_query, load_max_docs=2).load()

        # Format results
        if search_docs:
            formatted_search_docs = "\n\n---\n\n".join([
                f'<Document source="{doc.metadata.get("source", "Wikipedia")}" page="{doc.metadata.get("page", "")}"/>\n{doc.page_content}\n</Document>'
                for doc in search_docs
            ])
        else:
            formatted_search_docs = "No Wikipedia results found."

    except Exception as e:
        print(f"Wikipedia search error: {e}")
        formatted_search_docs = f"Wikipedia search error: {str(e)}"

    return {"context": [formatted_search_docs]}

print(" Fixed search functions loaded!")
print(" Now I will need to rebuild the interview graph - run the 'Build Interview Graph' cell again!")

 Fixed search functions loaded!
 Now I will need to rebuild the interview graph - run the 'Build Interview Graph' cell again!
