<a href="https://colab.research.google.com/github/mrburke00/llm_sandbox/blob/main/langchain_report.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [15]:
%%capture --no-stderr
%pip install --quiet -U langgraph langchain_community langchain_core tavily-python langchain_nvidia_ai_endpoints

In [16]:
import getpass
import os

if not os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):
    nvapi_key = getpass.getpass("Enter your NVIDIA API key: ")
    assert nvapi_key.startswith("nvapi-"), f"{nvapi_key[:5]}... is not a valid key"
    os.environ["NVIDIA_API_KEY"] = nvapi_key

Enter your NVIDIA API key: ··········


In [17]:
import os, getpass

def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")

_set_env("LANGCHAIN_API_KEY")
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "report-mAIstro"

LANGCHAIN_API_KEY: ··········


In [18]:
_set_env("TAVILY_API_KEY")
from tavily import TavilyClient, AsyncTavilyClient
tavily_client = TavilyClient()
tavily_async_client = AsyncTavilyClient()

TAVILY_API_KEY: ··········


In [19]:
## Core LC Chat Interface
from langchain_nvidia_ai_endpoints import ChatNVIDIA

llm = ChatNVIDIA(model="meta/llama-3.3-70b-instruct", temperature=0)
result = llm.invoke("Write a ballad about LangChain.")
print(result.content)

(Verse 1)
In realms of code, where innovators roam
A new dawn broke, with LangChain's noble tone
A framework born, to harness language's might
To build and create, through day and endless night

(Chorus)
Oh LangChain, oh LangChain, a beacon in the land
Guiding developers, hand in hand
Through the vast expanse, of AI's uncharted sea
LangChain shines bright, a star for you and me

(Verse 2)
With LLMs as its heart, and a will to explore
LangChain ventured forth, to unlock the secrets in store
It wove a tapestry, of functions and of art
A symphony of code, that touched the programmer's heart

(Chorus)
Oh LangChain, oh LangChain, a beacon in the land
Guiding developers, hand in hand
Through the vast expanse, of AI's uncharted sea
LangChain shines bright, a star for you and me

(Bridge)
From embeddings to indices, it paved the way
For applications grand, in a new and brighter day
It spoke of possibilities, of a future yet unknown
Where humans and machines, in harmony are sown

(Verse 3)
Thro

In [34]:
import asyncio
from langsmith import traceable
from pydantic import BaseModel, Field

class Section(BaseModel):
    name: str = Field(
        description="Name for this section of the report.",
    )
    description: str = Field(
        description="Brief overview of the main topics and concepts to be covered in this section.",
    )
    research: bool = Field(
        description="Whether to perform web research for this section of the report."
    )
    content: str = Field(
        description="The content of the section."
    )

def deduplicate_and_format_sources(search_response, max_tokens_per_source, include_raw_content=True):
    """
    Takes either a single search response or list of responses from Tavily API and formats them.
    Limits the raw_content to approximately max_tokens_per_source.
    include_raw_content specifies whether to include the raw_content from Tavily in the formatted string.

    Args:
        search_response: Either:
            - A dict with a 'results' key containing a list of search results
            - A list of dicts, each containing search results

    Returns:
        str: Formatted string with deduplicated sources
    """
    # Convert input to list of results
    if isinstance(search_response, dict):
        sources_list = search_response['results']
    elif isinstance(search_response, list):
        sources_list = []
        for response in search_response:
            if isinstance(response, dict) and 'results' in response:
                sources_list.extend(response['results'])
            else:
                sources_list.extend(response)
    else:
        raise ValueError("Input must be either a dict with 'results' or a list of search results")

    # Deduplicate by URL
    unique_sources = {}
    for source in sources_list:
        if source['url'] not in unique_sources:
            unique_sources[source['url']] = source

    # Format output
    formatted_text = "Sources:\n\n"
    for i, source in enumerate(unique_sources.values(), 1):
        formatted_text += f"Source {source['title']}:\n===\n"
        formatted_text += f"URL: {source['url']}\n===\n"
        formatted_text += f"Most relevant content from source: {source['content']}\n===\n"
        if include_raw_content:
            # Using rough estimate of 4 characters per token
            char_limit = max_tokens_per_source * 4
            # Handle None raw_content
            raw_content = source.get('raw_content', '')
            if raw_content is None:
                raw_content = ''
                print(f"Warning: No raw_content found for source {source['url']}")
            if len(raw_content) > char_limit:
                raw_content = raw_content[:char_limit] + "... [truncated]"
            formatted_text += f"Full source content limited to {max_tokens_per_source} tokens: {raw_content}\n\n"

    return formatted_text.strip()

def format_sections(sections: list[Section]) -> str:
    """ Format a list of sections into a string """
    formatted_str = ""
    for idx, section in enumerate(sections, 1):
        formatted_str += f"""
{'='*60}
Section {idx}: {section.name}
{'='*60}
Description:
{section.description}
Requires Research:
{section.research}

Content:
{section.content if section.content else '[Not yet written]'}

"""
    return formatted_str

@traceable
def tavily_search(query):
    """ Search the web using the Tavily API.

    Args:
        query (str): The search query to execute

    Returns:
        dict: Tavily search response containing:
            - results (list): List of search result dictionaries, each containing:
                - title (str): Title of the search result
                - url (str): URL of the search result
                - content (str): Snippet/summary of the content
                - raw_content (str): Full content of the page if available"""

    return tavily_client.search(query,
                         max_results=5,
                         include_raw_content=True)

@traceable
async def tavily_search_async(search_queries, tavily_topic, tavily_days):
    """
    Performs concurrent web searches using the Tavily API.

    Args:
        search_queries (List[SearchQuery]): List of search queries to process
        tavily_topic (str): Type of search to perform ('news' or 'general')
        tavily_days (int): Number of days to look back for news articles (only used when tavily_topic='news')

    Returns:
        List[dict]: List of search results from Tavily API, one per query

    Note:
        For news searches, each result will include articles from the last `tavily_days` days.
        For general searches, the time range is unrestricted.
    """

    search_tasks = []
    for query in search_queries:
        if tavily_topic == "news":
            search_tasks.append(
                tavily_async_client.search(
                    query,
                    max_results=5,
                    include_raw_content=True,
                    topic="news",
                    days=tavily_days
                )
            )
        else:
            search_tasks.append(
                tavily_async_client.search(
                    query,
                    max_results=5,
                    include_raw_content=True,
                    topic="general"
                )
            )

    # Execute all searches concurrently
    search_docs = await asyncio.gather(*search_tasks)

    return search_docs

In [35]:
from typing_extensions import TypedDict
from typing import  Annotated, List, Optional, Literal

class Sections(BaseModel):
    sections: List[Section] = Field(
        description="Sections of the report.",
    )
class SearchQuery(BaseModel):
    search_query: str = Field(
        None, description="Query for web search."
    )
class Queries(BaseModel):
    queries: List[SearchQuery] = Field(
        description="List of search queries.",
    )

In [36]:
import operator

class ReportState(TypedDict):
    topic: str # Report topic
    tavily_topic: Literal["general", "news"] # Tavily search topic
    tavily_days: Optional[int] # Only applicable for news topic
    report_structure: str # Report structure
    number_of_queries: int # Number web search queries to perform per section
    sections: list[Section] # List of report sections
    completed_sections: Annotated[list, operator.add] # Send() API key
    report_sections_from_research: str # String of any completed sections from research to write final sections
    final_report: str # Final report

In [37]:
from langchain_core.messages import HumanMessage, SystemMessage

report_planner_query_writer_instructions="""You are an expert technical writer, helping to plan a report on gene target mutations with an emphasis on clinical, biological, and business perspectives.

The report will be focused on the following topic:
{topic}

The report structure will follow these guidelines:
{report_organization}

Your goal is to generate {number_of_queries} search queries that will help gather comprehensive information for planning the report sections.

The query should:
1. Be directly related to gene target mutations and their clinical, biological, and business implications.
2. Address different facets such as clinical outcomes, molecular biology, and market trends.
3. Help satisfy the requirements specified in the report organization.
4. Be specific enough to locate high-quality, authoritative sources (e.g., peer-reviewed articles, official clinical trial data, market research reports) while covering the breadth needed for the report structure."""


report_planner_instructions="""You are an expert technical writer, helping to plan a report on gene target mutations analyzed from clinical, biological, and business perspectives.

Your goal is to generate the outline of the sections of the report.

The overall topic of the report is:
{topic}

The report should follow this organization:
{report_organization}

You should reflect on this information to plan the sections of the report. Consider that the final report will include:
- An Introduction that sets the context and significance of gene target mutations.
- Main Body Sections for each mutation, with dedicated subsections for Clinical Landscape, Biological Landscape, and Business Landscape.
- A Conclusion that summarizes insights and provides a comparative analysis across these dimensions.

Each section of the outline should have the following fields:
- Name: A concise title for the section.
- Description: A brief overview of the main topics and concepts to be covered.
- Research: Indicate whether this section requires web research (note that the introduction and conclusion generally do not, as they synthesize information from the main sections).
- Content: Leave blank for now.

Ensure your outline clearly differentiates sections that require further research from those that synthesize gathered data."""


async def generate_report_plan(state: ReportState):

    # Inputs
    topic = state["topic"]
    report_structure = state["report_structure"]
    number_of_queries = state["number_of_queries"]
    tavily_topic = state["tavily_topic"]
    tavily_days = state.get("tavily_days", None)

    # Convert JSON object to string if necessary
    if isinstance(report_structure, dict):
        report_structure = str(report_structure)

    # Generate search query
    structured_llm = llm.with_structured_output(Queries)

    # Format system instructions
    system_instructions_query = report_planner_query_writer_instructions.format(topic=topic, report_organization=report_structure, number_of_queries=number_of_queries)

    # Generate queries
    results = structured_llm.invoke([SystemMessage(content=system_instructions_query)]+[HumanMessage(content="Generate search queries that will help with planning the sections of the report.")])

    # Web search
    query_list = [query.search_query for query in results.queries]
    search_docs = await tavily_search_async(query_list, tavily_topic, tavily_days)

    # Deduplicate and format sources
    source_str = deduplicate_and_format_sources(search_docs, max_tokens_per_source=1000, include_raw_content=True)

    # Format system instructions
    system_instructions_sections = report_planner_instructions.format(topic=topic, report_organization=report_structure, context=source_str)

    # Generate sections
    structured_llm = llm.with_structured_output(Sections)
    report_sections = structured_llm.invoke([SystemMessage(content=system_instructions_sections)]+[HumanMessage(content="Generate the sections of the report. Your response must include a 'sections' field containing a list of sections. Each section must have: name, description, plan, research, and content fields.")])

    return {"sections": report_sections.sections}

In [1]:
# Structure
report_structure = """This report type focuses on comparative analysis.

The report structure should include:
1. Introduction (no research needed)
   - Brief overview of the topic area
   - Context for the comparison

2. Main Body Sections:
   - One dedicated section for EACH offering being compared in the user-provided list
   - Each section should examine:
     - Core Features (bulleted list)
     - Architecture & Implementation (2-3 sentences)
     - One example use case (2-3 sentences)

3. No Main Body Sections other than the ones dedicated to each offering in the user-provided list

4. Conclusion with Comparison Table (no research needed)
   - Structured comparison table that:
     * Compares all offerings from the user-provided list across key dimensions
     * Highlights relative strengths and weaknesses
   - Final recommendations"""

In [39]:
# Topic
report_topic = "Give an overview of capabilities and specific use case examples for these processing units: CPU, GPU."

In [40]:
# Tavily search parameters
tavily_topic = "general"
tavily_days = None # Only applicable for news topic

# Generate report plan
sections = await generate_report_plan({"topic": report_topic, "report_structure": report_structure, "number_of_queries": 2, "tavily_topic": tavily_topic, "tavily_days": tavily_days})

# Print sections
for section in sections['sections']:
    print(f"{'='*50}")
    print(f"Name: {section.name}")
    print(f"Description: {section.description}")
    print(f"Research: {section.research}")

Name: Introduction
Description: Provide a brief overview of the topic area, focusing on the importance of understanding CPU and GPU processing units in various applications.
Research: False
Name: CPU
Description: Examine the core features, architecture, and implementation of CPU processing units, along with a specific use case example.
Research: True
Name: GPU
Description: Examine the core features, architecture, and implementation of GPU processing units, along with a specific use case example.
Research: True
Name: Conclusion with Comparison Table
Description: Present a structured comparison table highlighting the relative strengths and weaknesses of CPU and GPU processing units, and provide final recommendations based on the analysis.
Research: False


In [41]:
class SectionState(TypedDict):
    tavily_topic: Literal["general", "news"] # Tavily search topic
    tavily_days: Optional[int] # Only applicable for news topic
    number_of_queries: int # Number web search queries to perform per section
    section: Section # Report section
    search_queries: list[SearchQuery] # List of search queries
    source_str: str # String of formatted source content from web search
    report_sections_from_research: str # String of any completed sections from research to write final sections
    completed_sections: list[Section] # Final key we duplicate in outer state for Send() API

class SectionOutputState(TypedDict):
    completed_sections: list[Section] # Final key we duplicate in outer state for Send() API
from IPython.display import Image, display
from langgraph.graph import START, END, StateGraph

query_writer_instructions="""Your goal is to generate targeted web search queries that will gather comprehensive information for writing a technical report section on gene target mutations from clinical, biological, and business perspectives.

Topic for this section:
{section_topic}

When generating {number_of_queries} search queries, ensure they:
1. Cover different aspects of the topic (e.g., clinical features, biological mechanisms, market trends, and commercial opportunities)
2. Include specific technical and domain-specific terms related to gene mutations (e.g., biomarkers, clinical trials, molecular pathways)
3. Target recent information by including year markers where relevant (e.g., "2024")
4. Look for comparisons or differentiators from similar gene targets or mutations
5. Search for both authoritative scientific literature and practical market analyses

Your queries should be:
- Specific enough to avoid generic results
- Detailed enough to capture in-depth information on clinical, biological, and business aspects
- Diverse enough to cover all facets of the report section
- Focused on authoritative sources (peer-reviewed studies, technical blogs, market research reports)"""


# Section writer instructions
section_writer_instructions = """You are an expert technical writer crafting one section of a technical report on gene target mutations with a focus on clinical, biological, and business landscapes.

Topic for this section:
{section_topic}

Guidelines for writing:

1. Technical Accuracy:
- Include specific data points, metrics, or benchmarks (e.g., clinical trial phases, molecular markers, market growth rates)
- Reference concrete research findings or studies
- Cite official studies or documentation where applicable
- Use technical terminology precisely

2. Length and Style:
- Strict 300-400 word limit
- No marketing language
- Maintain a technical focus
- Write in simple, clear language
- Start with your most important insight in **bold**
- Use short paragraphs (5-6 sentences max)

3. Structure:
- Use ## for the section title (Markdown format)
- Only use ONE structural element IF it helps clarify your point:
  * Either a focused table comparing 2-3 key clinical, biological, or business items (using Markdown table syntax)
  * Or a short list (3-5 items) using proper Markdown list syntax:
    - Use `*` or `-` for unordered lists
    - Use `1.` for ordered lists
    - Ensure proper indentation and spacing
- End with ### Sources that references the below source material formatted as:
  * List each source with title, date, and URL
  * Format: `- Title : URL`

4. Writing Approach:
- Include at least one specific example or case study highlighting a gene target mutation's clinical impact, biological insight, or market performance
- Use concrete details over general statements
- Make every word count
- Do not include any preamble prior to creating the section content
- Focus on your single most important point

5. Use this source material to help write the section:
{context}

6. Quality Checks:
- Exactly 150-200 words (excluding title and sources)
- Careful use of only ONE structural element (table or list) and only if it helps clarify your point
- One specific example / case study
- Starts with a bold insight
- No preamble prior to creating the section content
- Sources cited at the end"""

def generate_queries(state: SectionState):
    """ Generate search queries for a section """

    # Get state
    number_of_queries = state["number_of_queries"]
    section = state["section"]

    # Generate queries
    structured_llm = llm.with_structured_output(Queries)

    # Format system instructions
    system_instructions = query_writer_instructions.format(section_topic=section.description, number_of_queries=number_of_queries)

    # Generate queries
    queries = structured_llm.invoke([SystemMessage(content=system_instructions)]+[HumanMessage(content="Generate search queries on the provided topic.")])

    return {"search_queries": queries.queries}

async def search_web(state: SectionState):
    """ Search the web for each query, then return a list of raw sources and a formatted string of sources."""

    # Get state
    search_queries = state["search_queries"]
    tavily_topic = state["tavily_topic"]
    tavily_days = state.get("tavily_days", None)

    # Web search
    query_list = [query.search_query for query in search_queries]
    search_docs = await tavily_search_async(query_list, tavily_topic, tavily_days)

    # Deduplicate and format sources
    source_str = deduplicate_and_format_sources(search_docs, max_tokens_per_source=5000, include_raw_content=True)

    return {"source_str": source_str}

def write_section(state: SectionState):
    """ Write a section of the report """

    # Get state
    section = state["section"]
    source_str = state["source_str"]

    # Format system instructions
    system_instructions = section_writer_instructions.format(section_title=section.name, section_topic=section.description, context=source_str)

    # Generate section
    section_content = llm.invoke([SystemMessage(content=system_instructions)]+[HumanMessage(content="Generate a report section based on the provided sources.")])

    # Write content to the section object
    section.content = section_content.content

    # Write the updated section to completed sections
    return {"completed_sections": [section]}

# Add nodes and edges
section_builder = StateGraph(SectionState, output=SectionOutputState)
section_builder.add_node("generate_queries", generate_queries)
section_builder.add_node("search_web", search_web)
section_builder.add_node("write_section", write_section)

section_builder.add_edge(START, "generate_queries")
section_builder.add_edge("generate_queries", "search_web")
section_builder.add_edge("search_web", "write_section")
section_builder.add_edge("write_section", END)

# Compile
section_builder_graph = section_builder.compile()

# View
#display(Image(section_builder_graph.get_graph(xray=1).draw_mermaid_png()))

In [42]:
# Test with one section
sections = sections['sections']
test_section = sections[1]
print(f"{'='*50}")
print(f"Name: {test_section.name}")
print(f"Description: {test_section.description}")
print(f"Research: {test_section.research}")

# Run
#report_section = await section_builder_graph.ainvoke({"section": test_section, "number_of_queries": 2, "tavily_topic": tavily_topic, "tavily_days": tavily_days})

from IPython.display import Markdown
section = report_section['completed_sections'][0]
Markdown(section.content)
class ReportStateOutput(TypedDict):
    final_report: str # Final report

Name: CPU
Description: Examine the core features, architecture, and implementation of CPU processing units, along with a specific use case example.
Research: True


In [47]:
from langgraph.constants import Send

final_section_writer_instructions="""You are an expert technical writer crafting a section that synthesizes information from the rest of the report on a gene target mutation from clinical, biological, and business perspectives.

Section to write:
{section_topic}

Available report content:
{context}

1. Section-Specific Approach:

For Introduction:
- Use # for the report title (Markdown format)
- 100-200 word limit
- Write in simple and clear language
- Focus on the core motivation for the report in 1-2 paragraphs
- Introduce the significance of gene target mutations and their impact across clinical, biological, and business domains
- Include NO structural elements (no lists or tables)
- No sources section needed

For Main Body Sections:
- For the given gene targetm utation, create a dedicated subsection.
- Each mutation’s section must include three clearly labeled parts:
  a. **Clinical Landscape:**
     - Start with a bulleted list of key clinical features (e.g., research findings, treatment implications, clinical trial outcomes)
     - Follow with a 4-6 sentence narrative summarizing the clinical impact.
  b. **Biological Landscape:**
     - Begin with a bulleted list of core biological characteristics (e.g., genetic pathways, molecular interactions)
     - Follow with a 4-6 sentence narrative outlining the biological mechanisms.
  c. **Business Landscape:**
     - Provide a bulleted list of market trends and commercial opportunities (e.g., investment highlights, market potential, industry challenges)
     - Follow with a 4-6 sentence narrative discussing economic and strategic implications.
- Ensure clear separation between each part, and avoid repetitive content across different sections.

For Conclusion/Summary:
- Use ## for the section title (Markdown format)
- 100-150 word limit
- For this comparative report on gene target mutations:
    * Must include a focused comparison table using Markdown table syntax that compares the mutations across clinical, biological, and business dimensions.
    * The table should distill insights from the report, highlighting relative strengths, challenges, and opportunities for each mutation.
    * Keep table entries clear and concise.
- End with specific next steps or implications for further research, development, or business strategy.
- No sources section needed

2. Writing Approach:
- Use concrete details over general statements.
- Make every word count.
- Focus on your single most important point.

3. Quality Checks:
- For the introduction: 50-100 word limit, # for the report title, no structural elements, no sources section.
- For main body sections: Ensure each mutation’s content is divided into the three required parts with clear labels and concise narratives.
- For the conclusion: 100-150 word limit, ## for the section title, only ONE structural element at most (the comparison table), no sources section.
- Markdown format.
- Do not include word count or any preamble in your response."""



def initiate_section_writing(state: ReportState):
    """ This is the "map" step when we kick off web research for some sections of the report """

    # Kick off section writing in parallel via Send() API for any sections that require research
    return [
        Send("build_section_with_web_research", {"section": s,
                                                 "number_of_queries": state["number_of_queries"],
                                                 "tavily_topic": state["tavily_topic"],
                                                 "tavily_days": state.get("tavily_days", None)})
        for s in state["sections"]
        if s.research
    ]

def write_final_sections(state: SectionState):
    """ Write final sections of the report, which do not require web search and use the completed sections as context """

    # Get state
    section = state["section"]
    completed_report_sections = state["report_sections_from_research"]

    # Format system instructions
    system_instructions = final_section_writer_instructions.format(section_title=section.name, section_topic=section.description, context=completed_report_sections)

    # Generate section
    section_content = llm.invoke([SystemMessage(content=system_instructions)]+[HumanMessage(content="Generate a report section based on the provided sources.")])

    # Write content to section
    section.content = section_content.content

    # Write the updated section to completed sections
    return {"completed_sections": [section]}

def gather_completed_sections(state: ReportState):
    """ Gather completed sections from research """

    # List of completed sections
    completed_sections = state["completed_sections"]

    # Format completed section to str to use as context for final sections
    completed_report_sections = format_sections(completed_sections)

    return {"report_sections_from_research": completed_report_sections}

def initiate_final_section_writing(state: ReportState):
    """ This is the "map" step when we kick off research on any sections that require it using the Send API """

    # Kick off section writing in parallel via Send() API for any sections that do not require research
    return [
        Send("write_final_sections", {"section": s, "report_sections_from_research": state["report_sections_from_research"]})
        for s in state["sections"]
        if not s.research
    ]

def compile_final_report(state: ReportState):
    """ Compile the final report """

    # Get sections
    sections = state["sections"]
    completed_sections = {s.name: s.content for s in state["completed_sections"]}

    # Update sections with completed content while maintaining original order
    for section in sections:
        section.content = completed_sections[section.name]

    # Compile final report
    all_sections = "\n\n".join([s.content for s in sections])

    return {"final_report": all_sections}

# Add nodes and edges
builder = StateGraph(ReportState, output=ReportStateOutput)
builder.add_node("generate_report_plan", generate_report_plan)
builder.add_node("build_section_with_web_research", section_builder.compile())
builder.add_node("gather_completed_sections", gather_completed_sections)
builder.add_node("write_final_sections", write_final_sections)
builder.add_node("compile_final_report", compile_final_report)
builder.add_edge(START, "generate_report_plan")
builder.add_conditional_edges("generate_report_plan", initiate_section_writing, ["build_section_with_web_research"])
builder.add_edge("build_section_with_web_research", "gather_completed_sections")
builder.add_conditional_edges("gather_completed_sections", initiate_final_section_writing, ["write_final_sections"])
builder.add_edge("write_final_sections", "compile_final_report")
builder.add_edge("compile_final_report", END)

graph = builder.compile()
#display(Image(graph.get_graph(xray=1).draw_mermaid_png()))

In [1]:
# Structure
report_structure = """This report provides a comprehensive landscape analysis of a given gene target mutation, emphasizing clinical, biological, and business perspectives.

The report structure should include:

1. Introduction
   - Brief overview of a gene target mutation
   - Context for their significance across clinical, biological, and business domains

2. Main Body Sections:
   - One dedicated section for a gene target mutation
   - Each section should include:
     a. Clinical Landscape:
        - Bulleted list of key clinical features and insights (e.g., current research findings, treatment implications, clinical trials)
        - A narrative summary in 4-6 sentences discussing the clinical relevance and potential impact on patient care
     b. Biological Landscape:
        - Bulleted list of core biological characteristics (e.g., genetic pathways, mechanisms, molecular interactions)
        - A narrative summary in 4-6 sentences outlining the biological functions and underlying mechanisms
     c. Business Landscape:
        - Bulleted list of market trends and commercial opportunities (e.g., investment highlights, market potential, industry challenges)
        - A narrative summary in 4-6 sentences addressing the economic and strategic implications

3. No Main Body Sections other than the ones dedicated to a gene target mutation

4. Conclusion with a
   - Highlights across clinical, biological, and business dimensions for the gene mutation
   - Highlights relative strengths, challenges, and opportunities for the mutation
"""


# Tavily search parameters
tavily_topic = "general"
tavily_days = None # Only applicable for news topic

report_topic = "Give a landscape review of the FLT3 mutation as a target for AML. Emphasize the clinical, bioligical and business perspectives."

In [49]:
report = await graph.ainvoke({"topic": report_topic,
                                   "report_structure": report_structure,
                                   "number_of_queries": 2,
                                   "tavily_topic": tavily_topic,
                                   "tavily_days": tavily_days})

from IPython.display import Markdown
Markdown(report['final_report'])

#print the final report in plain text
report['final_report']




'# FLT3 Mutation Overview\nThe FLT3 mutation is a significant factor in acute myeloid leukemia (AML), with approximately 30% of AML patients having this mutation. This mutation affects the FLT3 gene, which provides instructions for making a protein called FMS-like tyrosine kinase 3, involved in the normal development of stem cells and blood cells. Mutations in the FLT3 gene can lead to the production of a defective protein that disrupts normal cell development and leads to cancer.\n\n## FLT3 Mutation Significance\nThe FLT3 mutation has significant implications across clinical, biological, and business domains. \n* **Clinical Landscape:**\n  * Key clinical features: poor prognosis, higher risk of relapse, shorter overall survival\n  * The mutation is associated with a poor prognosis, with a higher risk of relapse and shorter overall survival. Research has shown that FLT3 mutations can be targeted with specific inhibitors, such as midostaurin and gilteritinib, which have improved outcome

# FLT3 Mutation Overview
The FLT3 mutation is a significant factor in acute myeloid leukemia (AML), with approximately 30% of AML patients having this mutation. This mutation affects the FLT3 gene, which provides instructions for making a protein called FMS-like tyrosine kinase 3, involved in the normal development of stem cells and blood cells. Mutations in the FLT3 gene can lead to the production of a defective protein that disrupts normal cell development and leads to cancer.

## FLT3 Mutation Significance
The FLT3 mutation has significant implications across clinical, biological, and business domains.
* **Clinical Landscape:**
  * Key clinical features: poor prognosis, higher risk of relapse, shorter overall survival
  * The mutation is associated with a poor prognosis, with a higher risk of relapse and shorter overall survival. Research has shown that FLT3 mutations can be targeted with specific inhibitors, such as midostaurin and gilteritinib, which have improved outcomes in patients with FLT3-mutated AML. For example, the COMMODORE trial, a phase 3 study, evaluated gilteritinib versus salvage chemotherapy in patients with relapsed or refractory FLT3-mutated AML, showing that gilteritinib significantly improved overall survival and event-free survival.
* **Biological Landscape:**
  * Core biological characteristics: genetic pathways, molecular interactions, constitutive activation of the FLT3 receptor
  * The FLT3 gene encodes a receptor tyrosine kinase that plays a crucial role in hematopoietic cell development and proliferation. Mutations in the FLT3 gene, particularly internal tandem duplications (ITDs) and point mutations in the tyrosine kinase domain (TKD), lead to constitutive activation of the FLT3 receptor, resulting in uncontrolled cell growth and survival.
* **Business Landscape:**
  * Market trends and commercial opportunities: increasing incidence of AML, introduction of novel products, rising research and development investments
  * The global FLT3 inhibitors market is expected to grow significantly, driven by the increasing incidence of AML and the introduction of novel products. The market is projected to reach $2,061.3 million by 2032, with a CAGR of 14.88% during the forecast period. Companies such as Astellas Pharma Inc. are developing targeted therapies, including Xospata (Gilteritinib), a second-generation Type 1 tyrosine kinase inhibitor, which has shown inhibitory activity against FLT3 mutation and is used to treat adults with acute myeloid leukemia having FMS-like tyrosine kinase 3 mutation.

## FLT3 Mutation
**The FLT3 mutation is a significant factor in acute myeloid leukemia (AML), with approximately 30% of AML patients having this mutation**. The FLT3 gene provides instructions for making a protein called FMS-like tyrosine kinase 3, which is involved in the normal development of stem cells and blood cells. Mutations in the FLT3 gene can lead to the production of a defective protein that disrupts normal cell development and leads to cancer.

The COMMODORE trial, a phase 3 study, evaluated gilteritinib versus salvage chemotherapy in patients with relapsed or refractory FLT3-mutated AML. The results showed that gilteritinib significantly improved overall survival (OS) and event-free survival (EFS) compared to salvage chemotherapy. The median OS was 9.6 months with gilteritinib versus 5.0 months with salvage chemotherapy.

| Treatment | Median OS | Median EFS |
| --- | --- | --- |
| Gilteritinib | 9.6 months | 2.8 months |
| Salvage Chemotherapy | 5.0 months | 0.6 months |

The global FLT3 inhibitors market is expected to grow significantly, driven by the increasing incidence of AML and the introduction of novel products. The market is projected to reach $2,061.3 million by 2032, with a CAGR of 14.88% during the forecast period.

### Sources
- Phase 3 study of gilteritinib versus salvage chemotherapy in predominantly Asian patients with relapsed/refractory FLT3-mutated acute myeloid leukemia: https://www.nature.com/articles/s41375-024-02382-9
- FLT3 Inhibitors Market - A Global and Country Analysis: Focus on Commercialized Therapy, Potential Pipeline Product, and Region - Analysis and Forecast, 2022-2032: https://www.researchandmarkets.com/reports/5646700/flt3-inhibitors-market-a-global-and-country
- Global FLT3 Inhibitors Market Size, Top Share, Report to 2032: https://straitsresearch.com/report/flt3-inhibitors-market
- Global FLT3 Inhibitors Market Analysis Report 2022: A $2+ Billion Market by 2032: https://finance.yahoo.com/news/global-flt3-inhibitors-market-analysis-110300282.html
- FLT3 Inhibitors Market - Industry Analysis, Forecast & Trends | BIS Research: https://bisresearch.com/industry-report/flt3-inhibitors-market.html

## FLT3 Mutation in Acute Myeloid Leukemia
**The FLT3 mutation is a significant prognostic factor in acute myeloid leukemia (AML), with approximately 30% of patients harboring this mutation**. The mutation is associated with a poor prognosis, with a higher risk of relapse and shorter overall survival. Research has shown that FLT3 mutations can be targeted with specific inhibitors, such as midostaurin and gilteritinib, which have improved outcomes in patients with FLT3-mutated AML.

A specific example of the clinical impact of FLT3 mutations is the QuANTUM-First study, which evaluated the use of quizartinib in combination with intensive chemotherapy in patients with newly diagnosed FLT3-ITD-mutated AML. The study found that the addition of quizartinib improved overall survival, with a median OS of 31.9 months compared to 15.1 months with chemotherapy alone.

Current treatment approaches for FLT3-mutated AML include the use of FLT3 inhibitors in combination with intensive chemotherapy or hypomethylating agents. Ongoing clinical trials are evaluating the efficacy of these approaches, including the use of quizartinib and gilteritinib in combination with azacitidine or venetoclax.

### Sources
- FLT3 Mutations in Acute Myeloid Leukemia: Unraveling the Molecular ...: https://pmc.ncbi.nlm.nih.gov/articles/PMC10590537/
- Molecular and Clinical Features of - American Society of Hematology: https://ashpublications.org/blood/article/140/Supplement+1/6244/489271/Molecular-and-Clinical-Features-of-FLT3
- A Comprehensive Analysis of FLT3 Mutation Profiles and Clinical ...: https://ashpublications.org/blood/article/144/Supplement+1/1488/532081/A-Comprehensive-Analysis-of-FLT3-Mutation-Profiles
- FLT3 mutations in acute myeloid leukemia: Therapeutic paradigm beyond ...: https://pmc.ncbi.nlm.nih.gov/articles/PMC7004512/
- The Shifting Prognosis of - American Society of Hematology: https://ashpublications.org/blood/article/142/Supplement+1/958/499409/The-Shifting-Prognosis-of-FLT3-Mutations-in-Acute
- AML with FLT3 mutation: Symptoms, treatment, and outlook: https://www.medicalnewstoday.com/articles/acute-myeloid-leukemia-with-flt3-mutation
- FLT3-targeted treatment for acute myeloid leukemia - PubMed: https://pubmed.ncbi.nlm.nih.gov/35532877/
- Treatment of older adults with FLT3 -mutated AML: Emerging ... - Nature: https://www.nature.com/articles/s41408-023-00911-w
- FLT3 mutated acute myeloid leukemia: 2021 treatment algorithm: https://www.nature.com/articles/s41408-021-00495-3

## Genetic Pathways and Mechanisms Underlying FLT3 Mutation
**The FLT3 mutation is a key driver of acute myeloid leukemia (AML) development and progression, with approximately 30% of newly diagnosed AML patients harboring this mutation**. The FLT3 gene encodes a receptor tyrosine kinase that plays a crucial role in hematopoietic cell development and proliferation. Mutations in the FLT3 gene, particularly internal tandem duplications (ITDs) and point mutations in the tyrosine kinase domain (TKD), lead to constitutive activation of the FLT3 receptor, resulting in uncontrolled cell growth and survival. For example, a study of 730 patients with AML found that 127 had FLT3-ITD mutations, which were associated with a higher risk of relapse and inferior overall survival. Understanding the genetic pathways and mechanisms underlying FLT3 mutation is essential for the development of effective therapeutic strategies.

### Sources
- Targeting FLT3 Mutation in Acute Myeloid Leukemia: Current Strategies and Future Directions : https://pmc.ncbi.nlm.nih.gov/articles/PMC10136888/
- Targeting FLT3 mutations in AML: review of current knowledge and evidence : https://www.nature.com/articles/s41375-018-0357-9
- FLT3 Mutations in Acute Myeloid Leukemia: Key Concepts and Emerging Controversies : https://pmc.ncbi.nlm.nih.gov/articles/PMC7787101/
- The importance of FLT3 mutational analysis in acute myeloid leukemia : https://pubmed.ncbi.nlm.nih.gov/29164965/
- FLT3 mutational analysis in acute myeloid leukemia: Advantages and pitfalls with different approaches : https://pubmed.ncbi.nlm.nih.gov/35086749/
- FLT3 mutations in acute myeloid leukemia: Therapeutic paradigm beyond inhibitor development : https://pmc.ncbi.nlm.nih.gov/articles/PMC7004512/

## Market Trends and Opportunities in FLT3 Mutation
**The global FLT3 inhibitors market is expected to grow at a CAGR of 10.87% from 2024 to 2032, driven by increasing incidence of acute myeloid leukemia and introduction of novel products**. The market size was valued at USD 487.53 million in 2023 and is projected to reach USD 1,234.03 million by 2032. The Asia-Pacific region is expected to be a lucrative market for FLT3 inhibitors, with countries such as China and Japan dedicated to researching emerging targeted therapies for various FLT3 mutated cancers.

A specific example of a company operating in this market is Astellas Pharma Inc., which developed Xospata (Gilteritinib), a second-generation Type 1 tyrosine kinase inhibitor. The drug has shown inhibitory activity against FLT3 mutation and is used to treat adults with acute myeloid leukemia having FMS-like tyrosine kinase 3 mutation.

The market is driven by factors such as increasing incidence of acute myeloid leukemia, introduction of novel products, and rising research and development investments. However, the market also faces challenges such as disease relapse in FLT3 mutated AML and high treatment costs.

### Sources
- Global FLT3 Inhibitors Market Size, Top Share, Report to 2032: https://straitsresearch.com/report/flt3-inhibitors-market
- FLT3 Inhibitors Market by Type, Product, Application - Global Forecast 2025-2030: https://www.giiresearch.com/report/ires1606888-flt3-inhibitors-market-by-type-flt3-itd-internal.html
- Medical Flt3 Inhibitor Market Overall Study Report 2024-2032: https://www.openpr.com/news/3663193/medical-flt3-inhibitor-market-overall-study-report-2024-2032
- FLT3 Inhibitors Market - Industry Analysis, Forecast & Trends | BIS Research: https://bisresearch.com/industry-report/flt3-inhibitors-market.html
- FLT3 Inhibitors Market And Pipeline Insights 2023: https://www.precisionbusinessinsights.com/market-reports/flt3-inhibitors-market
- FLT3 Mutation: Detailed Discussion on Prognosis and Pathways: https://biologyinsights.com/flt3-mutation-detailed-discussion-on-prognosis-and-pathways/
- The Shifting Prognosis of FLT3 Mutations in Acute Myeloid Leukemia in the Era of Targeted Therapy: https://ashpublications.org/blood/article/142/Supplement+1/958/499409/The-Shifting-Prognosis-of-FLT3-Mutations-in-Acute
- Optimising Therapy in FLT3-Mutated Acute Myeloid Leukaemia: https://ashpublications.org/blood/article/144/Supplement+1/4191.1/528981/Optimising-Therapy-in-FLT3-Mutated-Acute-Myeloid
- FLT3 Mutations in Acute Myeloid Leukemia: Key Concepts and Emerging Controversies: https://pmc.ncbi.nlm.nih.gov/articles/PMC7787101/
- FLT3 is associated with dendritic cell infiltration, tertiary lymphoid structure construction, and predict response to checkpoint inhibitors immunotherapy in solid cancers: https://www.nature.com/articles/s41598-025-86185-7

## Summarize Insights, Highlight Relative Strengths, Challenges, and Opportunities for the FLT3 Mutation
The FLT3 mutation is a significant factor in acute myeloid leukemia (AML), with approximately 30% of AML patients having this mutation. Across clinical, biological, and business dimensions, the FLT3 mutation presents various strengths, challenges, and opportunities.

### Clinical Landscape
* Key clinical features:
  + Approximately 30% of AML patients have the FLT3 mutation
  + Associated with a poor prognosis and higher risk of relapse
  + Targeted therapies, such as midostaurin and gilteritinib, have improved outcomes
* The FLT3 mutation is a significant prognostic factor in AML, with research showing that targeted therapies can improve overall survival and event-free survival. For example, the COMMODORE trial demonstrated that gilteritinib significantly improved overall survival and event-free survival compared to salvage chemotherapy.

### Biological Landscape
* Core biological characteristics:
  + The FLT3 gene encodes a receptor tyrosine kinase involved in hematopoietic cell development and proliferation
  + Mutations in the FLT3 gene lead to constitutive activation of the FLT3 receptor, resulting in uncontrolled cell growth and survival
  + Understanding the genetic pathways and mechanisms underlying FLT3 mutation is essential for developing effective therapeutic strategies
* The FLT3 mutation is a key driver of AML development and progression, with mutations in the FLT3 gene leading to constitutive activation of the FLT3 receptor. Research has shown that understanding the genetic pathways and mechanisms underlying FLT3 mutation is crucial for developing effective therapeutic strategies.

### Business Landscape
* Market trends and commercial opportunities:
  + The global FLT3 inhibitors market is expected to grow significantly, driven by the increasing incidence of AML and the introduction of novel products
  + The market is projected to reach $2,061.3 million by 2032, with a CAGR of 14.88% during the forecast period
  + Companies, such as Astellas Pharma Inc., are developing targeted therapies, such as gilteritinib, to treat FLT3-mutated AML
* The FLT3 inhibitors market presents significant commercial opportunities, driven by the increasing incidence of AML and the introduction of novel products. Companies are investing in research and development to develop effective targeted therapies, such as gilteritinib, to treat FLT3-mutated AML.

## Conclusion
The FLT3 mutation presents various strengths, challenges, and opportunities across clinical, biological, and business dimensions. The following table compares the FLT3 mutation across these dimensions:
| Dimension | Strengths | Challenges | Opportunities |
| --- | --- | --- | --- |
| Clinical | Targeted therapies improve outcomes | Poor prognosis and higher risk of relapse | Development of effective therapeutic strategies |
| Biological | Understanding genetic pathways and mechanisms | Constitutive activation of FLT3 receptor | Development of targeted therapies |
| Business | Growing market and commercial opportunities | High treatment costs and disease relapse | Investment in research and development |
Further research and development are necessary to fully understand the FLT3 mutation and develop effective therapeutic strategies to improve patient outcomes.

SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers (<ipython-input-50-8801332d350c>, line 29)