<a href="https://colab.research.google.com/github/shahriar-faghani/ASNR_ASFNR_AI_Workshop_2025/blob/main/Multi_Agent_Large_Language_Models_in_Neuroradiology.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Multi-Agent Large Language Models in Neuroradiology.
##ASNR-ASFNR AI workshop 2025

Authors:
- **Shahriar Faghani**, MD
- **Pouria Rouzrokh**, MD, MPH, MHPE
- **Mana Moassefi**, MD

Welcome to this workshop! Here, we will learn together how multi-agent LLM-based systems can enhance radiology workflows.

## Part 1. Setup and Installation
This first section handles the necessary setup for our Colab environment. We'll update system packages, install required Python libraries, import them, and configure access to Google Drive and API keys.

### System Updates & Dependencies

First, let's update the system package list and install ffmpeg. This is a command-line tool required by the openai-whisper library for processing audio files.

In [1]:
# @title
# Update package list and install ffmpeg
!apt-get update -y && apt-get install -y ffmpeg

0% [Working]            Get:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,632 B]
0% [Connecting to archive.ubuntu.com (185.125.190.82)] [Connecting to security.                                                                               Get:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease [1,581 B]
Get:3 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
Hit:4 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:5 https://r2u.stat.illinois.edu/ubuntu jammy InRelease [6,555 B]
Get:6 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
Hit:7 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease
Get:8 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  Packages [1,683 kB]
Hit:9 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease
Get:10 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [127 kB]
Get:11 https://ppa

### 1.1 Install Python Packages

Next, we install all the Python libraries needed for this project using pip. We use the --quiet flag to reduce installation noise.
smolagents[transformers]: The core framework for building multi-agent systems, including support for transformer-based models.
gradio: To create the interactive web UI for transcription and analysis.
openai-whisper: For accurate speech-to-text transcription.
python-dotenv: To load API keys securely from a .env file.
groq: The client library to interact with the fast Groq LLM API (used for entity extraction).
openai: Often useful for various LLM interactions or helper functions.
requests, beautifulsoup4: For fetching and parsing web content (specifically Radiopaedia).
sentence-transformers: To generate text embeddings for our local PDF search (RAG).
faiss-cpu: A library for efficient similarity search, used to index the PDF embeddings.
PyPDF2: To extract text content from PDF files.
huggingface_hub: For logging into Hugging Face and using models/tools from the Hub.
torch: The underlying deep learning framework needed by Whisper and SentenceTransformers.

In [2]:
# @title
# Install required Python packages
!pip install --quiet \
    smolagents[transformers] \
    gradio \
    openai-whisper \
    python-dotenv \
    groq \
    openai \
    requests \
    beautifulsoup4 \
    sentence-transformers \
    faiss-cpu \
    PyPDF2 \
    huggingface_hub \
    torch \
    duckduckgo-search

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/800.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m798.7/800.5 kB[0m [31m26.7 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m800.5/800.5 kB[0m [31m19.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m54.1/54.1 MB[0m [31m19.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m323.1/323.1 kB[0m [31m30.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m129.4/129.4 kB[0m [31m13.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m31.3/31.3 MB[0m [31m67.5 MB/s[0

### 1.2 Import Libraries

Now we import all the necessary modules and classes into our Python environment.

In [3]:
# @title
import os
import re
import unicodedata
import time
import json # For handling structured data between agents
from pathlib import Path
import shutil # For cleaning /tmp if needed

# Web and API interaction
import requests
from bs4 import BeautifulSoup
from groq import Groq
from huggingface_hub import login, HfFolder

# Deep Learning and NLP
import torch
import whisper
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np

# PDF Processing
from PyPDF2 import PdfReader

# UI and Environment
import getpass
import gradio as gr
from google.colab import drive, files # For Drive mounting
from dotenv import load_dotenv

# SmolAgents core components
from smolagents import (
    Tool, # Base class for tools (alternative definition method)
    tool, # Decorator for function-based tools (preferred)
    CodeAgent,
    InferenceClientModel, # Connects to HF Inference API and partners
)

### 1.3 Mount Google Drive and Environment Setup
To access API keys securely and load reference PDF documents, we need to connect this Colab notebook to your Google Drive.

**Action Required:**

1. Run the next cell. It will prompt you to authorize access to your Google Drive. Follow the link, sign in, copy the authorization code, and paste it back into the Colab input field.
2. In your Google Drive's root folder (My Drive), create a new folder named RadiologyMultiAgentColab.
3. Inside RadiologyMultiAgentColab, create a file named api.env.
4. Edit the api.env file and add your API keys like this (replace placeholders with your actual keys):
```python
HF_TOKEN=hf_YourHuggingFaceTokenHere
GROQ_API_KEY=gsk_YourGroqApiKeyHere
```
5. Inside RadiologyMultiAgentColab, create another folder named SharedRadiologyPDFs.
6. Upload your internal reference PDF documents into the SharedRadiologyPDFs folder (For testing, you might place a sample PDF like Neuroradiology_Core_Reference.pdf there).

The following code will attempt to mount your drive and load the api.env file.

In [None]:
# Remove sample_data folder
!rm -rf /content/sample_data

# Mount Google Drive
MOUNT_GOOGLE_DRIVE = True #@param {type:"boolean"}
USE_GETPASS = True #@param {type:"boolean"}

if MOUNT_GOOGLE_DRIVE:
  try:
      drive.mount('/content/drive')
      print("Google Drive mounted successfully.")

      # Define base path for project files in Drive
      DRIVE_PROJECT_PATH = Path('/content/drive/MyDrive/RadiologyMultiAgentColab')

      # Ensure the project directory exists
      #DRIVE_PROJECT_PATH.mkdir(parents=True, exist_ok=True)
      Path('/content/drive/MyDrive/RadiologyMultiAgentColab/SharedRadiologyPDFs').mkdir(parents=True, exist_ok=True)
      print(f"Project path set to: {DRIVE_PROJECT_PATH}")

  except Exception as e:
    print(f"Error mounting Google Drive: {e}")
    MOUNT_GOOGLE_DRIVE = False

if not MOUNT_GOOGLE_DRIVE:
    print("Not running in Google Colab environment. Google Drive not mounted.")
    print("Please ensure your api.env file and PDF references are accessible locally.")
    DRIVE_PROJECT_PATH = Path('.')

# --- Load Environment Variables ---
env_path = DRIVE_PROJECT_PATH / 'api.env'
if env_path.exists():
    load_dotenv(dotenv_path=env_path)
    print(f".env file found and loaded from the path: {env_path}.")
else:
    print(f"Warning: .env file not found at {env_path}. Falling back to getpass.")
    if USE_GETPASS:
      try:
        # Prompt for and save Hugging Face API token
        hf_token = getpass.getpass("Enter your Hugging Face API token: ")
        os.environ["HF_TOKEN"] = hf_token
        print("Hugging Face API token saved.")
        # Prompt for and save Groq API key
        groq_key = getpass.getpass("Enter your Groq API key: ")
        os.environ["GROQ_API_KEY"] = groq_key
        print("Groq API key saved.")
      except Exception as e:
          print(f"Error accessing Colab secrets: {e}")
    else:
      print(f"Please make sure the '.env' file is present and accessible.")

### 1.4 API Key Verification and Login
Let's verify that the API keys were loaded correctly and log in to the Hugging Face Hub. We also initialize the Groq client.

In [7]:
# @title
# --- API Key Verification and Login ---
HF_TOKEN = os.getenv("HF_TOKEN")
GROQ_API_KEY = os.getenv("GROQ_API_KEY")
groq_client = None # Initialize client variable

if not HF_TOKEN:
    print("⚠️ Hugging Face Token (HF_TOKEN) not found. Analysis requiring Hugging Face models will fail.")
    print("   Please add it to your .env file or Colab secrets.")
else:
    try:
        login(token=HF_TOKEN, add_to_git_credential=False) # Avoid git credential helper issues in Colab
        print("✅ Successfully logged into Hugging Face Hub.")
    except Exception as e:
        print(f"❌ Error logging into Hugging Face Hub: {e}")
        print("   Please ensure your HF_TOKEN is valid.")

if not GROQ_API_KEY:
    print("⚠️ Groq API Key (GROQ_API_KEY) not found. Entity extraction will fail.")
    print("   Please add it to your .env file or Colab secrets.")
else:
    try:
        groq_client = Groq(api_key=GROQ_API_KEY)
        # Test Groq connection by listing models
        models = groq_client.models.list()
        print(f"✅ Successfully connected to Groq API. Available models (first few): {[m.id for m in models.data[:3]]}...")
    except Exception as e:
        print(f"❌ Error connecting to Groq API: {e}")
        print("   Please ensure your GROQ_API_KEY is valid.")
        groq_client = None # Ensure client is None if connection failed

Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.


✅ Successfully logged into Hugging Face Hub.
✅ Successfully connected to Groq API. Available models (first few): ['allam-2-7b', 'meta-llama/llama-4-scout-17b-16e-instruct', 'compound-beta']...


### 1.5 PDF Reference Path Setup

We define the path to the folder containing your local PDF references and check if it exists and contains PDF files.

In [9]:
# @title
# --- PDF Reference Path ---
PDF_FOLDER_PATH = DRIVE_PROJECT_PATH / 'SharedRadiologyPDFs'

if PDF_FOLDER_PATH.exists() and PDF_FOLDER_PATH.is_dir():
    print(f"✅ PDF reference folder found at: {PDF_FOLDER_PATH}")
    pdf_files = list(PDF_FOLDER_PATH.glob("*.pdf"))
    if pdf_files:
        print(f"   Found {len(pdf_files)} PDF file(s):")
        for pdf_file in pdf_files[:5]: # Print first 5
             print(f"   - {pdf_file.name}")
        if len(pdf_files) > 5:
            print("   ...")
    else:
        print(f"⚠️ Warning: The folder {PDF_FOLDER_PATH} exists, but no PDF files were found inside.")
else:
    print(f"⚠️ Warning: PDF reference folder not found or is not a directory at {PDF_FOLDER_PATH}")
    print("   Please ensure you created the 'SharedRadiologyPDFs' folder inside 'RadiologyMultiAgentColab' and uploaded PDFs.")
    pdf_files = []

✅ PDF reference folder found at: /content/drive/MyDrive/RadiologyMultiAgentColab/SharedRadiologyPDFs
   Found 3 PDF file(s):
   - AD-M2400 ARIA in the ED infographic.pdf
   - jksr-86-17-s001.pdf
   - ARIA_differential_diagnosis.pdf


### 1.6 Check GPU Availability
Machine learning tasks like transcription and embedding generation run much faster on a GPU. Let's check if one is available in this Colab session. (Go to Runtime -> Change runtime type -> Hardware accelerator -> GPU if needed).

In [10]:
# @title
# Check for GPU
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"🤖 Using device: {device.upper()}")
if device == "cpu":
    print("   Note: Operations like transcription and embedding will be slower on CPU.")

🤖 Using device: CUDA


### 1.7 Clean Temporary Files

Sometimes, Colab's temporary directory (/tmp) can fill up. Uncomment and run the following cell if you encounter disk space errors later.

In [11]:
# @title
# --- Optional: Clean Temporary Files ---

# tmp_path = Path('/tmp')
# if tmp_path.exists():
#     print("Cleaning /tmp/ directory...")
#     cleaned_count = 0
#     error_count = 0
#     for item in tmp_path.iterdir():
#         try:
#             if item.is_file() or item.is_symlink():
#                 item.unlink()
#                 cleaned_count += 1
#             elif item.is_dir():
#                 shutil.rmtree(item)
#                 cleaned_count += 1
#         except Exception as e:
#             # print(f"Could not remove {item}: {e}") # Can be noisy
#             error_count += 1
#     print(f"/tmp/ directory cleaned. Removed {cleaned_count} items. Encountered {error_count} errors.")

## Part 2. Transcription Interface with Whisper + Gradio
In this section, we'll set up the speech-to-text transcription functionality. We'll load the OpenAI Whisper model and create a simple Gradio interface that allows users to:

- See a sample brain MRI image for context.
Record a short radiology report using their microphone.
- Get the transcribed text output.

This initial interface will help us test the transcription part before integrating it into the full multi-agent workflow.
### Load Whisper Model
We'll load a pre-trained Whisper model, that is a well-known speech to text model. Whisper comes in various sizes (e.g., tiny, base, small, medium, large). Larger models are generally more accurate but slower and require more resources. Smaller models can run on regular devices (e.g., Macbooks and even some smart mobile phones).

For this educational notebook, base.en (English-only base model) is a good balance. We ensure the model is loaded only once to save resources.

In [12]:
# @title
# Load Whisper Model
# Ensure model is loaded only once using a global-like check

try:
  whisper_model
except NameError:
  whisper_model = None

# You can choose different model sizes: "tiny.en", "base.en", "small.en", "medium.en"
# ".en" models are English-only and typically faster/smaller than multilingual ones.
whisper_model_size = "base.en"

if 'whisper_model' not in globals() or whisper_model is None:
    try:
        print("Loading Whisper model (this may take a moment)...")
        whisper_model = whisper.load_model(whisper_model_size, device=device)
        print(f"✅ Whisper model '{whisper_model_size}' loaded successfully on {device.upper()}.")
    except Exception as e:
        print(f"❌ Error loading Whisper model: {e}")
        print("   Transcription functionality will not work. Please check your setup and GPU (if selected).")
        whisper_model = None
else:
    print(f"✅ Whisper model ('{whisper_model_size}') already loaded.")

Loading Whisper model (this may take a moment)...


100%|███████████████████████████████████████| 139M/139M [00:02<00:00, 52.2MiB/s]


✅ Whisper model 'base.en' loaded successfully on CUDA.


### 2.1 Define Transcription Function
This Python function takes the filepath of an audio recording (as provided by Gradio) and uses the loaded Whisper model to convert the speech into text. It includes basic error handling.


In [14]:
# @title
def transcribe_audio(audio_filepath: str | None) -> str:
    """
    Transcribes the given audio file using the pre-loaded Whisper model.

    Args:
        audio_filepath: Path to the audio file, or None if no file is provided.

    Returns:
        The transcribed text, or an error/informational message.
    """
    if whisper_model is None:
        return "Error: Whisper model is not loaded. Cannot transcribe."
    if not audio_filepath: # Gradio might pass None if no audio is recorded
        return "No audio recorded. Please use the microphone to record your report."

    print(f"Transcribing audio file: {audio_filepath}...")
    try:
        # The 'language' parameter can be useful for multilingual models.
        # For ".en" models, it's less critical but doesn't hurt.
        result = whisper_model.transcribe(audio_filepath, language='en', fp16=torch.cuda.is_available())
        transcription = result['text']
        print("Transcription complete.")
        return transcription.strip() if transcription else "Transcription result was empty."
    except Exception as e:
        print(f"❌ Error during transcription: {e}")
        return f"Error during transcription: {e}"

### 2.2 Simple Gradio Interface for Transcription
Now, we'll create a basic Gradio interface. It will display:
- A title and brief instructions.
- A sample brain MRI image to provide context for the dictation.
- An audio input component that allows recording from the microphone.
- A button to trigger the transcription.
- A textbox to display the transcription result.

This interface will be launched for testing. Later, we'll build a more advanced, integrated GUI for our multiagent workflow!

In [None]:
# @title
# Simple Gradio Interface for Transcription

# --- Brain MRI Image ---
# You can replace this URL with a direct link to any publicly accessible image,
# or upload an image to your Colab session and use its local path.
# For demonstration, we use a Wikimedia Commons link.
brain_image_url = 'https://radiologyassistant.nl/assets/brain-ischemia-vascular-territories/a509797855a416_PCA-infarct2.jpg'
brain_image_caption = "Sample Brain CT (Axial non-contrast)"

print("Setting up basic Gradio transcription interface...")

with gr.Blocks(css="footer {display: none !important}") as basic_transcription_interface:
    gr.Markdown("# Step 1: Transcribe Your Radiology Report")
    gr.Markdown(
        "Focus on the image below (or imagine a case). "
        "Record a short dictation (3-4 sentences) using your microphone, "
        "then click 'Transcribe Audio' to see the text."
    )

    with gr.Row():
        gr.Image(value=brain_image_url, label=brain_image_caption, height=350, width=350, elem_id="sample-mri-image")
        with gr.Column(scale=2): # Give more space to controls
            audio_input = gr.Audio(
                sources=["microphone"],
                type="filepath", # Whisper needs a filepath
                label="Record Your Report Dictation Here:"
            )
            transcribe_button = gr.Button("🎤 Transcribe Audio", variant="primary")

    transcription_output = gr.Textbox(
        label="📝 Transcription Result:",
        lines=5,
        placeholder="Your transcribed report will appear here..."
    )

    # Connect button click to the transcription function
    transcribe_button.click(
        fn=transcribe_audio,
        inputs=audio_input,
        outputs=transcription_output
    )

# --- Launch the basic interface for testing ---
# Note: We will comment this out or remove it when we build the final advanced GUI,
# as launching multiple Gradio apps in one notebook can sometimes be problematic.
# For now, it's useful for isolated testing of transcription.

# To run this interface, uncomment the line below.
basic_transcription_interface.launch(share=True, debug=True)
print("Basic transcription interface launched. You can test recording and transcription now.")
print("After testing, you might want to interrupt/stop this cell before proceeding to avoid issues with the final GUI.")

## Part 3. Tool Definitions
In the SmolAgents framework, "tools" are specialized functions that our AI agents can utilize to perform specific actions or interact with external data sources. The Large Language Model (LLM) that powers each agent intelligently decides which tool to use based on its current task and the descriptive information provided for each available tool.

We will define three main tools for our radiology workflow:

1. **Radiopaedia Content Extraction Tool** (*radiopaedia_content_extraction_tool*): This tool is designed to fetch the main textual content from a specific Radiopaedia.org article, given its direct URL. The URL itself will be identified by the RadiopaediaExpertAgent using a general web search.

2. **Internal PDF RAG Query Tool** (*query_internal_references_tool*): This tool enables searching our curated library of local PDF documents (e.g., textbooks, guidelines). When given a query, typically an imaging finding, it retrieves the most relevant text chunks from these PDFs. Its output is a structured JSON string, providing not just the text but also metadata like the source PDF and a unique global_chunk_index for each chunk. This index is vital for contextual expansion.

3. **Retrieve Neighboring Chunks Tool** (*retrieve_neighboring_chunks_tool*): This new tool works in conjunction with the RAG query tool. Given the global_chunk_index of a chunk retrieved by the RAG tool, it fetches a specified number of text chunks immediately preceding and succeeding it from the same source PDF document. This allows an agent to gain broader context around an initially identified relevant piece of information.

4. **Web Search Tool** (*DuckDuckGoSearchTool*): A general-purpose web search tool (using DuckDuckGo) provided by SmolAgents, which our RadiopaediaExpertAgent will use to find relevant article URLs.

We'll use the @tool decorator from SmolAgents, which is a convenient way to wrap Python functions and make them usable by agents. It automatically infers the tool's name, inputs, and output types from the function signature and docstring.

### 3.1 Radiopaedia Content Extraction Tool

Defining this tool is mostly straightforward as it does not need us to define any helper functions.

In [16]:
# @title
# In Section 3

@tool
def radiopaedia_content_extraction_tool(page_url: str) -> str:
    """
    Fetches a specific Radiopaedia.org article page given its full URL and extracts
    the main textual content. It aims to return a clean version of the article body.
    The calling agent is responsible for summarizing this extracted content.

    Args:
        page_url (str): The full URL of the Radiopaedia article to process
                        (e.g., "https://radiopaedia.org/articles/stroke?lang=us").

    Returns:
        str: A string containing the extracted and cleaned textual content of the article.
             The content might be truncated by this tool if exceedingly long to ensure manageability.
             If an error occurs (e.g., URL not found, content not extractable),
             an error message string is returned (e.g., "Error: Failed to fetch...").
    """
    print(f"Radiopaedia Content Extractor: Attempting to fetch URL: {page_url}")
    # Validate that the URL is for a Radiopaedia article
    if not page_url or not (page_url.startswith("http://radiopaedia.org/articles/") or page_url.startswith("https://radiopaedia.org/articles/")):
        return "Error: Invalid or non-Radiopaedia article URL provided. URL must start with 'http(s)://radiopaedia.org/articles/'."

    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
        "Accept-Language": "en-US,en;q=0.5", # Prefer English content
    }

    try:
        resp = requests.get(page_url, headers=headers, timeout=25) # Timeout for the request
        resp.raise_for_status() # Raise an exception for HTTP errors (4xx or 5xx)
    except requests.exceptions.Timeout:
        print(f"   ❌ Timeout error fetching Radiopaedia URL: {page_url}")
        return f"Error: Timeout occurred while attempting to fetch the Radiopaedia page: {page_url}"
    except requests.exceptions.HTTPError as e:
        print(f"   ❌ HTTP error {e.response.status_code} fetching Radiopaedia URL {page_url}: {e}")
        return f"Error: An HTTP error ({e.response.status_code}) occurred while fetching the Radiopaedia page: {page_url}."
    except requests.exceptions.RequestException as e: # Catch other network-related errors
        print(f"   ❌ Network request error fetching Radiopaedia URL {page_url}: {e}")
        return f"Error: A network connection error occurred while attempting to fetch the Radiopaedia page: {page_url}."

    try:
        soup = BeautifulSoup(resp.text, "html.parser")
        # Try to find the main article body using common class names on Radiopaedia
        body = soup.find("div", class_="article-body")
        if not body: # Fallback to another common class name
            body = soup.find("div", class_="body user-generated-content")

        if not body:
            print(f"   ❌ Could not find the main article content section at URL: {page_url}. The page structure may have changed or the URL might not be a standard article page.")
            return f"Error: Could not find the main article content section at URL: {page_url}. The page structure might differ from expected."

        # List of CSS selectors for elements to remove to clean up the content
        unwanted_selectors = [
            "div.ad-banner-mobile", "div.ad-container", "aside.article-aside", "div.rb-quick-links",
            "div.questions.expandable", "div.references.expandable", "div.reference_lists",
            "div.incoming-links.expandable", "div.article-quiz-callout", "div#article-images-carousel",
            "div.article-related-articles-callout", "div.article-social-sharing",
            "div.article-licence", "div.article-segmentation-map", "div.article-metadata-row",
            "figure", "figcaption", # Remove image figures and captions
            "a.image-thumbnail", "script", "style", "nav", "footer", "header", # Remove non-content elements
            "div.article-doi", "div.article-updated-at", "div.article-contributors", "div.article-case-links-title",
            ".article-images-header", ".image-object-counter", ".article-jump-to-top", ".article-actions"
        ]
        for selector in unwanted_selectors:
            for unwanted_tag in body.select(selector):
                unwanted_tag.decompose() # Remove the tag and all its children

        # Extract text primarily from paragraphs and heading elements for better semantic structure
        text_elements = body.find_all(['p', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'li'])
        content_parts = [elem.get_text(separator=" ", strip=True) for elem in text_elements]

        processed_text_parts = []
        for part in content_parts:
            if part: # Ensure part is not empty after stripping
                # Add more spacing for headings or longer paragraphs for readability
                if part.startswith(tuple(f'<h{i}>' for i in range(1,7))) or len(part) > 100 : # crude check for heading or paragraph
                     processed_text_parts.append(part + "\n\n")
                else: # for list items or shorter text, less spacing
                     processed_text_parts.append(part + "\n")

        text = "".join(processed_text_parts)
        # Consolidate multiple blank lines resulting from joins or original HTML structure
        text = re.sub(r'\n\s*\n+', '\n\n', text).strip()

        if not text:
            print(f"   ❌ Extracted text was empty for URL {page_url} after cleaning and processing.")
            return f"Error: Extracted text was empty for URL {page_url} after cleaning. The article might be primarily composed of images or have an unrecognized format."

        # Cap the length of the raw text returned to the agent.
        # The agent will then summarize this. This cap prevents overwhelming the agent.
        max_raw_text_length = 150000 # Characters (allows for a fairly substantial article)
        full_text_content = text[:max_raw_text_length]
        if len(text) > max_raw_text_length:
            full_text_content += "\n\n... [CONTENT TRUNCATED BY TOOL DUE TO EXCEEDING MAX LENGTH]"
            print(f"   ⚠️ Content from {page_url} was truncated by radiopaedia_content_extraction_tool due to its length ({len(text)} chars).")

        print(f"   ✅ Successfully extracted content from {page_url}. Final length for agent: {len(full_text_content)} chars.")
        return full_text_content

    except Exception as e:
        print(f"   ❌ Unexpected error occurred while parsing content from Radiopaedia URL {page_url}: {str(e)}")
        import traceback
        traceback.print_exc() # Log the full traceback for debugging
        return f"Error: An unexpected error occurred while parsing content from Radiopaedia URL {page_url}: {str(e)}"

In [17]:
# @title
# --- Test: radiopaedia_content_extraction_tool ---

print("--- Testing Radiopaedia Content Extraction Tool ---")
# Ensure the tool function exists (defined in previous cells)
if 'radiopaedia_content_extraction_tool' in locals():
    test_url = "https://radiopaedia.org/articles/stroke?lang=us"
    print(f"Attempting to extract content from URL: {test_url}")

    # Call the tool directly
    extraction_result = radiopaedia_content_extraction_tool(page_url=test_url)

    if isinstance(extraction_result, str):
        if extraction_result.startswith("Error:"):
            print(f"\n❌ Test Result (Error): {extraction_result}")
        else:
            print(f"\n✅ Test Result (Success!): Extracted Content (first 600 chars):\n")
            print(extraction_result[:600])
            if len(extraction_result) > 600:
                print("\n...")
            if "[CONTENT TRUNCATED BY TOOL DUE TO EXCEEDING MAX LENGTH]" in extraction_result:
                 print("\n(Note: Full content was truncated by the tool)")
    else:
        print(f"\n❌ Test Error: Unexpected return type from tool: {type(extraction_result)}")
else:
    print("❌ Cannot run test: `radiopaedia_content_extraction_tool` function not defined.")
print("-" * 50)

# --- Optional: Test with a non-article URL (should fail gracefully) ---
print("\n--- Testing Radiopaedia Tool with Invalid URL ---")
if 'radiopaedia_content_extraction_tool' in locals():
    invalid_url = "https://radiopaedia.org/cases" # Not an article URL
    print(f"Attempting to extract content from invalid URL: {invalid_url}")
    error_result = radiopaedia_content_extraction_tool(page_url=invalid_url)
    if isinstance(error_result, str) and error_result.startswith("Error:"):
        print(f"✅ Test Result (Expected Error): {error_result}")
    else:
        print(f"❌ Test Result (Unexpected): Tool did not return expected error for invalid URL. Output: {error_result}")
else:
    print("❌ Cannot run test: `radiopaedia_content_extraction_tool` function not defined.")
print("-" * 50)

--- Testing Radiopaedia Content Extraction Tool ---
Attempting to extract content from URL: https://radiopaedia.org/articles/stroke?lang=us
Radiopaedia Content Extractor: Attempting to fetch URL: https://radiopaedia.org/articles/stroke?lang=us
   ✅ Successfully extracted content from https://radiopaedia.org/articles/stroke?lang=us. Final length for agent: 763 chars.

✅ Test Result (Success!): Extracted Content (first 600 chars):

A stroke is a clinical diagnosis that refers to a sudden onset focal neurological deficit of presumed vascular origin.

Stroke is generally divided into two broad categories 1,2 :
ischemic stroke (87%)
hemorrhagic stroke (13%)
Terminology
The term "stroke" is ambiguous and care must be taken to ensure that precise terminology is used. This is particularly the case for "hemorrhagic stroke" which although is often used synonymously with intracerebral hemorrhage , has a broader definition to many authors and organizations to also include subarachnoid hemorrhage 1

### 3.2 Local PDF RAG (Retrieval-Augmented Generation) Tool

This tool will allow our agent to search for information within the PDF documents you've stored in the SharedRadiologyPDFs folder on Google Drive. It involves several sub-steps:
1. Load and Chunk PDFs: Read text from all PDFs and split it into smaller, manageable chunks.
2. Embed Chunks: Convert each text chunk into a numerical vector (embedding) using a sentence transformer model.
3. Build FAISS Index: Create a FAISS index from these embeddings for fast similarity searching.
4. Query Function: The actual tool function will take a user's query, embed it, and use the FAISS index to find the most relevant text chunks from the PDFs.

We'll define helper functions for loading/chunking and index building, then the SmolAgent tool.

#### Helper function: PDF Loading and Chunking
TThis function, load_and_chunk_pdfs, is responsible for processing your local PDF library:

1. It locates all PDF files within the SharedRadiologyPDFs folder in your Google Drive.
2. For each PDF, it extracts the raw text content.
3. This text is then divided into smaller, overlapping "chunks." Chunking is important because LLMs have context limits, and processing smaller segments improves the relevance of search results.
4. Crucially, for each chunk, detailed metadata is stored:
  - source_pdf_name: The filename of the PDF it came from.
  - chunk_index_in_doc: Its sequential position within that specific PDF.
  - global_chunk_index: A unique index for the chunk across all processed PDFs. This global index is vital for the retrieve_neighboring_chunks_tool to function correctly.

  The output of this function (a list of all text chunks and a parallel list of their metadata) forms the foundation of our RAG system's knowledge base.

In [18]:
# @title
def load_and_chunk_pdfs(
    pdf_folder_path: Path,
    chunk_size: int = 800,  # Target character length for each chunk
    chunk_overlap: int = 150 # Number of characters to overlap between chunks
) -> tuple[list[str], list[dict]]:
    """
    Loads all PDF files from a specified folder, extracts their text content,
    and splits this text into manageable, overlapping chunks.
    Each chunk is stored along with detailed metadata for later retrieval and context expansion.

    Args:
        pdf_folder_path (Path): The path to the folder containing PDF files.
        chunk_size (int): The target size for each text chunk in characters.
        chunk_overlap (int): The number of characters to overlap between consecutive chunks
                             to ensure context isn't lost at chunk boundaries.

    Returns:
        tuple[list[str], list[dict]]: A tuple containing:
            - A list of all text chunks (strings) extracted from all PDFs.
            - A parallel list of metadata dictionaries, one for each chunk, containing
              'source_pdf_name', 'chunk_index_in_doc', and 'global_chunk_index'.
    """
    all_chunks_text: list[str] = []
    all_chunks_metadata: list[dict] = []
    current_global_chunk_index: int = 0 # Initialize global counter

    if not pdf_folder_path.exists() or not pdf_folder_path.is_dir():
        print(f"⚠️ PDF folder not found or is not a directory: {pdf_folder_path}. Cannot load PDFs.")
        return [], []

    pdf_files_in_folder = list(pdf_folder_path.glob("*.pdf"))
    if not pdf_files_in_folder:
        print(f"⚠️ No PDF files (*.pdf) found in the specified folder: {pdf_folder_path}.")
        return [], []

    print(f"Found {len(pdf_files_in_folder)} PDF file(s) in {pdf_folder_path}. Starting processing...")

    for pdf_path in pdf_files_in_folder:
        print(f"  Processing PDF: {pdf_path.name}...")
        try:
            reader = PdfReader(pdf_path)
            full_pdf_text = ""
            for i, page in enumerate(reader.pages):
                page_text = page.extract_text()
                if page_text:
                    full_pdf_text += page_text.strip() + "\n" # Add a newline separator between pages
                # else:
                #     print(f"    - Note: Page {i+1} in {pdf_path.name} had no extractable text.")

            if not full_pdf_text.strip():
                print(f"    - Warning: No text could be extracted from {pdf_path.name} overall.")
                continue

            # Basic text cleaning: reduce multiple spaces/newlines
            cleaned_pdf_text = re.sub(r'\s\s+', ' ', full_pdf_text)
            cleaned_pdf_text = re.sub(r'\n\n+', '\n', cleaned_pdf_text).strip()

            # Chunking logic for the current PDF document
            start_char_index = 0
            chunk_index_within_this_doc = 0 # Reset for each new document

            num_chunks_from_this_doc = 0
            while start_char_index < len(cleaned_pdf_text):
                end_char_index = min(start_char_index + chunk_size, len(cleaned_pdf_text))
                chunk_text_content = cleaned_pdf_text[start_char_index:end_char_index]

                all_chunks_text.append(chunk_text_content) # Add to global list of texts
                all_chunks_metadata.append({
                    'source_pdf_name': pdf_path.name,
                    'chunk_index_in_doc': chunk_index_within_this_doc,
                    'global_chunk_index': current_global_chunk_index,
                    # 'text_preview': chunk_text_content[:60].replace('\n', ' ') + "..." # Useful for debugging
                })

                chunk_index_within_this_doc += 1
                current_global_chunk_index += 1 # Increment global index for every chunk
                num_chunks_from_this_doc +=1

                # Determine start of next chunk considering overlap
                next_start = start_char_index + chunk_size - chunk_overlap
                # If the remaining text is very small, just break to avoid a tiny last chunk
                if len(cleaned_pdf_text) - next_start < (chunk_size * 0.20) and next_start < len(cleaned_pdf_text) : # If less than 20% of chunk size remains
                     if start_char_index + chunk_size < len(cleaned_pdf_text): # if current chunk wasn't the last possible full chunk
                          # Force the last chunk to grab all remaining text
                          chunk_text_content_final = cleaned_pdf_text[start_char_index:] # Grab from current start to end
                          all_chunks_text[-1] = chunk_text_content_final # Overwrite last appended chunk
                          # Update metadata for this now potentially larger last chunk if needed, or just accept previous metadata
                          print(f"    - Adjusted last chunk of {pdf_path.name} to include all remaining text.")
                     break # Exit loop after processing the (potentially adjusted) last chunk
                start_char_index = next_start


            print(f"    - Extracted {num_chunks_from_this_doc} chunks from {pdf_path.name}.")

        except Exception as e:
            print(f"    - ❌ Error processing PDF file {pdf_path.name}: {e}")
            import traceback
            traceback.print_exc() # Log full error for diagnosis

    print(f"✅ PDF processing complete. Total {len(all_chunks_text)} chunks created from all documents.")
    if len(all_chunks_text) != len(all_chunks_metadata):
        # This should ideally not happen with the current logic
        print(f"    🚨 CRITICAL INTERNAL WARNING: Mismatch in chunk text ({len(all_chunks_text)}) and metadata ({len(all_chunks_metadata)}) counts! This will cause issues.")
    return all_chunks_text, all_chunks_metadata

#### Helper Function: Embedding and FAISS Index Building Function
This function takes the text chunks, generates embeddings using a SentenceTransformer model, and builds a FAISS index for efficient searching. These (embedder, index, chunks, metadata) will be stored globally for the tool to use.

In [19]:
# @title
# Global variables for the RAG system components
# These will be initialized by build_pdf_rag_system()
pdf_rag_embedder: SentenceTransformer | None = None
pdf_rag_index: faiss.Index | None = None
pdf_rag_chunks: list[str] = []
pdf_rag_metadata: list[dict] = []

def build_pdf_rag_system(
    chunks: list[str],
    metadata: list[dict],
    embedding_model_name: str = "sentence-transformers/all-MiniLM-L6-v2" # A good default
):
    """
    Builds the FAISS index and initializes the embedder for the PDF RAG system.

    Args:
        chunks (list[str]): The list of text chunks from PDFs.
        metadata (list[dict]): The list of metadata corresponding to the chunks.
        embedding_model_name (str): Name of the SentenceTransformer model to use.
    """
    global pdf_rag_embedder, pdf_rag_index, pdf_rag_chunks, pdf_rag_metadata

    if not chunks:
        print("⚠️ No chunks provided to build the PDF RAG system. It will not be functional.")
        return

    pdf_rag_chunks = chunks
    pdf_rag_metadata = metadata

    try:
        print(f"Initializing PDF RAG embedder: {embedding_model_name}...")
        pdf_rag_embedder = SentenceTransformer(embedding_model_name, device=device)
        print("✅ Embedder initialized.")

        print("Generating embeddings for PDF chunks (this may take a while for many chunks)...")
        # Encode in batches if memory is a concern, though for moderate sizes direct encoding is fine
        embeddings = pdf_rag_embedder.encode(
            pdf_rag_chunks,
            show_progress_bar=True,
            convert_to_numpy=True, # FAISS expects numpy arrays
            batch_size=32 # Adjust batch size based on available VRAM/RAM
        )

        if embeddings is None or embeddings.shape[0] == 0:
            print("❌ Error: Failed to generate embeddings for PDF chunks.")
            pdf_rag_index = None # Ensure index is None if embedding fails
            return

        print(f"✅ Embeddings generated. Shape: {embeddings.shape}")

        dimension = embeddings.shape[1]
        # Using IndexFlatL2 for simplicity; other FAISS indices exist for larger datasets
        pdf_rag_index = faiss.IndexFlatL2(dimension)
        # FAISS expects float32 for IndexFlatL2
        pdf_rag_index.add(embeddings.astype(np.float32))

        print(f"✅ FAISS index built successfully with {pdf_rag_index.ntotal} vectors.")

    except Exception as e:
        print(f"❌ Error building PDF RAG system: {e}")
        import traceback
        traceback.print_exc()
        pdf_rag_embedder = None
        pdf_rag_index = None
        pdf_rag_chunks = [] # Clear chunks if system build fails
        pdf_rag_metadata = []

#### Building the RAG Search Index
Once we have the text chunks and their metadata, we need to make them searchable. This involves:

1.  **Embedding Generation**: Using a `SentenceTransformer` model (e.g., `all-MiniLM-L6-v2`), each text chunk is converted into a dense vector embedding. These embeddings capture the semantic meaning of the text.

2.  **FAISS Index Creation**: These embeddings are then loaded into a FAISS (Facebook AI Similarity Search) index. FAISS allows for very fast and efficient searching of the most similar embeddings (and thus, text chunks) to a given query embedding.

The `build_pdf_rag_system` function handles these steps. The resulting `pdf_rag_embedder` and `pdf_rag_index`, along with the `pdf_rag_chunks` and `pdf_rag_metadata`, are stored as global variables for use by our RAG tools. This indexing process is typically done once when the notebook starts or when the PDF library changes.

In [20]:
# @title
# Global variables for the RAG system components
pdf_rag_embedder: SentenceTransformer | None = None
pdf_rag_index: faiss.Index | None = None
pdf_rag_chunks: list[str] = []       # Will store text of all chunks
pdf_rag_metadata: list[dict] = []    # Will store metadata for each chunk

def build_pdf_rag_system(
    chunks_text_list: list[str],
    chunks_metadata_list: list[dict],
    embedding_model_name: str = "sentence-transformers/all-MiniLM-L6-v2"
):
    """
    Builds the FAISS index from text chunks and initializes the sentence embedder.
    Stores the embedder, index, chunks, and metadata in global variables.
    """
    global pdf_rag_embedder, pdf_rag_index, pdf_rag_chunks, pdf_rag_metadata

    if not chunks_text_list or not chunks_metadata_list or len(chunks_text_list) != len(chunks_metadata_list):
        print("⚠️ No chunks provided or mismatch between chunks and metadata. PDF RAG system will not be functional.")
        pdf_rag_chunks, pdf_rag_metadata = [], [] # Clear any partial data
        return

    # Store the provided chunks and metadata globally
    pdf_rag_chunks = chunks_text_list
    pdf_rag_metadata = chunks_metadata_list
    print(f"Stored {len(pdf_rag_chunks)} chunks and metadata globally for RAG system.")

    try:
        print(f"Initializing PDF RAG sentence embedder: {embedding_model_name}...")
        pdf_rag_embedder = SentenceTransformer(embedding_model_name, device=device)
        print("✅ Embedder initialized.")

        print("Generating embeddings for all PDF chunks (this may take a while for large libraries)...")
        embeddings = pdf_rag_embedder.encode(
            pdf_rag_chunks,
            show_progress_bar=True,
            convert_to_numpy=True,
            batch_size=32 # Adjust based on available VRAM/RAM if performance issues arise
        )

        if embeddings is None or embeddings.shape[0] == 0:
            print("❌ Error: Failed to generate embeddings for PDF chunks. RAG system will be impaired.")
            pdf_rag_index = None
            return

        print(f"✅ Embeddings generated successfully. Shape: {embeddings.shape}")

        dimension = embeddings.shape[1]
        # Using IndexFlatL2 for simplicity; other FAISS indices offer different trade-offs.
        pdf_rag_index = faiss.IndexFlatL2(dimension)
        # FAISS expects float32 for IndexFlatL2. Ensure embeddings are in this format.
        pdf_rag_index.add(embeddings.astype(np.float32))

        print(f"✅ FAISS index built successfully with {pdf_rag_index.ntotal} vectors.")
        print("   PDF RAG System is ready for querying.")

    except Exception as e:
        print(f"❌ Error occurred while building the PDF RAG system: {e}")
        import traceback
        traceback.print_exc()
        # Reset global RAG components on failure
        pdf_rag_embedder = None
        pdf_rag_index = None
        pdf_rag_chunks = []
        pdf_rag_metadata = []
        print("   PDF RAG system components have been reset due to build failure.")

#### Initialize RAG System (Call the Loaders and Builders)

This cell executes the functions defined above to load your PDFs, chunk the text, generate embeddings, and build the searchable FAISS index. This process runs once and makes the RAG system ready for queries. Ensure your `SharedRadiologyPDFs` folder (defined by `PDF_FOLDER_PATH` in Section 1) contains your reference documents.


In [21]:
# @title
print("\n--- Initializing Local PDF RAG System ---")
# Check if already initialized to avoid re-processing if cell is run multiple times,
# though re-running might be desired if PDFs change.
# For simplicity, we'll allow re-initialization here.

# Clear previous RAG data if any, to ensure a fresh build
pdf_rag_embedder, pdf_rag_index, pdf_rag_chunks, pdf_rag_metadata = None, None, [], []

temp_loaded_chunks, temp_loaded_metadata = load_and_chunk_pdfs(PDF_FOLDER_PATH)
if temp_loaded_chunks and temp_loaded_metadata: # Ensure both were loaded
    build_pdf_rag_system(temp_loaded_chunks, temp_loaded_metadata)
else:
    print("⚠️ PDF RAG System could not be built: No PDF chunks or metadata were loaded. Check PDF folder and loading function.")


--- Initializing Local PDF RAG System ---
Found 3 PDF file(s) in /content/drive/MyDrive/RadiologyMultiAgentColab/SharedRadiologyPDFs. Starting processing...
  Processing PDF: AD-M2400 ARIA in the ED infographic.pdf...
    - Extracted 11 chunks from AD-M2400 ARIA in the ED infographic.pdf.
  Processing PDF: jksr-86-17-s001.pdf...
    - Extracted 64 chunks from jksr-86-17-s001.pdf.
  Processing PDF: ARIA_differential_diagnosis.pdf...
    - Extracted 10 chunks from ARIA_differential_diagnosis.pdf.
✅ PDF processing complete. Total 85 chunks created from all documents.
Stored 85 chunks and metadata globally for RAG system.
Initializing PDF RAG sentence embedder: sentence-transformers/all-MiniLM-L6-v2...


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

✅ Embedder initialized.
Generating embeddings for all PDF chunks (this may take a while for large libraries)...


Batches:   0%|          | 0/3 [00:00<?, ?it/s]

✅ Embeddings generated successfully. Shape: (85, 384)
✅ FAISS index built successfully with 85 vectors.
   PDF RAG System is ready for querying.


#### Tool definition

After we defined the helper functions, built our RAG search index, and initalized the RAG system, it is finally time to define our local PDF RAG tool.

This tool is the agent's interface to search the indexed PDF content:
1. It takes a query (e.g., a specific imaging finding) and an optional top_k (number of results to return).
2. The query is converted into an embedding using the same sentence transformer model.
3. This query embedding is used to search the FAISS index for the top_k most semantically similar text chunks from the PDFs.
4. Crucially, the tool now returns its findings as a JSON string. This structured output contains a list of "result" objects. Each object provides:
  - text: The actual content of the retrieved chunk.
  - source_pdf_name: The name of the PDF file this chunk originated from.
  - global_chunk_index: The unique global index of this chunk across all documents. This is essential for the retrieve_neighboring_chunks_tool to locate and provide further context.
  - chunk_index_in_doc: The index of the chunk within its original PDF.
  - relevance_score (optional): A score indicating how similar the chunk is to the query.
  
This JSON output allows the InternalReferenceExpertAgent (which is a CodeAgent) to easily parse these results and use the global_chunk_index for subsequent context expansion.


In [22]:
# @title
@tool
def query_internal_references_tool(query: str, top_k: int = 3) -> str:
    """
    Searches the indexed internal PDF reference documents for information
    relevant to the given query (e.g., an imaging finding).
    Uses semantic search over text chunks extracted from these PDFs.

    Args:
        query (str): The question or imaging finding to search for in the internal references.
        top_k (int): The maximum number of relevant text chunks to retrieve.
                     Defaults to 3, with a maximum of 5 allowed by this tool.

    Returns:
        str: A JSON string.
             If successful, this JSON string contains an object with a "results" key.
             The value of "results" is a list of retrieved chunk objects. Each chunk object
             includes 'text', 'source_pdf_name', 'global_chunk_index', 'chunk_index_in_doc',
             and 'relevance_score'.
             If an error occurs or no relevant results are found, the JSON string will contain
             an object with an "error" key and a descriptive message.
    """
    print(f"Local PDF RAG Tool: Received query '{query}', top_k={top_k}")
    global pdf_rag_index, pdf_rag_embedder, pdf_rag_chunks, pdf_rag_metadata # Access global RAG components

    if not all([pdf_rag_index, pdf_rag_embedder, pdf_rag_chunks, pdf_rag_metadata]):
        # Check if any essential component is None or empty
        print("   ❌ Error: Local PDF RAG system is not fully initialized or critical data (index, embedder, chunks, metadata) is missing.")
        return json.dumps({"error": "Local PDF RAG system is not fully initialized. Please check setup and PDF loading."})
    if not query or not query.strip():
        print("   ❌ Error: No query provided for internal reference lookup.")
        return json.dumps({"error": "No query provided for internal reference lookup."})

    try:
        # Sanitize top_k to be within a reasonable range (e.g., 1 to 5)
        k = min(max(1, int(top_k)), 5)

        query_embedding = pdf_rag_embedder.encode([query], convert_to_numpy=True)
        # FAISS index expects float32
        if query_embedding.dtype != np.float32:
            query_embedding = query_embedding.astype(np.float32)

        # Perform the search using the FAISS index
        distances, indices = pdf_rag_index.search(query_embedding, k)

        retrieved_items = []
        # indices[0] contains the list of global_chunk_indexes for the top_k results
        if indices.size > 0 and indices[0,0] != -1: # FAISS returns -1 if fewer than k items found
            for i in range(indices.shape[1]): # Iterate through the found indices
                global_idx = indices[0, i]
                if global_idx == -1: # No more valid items found for this query
                    break

                # Ensure the index is valid for our lists
                if 0 <= global_idx < len(pdf_rag_chunks) and global_idx < len(pdf_rag_metadata):
                    chunk_text_content = pdf_rag_chunks[global_idx]
                    meta_info = pdf_rag_metadata[global_idx]

                    retrieved_items.append({
                        "text": chunk_text_content.strip(),
                        "source_pdf_name": meta_info.get('source_pdf_name', 'Unknown Source'),
                        "global_chunk_index": meta_info.get('global_chunk_index', -1), # Should match global_idx
                        "chunk_index_in_doc": meta_info.get('chunk_index_in_doc', -1),
                        "relevance_score": float(distances[0, i]) if (distances is not None and distances.shape[1] > i) else None # Similarity score
                    })
                else:
                    print(f"   ⚠️ Warning: Invalid global_chunk_index {global_idx} encountered from FAISS search. Total chunks available: {len(pdf_rag_chunks)}. This result will be skipped.")

        if not retrieved_items:
            print(f"   ℹ️ No relevant chunks found for '{query}' in internal references after processing FAISS results.")
            return json.dumps({"error": f"No relevant information found in internal references for the query: '{query}'."})

        print(f"   ✅ Local PDF RAG Tool: Retrieved {len(retrieved_items)} structured items for '{query}'.")
        return json.dumps({
            "query": query, # Echo back the query for context
            "results": retrieved_items,
            "message": f"Successfully retrieved {len(retrieved_items)} relevant chunks from internal references."
        })

    except Exception as e:
        print(f"   ❌ Local PDF RAG Tool: An unexpected error occurred during search for '{query}': {e}")
        import traceback
        traceback.print_exc() # Log full traceback
        return json.dumps({"error": f"An unexpected error occurred while performing search in internal references for query '{query}': {str(e)}"})

In [23]:
# @title
# --- Test: query_internal_references_tool ---

print("--- Testing Internal PDF RAG Query Tool ---")

# Ensure RAG system seems ready and tool exists
rag_system_ready = (
    'query_internal_references_tool' in locals() and
    pdf_rag_index is not None and
    pdf_rag_embedder is not None and
    pdf_rag_chunks and
    pdf_rag_metadata
)

if rag_system_ready:
    test_query = "Occipital hypodensity, likely representing acute stroke." # Choose a term likely in neuro/radio PDFs
    print(f"Querying internal references for: '{test_query}' (top_k=2)")

    # Call the tool directly
    rag_result_json_str = query_internal_references_tool(query=test_query, top_k=5)

    print(f"\nRaw JSON Output from RAG tool:\n{rag_result_json_str}")

    try:
        rag_result = json.loads(rag_result_json_str)
        if "error" in rag_result:
            print(f"\n❌ Test Result (Error reported by tool): {rag_result['error']}")
        elif "results" in rag_result and isinstance(rag_result["results"], list):
            print(f"\n✅ Test Result (Success!): Found {len(rag_result['results'])} chunks.")
            # Store the global index of the first result for the next test
            first_result_global_index_for_next_test = None
            for i, chunk_info in enumerate(rag_result["results"]):
                print(f"\n--- Chunk {i+1} ---")
                print(f"  Source PDF: {chunk_info.get('source_pdf_name', 'N/A')}")
                print(f"  Global Index: {chunk_info.get('global_chunk_index', 'N/A')}")
                print(f"  Index in Doc: {chunk_info.get('chunk_index_in_doc', 'N/A')}")
                print(f"  Relevance Score: {chunk_info.get('relevance_score', 'N/A'):.4f}" if chunk_info.get('relevance_score') is not None else "  Relevance Score: N/A")
                print(f"  Text Preview: {chunk_info.get('text', '')[:150]}...")
                # Save the index of the first valid result
                if i == 0 and chunk_info.get('global_chunk_index', -1) != -1:
                     first_result_global_index_for_next_test = chunk_info.get('global_chunk_index')

            # Check if we stored an index for the neighbor test
            if first_result_global_index_for_next_test is not None:
                print(f"\n(Stored global_chunk_index {first_result_global_index_for_next_test} from the first result for the next test)")
            else:
                print("\n(Could not store a valid global_chunk_index from results for the next test)")

        else:
            print("\n❌ Test Error: RAG tool returned unexpected JSON structure.")
            print(f"   Parsed JSON: {rag_result}")

    except json.JSONDecodeError:
        print("\n❌ Test Error: RAG tool output was not valid JSON.")
    except Exception as e:
        print(f"\n❌ Test Error: An unexpected error occurred processing RAG result: {e}")

else:
    print("❌ Cannot run RAG query test: Tool not defined or RAG system not initialized.")
    print("   Please ensure PDF loading and RAG index building completed successfully in previous cells.")
    first_result_global_index_for_next_test = None # Ensure variable exists but is None

print("-" * 50)

--- Testing Internal PDF RAG Query Tool ---
Querying internal references for: 'Occipital hypodensity, likely representing acute stroke.' (top_k=2)
Local PDF RAG Tool: Received query 'Occipital hypodensity, likely representing acute stroke.', top_k=5
   ✅ Local PDF RAG Tool: Retrieved 5 structured items for 'Occipital hypodensity, likely representing acute stroke.'.

Raw JSON Output from RAG tool:
{"query": "Occipital hypodensity, likely representing acute stroke.", "results": [{"text": "in conjunction with cortical/subcorti -\ncal microhemorrhages, superficial siderosis, or chronic lobar hemorrhage, typical features of CAA. Distinguishing CAA-RI from ARIA-E solely by imaging is difficult, so medication history whether the patient is on anti-amyloid \u03b2 immunotherapy is crucial (23, 35, 38).\nISCHEMIC STROKE\nBecause ARIA often presents with nonspecific neurological symptoms, it may be mistaken for ischemic strokes. Additionally, FLAIR hyperintensity in ARIA-E can resemble acute or s

### 3.3 Retrieve Neighboring Chunks Tool (retrieve_neighboring_chunks_tool)
This tool is specifically designed to support the InternalReferenceExpertAgent (our RAG-focused CodeAgent). After the agent uses query_internal_references_tool to find an initial set of relevant text chunks, it might want to see more context around a particularly promising chunk.

The retrieve_neighboring_chunks_tool takes:

- target_global_chunk_index: The global index of the chunk for which context is desired (this index is provided in the output of query_internal_references_tool).
- num_before: How many chunks immediately preceding the target chunk to retrieve.
- num_after: How many chunks immediately succeeding the target chunk to retrieve.

It's careful to only retrieve neighbors that belong to the same original PDF document as the target chunk, preventing irrelevant context from other documents. The output is a JSON string containing a list of these context chunks (including the target chunk itself), each tagged with its type (e.g., 'target', 'before_1', 'after_1') and its metadata.

In [24]:
# @title
@tool
def retrieve_neighboring_chunks_tool(target_global_chunk_index: int, num_before: int = 1, num_after: int = 1) -> str:
    """
    Retrieves text chunks immediately preceding and succeeding a specified target chunk
    from the internal PDF knowledge base. Ensures that all retrieved neighboring chunks
    belong to the SAME SOURCE DOCUMENT as the target chunk. This tool is vital for
    obtaining wider contextual understanding around an initially retrieved piece of information.

    Args:
        target_global_chunk_index (int): The 'global_chunk_index' of the target chunk.
                                         This index is obtained from the output of the
                                         `query_internal_references_tool`.
        num_before (int): The number of chunks to retrieve immediately before the target chunk.
                          Defaults to 1. Allowed range: 0 to 3.
        num_after (int): The number of chunks to retrieve immediately after the target chunk.
                         Defaults to 1. Allowed range: 0 to 3.

    Returns:
        str: A JSON string.
             If successful, the JSON string contains an object with a "results" key. The value
             of "results" is a list of context chunk objects. Each object includes 'type'
             (e.g., 'target', 'before_1', 'after_1'), 'text', 'source_pdf_name',
             'global_chunk_index', and 'chunk_index_in_doc'. The list is ordered:
             [before_n, ..., before_1, target, after_1, ..., after_n].
             If an error occurs (e.g., index out of bounds, target chunk metadata issues),
             the JSON string will contain an object with an "error" key and a descriptive message.
             A "warning" key might be present if only the target chunk is returned despite
             requesting neighbors (e.g., if at document boundaries).
    """
    global pdf_rag_chunks, pdf_rag_metadata # Access globally stored RAG data

    print(f"Neighbor Chunks Tool: Request for target_global_idx={target_global_chunk_index}, num_before={num_before}, num_after={num_after}")

    # Validate RAG system readiness and input parameters
    if not all([pdf_rag_chunks, pdf_rag_metadata]) or len(pdf_rag_chunks) != len(pdf_rag_metadata):
        print("   ❌ Error: PDF RAG system (chunks/metadata) is not properly initialized or has data inconsistency.")
        return json.dumps({"error": "PDF RAG system (chunks/metadata) is not properly initialized or has data inconsistency."})

    if not (isinstance(target_global_chunk_index, int) and 0 <= target_global_chunk_index < len(pdf_rag_chunks)):
        print(f"   ❌ Error: Invalid target_global_chunk_index: {target_global_chunk_index}. Must be an integer within bounds (0-{len(pdf_rag_chunks)-1}).")
        return json.dumps({"error": f"Invalid target_global_chunk_index: {target_global_chunk_index}. Must be an integer within bounds (0-{len(pdf_rag_chunks)-1})."})

    # Sanitize num_before and num_after to be within reasonable limits (e.g., 0-3)
    num_before = min(max(0, int(num_before)), 3)
    num_after = min(max(0, int(num_after)), 3)

    try:
        target_chunk_meta = pdf_rag_metadata[target_global_chunk_index]
        target_chunk_text = pdf_rag_chunks[target_global_chunk_index]
        target_source_pdf = target_chunk_meta.get('source_pdf_name') # Get the source PDF of the target chunk

        if target_source_pdf is None: # Should not happen if metadata is built correctly
            print(f"   ❌ Error: Metadata for target chunk (global_idx {target_global_chunk_index}) is missing 'source_pdf_name'.")
            return json.dumps({"error": f"Metadata for target chunk (global_idx {target_global_chunk_index}) is missing 'source_pdf_name'."})

        # Initialize list to store all context chunks, starting with the target
        context_chunks_list = [{
            "type": "target", # Identifies this as the original target chunk
            "text": target_chunk_text,
            "source_pdf_name": target_source_pdf,
            "global_chunk_index": target_global_chunk_index,
            "chunk_index_in_doc": target_chunk_meta.get('chunk_index_in_doc', -1)
        }]

        # Retrieve preceding chunks from the same document
        for i in range(1, num_before + 1):
            prev_global_idx = target_global_chunk_index - i
            if prev_global_idx >= 0: # Check if index is within global bounds
                prev_meta = pdf_rag_metadata[prev_global_idx]
                # CRITICAL: Ensure the preceding chunk is from the SAME source PDF
                if prev_meta.get('source_pdf_name') == target_source_pdf:
                    context_chunks_list.insert(0, { # Insert at the beginning to maintain natural reading order
                        "type": f"before_{i}",
                        "text": pdf_rag_chunks[prev_global_idx],
                        "source_pdf_name": target_source_pdf,
                        "global_chunk_index": prev_global_idx,
                        "chunk_index_in_doc": prev_meta.get('chunk_index_in_doc', -1)
                    })
                else: # Reached the boundary of a different document
                    print(f"   ℹ️ Boundary: Preceding chunk (global_idx {prev_global_idx}) is from a different document. Stopping 'before' search.")
                    break
            else: # Reached the beginning of the global chunk list
                break

        # Retrieve succeeding chunks from the same document
        for i in range(1, num_after + 1):
            next_global_idx = target_global_chunk_index + i
            if next_global_idx < len(pdf_rag_chunks): # Check if index is within global bounds
                next_meta = pdf_rag_metadata[next_global_idx]
                # CRITICAL: Ensure the succeeding chunk is from the SAME source PDF
                if next_meta.get('source_pdf_name') == target_source_pdf:
                    context_chunks_list.append({
                        "type": f"after_{i}",
                        "text": pdf_rag_chunks[next_global_idx],
                        "source_pdf_name": target_source_pdf,
                        "global_chunk_index": next_global_idx,
                        "chunk_index_in_doc": next_meta.get('chunk_index_in_doc', -1)
                    })
                else: # Reached the boundary of a different document
                    print(f"   ℹ️ Boundary: Succeeding chunk (global_idx {next_global_idx}) is from a different document. Stopping 'after' search.")
                    break
            else: # Reached the end of the global chunk list
                break

        output_payload = {"results": context_chunks_list}
        if len(context_chunks_list) == 1 and (num_before > 0 or num_after > 0) :
            # Only target chunk was added, but neighbors were requested
            output_payload["warning"] = ("Only the target chunk itself was retrieved; no valid neighbors found "
                                         "within the same document for the requested range (e.g., target is at document start/end).")

        print(f"   ✅ Neighbor Chunks Tool: Retrieved {len(context_chunks_list)} total context chunks for target_global_idx {target_global_chunk_index}.")
        return json.dumps(output_payload)

    except IndexError: # Should be rare if bounds are checked, but good for safety
        print(f"   ❌ Error: Index out of bounds while accessing RAG chunks/metadata for target_global_idx {target_global_chunk_index}.")
        return json.dumps({"error": f"Index error while accessing chunks/metadata for target_global_idx {target_global_chunk_index}. This might indicate RAG system data inconsistency."})
    except Exception as e:
        print(f"   ❌ Neighbor Chunks Tool: Unexpected error for target_global_idx {target_global_chunk_index}: {e}")
        import traceback
        traceback.print_exc()
        return json.dumps({"error": f"An unexpected error occurred in retrieve_neighboring_chunks_tool: {str(e)}"})

In [25]:
# @title
# --- Test: retrieve_neighboring_chunks_tool ---

print("--- Testing Neighboring Chunks Tool ---")

# Check if tool exists and if we have a valid target index from the previous test
neighbor_tool_ready = (
    'retrieve_neighboring_chunks_tool' in locals() and
    'first_result_global_index_for_next_test' in locals() and
    first_result_global_index_for_next_test is not None and
    isinstance(first_result_global_index_for_next_test, int) and
    pdf_rag_chunks and # Check RAG system still seems loaded
    pdf_rag_metadata
)

if neighbor_tool_ready:
    target_idx = first_result_global_index_for_next_test
    num_b = 1
    num_a = 1
    print(f"Attempting to retrieve neighbors for target global_chunk_index: {target_idx} (before={num_b}, after={num_a})")

    # Call the tool directly
    neighbor_result_json_str = retrieve_neighboring_chunks_tool(
        target_global_chunk_index=target_idx,
        num_before=num_b,
        num_after=num_a
    )

    print(f"\nRaw JSON Output from Neighbor tool:\n{neighbor_result_json_str}")

    try:
        neighbor_result = json.loads(neighbor_result_json_str)
        if "error" in neighbor_result:
            print(f"\n❌ Test Result (Error reported by tool): {neighbor_result['error']}")
        elif "results" in neighbor_result and isinstance(neighbor_result["results"], list):
            print(f"\n✅ Test Result (Success!): Retrieved {len(neighbor_result['results'])} context chunks (target + neighbors).")
            if "warning" in neighbor_result:
                print(f"   Note: Tool issued a warning: {neighbor_result['warning']}")

            for chunk_info in neighbor_result["results"]:
                 print(f"\n--- Context Chunk ---")
                 print(f"  Type: {chunk_info.get('type', 'N/A')}")
                 print(f"  Source PDF: {chunk_info.get('source_pdf_name', 'N/A')}")
                 print(f"  Global Index: {chunk_info.get('global_chunk_index', 'N/A')}")
                 print(f"  Index in Doc: {chunk_info.get('chunk_index_in_doc', 'N/A')}")
                 print(f"  Text Preview: {chunk_info.get('text', '')[:150]}...")
        else:
            print("\n❌ Test Error: Neighbor tool returned unexpected JSON structure.")
            print(f"   Parsed JSON: {neighbor_result}")

    except json.JSONDecodeError:
        print("\n❌ Test Error: Neighbor tool output was not valid JSON.")
    except Exception as e:
        print(f"\n❌ Test Error: An unexpected error occurred processing neighbor result: {e}")

elif 'retrieve_neighboring_chunks_tool' not in locals():
     print("❌ Cannot run neighbor test: `retrieve_neighboring_chunks_tool` function not defined.")
elif 'first_result_global_index_for_next_test' not in locals() or first_result_global_index_for_next_test is None:
     print("❌ Cannot run neighbor test: No valid 'global_chunk_index' obtained from the previous RAG query test.")
     print("   Please ensure the RAG query test ran successfully and found at least one result.")
else:
     print("❌ Cannot run neighbor test: RAG system (chunks/metadata) seems uninitialized.")

print("-" * 50)

--- Testing Neighboring Chunks Tool ---
Attempting to retrieve neighbors for target global_chunk_index: 48 (before=1, after=1)
Neighbor Chunks Tool: Request for target_global_idx=48, num_before=1, num_after=1
   ✅ Neighbor Chunks Tool: Retrieved 3 total context chunks for target_global_idx 48.

Raw JSON Output from Neighbor tool:
{"results": [{"type": "before_1", "text": "s ARIA-E in, whereas ARIA-E in the setting of an -\nti-amyloid \u03b2 immunotherapy is viewed as an iatrogenic CAA-RI or CAA-RI\u2013like syndrome. So -\nlopova et al. (34) reported autopsy findings of an acute arteritis pattern resembling severe CAA-RI in a fatal ARIA case under lecanemab treatment, including widespread inflammation with macrophages and activated microglia, along with arteriol degeneration (25, 34). Unlike drug-induced ARIA, spontaneous CAA-RI is an autoimmune process that responds to immu -\nnosuppressive therapy and corticosteroids (35). On contrast enhanced MRI, involving the me -\nning\nes or bra

### 3.4 Web Search Tool (DuckDuckGoSearchTool)
We also need a general web search capability, primarily for the RadiopaediaExpertAgent to find the correct Radiopaedia article URLs based on a diagnosis term. SmolAgents provides a ready-made tool using DuckDuckGo for this that can be passed to agents as a base tool (so, no need for custom definition by us), but if you want to see how it works, it is like the following code snippet:

In [26]:
# @title
# In Section 3, ADD this cell:

from smolagents import DuckDuckGoSearchTool

web_search_tool = None # Initialize to None
try:
    # Instantiate the pre-built tool
    web_search_tool = DuckDuckGoSearchTool()
    print("✅ WebSearchTool instantiated successfully.")
    # Display its description so the learner sees what the agent knows about it
    print(f"   Tool Description: {web_search_tool.description}")
except Exception as e:
    print(f"❌ Error instantiating WebSearchTool: {e}")
    print("   Web search capabilities will be unavailable if this failed.")

✅ WebSearchTool instantiated successfully.
   Tool Description: Performs a duckduckgo web search based on your query (think a Google search) then returns the top search results.


In [None]:
# @title
# Code Cell 4: Testing DuckDuckGoSearchTool (NEW TEST CELL)

print("--- Testing DuckDuckGo Search Tool ---")

# Check if the tool was instantiated successfully
if web_search_tool is not None and 'web_search_tool' in locals():
    # Example query: Find the radiopaedia page for Multiple Sclerosis
    test_search_query = "stroke site:radiopaedia.org"
    print(f"Performing web search for: '{test_search_query}'")

    try:
        # Call the tool's forward method (or call the instance directly)
        # SmolAgent tools can often be called directly as functions
        search_results_str = web_search_tool(query=test_search_query)

        print("\n✅ Web Search Result (Success!):")
        # The tool typically returns a formatted string of results
        print(search_results_str)
        print("\n(Observe if the top result(s) point to the correct Radiopaedia URL)")

    except Exception as e:
        print(f"\n❌ Test Error: An unexpected error occurred during web search: {e}")
        import traceback
        traceback.print_exc()

else:
    print("❌ Cannot run web search test: `web_search_tool` was not instantiated successfully.")

print("-" * 50)

--- Testing DuckDuckGo Search Tool ---
Performing web search for: 'stroke site:radiopaedia.org'

✅ Web Search Result (Success!):
## Search Results

[Ischemic stroke | Radiology Reference Article - Radiopaedia.org](https://radiopaedia.org/articles/ischemic-stroke-2?lang=us)
Terminology. The term "stroke" is a clinical determination, whereas "infarction" is fundamentally a pathologic term 1.. Bridging these terms, ischemic stroke is the subtype of stroke that requires both a clinical neurologic deficit and evidence of CNS infarction (cell death attributable to ischemia). The evidence of infarction may be based on imaging, pathology, and/or persistent neurologic ...

[Stroke | Radiology Reference Article - Radiopaedia.org](https://radiopaedia.org/articles/stroke)
A stroke is a clinical diagnosis that refers to a sudden onset focal neurological deficit of presumed vascular origin.. Stroke is generally divided into two broad categories 1,2:. ischemic stroke (87%); hemorrhagic stroke (13%); 

## Part 4. Agent Setup Using SmolAgents

With our specialized tools defined, we now construct the AI agents that will use them. This section details the setup of our three-agent system, designed for a sophisticated radiology report analysis workflow:

1. **LLM Configuration**: Setting up the InferenceClientModel instances that power the agents.
2. **Initial Information Extraction**: Defining the extract_impression_and_findings_with_groq function for pre-processing the user's transcription.
3. **Custom Agent Instructions**: Defining the detailed instruction strings that will be appended to each agent's default system prompt to guide their specific roles and workflows.
4. **Expert Agents**:
  - **RadiopaediaExpertAgent** (CodeAgent): Takes a primary diagnosis, uses web_search_tool to find the Radiopaedia URL, uses radiopaedia_content_extraction_tool to get content, and returns a concise summary.
  - **InternalReferenceExpertAgent** (CodeAgent): Takes imaging findings, uses query_internal_references_tool and retrieve_neighboring_chunks_tool to research differential diagnoses (Ddx) within local PDFs, and returns structured JSON results.
5. **Orchestrator Agent** (RadiologyReportAnalyzer - CodeAgent): The central coordinator, directing the expert agents and synthesizing their findings into final differential diagnosis suggestions for the user.

We will initialize each agent and then customize its behavior by appending our specific instructions to its prompt_templates["system_prompt"].

### 4.1: Define LLM Models for Agents

Each agent requires an LLM "brain". We use InferenceClientModel from SmolAgents to connect to suitable models (e.g., via Hugging Face Inference API or partners). We'll configure models for the orchestrator (needs strong reasoning/coding), the Internal Reference code agent (also needs coding ability), and the Radiopaedia tool-calling agent.


In [None]:
# @title
# In Section 4

# --- Define LLM Models for Agents ---

# Ensure HF_TOKEN is available and login was successful from Section 1
if not HF_TOKEN or not HfFolder.get_token():
     raise ValueError(
         "❌ HF_TOKEN is not set or Hugging Face login failed in Section 1. "
         "Cannot initialize InferenceClientModel. Please review Section 1 setup."
     )

# --- Configuration for LLM Providers ---
# If needed, specify provider details here (e.g., for TogetherAI, FireworksAI)
LLM_PROVIDER_DETAILS = {}
# Example: LLM_PROVIDER_DETAILS = {"provider": "together"}

# --- Model Selection ---
# Orchestrator (CodeAgent): Needs strong reasoning, planning, code generation
# ORCHESTRATOR_MODEL_ID = "meta-llama/Llama-3.1-70B-Instruct"
ORCHESTRATOR_MODEL_ID = "Qwen/Qwen2.5-Coder-32B-Instruct"

# Internal Reference Expert (CodeAgent): Also needs good reasoning/code generation
INTERNAL_EXPERT_CODE_AGENT_MODEL_ID = "Qwen/Qwen2.5-Coder-32B-Instruct" # Or use 70B if 8B struggles
# INTERNAL_EXPERT_CODE_AGENT_MODEL_ID = "meta-llama/Llama-3.1-70B-Instruct"
# Radiopaedia Expert (CodeAgent): Task is more focused, can use a smaller model
RADIOPEDIA_AGENT_MODEL_ID = "Qwen/Qwen2.5-Coder-32B-Instruct"
# RADIOPEDIA_AGENT_MODEL_ID = "meta-llama/Llama-3.1-70B-Instruct"
print(f"Attempting to use ORCHESTRATOR_MODEL_ID: {ORCHESTRATOR_MODEL_ID}")
print(f"Attempting to use INTERNAL_EXPERT_CODE_AGENT_MODEL_ID: {INTERNAL_EXPERT_CODE_AGENT_MODEL_ID}")
print(f"Attempting to use RADIOPEDIA_AGENT_MODEL_ID (for Radiopaedia Expert): {RADIOPEDIA_AGENT_MODEL_ID}")

orchestrator_llm_model = None
internal_expert_llm_model = None
radiopedia_scraping_model = None # For Radiopaedia Expert

try:
    print(f"\nInitializing Orchestrator LLM: {ORCHESTRATOR_MODEL_ID}...")
    orchestrator_llm_model = InferenceClientModel(
        model_id=ORCHESTRATOR_MODEL_ID,
        token=HF_TOKEN,
        max_tokens=4096,  # Ample for complex logic and JSON output
        temperature=0.1,      # Low temp for deterministic code/planning
        **LLM_PROVIDER_DETAILS
    )
    print(f"✅ Orchestrator LLM ({ORCHESTRATOR_MODEL_ID}) initialized.")

    print(f"\nInitializing Internal Reference Expert LLM: {INTERNAL_EXPERT_CODE_AGENT_MODEL_ID}...")
    internal_expert_llm_model = InferenceClientModel(
        model_id=INTERNAL_EXPERT_CODE_AGENT_MODEL_ID,
        token=HF_TOKEN,
        max_tokens=4096, # Needs space for code generation and reasoning
        temperature=0.1,     # Low temp for reliable code generation
        **LLM_PROVIDER_DETAILS
    )
    print(f"✅ Internal Reference Expert LLM ({INTERNAL_EXPERT_CODE_AGENT_MODEL_ID}) initialized.")

    print(f"\nInitializing Radiopaedia Expert LLM: {RADIOPEDIA_AGENT_MODEL_ID}...")
    radiopedia_scraping_model = InferenceClientModel(
        model_id=RADIOPEDIA_AGENT_MODEL_ID,
        token=HF_TOKEN,
        max_tokens=10000, # Room for reasoning and summarizing
        temperature=0.2,     # Slightly higher temp might help summarization creativity
        **LLM_PROVIDER_DETAILS
    )
    print(f"✅ Radiopaedia Expert LLM ({RADIOPEDIA_AGENT_MODEL_ID}) initialized.")

except Exception as e:
    print(f"\n❌ Error initializing one or more InferenceClientModels: {e}")
    print("   Troubleshooting tips:")
    print("   - Verify your HF_TOKEN in Section 1.")
    print("   - Check the model IDs are valid and accessible via your token/provider.")
    print("   - Consider trying smaller models (e.g., 8B versions) if resource limits are suspected.")
    # Ensure models are None if init failed
    if 'orchestrator_llm_model' not in locals() or orchestrator_llm_model is None: orchestrator_llm_model = None
    if 'internal_expert_llm_model' not in locals() or internal_expert_llm_model is None: internal_expert_llm_model = None
    if 'radiopedia_scraping_model' not in locals() or radiopedia_scraping_model is None: radiopedia_scraping_model = None
    print("\nContinuing setup, but agent functionality may be affected if LLMs failed.")

Attempting to use ORCHESTRATOR_MODEL_ID: Qwen/Qwen2.5-Coder-32B-Instruct
Attempting to use INTERNAL_EXPERT_CODE_AGENT_MODEL_ID: Qwen/Qwen2.5-Coder-32B-Instruct
Attempting to use RADIOPEDIA_AGENT_MODEL_ID (for Radiopaedia Expert): Qwen/Qwen2.5-Coder-32B-Instruct

Initializing Orchestrator LLM: Qwen/Qwen2.5-Coder-32B-Instruct...
✅ Orchestrator LLM (Qwen/Qwen2.5-Coder-32B-Instruct) initialized.

Initializing Internal Reference Expert LLM: Qwen/Qwen2.5-Coder-32B-Instruct...
✅ Internal Reference Expert LLM (Qwen/Qwen2.5-Coder-32B-Instruct) initialized.

Initializing Radiopaedia Expert LLM: Qwen/Qwen2.5-Coder-32B-Instruct...
✅ Radiopaedia Expert LLM (Qwen/Qwen2.5-Coder-32B-Instruct) initialized.


### 4.2: Define Initial Diagnosis and Findings Extraction Function
This function uses a fast external LLM call (Groq) to pre-process the user's transcript, separating the likely primary diagnosis from the listed imaging findings. This structured data simplifies the input for our main Orchestrator agent.

In [None]:
# @title
# In Section 4

# (Make sure 'json' is imported, typically done in Section 1)
# import json

def extract_impression_and_findings_with_groq(report_text: str) -> dict:
    """
    Uses a Groq LLM to extract a primary impression and a list of key imaging
    findings from a radiology report, returning them in a dictionary.

    Args:
        report_text (str): The transcribed radiology report.

    Returns:
        dict: A dictionary with keys "primary_diagnosis" (str | None) and
              "imaging_findings" (list[str]). Returns default with None/empty list
              on error or if items are not found.
    """
    global groq_client # Use the client initialized in Section 1
    default_return = {"primary_diagnosis": None, "imaging_findings": []}
    if not groq_client:
        print("⚠️ Groq client not initialized (check Section 1 setup). Cannot extract diagnosis/findings via Groq.")
        return default_return
    if not report_text or not report_text.strip():
        print("⚠️ No report text provided to extract_impression_and_findings_with_groq.")
        return default_return

    print("Attempting to extract primary diagnosis and imaging findings using Groq LLM...")
    try:
        # Ensure model name is current on Groq
        groq_model_name = "llama-3.3-70b-versatile" # Or "llama3-70b-8192", etc.

        chat_completion = groq_client.chat.completions.create(
            messages=[
                {
                    "role": "system",
                    "content": (
                        "You are an expert medical information extraction system. Analyze the provided radiology report text. "
                        "Your task is to identify and extract two specific pieces of information: "
                        "1. The single **primary diagnosis** the radiologist seems to be considering most strongly (e.g., from the 'Impression' section or explicitly stated). If multiple possibilities are listed with equal emphasis, choose the first one mentioned or the most likely one based on common patterns. If no clear primary diagnosis is stated, return null or an empty string for this field. "
                        "2. A comprehensive list of distinct **imaging findings** described in the report (e.g., 'ring enhancement', 'T2 hyperintensity', 'mass effect', 'periventricular lesions', 'cerebellar atrophy'). Extract specific, descriptive findings. "
                        "Return this information strictly as a JSON object with exactly two keys: 'primary_diagnosis' (string or null) and 'imaging_findings' (a list of strings). Output ONLY the JSON object and nothing else.\n"
                        "Example Input: 'Findings: Multiple T2 bright lesions in white matter. Contrast shows ring enhancement in one frontal lesion. Impression: Suggestive of Multiple Sclerosis.'\n"
                        "Example Output: {\"primary_diagnosis\": \"Multiple Sclerosis\", \"imaging_findings\": [\"Multiple T2 bright lesions in white matter\", \"ring enhancement in one frontal lesion\"]}"
                    ),
                },
                {
                    "role": "user",
                    "content": f"Extract the primary diagnosis and key imaging findings from this radiology report:\n\n--- START REPORT ---\n{report_text}\n--- END REPORT ---\n\nOutput only a JSON object with 'primary_diagnosis' and 'imaging_findings' keys.",
                }
            ],
            model=groq_model_name,
            temperature=0.0, # Deterministic extraction
            max_tokens=450,  # Adjusted token limit
            response_format={"type": "json_object"}, # Explicitly request JSON
        )
        response_content = chat_completion.choices[0].message.content
        # print(f"   Groq raw JSON response: {response_content}") # Debugging line

        # Attempt to parse the JSON response
        try:
            parsed_json = json.loads(response_content)
            if not isinstance(parsed_json, dict):
                print(f"   Warning: Groq response was not a dictionary as expected. Type: {type(parsed_json)}")
                return default_return

            # Extract, validate, and clean the diagnosis
            diagnosis_raw = parsed_json.get("primary_diagnosis")
            final_diagnosis = str(diagnosis_raw).strip() if diagnosis_raw and isinstance(diagnosis_raw, str) else None

            # Extract, validate, and clean the findings list
            findings_raw = parsed_json.get("imaging_findings", [])
            if isinstance(findings_raw, list):
                # Filter out non-strings, strip whitespace, ensure uniqueness (case-insensitive check), sort
                seen_lower = set()
                cleaned_findings = []
                for f in findings_raw:
                    if isinstance(f, str) and f.strip():
                         f_stripped = f.strip()
                         if f_stripped.lower() not in seen_lower:
                              cleaned_findings.append(f_stripped)
                              seen_lower.add(f_stripped.lower())
                final_findings = sorted(cleaned_findings)
            else:
                print(f"   Warning: 'imaging_findings' from Groq was not a list. Type: {type(findings_raw)}")
                final_findings = [] # Default to empty list if not a list

            result = {"primary_diagnosis": final_diagnosis, "imaging_findings": final_findings}
            print(f"✅ Extracted via Groq: Diagnosis='{result['primary_diagnosis']}', Findings={result['imaging_findings']}")
            return result

        except json.JSONDecodeError:
            print(f"   Warning: Failed to parse JSON response from Groq for diagnosis/findings: {response_content}")
            return default_return
        except Exception as e_parse:
            print(f"   Error processing Groq JSON content: {e_parse}")
            return default_return

    except Exception as e_api:
        print(f"❌ Error calling Groq API for diagnosis/finding extraction: {e_api}")
        import traceback
        traceback.print_exc()
        return default_return

# --- Optional: Test the updated Groq Extraction Function ---
test_report_for_dx_finding = "Impression: Occipital hypodensity, likely representing acute stroke."
print(f"\nTesting Diagnosis/Finding Extraction for: '{test_report_for_dx_finding}'")
extracted_dx_fx = extract_impression_and_findings_with_groq(test_report_for_dx_finding)
print(f"Test Extraction Result: {extracted_dx_fx}")
print("-" * 50)


Testing Diagnosis/Finding Extraction for: 'Impression: Occipital hypodensity, likely representing acute stroke.'
Attempting to extract primary diagnosis and imaging findings using Groq LLM...
✅ Extracted via Groq: Diagnosis='acute stroke', Findings=['Occipital hypodensity']
Test Extraction Result: {'primary_diagnosis': 'acute stroke', 'imaging_findings': ['Occipital hypodensity']}
--------------------------------------------------


### 4.3: Define Custom Instruction Strings for Agents
These strings contain the detailed operational logic and output requirements for each agent. They will be appended to the default system prompts provided by SmolAgents.

In [None]:
# @title
# In Section 4

# --- Custom Instructions for Radiopaedia Expert Agent ---

radiopaedia_agent_custom_instructions_text = """
--- ADDITIONAL INSTRUCTIONS FOR YOUR ROLE (Radiopaedia Expert - CodeAgent) ---
You are an expert research assistant. Your specific task is to take a **single primary radiological impression** (provided as the `query` argument), find the most relevant Radiopaedia.org article(s) for it, extract their content, and then return it.

You have access to these tools:
1. `web_search(query: str)`: Use this to find the URL(s) of Radiopaedia pages.
2. `radiopaedia_content_extraction_tool(page_url: str)`: Use this to get the text content from a specific Radiopaedia URL.

Your workflow when called with a `query` (which is the primary diagnosis):
1.  **Find Radiopaedia URL(s):**
    *   Use `web_search_tool` with a targeted search query like "`query` site:radiopaedia.org" to find the official Radiopaedia article URL(s) for the given diagnosis.
    *   Prioritize finding the single, most comprehensive Radiopaedia page. If multiple relevant pages are found (e.g., main article, specific subtypes), you should select the most relevant one.
2.  **Extract Content:**
    *   For the selected URL, use `radiopaedia_content_extraction_tool(page_url="URL_from_step_1")` to extract its textual content. Handle potential errors from this tool (e.g., if it returns a string starting with "Error:").
3. **Summarization:**
    * Summarize the information the page content offers on the differential diagnoses of the primary diagnosis in one page. Focus on how the differentials present and how to discern them from the primary diagnosis.
4.  **Handling Failures:**
    *   If you cannot find a relevant Radiopaedia URL via web search, or if content extraction fails for all selected URLs, return a clear error message like "Error: Could not find or process a Radiopaedia page for diagnosis '[query]'."
5.  **Final Output:**
    *   Write your report in `final_markdown_string`. Call `final_answer(final_markdown_string)`.

**Important Considerations:**
*   Generated code can use `json` and `ast` packages. Use `print()` for logging.
*   Handle errors robustly at each stage (parsing, tool calls, accessing results).

**Final Output:** Write your report in `final_markdown_string`. Call `final_answer(final_markdown_string)`.
"""
print("Defined: radiopaedia_agent_custom_instructions_text")

# --- Custom Instructions for Internal Reference Expert Agent (CodeAgent Version) ---

internal_rag_agent_custom_instructions_text = """
--- ADDITIONAL INSTRUCTIONS FOR YOUR ROLE (Internal Reference Expert - CodeAgent) ---
You are a specialized AI agent implemented as a CodeAgent. Your goal is to analyze a list of **imaging findings** to identify potential differential diagnoses (Ddx) for each of them, using information retrieved from an internal PDF reference library via provided tools.
Your final output MUST be a **well-formatted Markdown report** summarizing the potential Ddx identified for each finding, returned via the `final_answer()` function.

**Input:**
*   The list of findings is passed to you as part of your Task prompt.

**Available Tools (for your generated code):**
1. `query_internal_references_tool(query: str, top_k: int)`: Searches internal PDFs. Returns a **JSON string** ('results' list or 'error'). Your code must parse this.
2. `retrieve_neighboring_chunks_tool(...)`: Gets context around a chunk index. Returns a **JSON string** ('results' list or 'error'). Your code must parse this.

**Required Workflow Logic (might need two or more rounds of Python Code Generation):**

1.  **Extract Findings List:** Create a manual Python list of all imaging findings from the Task you received.
2.  **RAG:** Do RAG for each of the imaging findings to find relevant chunks for each of them. To do so, you can call `query_internal_references_tool(query=query_string, top_k=5)` (retrieve slightly more, e.g., top 5 initially) and expect it to return a JSON to you.
4.  **Retrieved Chunks Analysis:**
    *    Read the retrieved chunks and think which of those might be talking about different diagnoses relevant to the respected imaging finding.
    *    Select the chunks that appear relevant.
5.  **Round 2: Retrieve Neighbors for Selected Chunks:**
    *   For each of the selected chunks:
        *   Call `retrieve_neighboring_chunks_tool` for that index. **Decide dynamically how many neighbors (`num_before`, `num_after`, max 3 each) are needed** based on the initial chunk's relevance or ambiguity (this requires LLM reasoning during code generation).
        *   Store and log the neighbor JSON string result associated with the selected chunks. Handle/log errors.
6. Read the expanded chunks and see if you find any differential diagnoses relevant to the imaging findings. Do not perform a second round of RAG or do not attempt further looking into the neighbor chunks.
7.  **Synthesize and Generate Markdown Report:**
    *   Based on the chunks you read, create a final report of all possible differential diagnoses for each of the imaging findings and mention the reasons why. Also cite the source PDF / related chunk numbers for reference.
    *   Emphasize and prioritize differential diagnoses that are relevant to multiple of the input imaging findings.
    *   Keep your report coherent, concise and relevant.
    *   Your report MUST be in Markdown format.
8.  **Final Output:** Write your report in `final_markdown_string`. Call `final_answer(final_markdown_string)`.

**Important Considerations:**
*   Generated code can use `json` and `ast` packages. Use `print()` for logging.
*   Handle errors robustly at each stage (parsing, tool calls, accessing results).
*   Base Ddx strictly on internal PDF text context.
"""
print("Defined: internal_rag_agent_custom_instructions_text")

# --- Custom Instructions for Orchestrator Agent (`RadiologyReportAnalyzer` - REVISED for Managed Agent Calls) ---
orchestrator_custom_instructions_text = """
--- HIGH-LEVEL GOAL & RESOURCES (RadiologyReportAnalyzer - CodeAgent) ---
You are the top-level Orchestrator CodeAgent. Your goal is to generate differential diagnosis suggestions for a radiologist by analyzing their initial assessment and orchestrating managed expert agents. Your final output MUST be a **Markdown report** string returned via `final_answer()`.

**Input:**
*   A Python dictionary `input_data_dict` (containing `primary_diagnosis`, `imaging_findings`) available in your execution environment.

**Available Resources (Managed Agents - call using `task=`):**
1.  `radiopaedia_expert`: Input: diagnosis string via `task=`. Output: related Radiopedia article string.
2.  `internal_reference_expert`: Input: findings list string representation via `task=`. Output: Markdown report detailing Ddx for imaging findings.

**Required Task (Implement via Python Code Generation):**

1.  **Process Input:** Generate code to access `input_data_dict`, extract `primary_diagnosis` and `imaging_findings_list`. Handle errors and missing data robustly. Log extracted values. Set `input_ok` flag. If input is invalid, generate code to call `final_answer` with an error Markdown report.
2.  **Delegate Research (Conditional on `input_ok`):** Generate code that proceeds only if input is valid. Initialize variables for agent responses.
    *   **Call Radiopaedia Expert:** If `primary_diagnosis`, generate code to call `radiopaedia_expert` using `task=` with the diagnosis query. Store the returned string response. Include error handling and logging.
    *   **Call Internal Reference Expert:** If `imaging_findings_list`, generate code to construct the task string (including the string representation of the list) and call `internal_reference_expert` using `task=`. Store the returned Markdown string response. Include error handling and logging.
3.  **Synthesize Findings:** Analyze the string responses received from both agents (note that these will be in simple string formats and not JSON). Compare info from the Radiopaedia articles with the internal Ddx report (Markdown). Identify consistencies, conflicts, and noteworthy suggestions. Create a list of concise, actionable suggestion strings on what differentials to consider.
4.  **Generate Final Markdown Report:** Generate code to construct the final Markdown report string (`final_report_markdown`). No need to include the original content the agents returned to you, but instead include a Synthesized Suggestions section to talk about possible differential diagnoses to consider. Format clearly.
5.  **Verification:** Double check your report to make sure no part is missing or is incomplete. If needed, generate it again.
6.  **Final Output:** Generate code to call `final_answer(final_report_markdown)`. The argument MUST be the complete Markdown report string. Import `json` if needed internally by your generated code (e.g., for robust error handling).

**Important Considerations:**
*   Ensure the syntax of your codes are appropriate. E.g., avoid generating codes that might result in errors such as "f-string expression part cannot include a backslash".
*   Handle errors robustly at each stage (parsing, tool calls, accessing results).
"""
print("Defined: orchestrator_custom_instructions_text")

Defined: radiopaedia_agent_custom_instructions_text
Defined: internal_rag_agent_custom_instructions_text
Defined: orchestrator_custom_instructions_text


### 4.4: Define and Customize Agents
Initialize the agents using their respective classes (CodingAgent) for Radiopaedia, CodeAgent for Internal Reference and Orchestrator). Then, immediately append the corresponding custom instruction strings to their prompt_templates["system_prompt"].

In [None]:
# @title
# In Section 4

# --- Define and Customize Radiopaedia Expert Agent (CodingAgent) ---
radiopaedia_expert_agent = None
# Check dependencies (LLM model, tools from Sec 3)
if radiopedia_scraping_model and 'radiopaedia_content_extraction_tool' in locals() and web_search_tool:
    try:
        radiopaedia_expert_agent = CodeAgent(
            tools=[radiopaedia_content_extraction_tool],
            model=radiopedia_scraping_model,
            name="radiopaedia_expert",
            add_base_tools=True, # To pass web_search
            description=( # Short description for Orchestrator
                "Input: radiological impression string. Action: Finds relevant Radiopaedia URL via web search, extracts content, returns the content with URL or error."
            ),
            max_steps=8, # Increased steps for multi-tool use + summarization thought
            verbosity_level=2, # Set to 2 for deep debugging if needed
            additional_authorized_imports=['json', 'ast'] # For parsing its input task string
        )
        # Append custom instructions
        if radiopaedia_expert_agent.prompt_templates["system_prompt"]:
            radiopaedia_expert_agent.prompt_templates["system_prompt"] += "\n" + radiopaedia_agent_custom_instructions_text
            print("✅ Radiopaedia Expert Agent defined and system prompt customized.")
        else: print("⚠️ Radiopaedia Expert Agent: default system prompt template empty during customization.")
    except Exception as e:
        print(f"❌ Error defining/customizing Radiopaedia Expert Agent: {e}")
        radiopaedia_expert_agent = None
else:
    print("⚠️ Skipping Radiopaedia Expert Agent definition due to missing dependencies.")


# --- Define and Customize Internal Reference Expert Agent (CodeAgent) ---
internal_reference_expert_agent = None
# Check dependencies (LLM model, tools from Sec 3, RAG index from Sec 3)
if internal_expert_llm_model and 'query_internal_references_tool' in locals() and 'retrieve_neighboring_chunks_tool' in locals() and pdf_rag_index:
    try:
        internal_reference_expert_agent = CodeAgent(
            model=internal_expert_llm_model,
            tools=[query_internal_references_tool, retrieve_neighboring_chunks_tool],
            add_base_tools=False, # Explicitly authorize imports needed for its generated code
            name="internal_reference_expert",
            description=( # Description for Orchestrator
                "Input: task string containing Python list of imaging findings. Action: Generates code to use RAG+neighbor tools for each finding, identifies Ddx & reasons from internal PDFs. Output: JSON string mapping findings to list of {'differential_diagnosis': '...', 'reasoning': '...'}."
            ),
            max_steps=25, # Increased steps for complex coding loop, multiple tool calls per finding
            verbosity_level=2, # Set to 2 for deep debugging if needed
            additional_authorized_imports=['json', 'ast'] # For parsing its input task string
        )
        # Append custom instructions
        if internal_reference_expert_agent.prompt_templates["system_prompt"]:
            internal_reference_expert_agent.prompt_templates["system_prompt"] += "\n" + internal_rag_agent_custom_instructions_text
            print("✅ Internal Reference Expert Agent (CodeAgent) defined, system prompt customized, 'json' & 'ast' authorized.")
        else: print("⚠️ Internal Reference Expert Agent: default system prompt template empty during customization.")
    except Exception as e:
        print(f"❌ Error defining/customizing Internal Reference Expert Agent: {e}")
        internal_reference_expert_agent = None
else:
    print("⚠️ Skipping Internal Reference Expert Agent definition due to missing dependencies.")


# --- Define and Customize Orchestrator Agent (`RadiologyReportAnalyzer` - CodeAgent) ---
radiology_report_analyzer_agent = None
managed_agents_list = [] # List to hold successfully created expert agents
if radiopaedia_expert_agent: managed_agents_list.append(radiopaedia_expert_agent)
if internal_reference_expert_agent: managed_agents_list.append(internal_reference_expert_agent)

# Check if Orchestrator LLM is ready and at least one expert agent was successfully created
if orchestrator_llm_model and managed_agents_list:
    print(f"\nProceeding to define Orchestrator Agent with {len(managed_agents_list)} managed expert agent(s).")
    try:
        radiology_report_analyzer_agent = CodeAgent(
            model=orchestrator_llm_model,
            managed_agents=managed_agents_list, # Pass the instances of subordinate agents
            tools=[], # Orchestrator delegates specific tool use
            add_base_tools=False, # Authorize needed imports specifically
            name="RadiologyReportAnalyzer",
            description=( # Orchestrator's own description (less critical as it's top-level)
                "Top-level orchestrator for radiology report analysis. Receives primary diagnosis and findings (as dict string). Directs expert agents to gather info. Synthesizes results into final Ddx suggestions for the overall case. Outputs structured JSON via final_answer()."
            ),
            verbosity_level=2, # High verbosity for the main orchestrator
            max_steps=15, # Planning, input parsing, 2 main agent calls, synthesis, final_answer
            additional_authorized_imports=['json', 'ast'] # For parsing its input task string
        )
        # Append custom instructions
        if radiology_report_analyzer_agent.prompt_templates["system_prompt"]:
            radiology_report_analyzer_agent.prompt_templates["system_prompt"] += "\n" + orchestrator_custom_instructions_text
            print("✅ Radiology Report Analyzer (Orchestrator) Agent defined, system prompt customized, 'json' & 'ast' authorized.")
        else: print("⚠️ Orchestrator Agent: default system prompt template empty during customization.")
    except Exception as e:
        print(f"❌ Error defining/customizing Radiology Report Analyzer Agent: {e}")
        radiology_report_analyzer_agent = None
elif not orchestrator_llm_model:
    print("⚠️ Skipping Orchestrator Agent definition: Orchestrator LLM was not successfully initialized.")
else: # managed_agents_list is empty
    print("⚠️ Skipping Orchestrator Agent definition: No subordinate expert agents were successfully created or added. The orchestrator needs managed agents to function.")

✅ Radiopaedia Expert Agent defined and system prompt customized.
✅ Internal Reference Expert Agent (CodeAgent) defined, system prompt customized, 'json' & 'ast' authorized.

Proceeding to define Orchestrator Agent with 2 managed expert agent(s).
✅ Radiology Report Analyzer (Orchestrator) Agent defined, system prompt customized, 'json' & 'ast' authorized.


## Part 5. Multi-Agent Execution Pipeline
With our specialized tools defined (Section 3) and our customized agents configured (Section 4), we now create the function that orchestrates the entire workflow. The run_differential_diagnosis_pipeline function defined below serves as the central engine for our radiology assistant.

Its process is as follows:

1. **Input**: Takes the raw transcribed text from the user's dictation.

2. Information Extraction: Calls the extract_impression_and_findings_with_groq function to parse the transcription into a primary diagnosis (if found) and a list of imaging findings.

3. **Task Preparation for Orchestrator**: Constructs the specific task input for the RadiologyReportAnalyzerAgent. This input includes the extracted diagnosis and findings, packaged appropriately (as a string representation of a dictionary, which the Orchestrator is instructed to parse).

4. **Orchestrator Invocation**: Executes the radiology_report_analyzer_agent.run() method with the prepared task. The Orchestrator, guided by its comprehensive instructions (appended system prompt), will then:
  - Call the radiopaedia_expert agent with the primary diagnosis.
  - Call the internal_reference_expert agent with the list of imaging findings.
  - Receive the concise summary from Radiopaedia and the structured JSON Ddx list from the internal references agent.
  - Synthesize this information.
  - Generate overall differential diagnosis suggestions.

5. **Output Processing**: Receives the final output from the Orchestrator, which should be a JSON string containing the complete analysis and suggestions.

6. **Parsing and Validation**: Parses the Orchestrator's JSON output and performs basic validation to ensure it matches the expected structure.

7. **Return Value**: Returns the parsed dictionary containing the comprehensive results or an error string if any step in the pipeline fails.

This function encapsulates the entire multi-agent interaction, triggered by the user's transcribed report.

In [None]:
# @title
# In Section 5

# Required imports
import json # Keep for potential internal error JSON checking if needed
import time
from smolagents.agent_types import AgentText # Import AgentText to handle it explicitly

def run_differential_diagnosis_pipeline(transcription: str) -> str: # Return type is now string
    """
    Orchestrates the multi-agent pipeline. Expects the Orchestrator agent
    to return the final result as a Markdown string via final_answer().

    Args:
        transcription (str): The transcribed radiology report text.

    Returns:
        str: If successful, returns the Markdown report string generated by the
             Orchestrator agent. If any part fails, returns an error string.
    """
    # --- Input Validation ---
    if not transcription or not transcription.strip():
        print("❌ Pipeline Error: Input transcription is empty.")
        return "Error: Transcription is empty."

    # --- Agent Readiness Check ---
    if 'radiology_report_analyzer_agent' not in globals() or radiology_report_analyzer_agent is None:
        print("❌ Pipeline Error: Orchestrator Agent not initialized.")
        return "Error: Core analysis agent (Orchestrator) is not initialized."

    print("\n--- 🚀 Starting Differential Diagnosis Pipeline 🚀 ---")
    print(f"Input Transcription (first 200): '{transcription[:200]}...'")

    # --- Step 1: Extract Diagnosis & Findings ---
    print("\n--- Step 1: Extracting Diagnosis & Findings (with Groq) ---")
    extracted_info_dict = extract_impression_and_findings_with_groq(transcription)
    # ... (Warnings about missing diagnosis/findings) ...

    # --- Step 2: Prepare Task and Args for Orchestrator ---
    task_for_orchestrator = (
        "Analyze the radiology case using the data in the `input_data_dict` argument. "
        "Execute your workflow, call managed agents, synthesize results, "
        "and return the final analysis as a **Markdown report string** via `final_answer()`." # Emphasize Markdown output in task too
    )
    orchestrator_args = {"input_data_dict": extracted_info_dict}
    print("\n--- Step 2: Task and Args prepared for Orchestrator Agent ---")
    print(f"   Argument ('input_data_dict'): {orchestrator_args['input_data_dict']}")

    # --- Step 3: Run the Orchestrator Agent ---
    print("\n--- Step 3: Invoking Orchestrator Agent...")
    start_time = time.time()
    final_orchestrator_output = None # Can be AgentText, str, or None

    try:
        # .run() returns the object passed to final_answer()
        final_orchestrator_output = radiology_report_analyzer_agent.run(
            task_for_orchestrator,
            additional_args=orchestrator_args
        )
        end_time = time.time()
        print(f"✅ Orchestrator Agent execution completed in {end_time - start_time:.2f} seconds.")
        # print(f"   Raw output object type from Orchestrator: {type(final_orchestrator_output)}") # Debug type

    except Exception as e:
        print(f"❌ CRITICAL ERROR during Orchestrator Agent execution: {e}")
        # ... (Get last step info) ...
        return f"Error during Orchestrator run: {str(e)}"

    # --- Step 4: Validate and Return the Orchestrator's Output (expecting Markdown string) ---
    print("\n--- Step 4: Validating Orchestrator's Final Output ---")

    # Check if output exists
    if final_orchestrator_output is None:
        print("❌ Pipeline Error: Orchestrator Agent returned None.")
        # ... (Get last step info) ...
        return "Error: Analysis agent returned None output."

    # Check if it's a string or an AgentText object (which acts like a string)
    if isinstance(final_orchestrator_output, (str, AgentText)):
        # Convert AgentText to a plain string if necessary
        final_markdown_report = str(final_orchestrator_output)

        if not final_markdown_report.strip():
             print("❌ Pipeline Error: Orchestrator Agent returned an empty string.")
             return "Error: Analysis agent returned an empty report string."
        else:
             print("✅ Successfully received Markdown report string from Orchestrator.")
             print("--- Differential Diagnosis Pipeline Completed Successfully 🎉 ---")
             return final_markdown_report # Return the Markdown string
    else:
        # Output was not a string or AgentText, report error
        print(f"❌ Pipeline Error: Orchestrator Agent returned type {type(final_orchestrator_output)}, expected string or AgentText.")
        print(f"   Raw output received: {str(final_orchestrator_output)[:500]}...")
        return f"Error: Analysis agent returned unexpected output type ({type(final_orchestrator_output)})."

In [None]:
# @title
# --- Example Usage Cell (for testing Section 5 directly) ---
# (Keep the example usage block commented out, but update it to reflect the new input/output)

sample_report_for_pipeline_test_revised = """
HISTORY: 65yo female with visual disturbance with PMH of Alzheimer's disease.
FINDINGS: Left occipital hypodensity.
IMPRESSION: Findings are highly suspicious for stroke.
"""

print("\n--- ### RUNNING PIPELINE TEST ### ---")
# Ensure dependent variables (agents, etc.) are not None before running
all_systems_go_revised = (
    orchestrator_llm_model is not None and
    radiopedia_scraping_model is not None and # For radiopaedia expert
    internal_expert_llm_model is not None and # For internal ref expert
    web_search_tool is not None and
    'radiopaedia_content_extraction_tool' in locals() and # Tool check
    'query_internal_references_tool' in locals() and # Tool check
    'retrieve_neighboring_chunks_tool' in locals() and # Tool check
    radiopaedia_expert_agent is not None and
    internal_reference_expert_agent is not None and
    radiology_report_analyzer_agent is not None and
    pdf_rag_index is not None # RAG system check
)

# --- Example Usage Cell (Commented Out) ---
# (Update the printing part to handle the string output)
print("\n--- ### RUNNING REVISED PIPELINE TEST (Expecting Markdown) ### ---")
if all_systems_go_revised: # Check dependencies
    pipeline_test_result_markdown = run_differential_diagnosis_pipeline(sample_report_for_pipeline_test_revised)
    print("\n--- ### REVISED PIPELINE TEST RESULT (MARKDOWN) ### ---")
    if pipeline_test_result_markdown.startswith("Error:"):
        print(f"Pipeline Test Error: {pipeline_test_result_markdown}")
    else:
        # Print the Markdown report
        print(pipeline_test_result_markdown)
else:
    print("⚠️ SKIPPING REVISED PIPELINE TEST: Check component initialization.")
print("-" * 50)

## Part 6. Advanced GUI (Integrated Workflow)
This final section brings all our components together into a single, interactive Gradio application. This interface will allow a radiologist to:

1. Input a Report:
  - Record their report dictation using a microphone.
  - Alternatively, type their report directly into a textbox or edit the transcription.

2. Transcribe: Convert the audio to text using Whisper (if audio was provided).

3. Analyze: Send the (potentially edited) report text to our multi-agent system (run_differential_diagnosis_pipeline).

4. View Results:
Display the comprehensive Markdown report generated by the Orchestrator agent, which includes:
  - A summary of the input diagnosis and findings.
  - The concise summary from Radiopaedia regarding the primary diagnosis.
  - The detailed Markdown report from the Internal Reference Expert on Ddx for different findings.
  - The Orchestrator's final synthesized differential diagnosis suggestions.

The UI will use tabs and appropriate components for a clear presentation of the workflow and its outputs.


In [None]:
# In Section 7

# Ensure Gradio, json, time are imported (usually in Section 1)
import gradio as gr
import json
import time
import os
import shutil
import numpy as np
from datetime import datetime
import soundfile as sf  # Make sure to pip install soundfile

print("Setting up Advanced Gradio Interface (Revised Audio Handling)...")

# Create a directory to store permanent audio files
os.makedirs("saved_audio", exist_ok=True)

# --- Define UI Theme (Optional but recommended) ---
theme = gr.themes.Default(
    primary_hue=gr.themes.colors.blue,
    secondary_hue=gr.themes.colors.neutral
).set(
    button_primary_background_fill="*primary_500",
    button_primary_background_fill_hover="*primary_600",
    button_primary_text_color="white",
    button_secondary_background_fill="*neutral_200",
    button_secondary_background_fill_hover="*neutral_300",
    button_secondary_text_color="*neutral_700",
)

# CSS for styling the output area
css_custom = """
    #final-report-display div { max-height: 70vh; overflow-y: auto; border: 1px solid #e0e0e0; padding: 10px; background-color: #f9f9f9; border-radius: 4px; }
    #report-text-input textarea { font-size: 1.05em; line-height: 1.5; }
    #status-area p { font-style: italic; color: #555; }
"""

with gr.Blocks(theme=theme, title="Radiology Multi-Agent Assistant", css=css_custom) as advanced_radiology_gui:
    gr.Markdown(
        "# 🧠 Advanced Radiology Report Assistant with Multi-Agent AI 🔬"
        "\n*Powered by SmolAgents, Whisper, and LLMs*"
    )

    # This variable will store the current audio data
    current_audio_data = gr.State(value=None)

    # --- This Textbox will hold the current report text for editing and analysis ---
    current_report_text_display = gr.Textbox(
        label="📝 Report Text (Type, Paste, or Edit Transcription)",
        lines=8,
        placeholder="Your report text will appear here. You can type directly or edit the text after transcription.",
        interactive=True, # User can edit this box
        elem_id="report-text-input"
    )

    with gr.Row():
        with gr.Column(scale=3): # Main interaction column
            gr.Markdown(
                "**Workflow:**\n"
                "1. **Dictate or Type Report:** Use microphone or type directly into the 'Report Text' box above.\n"
                "2. **Transcribe (If Voice Used):** After recording, click **'🎤 Transcribe Audio'**. The text appears in the 'Report Text' box.\n"
                "3. **Edit Text:** Freely edit the text in the 'Report Text' box.\n"
                "4. **Analyze:** Once the report text is ready, click **'💡 Analyze with Multi-Agent AI'**.\n"
                "5. **Review Results:** The AI's detailed analysis and suggestions will appear at the bottom."
            )

            # --- Section 1: Input Report (Voice or Text) ---
            with gr.Group():
                gr.Markdown("### Step 1: Provide Your Report")

                # IMPORTANT CHANGE: Changed type to "numpy" to get the actual audio data
                audio_input = gr.Audio(
                    sources=["microphone", "upload"],
                    type="numpy",  # Changed from "filepath" to "numpy"
                    label="🎤 Record Dictation OR Upload Audio File:",
                    elem_id="audio-recorder",
                )

                # Transcribe Button in its own row below audio
                with gr.Row():
                    transcribe_button = gr.Button(
                        value="🎤 Transcribe Audio",
                        variant="secondary",
                    )

            # --- Section 2: AI Analysis Trigger ---
            with gr.Group():
                gr.Markdown("### Step 2: Perform AI Analysis")
                analyze_button = gr.Button(
                    value="💡 Analyze Report with Multi-Agent AI",
                    variant="primary",
                )

            status_update_area = gr.Markdown("Status: Ready", elem_id="status-area")

        with gr.Column(scale=2): # Sidebar column
            if 'brain_image_url' in globals() and brain_image_url:
                brain_caption = brain_image_caption if 'brain_image_caption' in globals() else "Sample Brain Image"
                gr.Image(value = brain_image_url, label = brain_caption, height=350, show_download_button=False)
            else:
                gr.Markdown("*(Sample CT image could be displayed here if `brain_image_url` is set)*")

            gr.Markdown("--- \n*Disclaimer: This is an educational demonstration tool. **Not for clinical diagnostic use.** Results should be critically reviewed by qualified medical professionals.*")


    # --- Section 3: Output Display Area ---
    with gr.Group():
        gr.Markdown("### Step 3: AI-Generated Report & Differential Diagnosis Suggestions")
        final_markdown_report_output = gr.Markdown(
            elem_id="final-report-display",
            latex_delimiters=[]
        )

    # --- Define Button Click Actions (Callbacks) ---

    # Function to save audio data to a permanent file
    def save_audio_to_file(audio_data):
        if audio_data is None:
            return None

        sample_rate, audio_array= audio_data

        # Create a unique filename using timestamp
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        permanent_path = os.path.join("saved_audio", f"audio_{timestamp}.wav")
        # Save the audio data directly
        try:
            sf.write(permanent_path, audio_array, sample_rate)
            print(f"Audio saved permanently to: {permanent_path}")
            return permanent_path
        except Exception as e:
            print(f"Error saving audio: {e}")
            return None

    # 1. Audio Recording Event Handler - Save the audio data when it's recorded
    def on_audio_recorded(audio_data):
        if audio_data is None:
            return None

        # Store the audio data for later use (no saving yet)
        return audio_data

    # Connect the audio recording event
    audio_input.change(
        fn=on_audio_recorded,
        inputs=[audio_input],
        outputs=[current_audio_data]
    )

    # 2. Transcription Button Callback
    def transcribe_and_update_textbox(audio_data, text_currently_in_box):
        if audio_data is None:
            return "Status: No audio provided. Please record or upload audio. 🎤", text_currently_in_box

        # First save the audio data to a permanent file
        audio_path = save_audio_to_file(audio_data)

        if not audio_path:
            return "Status: Failed to save audio. ❌", text_currently_in_box

        # Now transcribe from the permanent file
        try:
            print(f"Transcribing audio from: {audio_path}")
            # Adjust this to use your actual transcription function
            transcription_result = transcribe_audio(audio_path)

            if "Error:" in transcription_result:
                return f"Status: Transcription Failed - {transcription_result} ❌", text_currently_in_box
            else:
                return "Status: Transcription complete. You can now edit the text or proceed to analysis. ✅", transcription_result

        except Exception as e:
            error_msg = f"Error during transcription: {str(e)}"
            print(error_msg)
            return f"Status: {error_msg} ❌", text_currently_in_box

    # Connect the transcribe button
    transcribe_button.click(
        fn=transcribe_and_update_textbox,
        inputs=[current_audio_data, current_report_text_display],
        outputs=[status_update_area, current_report_text_display]
    )

    # 3. Analysis Button Callback
    def analyze_report_and_display(report_text_from_textbox):
        if not report_text_from_textbox or not report_text_from_textbox.strip():
            return ("Status: Error - Report text is empty. ❌",
                   "**Analysis Error:** Report text cannot be empty...")

        yield ("Status: Starting multi-agent analysis... 🔄",
               "```markdown\n🤖 **Processing your report...**\n...\n✨ Please be patient! ✨\n```")

        pipeline_output_str = run_differential_diagnosis_pipeline(report_text_from_textbox)

        if isinstance(pipeline_output_str, str) and pipeline_output_str.startswith("Error:"):
            yield f"Status: Analysis Failed ❌", f"**Pipeline Error:**\n\n```text\n{pipeline_output_str}\n```\n\n..."
        elif isinstance(pipeline_output_str, str):
            yield "Status: Analysis complete! ✅", pipeline_output_str
        else:
            yield ("Status: Analysis completed with unexpected output. ⚠️"), (f"**Unexpected Output:** {type(pipeline_output_str)}\n\n"
                                                                           f"Output:\n```\n{str(pipeline_output_str)[:500]}\n```")

    analyze_button.click(
        fn=analyze_report_and_display,
        inputs=[current_report_text_display],
        outputs=[status_update_area, final_markdown_report_output]
    )

# --- Launch the Advanced GUI ---
print("\n🚀 Launching the Advanced Radiology Assistant GUI...")
print("   If running in Colab, a public link will appear. Open it in a new tab for best experience (especially microphone access).")
print("   Please ensure you **allow microphone access** in your browser when prompted for voice input.")

advanced_radiology_gui.queue().launch(share=True, debug=True)

Setting up Advanced Gradio Interface (Revised Audio Handling)...

🚀 Launching the Advanced Radiology Assistant GUI...
   If running in Colab, a public link will appear. Open it in a new tab for best experience (especially microphone access).
   Please ensure you **allow microphone access** in your browser when prompted for voice input.
Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
* Running on public URL: https://676baf8225ea452966.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)



--- 🚀 Starting Differential Diagnosis Pipeline 🚀 ---
Input Transcription (first 200): 'HISTORY: 65yo female with visual disturbance with PMH of Alzheimer's disease.
FINDINGS: Left occipital hypodensity.
IMPRESSION: Findings are highly suspicious for stroke....'

--- Step 1: Extracting Diagnosis & Findings (with Groq) ---
Attempting to extract primary diagnosis and imaging findings using Groq LLM...
✅ Extracted via Groq: Diagnosis='stroke', Findings=['Left occipital hypodensity']

--- Step 2: Task and Args prepared for Orchestrator Agent ---
   Argument ('input_data_dict'): {'primary_diagnosis': 'stroke', 'imaging_findings': ['Left occipital hypodensity']}

--- Step 3: Invoking Orchestrator Agent...


Radiopaedia Content Extractor: Attempting to fetch URL: https://radiopaedia.org/articles/stroke
   ✅ Successfully extracted content from https://radiopaedia.org/articles/stroke. Final length for agent: 763 chars.


Radiopaedia Content Extractor: Attempting to fetch URL: https://radiopaedia.org/articles/stroke
   ✅ Successfully extracted content from https://radiopaedia.org/articles/stroke. Final length for agent: 763 chars.


Radiopaedia Content Extractor: Attempting to fetch URL: https://radiopaedia.org/articles/stroke
   ✅ Successfully extracted content from https://radiopaedia.org/articles/stroke. Final length for agent: 763 chars.


Local PDF RAG Tool: Received query 'Left occipital hypodensity', top_k=5
   ✅ Local PDF RAG Tool: Retrieved 5 structured items for 'Left occipital hypodensity'.


Local PDF RAG Tool: Received query 'occipital hypodensity', top_k=5
   ✅ Local PDF RAG Tool: Retrieved 5 structured items for 'occipital hypodensity'.


Local PDF RAG Tool: Received query 'occipital lobe hypodensity', top_k=5
   ✅ Local PDF RAG Tool: Retrieved 5 structured items for 'occipital lobe hypodensity'.
Local PDF RAG Tool: Received query 'occipital lobe low density', top_k=5
   ✅ Local PDF RAG Tool: Retrieved 5 structured items for 'occipital lobe low density'.
Local PDF RAG Tool: Received query 'occipital lobe decreased density', top_k=5
   ✅ Local PDF RAG Tool: Retrieved 5 structured items for 'occipital lobe decreased density'.


Neighbor Chunks Tool: Request for target_global_idx=43, num_before=2, num_after=2
   ✅ Neighbor Chunks Tool: Retrieved 5 total context chunks for target_global_idx 43.
Neighbor Chunks Tool: Request for target_global_idx=44, num_before=2, num_after=2
   ✅ Neighbor Chunks Tool: Retrieved 5 total context chunks for target_global_idx 44.
Neighbor Chunks Tool: Request for target_global_idx=46, num_before=2, num_after=2
   ✅ Neighbor Chunks Tool: Retrieved 5 total context chunks for target_global_idx 46.
Neighbor Chunks Tool: Request for target_global_idx=47, num_before=2, num_after=2
   ✅ Neighbor Chunks Tool: Retrieved 5 total context chunks for target_global_idx 47.
Neighbor Chunks Tool: Request for target_global_idx=50, num_before=2, num_after=2
   ✅ Neighbor Chunks Tool: Retrieved 5 total context chunks for target_global_idx 50.
Neighbor Chunks Tool: Request for target_global_idx=51, num_before=2, num_after=2
   ✅ Neighbor Chunks Tool: Retrieved 5 total context chunks for target_global_

✅ Orchestrator Agent execution completed in 263.36 seconds.

--- Step 4: Validating Orchestrator's Final Output ---
✅ Successfully received Markdown report string from Orchestrator.
--- Differential Diagnosis Pipeline Completed Successfully 🎉 ---
