<a href="https://colab.research.google.com/github/paulokuriki/prompt_engineering/blob/main/prompt_engineering.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 📑 **Radiology Report Classification with Large Language Models**

## **Learning Objectives**

By the end of this session, you will be able to:


✅ Understand how Large Language Models (LLMs) assist in radiology report classification.  
✅ Apply effective prompting techniques for medical report analysis.  
✅ Develop specialized classifiers for different radiological findings.  

---

## **Introduction**
Radiologists frequently need to classify reports based on specific findings or conditions. **Large Language Models (LLMs)** can streamline this process, improving efficiency and consistency in medical report classification. This tutorial demonstrates how to leverage LLMs for various classification tasks in radiology.

---

## **Dataset: Indiana Chest X-ray Collection**
This notebook processes the **Indiana Chest X-ray Collection**, a publicly available dataset provided by the **National Library of Medicine (NLM), National Institutes of Health (NIH),** in collaboration with **Indiana University**.

### 🎯 **Acknowledgment**
- **Dataset Source**: [Open-i (NLM)](https://openi.nlm.nih.gov/)  
- **Reference Paper**:  
  > **Demner-Fushman D, Kohli MD, Rosenman MB, Shooshan SE, Rodriguez L, Antani S, Thoma GR, McDonald CJ.**  
  > *Preparing a collection of radiology examinations for distribution and retrieval.*  
  > J Am Med Inform Assoc. 2016 Mar;23(2):304-10.  
  > DOI: [10.1093/jamia/ocv080](https://doi.org/10.1093/jamia/ocv080)  
  > PMID: [26133894](https://pubmed.ncbi.nlm.nih.gov/26133894/) | PMCID: [PMC5009925](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5009925/)  

## **Setup and Data Loading**
To begin, we'll install the required libraries and load the dataset.

In [1]:
pip install langchain_openai langchain-ollama

Collecting langchain_openai
  Downloading langchain_openai-0.3.7-py3-none-any.whl.metadata (2.3 kB)
Collecting langchain-ollama
  Downloading langchain_ollama-0.2.3-py3-none-any.whl.metadata (1.9 kB)
Collecting tiktoken<1,>=0.7 (from langchain_openai)
  Downloading tiktoken-0.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Collecting ollama<1,>=0.4.4 (from langchain-ollama)
  Downloading ollama-0.4.7-py3-none-any.whl.metadata (4.7 kB)
Downloading langchain_openai-0.3.7-py3-none-any.whl (55 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m55.3/55.3 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading langchain_ollama-0.2.3-py3-none-any.whl (19 kB)
Downloading ollama-0.4.7-py3-none-any.whl (13 kB)
Downloading tiktoken-0.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m43.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling colle

In [2]:
import csv
import glob
import json
import os
import random
import shutil
import subprocess
import tarfile
import threading
import time
import xml.etree.ElementTree as ET

from IPython.display import HTML, display
from langchain_openai import ChatOpenAI
from langchain_ollama import ChatOllama
import pandas as pd
import requests
from tqdm import tqdm

## **Dataset Processing**
### **Steps**
1. **Download**: Retrieve the dataset from the official NLM repository.  
2. **Extract**: Unpack the `.tgz` archive.  
3. **Parse**: Extract the **Findings** section from each radiology report.  
4. **Label**: Assign a **normal** or **abnormal** classification based on MeSH (Medical Subject Headings) terms.  
5. **Export**: Save the processed data as a **CSV file**.  

In [3]:
# Define URLs and paths
TGZ_URL = "https://openi.nlm.nih.gov/imgs/collections/NLMCXR_reports.tgz"
TGZ_FILE = "NLMCXR_reports.tgz"
EXTRACT_DIR = "NLMCXR_reports_extracted"
XML_FOLDER = os.path.join(EXTRACT_DIR, "ecgen-radiology")
OUTPUT_CSV = "converted_reports.csv"

def run_with_feedback(func, description):
    """Run a function with feedback messages."""
    print(f"\n🔄 {description}...")
    func()
    print(f"✅ {description} completed successfully!\n")

def clean_up():
    """Remove old files and directories if they exist."""
    if os.path.exists(TGZ_FILE):
        os.remove(TGZ_FILE)
    if os.path.exists(EXTRACT_DIR):
        shutil.rmtree(EXTRACT_DIR)
    if os.path.exists(OUTPUT_CSV):
        os.remove(OUTPUT_CSV)

def download_file():
    """Download the dataset file."""
    #print("🔄 Downloading dataset...")
    response = requests.get(TGZ_URL, stream=True)
    response.raise_for_status()
    with open(TGZ_FILE, "wb") as f:
        for chunk in response.iter_content(chunk_size=8192):
            f.write(chunk)
    print("✅ Dataset downloaded successfully!")

def extract_tgz():
    """Extract the dataset files."""
    #print("🔄 Extracting dataset files...")
    with tarfile.open(TGZ_FILE, "r:gz") as tar:
        tar.extractall(EXTRACT_DIR)
    print("✅ Files extracted successfully!")

def parse_xml_file(xml_path):
    """Parse a single XML file to extract findings and MeSH terms."""
    try:
        tree = ET.parse(xml_path)
        root = tree.getroot()

        # Extract AbstractText elements
        abstract_texts = root.findall(".//AbstractText")

        # Extract only the "Findings" section
        findings_text = ""
        for abstract in abstract_texts:
            label = abstract.attrib.get("Label", "").lower()
            if label == "findings":
                findings_text = abstract.text.strip() if abstract.text else ""
                break  # Stop after finding the "Findings" section

        # Extract MeSH terms
        mesh_major_list = [m.text.strip() for m in root.findall(".//MeSH/major") if m.text]
        mesh_major = "|".join(mesh_major_list) if mesh_major_list else ""

        return findings_text, mesh_major
    except:
        return "", ""

def download_prepare_dataset():
    """Handles dataset downloading, extraction, and conversion to CSV."""
    run_with_feedback(clean_up, "Cleaning up old files")

    run_with_feedback(download_file, "Downloading dataset")

    run_with_feedback(extract_tgz, "Extracting dataset")

    xml_files = glob.glob(os.path.join(XML_FOLDER, "*.xml"))

    print(f"\n🔄 Processing {len(xml_files)} XML files...")
    reports = [parse_xml_file(xml_file) for xml_file in tqdm(xml_files, desc="🔍 Parsing XML files")]
    print(f"✅ Processing completed! Total parsed reports: {len(reports)}\n")

    # Save reports to CSV
    print("🔄 Saving extracted data to CSV...")
    with open(OUTPUT_CSV, "w", newline="", encoding="utf-8") as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(["Report", "MeSH Major"])
        for report, mesh_major in reports:
            writer.writerow([report, mesh_major])
    print(f"✅ Data saved to {OUTPUT_CSV}!")

    # Load and clean the dataset
    df = pd.read_csv(OUTPUT_CSV)
    print(f"\n🔄 Cleaning dataset... {len(df)} entries found.")

    df["label"] = df["MeSH Major"].apply(lambda x: "normal" if x == "normal" else "abnormal")
    df.rename(columns={"Report": "report"}, inplace=True)
    df = df.map(lambda x: x.strip() if isinstance(x, str) else x)
    df = df.replace('', pd.NA).dropna().reset_index(drop=True)
    df.to_csv(OUTPUT_CSV, index=False)

    print("✅ Dataset preparation completed!")
    return df

# ✅ Run the full pipeline with feedback
df = download_prepare_dataset()

# Show data distribution
print(f'\n✅ Dataset ready! Total sample reports: {len(df)}')
print(df.label.value_counts())



🔄 Cleaning up old files...
✅ Cleaning up old files completed successfully!


🔄 Downloading dataset...
✅ Dataset downloaded successfully!
✅ Downloading dataset completed successfully!


🔄 Extracting dataset...
✅ Files extracted successfully!
✅ Extracting dataset completed successfully!


🔄 Processing 3955 XML files...


🔍 Parsing XML files: 100%|██████████| 3955/3955 [00:02<00:00, 1879.77it/s]


✅ Processing completed! Total parsed reports: 3955

🔄 Saving extracted data to CSV...
✅ Data saved to converted_reports.csv!

🔄 Cleaning dataset... 3955 entries found.
✅ Dataset preparation completed!

✅ Dataset ready! Total sample reports: 3425
label
abnormal    2219
normal      1206
Name: count, dtype: int64


Installing Ollama

In [4]:
MODEL_NAME = "llama3.1:8b"  # Change model name as needed

def run_command_with_feedback(command, description=None, check_existing=False):
    """Run a shell command with a description and print feedback."""
    if description is None:
        description = command
    print(f"\n🔄 {description}...")

    if check_existing:
        result = subprocess.run(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
        if result.returncode == 0:
            print(f"✅ {description} already completed, skipping.")
            return

    subprocess.run(command, shell=True, check=True)
    print(f"✅ {description} completed!")

def is_ollama_installed():
    """Check if Ollama is already installed."""
    result = subprocess.run("command -v ollama", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    return result.returncode == 0  # If the command returns 0, Ollama is installed

def is_ollama_running():
    """Check if Ollama server is running using its API."""
    try:
        response = requests.get("http://localhost:11434/api/tags", timeout=2)
        return response.status_code == 200
    except requests.RequestException:
        return False  # Server is not running or unreachable

def is_model_downloaded(model_name):
    """Check if a model is already downloaded in Ollama."""
    result = subprocess.run(["ollama", "list"], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    return model_name in result.stdout

def install_ollama():
    """Installs Ollama in Google Colab if not already installed."""
    if is_ollama_installed():
        print("✅ Ollama is already installed, skipping installation.")
        return

    print("\n🔄 Installing Ollama. This may take a few minutes...")
    run_command_with_feedback("apt update", "Updating package list")
    run_command_with_feedback("apt install -y pciutils", "Installing pciutils")
    run_command_with_feedback("curl -fsSL https://ollama.com/install.sh | sh", "Downloading and installing Ollama")
    print("✅ Ollama installed successfully!")

def start_ollama_server(silent=True):
    """Starts the Ollama server in a background thread if not already running."""
    if is_ollama_running():
        if not silent:
            print("✅ Ollama server is already running.")
            return

    if not silent:
        print("\n🚀 Starting Ollama server...")
    thread = threading.Thread(target=lambda: subprocess.run(["ollama", "serve"], check=True))
    thread.start()
    time.sleep(5)  # Give it time to start
    if is_ollama_running():
        if not silent:
            print("✅ Ollama server is running!")
        return True
    else:
        if not silent:
            print("⚠️ Ollama server failed to start. Check for issues!")
        return False

def pull_model(model_name):
    """Downloads the model for Ollama if not already downloaded."""
    if is_model_downloaded(model_name):
        print(f"✅ Model '{model_name}' is already downloaded, skipping.")
        return

    print(f"\n🔄 Pulling model: {model_name}...")
    subprocess.run(["ollama", "pull", model_name], check=True)
    print(f"✅ Model '{model_name}' downloaded successfully!")

def load_model_into_vram(model_name = MODEL_NAME):
    """Sends a keep-alive request to ensure the model stays loaded in memory."""
    print(f"🔄 Sending keep-alive request for '{model_name}'...")
    try:
        response = requests.post("http://localhost:11434/api/generate",
                                 json={"model": model_name,
                                       "prompt": "Simply respond with 'Hi'",
                                       "stream": False,
                                       "keep_alive": -1},
                                 timeout=20)
        if response.status_code == 200:
            print(f"✅ Model '{model_name}' is now preloaded in memory!")
        else:
            print(f"⚠️ Failed to keep model '{model_name}' in memory. Response: {response.status_code}")
    except requests.RequestException as e:
        print(f"⚠️ Keep-alive request failed: {e}")

# ✅ Run the complete setup with verification
install_ollama()
start_ollama_server()
pull_model(MODEL_NAME)
load_model_into_vram(MODEL_NAME)



🔄 Installing Ollama. This may take a few minutes...

🔄 Updating package list...
✅ Updating package list completed!

🔄 Installing pciutils...
✅ Installing pciutils completed!

🔄 Downloading and installing Ollama...
✅ Downloading and installing Ollama completed!
✅ Ollama installed successfully!

🔄 Pulling model: llama3.1:8b...
✅ Model 'llama3.1:8b' downloaded successfully!
🔄 Sending keep-alive request for 'llama3.1:8b'...
✅ Model 'llama3.1:8b' is now preloaded in memory!


## **Classification with Large Language Models (LLMs)**
To facilitate classification, we define functions that analyze radiology reports using LLMs.

In [5]:
def classify_report(report, template, looking_for=None, examples=None, max_retries=3):
    """
    Classify a radiology report using an LLM with a retry mechanism.

    Before each attempt, check if the Ollama server is actively responding.
    If it appears idle, the function will refresh the server and reload the model.
    """
    model = MODEL_NAME
    for attempt in range(max_retries):
        # Check if the server is active; if not, refresh it.
        if not is_ollama_running():
            start_ollama_server(silent=True)

        try:
            if 'gpt' in model:
                # Initialize the OpenAI model (requires a valid API key)
                llm = ChatOpenAI(
                    model="gpt-4o-mini-2024-07-18",
                    temperature=0,
                    seed=42,
                    model_kwargs={"response_format": {"type": "json_object"}}
                )
            else:
                # Initialize the Ollama model
                llm = ChatOllama(
                    model=model,
                    base_url="localhost:11434",
                    format='json',
                    keep_alive=-1,
                    temperature=0,
                    seed=42,
                    model_kwargs={"seed": 42, "response_format": {"type": "json_object"}}
                )

            # Prepare the prompt using the provided template.
            if looking_for and examples:
                prompt = template.format(report=report, looking_for=looking_for, examples=examples)
            else:
                prompt = template.format(report=report)

            response = llm.invoke(prompt)
            result = json.loads(response.content.lower())
            return result['classification']
        except Exception as e:
            print(f"⚠️ Issue during classification attempt {attempt + 1}: {e}")
            # If the server appears idle, refresh it.
            if not is_ollama_running():
                start_ollama_server(silent=True)

    print("⚠️ Unable to complete the classification after several attempts. Please re-run the cell if necessary.")
    return 'error'


def classify_multiple_reports(df, n_reports=5, template=None, looking_for=None, examples=None, seed=44):
    """Classify multiple random reports."""
    results = []
    random.seed(seed)  # Set seed for reproducibility
    report_indices = random.sample(range(len(df)), n_reports)

    if not is_ollama_running():
        start_ollama_server(silent=True)

    for idx in tqdm(report_indices, desc="Classifying reports"):
        report = df.iloc[idx]['report']
        label = df.iloc[idx]['label']
        prediction = classify_report(report, template, looking_for, examples)
        results.append({
            'index': idx,
            'report': report,
            'original_label': label,
            'predicted_label': prediction
        })

    return results

def display_results(results, show_original_label=True):
    """Display classification results in a readable format"""

    for r in results:
        print(f"\nReport #{r['index']}:")
        print("=" * 50)
        print(f"Report text:")
        display(r['report'])
        print("-" * 50)

        if show_original_label:
            # Compare prediction with the original label
            is_correct = r['original_label'].lower() == r['predicted_label'].lower()
            classification = 'Correct' if is_correct else 'Wrong'
            classification_icon = '✅' if is_correct else '👎'
            print(f"Original Label : {r['original_label']}")
            print(f"Predicted Label: {r['predicted_label']}")
            print(f"Classification : {classification} {classification_icon}")

        else:
            # Only show predicted classification when original label is hidden
            is_correct = r['predicted_label'].lower() == "present"
            classification_icon = '✓✓✓' if is_correct else 'xxx'
            classification_icon = '✅' if is_correct else '👎'
            print(f"Finding: {r['predicted_label']} {classification_icon}")

        print()


    total = len(results)
    print("SUMMARY:")
    print("=" * 50)
    print(f"\nAnalyzed {total} reports")

    # Show accuracy if original labels are available
    if show_original_label:
        correct = sum(1 for r in results if r['original_label'].lower() == r['predicted_label'].lower())

        print(f"Correct predictions: {correct}")
        print(f"Accuracy: {(correct/total)*100:.1f}%\n")


# **Let's Try It Out!** 🏥

## **Zero-Shot Classification**
In **zero-shot classification**, the model determines whether a given chest X-ray report describes a normal or abnormal case without prior examples.


### How It Works:
**Prompt Template:**  
- The model receives a structured instruction to classify reports.
- The response format is **JSON**, containing the key **"classification"** with possible values: `"normal"` or `"abnormal"`.
- The `{report}` placeholder is replaced with an actual chest X-ray report before passing it to the model.


In [6]:
zero_shot_prompt_template = """
### INSTRUCTION
You are a specialist in chest X-ray reports. Your task is to classify whether a report is normal or abnormal.
Your response should be in JSON format with the key "classification" and the possible values: "normal" or "abnormal".

### REPORT TO CLASSIFY
{report}
"""


#for seed in range(20, 100):
#    print(seed)
seed=42
results = classify_multiple_reports(df, n_reports=1, template=zero_shot_prompt_template, seed=seed)
display_results(results, show_original_label=True)


Classifying reports: 100%|██████████| 1/1 [00:00<00:00,  1.43it/s]


Report #2619:
Report text:





'Cardiac and mediastinal contours are within normal limits. The lungs are clear. Bony structures are intact. Small hiatal hernia.'

--------------------------------------------------
Original Label : abnormal
Predicted Label: normal
Classification : Wrong 👎

SUMMARY:

Analyzed 1 reports
Correct predictions: 0
Accuracy: 0.0%



### Testing Classification in Multiple Reports to measure performance

In [7]:
results = classify_multiple_reports(df, n_reports=10, template=zero_shot_prompt_template, seed=44)

display_results(results, show_original_label=True)

Classifying reports: 100%|██████████| 10/10 [00:05<00:00,  1.89it/s]


Report #1673:
Report text:





'Both lungs are clear and expanded. Heart and mediastinum normal.'

--------------------------------------------------
Original Label : normal
Predicted Label: normal
Classification : Correct ✅


Report #2130:
Report text:


'Lungs are clear bilaterally. Cardiac and mediastinal silhouettes are normal. Pulmonary vasculature is normal. No pneumothorax or pleural effusion. No acute bony abnormality.'

--------------------------------------------------
Original Label : normal
Predicted Label: normal
Classification : Correct ✅


Report #2219:
Report text:


'Redemonstration of azygos lobe. Redemonstrated left perihilar nodular opacity, similar in size from previous examination. Dense appearing, may be granulomatous. The trachea is midline. Negative for pneumothorax, pleural effusion or focal airspace consolidation. The heart size is normal. XXXX. Limited exam, for evaluation of fractures. However, no evidence for displaced rib fracture.'

--------------------------------------------------
Original Label : abnormal
Predicted Label: abnormal
Classification : Correct ✅


Report #2873:
Report text:


'Heart size normal. Lungs are clear. XXXX are normal. No pneumonia, effusions, edema, pneumothorax, adenopathy, nodules or masses.'

--------------------------------------------------
Original Label : normal
Predicted Label: normal
Classification : Correct ✅


Report #477:
Report text:


'Both lungs are clear and expanded. Heart and mediastinum normal.'

--------------------------------------------------
Original Label : normal
Predicted Label: normal
Classification : Correct ✅


Report #723:
Report text:


'The lungs are clear bilaterally. Specifically, no evidence of focal consolidation, pneumothorax, or pleural effusion.. Scattered calcified granulomas noted. Cardio mediastinal silhouette is unremarkable. Visualized osseous structures of the thorax demonstrate mild multilevel degenerative disc disease of the thoracolumbar spine without acute abnormality.'

--------------------------------------------------
Original Label : abnormal
Predicted Label: normal
Classification : Wrong 👎


Report #1554:
Report text:


'The lungs are clear, and without focal airspace opacity. The cardiomediastinal silhouette is normal in size and contour, and stable. There is no pneumothorax or large pleural effusion. XXXX foreign body in the posterior soft tissues appear stable.'

--------------------------------------------------
Original Label : abnormal
Predicted Label: normal
Classification : Wrong 👎


Report #922:
Report text:


'XXXX XXXX and lateral chest examination was obtained. The heart silhouette is normal in size and contour. Aortic XXXX appear unremarkable. Lungs demonstrate no acute findings. There is no effusion or pneumothorax. Bilateral prominent lung vascularity medially, unchanged.'

--------------------------------------------------
Original Label : abnormal
Predicted Label: normal
Classification : Wrong 👎


Report #1186:
Report text:


'There are XXXX sternotomy XXXX identified. The heart is within normal limits in size. The aorta is calcified and tortuous. There are scattered calcified granulomas throughout both lungs. No focal infiltrate, pleural effusion, or pneumothorax. Mild degenerative changes of the thoracic spine.'

--------------------------------------------------
Original Label : abnormal
Predicted Label: abnormal
Classification : Correct ✅


Report #119:
Report text:


'Mild hypoventilation with bronchovascular crowding and prominent central and basilar interstitial markings. No focal alveolar consolidation, no pleural effusion demonstrated. Considering technical factors heart size XXXX within normal limits.'

--------------------------------------------------
Original Label : abnormal
Predicted Label: abnormal
Classification : Correct ✅

SUMMARY:

Analyzed 10 reports
Correct predictions: 7
Accuracy: 70.0%



## 2. Few-Shot Classification

Zero-shot classification may not always be accurate. To enhance performance, we use **few-shot prompting**, where the model is provided with a few labeled examples to improve its accuracy.

### How It Works:
**Prompt Template:**  
- The `{examples}` placeholder contains relevant instances of abnormalities.
- The `{report}` placeholder is replaced with an actual chest X-ray report.
- The model is instructed to classify a report as **abnormal** if it contains specific **findings**.

In [13]:
few_show_template = """
### INSTRUCTION
You are a specialist in chest X-ray reports.
Your task is to classify a report as abnormal if it describes signs of {looking_for}, such as:
{examples}
Consider the finding positive even if it is mild.
If multiple X-rays are reported together, focus only on the chest X-ray report.
Your response should be in JSON format with the key 'classification' and the possible values: 'normal' or 'abnormal'.

### REPORT TO CLASSIFY
{report}
"""

looking_for = 'any abnormalities'

examples = """
- Low or high lung volume.
- Abnormalities in the lungs, bones, heart, mediastinum, or pleura.
- Abnormal calcifications, granulomas, or calcified lymph nodes.
- Post-surgical changes overlying the axilla, neck, or abdominal regions.
- Presence of surgical or closure devices.
- Presence of hiatal hernia even if small.
- Part of the lung was not evaluated.
"""


print(f"\n{looking_for.capitalize()} Classifier")
print(f"Examples: {examples}")

seed=42
results = classify_multiple_reports(df, n_reports=10,
                                    template=few_show_template,
                                    looking_for=looking_for,
                                    examples=examples,
                                    seed=seed)

display_results(results, show_original_label=True)


Any abnormalities Classifier
Examples: 
- Low or high lung volume.
- Abnormalities in the lungs, bones, heart, mediastinum, or pleura.
- Abnormal calcifications, granulomas, or calcified lymph nodes.
- Post-surgical changes overlying the axilla, neck, or abdominal regions.
- Presence of surgical or closure devices.
- Presence of hiatal hernia even if small.
- Part of the lung was not evaluated.



Classifying reports: 100%|██████████| 10/10 [00:05<00:00,  1.67it/s]


Report #2619:
Report text:





'Cardiac and mediastinal contours are within normal limits. The lungs are clear. Bony structures are intact. Small hiatal hernia.'

--------------------------------------------------
Original Label : abnormal
Predicted Label: abnormal
Classification : Correct ✅


Report #456:
Report text:


'The cardiomediastinal silhouette is normal in size and contour. No focal consolidation, pneumothorax or large pleural effusion. Negative for acute bone abnormality.'

--------------------------------------------------
Original Label : normal
Predicted Label: normal
Classification : Correct ✅


Report #102:
Report text:


'The heart is normal in size. The pulmonary vascularity is within normal limits in appearance. No pneumothorax or pleural effusion. A wedge-shaped opacity has developed in the right upper lobe. There is also XXXX patchy opacification identified in the left upper lobe. No acute bony abnormality.'

--------------------------------------------------
Original Label : abnormal
Predicted Label: abnormal
Classification : Correct ✅


Report #3037:
Report text:


'There is mild XXXX opacification over both XXXX, XXXX secondary to soft tissue attenuation. There are no focal air space opacities. No pleural effusion or pneumothorax. Cardiomediastinal silhouette is within normal limits. Trachea is midline. No free subdiaphragmatic air.'

--------------------------------------------------
Original Label : abnormal
Predicted Label: abnormal
Classification : Correct ✅


Report #1126:
Report text:


'The lungs are clear bilaterally. Specifically, no evidence of focal consolidation, pneumothorax, or pleural effusion.. Cardio mediastinal silhouette is unremarkable. Visualized osseous structures of the thorax are without acute abnormality.'

--------------------------------------------------
Original Label : normal
Predicted Label: normal
Classification : Correct ✅


Report #1003:
Report text:


'Three images are available for review. The heart size is normal. The mediastinal contour is within normal limits. The lungs are free of any focal infiltrates. There are no nodules or masses. No visible pneumothorax. No visible pleural fluid. The XXXX are grossly normal. There is no visible free intraperitoneal air under the diaphragm.'

--------------------------------------------------
Original Label : normal
Predicted Label: normal
Classification : Correct ✅


Report #914:
Report text:


'Cardiomediastinal silhouette is within normal limits in overall size and appearance. Central vascular markings are symmetric and within normal limits. The lungs are normally inflated with no focal airspace disease, pleural effusion, or pneumothorax. No acute bone abnormality.'

--------------------------------------------------
Original Label : normal
Predicted Label: normal
Classification : Correct ✅


Report #571:
Report text:


'Normal heart and mediastinum. Clear lungs. Trachea is midline. No pneumothorax. No pleural effusion. Radiopaque foreign body overlying left chest.'

--------------------------------------------------
Original Label : abnormal
Predicted Label: abnormal
Classification : Correct ✅


Report #3016:
Report text:


'There is a subtle left medial base opacity. Cardiomediastinal silhouette is normal. Pulmonary vasculature and XXXX are normal. No pneumothorax or large pleural effusion. Osseous structures and soft tissues are normal.'

--------------------------------------------------
Original Label : abnormal
Predicted Label: abnormal
Classification : Correct ✅


Report #419:
Report text:


'No focal consolidation, suspicious pulmonary opacity or definite pleural effusion. Heart size and pulmonary vascularity within normal limits. Stable mediastinal contour. Calcified hilar lymph XXXX. Visualized osseous structures unremarkable.'

--------------------------------------------------
Original Label : abnormal
Predicted Label: normal
Classification : Wrong 👎

SUMMARY:

Analyzed 10 reports
Correct predictions: 9
Accuracy: 90.0%



## 3. Condition-Specific Classification

In some cases, you may need to classify reports based on specific medical conditions rather than general abnormalities.

### How It Works:
**Prompt Template:**  
- The `{looking_for}` placeholder specifies the condition to be detected (e.g., cardiomegaly, COPD).
- The `{examples}` placeholder contains key indicators related to the condition.
- The model classifies the report as **"present"** or **"absent"** based on the findings.

### Example: Cardiomegaly Detection
- **Signs of Cardiomegaly:**
  - Increased heart size
  - Enlarged cardiac silhouette
  - Increased cardiomediastinal silhouette

In [9]:
finding_specific_template = """
### INSTRUCTION
You are a specialist in chest X-ray reports.
Your task is to classify a report as positive if it describes signs of {looking_for}, such as:
{examples}
Consider the finding positive even if it is mild.
Your response should be in JSON format with the key 'classification' and the possible values: 'present' or 'absent'.

### REPORT TO CLASSIFY
{report}
"""

looking_for = 'cardiomegaly'

examples = """
- Increased heart size.
- Enlarged cardiac silhouette.
- Increased cardiomediastinal silhouette.
"""

print(f"\n{looking_for.capitalize()} Detection")
print(f"Examples: {examples}")


results = classify_multiple_reports(df,
                                    n_reports=10,
                                    template=finding_specific_template,
                                    looking_for=looking_for,
                                    examples=examples,
                                    seed=45)

display_results(results, show_original_label=False)


Cardiomegaly Detection
Examples: 
- Increased heart size.
- Enlarged cardiac silhouette.
- Increased cardiomediastinal silhouette.



Classifying reports: 100%|██████████| 10/10 [00:05<00:00,  1.88it/s]


Report #1113:
Report text:





'Heart size appears enlarged. Mediastinal contours are within normal limits. Lung volumes are low with central bronchovascular crowding and patchy basilar atelectasis.. Osseous structures are within normal limits for patient age.'

--------------------------------------------------
Finding: present ✅


Report #1710:
Report text:


'The cardiomediastinal silhouette is normal in size and contour. Negative for effusion, pneumothorax, or focal airspace consolidation. The lungs are normally aerated.'

--------------------------------------------------
Finding: absent 👎


Report #1998:
Report text:


'Both lungs remain clear and expanded. No focal parenchymal infiltrates or pleural air collections. Heart and aorta are normal. No change in the large hiatus hernia. Pelvis. Bone density is decreased. Hips are normal and symmetric. No fractures, dislocations, or bone destruction. Note XXXX of a severe rotatory dextroscoliosis in the lumbar spine.'

--------------------------------------------------
Finding: absent 👎


Report #1055:
Report text:


'The heart and mediastinum are unremarkable. The lungs are clear without infiltrate. There is no effusion or pneumothorax.'

--------------------------------------------------
Finding: absent 👎


Report #335:
Report text:


'Normal cardiac contours. No pleural effusion or pneumothorax. Bilateral lower lobe bronchial thickening consistent with bronchitis.'

--------------------------------------------------
Finding: absent 👎


Report #1241:
Report text:


'Cardiomediastinal silhouettes are within normal limits. Lungs are clear without focal consolidation, pneumothorax, or pleural effusion. Bony thorax is unremarkable.'

--------------------------------------------------
Finding: absent 👎


Report #1387:
Report text:


'The XXXX examination consists of frontal and lateral radiographs of the chest. The cardiomediastinal contours are within normal limits. Pulmonary vascularity is within normal limits. No focal consolidation, pleural effusion, or pneumothorax identified. Multilevel degenerative changes are seen throughout the thoracic spine. XXXX anchors XXXX over the left humeral head. There is mild bilateral acromioclavicular joint osteoarthritis. Visualized upper abdomen is grossly unremarkable in appearance.'

--------------------------------------------------
Finding: absent 👎


Report #88:
Report text:


'Moderate bilateral interstitial edema, with cardiomegaly and bilateral effusion consistent with moderate cardiac failure. A large calcified right mediastinal adenopathy, XXXX chronic fungal. No pneumothorax.'

--------------------------------------------------
Finding: present ✅


Report #296:
Report text:


'Heart size normal. Lungs XXXX clear. XXXX XXXX normal. No pneumonia, effusions, edema, pneumothorax, adenopathy, nodules or masses.'

--------------------------------------------------
Finding: absent 👎


Report #1983:
Report text:


'The heart size is within normal limits. There are calcified hilar lymph XXXX bilaterally. There are bibasilar airspace opacities with small bilateral pleural effusions, left greater than right. No pneumothorax. No acute bony abnormalities.'

--------------------------------------------------
Finding: absent 👎

SUMMARY:

Analyzed 10 reports


# Create Your Own Specialized Classifier! 🚀

You can create a custom classifier for any specific condition by following these steps:

1. **Define the condition** you want to detect (e.g., COPD, pleural effusion).
2. **Specify key indicators** associated with the condition.
3. **Modify the template** to reflect the new condition and findings.
4. **Run the model** to classify reports based on your selected condition.

### Example: COPD Detection
- **Signs of COPD:**
  - Hyperinflated lungs
  - Flattened diaphragm
  - Increased retrosternal airspace
  - Narrowed cardiac silhouette
  - Emphysematous changes


In [10]:
finding_specific_template = """
### INSTRUCTION
You are a specialist in chest X-ray reports.
Your task is to classify a report as positive if it describes signs of {looking_for}, such as:
{examples}
Consider the finding positive even if it is mild.
Your response should be in JSON format with the key 'classification' and the possible values: 'present' or 'absent'.

### REPORT TO CLASSIFY
{report}
"""

looking_for = "COPD (Chronic Obstructive Pulmonary Disease)"

examples = """
- Hyperinflated lungs.
- Flattened diaphragm.
- Increased retrosternal airspace.
- Narrowed cardiac silhouette.
- Emphysematous changes.
"""

print(f"\n{looking_for.capitalize()} Detection")
print(f"Examples: {examples}")


results = classify_multiple_reports(df,
                                    n_reports=10,
                                    template=finding_specific_template,
                                    looking_for=looking_for,
                                    examples=examples,
                                    seed=43)

display_results(results, show_original_label=False)


Copd (chronic obstructive pulmonary disease) Detection
Examples: 
- Hyperinflated lungs.
- Flattened diaphragm.
- Increased retrosternal airspace.
- Narrowed cardiac silhouette.
- Emphysematous changes.



Classifying reports: 100%|██████████| 10/10 [00:05<00:00,  1.88it/s]


Report #157:
Report text:





'The cardiomediastinal silhouette and pulmonary vasculature are within normal limits in size. Dual-XXXX cardiac pacing device is in XXXX, stable position with leads projecting over the right atrium and right ventricle. The lungs are clear of focal airspace disease, pneumothorax, or pleural effusion. There are no acute bony findings.'

--------------------------------------------------
Finding: absent 👎


Report #1171:
Report text:


'Stable left-sided ICD and postsurgical changes consistent with prior CABG. The cardiomediastinal silhouette and vasculature are within normal limits for size and contour. The lungs are normally inflated and clear. Mild degenerative endplate changes of the spine.'

--------------------------------------------------
Finding: absent 👎


Report #2851:
Report text:


'There are no focal areas of consolidation. No suspicious pulmonary opacities. Heart size within normal limits. No pleural effusions. No evidence of pneumothorax. Osseous structures intact.'

--------------------------------------------------
Finding: absent 👎


Report #3123:
Report text:


'The heart size is stable. The aorta is ectatic and atherosclerotic but stable. XXXX sternotomy XXXX are again noted. The scarring in the left lower lobe is again noted and unchanged from prior exam. There are mild bilateral prominent lung interstitial opacities consistent with emphysematous disease. The calcified granulomas are stable.'

--------------------------------------------------
Finding: present ✅


Report #589:
Report text:


'Lungs are hyperexpanded. Bullae are present in the upper lobes. No focal infiltrates. Heart size normal.'

--------------------------------------------------
Finding: present ✅


Report #1894:
Report text:


'Normal heart size. Left chest XXXX tip mid SVC. Right axillary surgical clips. Stable pleural based nodule left mid chest. No acute pulmonary findings.'

--------------------------------------------------
Finding: absent 👎


Report #1515:
Report text:


'Lungs are clear bilaterally. Cardiac and mediastinal silhouettes are normal. Pulmonary vasculature is normal. No pneumothorax or pleural effusion. No acute bony abnormality.'

--------------------------------------------------
Finding: absent 👎


Report #2751:
Report text:


'Heart size, mediastinal contour, and pulmonary vascularity are within normal limits. No focal consolidation, suspicious pulmonary opacity, large pleural effusion, or pneumothorax is identified. There is a rounded lucency seen above the diaphragm on lateral view, suggestive of small hiatal hernia. Visualized osseous structures appear intact. Degenerative changes of the thoracic spine seen.'

--------------------------------------------------
Finding: absent 👎


Report #2860:
Report text:


'The cardiomediastinal silhouette and pulmonary vasculature are within normal limits. There is no pneumothorax or pleural effusion. There are no focal areas of consolidation. Small T-spine osteophytes.'

--------------------------------------------------
Finding: absent 👎


Report #3247:
Report text:


'The heart is normal in size. The mediastinum is unremarkable. XXXX XXXX opacities in right mid lung. The lungs are otherwise grossly clear.'

--------------------------------------------------
Finding: absent 👎

SUMMARY:

Analyzed 10 reports


# Conclusion & Next Steps 📌

Through this notebook, we've explored how **prompt engineering** enhances the classification of radiology reports using LLMs.

### Key Takeaways:
✅ **Zero-shot prompting** enables classification without prior examples.  
✅ **Few-shot prompting** improves accuracy by providing labeled examples.  
✅ **Condition-specific classification** allows targeted detection of medical conditions.  

### What’s Next?
Now that you understand the fundamentals, here are some next steps:
1. **Test additional conditions** by modifying the `{looking_for}` and `{examples}` variables.
2. **Experiment with different LLM models** to compare performance.
3. **Refine your prompts** to improve classification accuracy.
4. **Apply real-world datasets** to validate the model in clinical settings.

By leveraging **LLMs and prompt engineering**, we can enhance efficiency in **radiology report classification**, aiding medical professionals in making informed decisions. 🏥💡

---

### Acknowledgment  
This project utilizes an **open-access chest X-ray collection** from **Indiana University**, sourced from the **OpenI repository** ([OpenI](https://openi.nlm.nih.gov/)). We acknowledge and appreciate the efforts of the dataset creators in making this valuable resource available for research and analysis.

---

### Credits
This notebook is adapted from work by **Paulo Kuriki** and **Felipe Kitamura**.

