From the associated README:

# Medical Document Summarizer

**A patient-centered tool that translates complex medical documentation into clear, understandable summaries using Google's MedGemma model.**

This project addresses the critical gap between the medical information patients receive and their ability to comprehend it. Built for the HAI-DEF Kaggle hackathon, it leverages state-of-the-art medical language models to help patients understand doctors' notes, test results, discharge instructions, and specialist reports without providing medical advice or diagnosis.

## The Problem

Research shows that only 12% of American adults have the health literacy needed to fully understand what their doctors tell them. Patients forget 40-80% of medical information immediately after consultations, and 78% of emergency department patients leave with deficient comprehension of their discharge instructions‚Äîoften without realizing it. This comprehension gap costs the US healthcare system $106-238 billion annually and is independently associated with a 75% increase in mortality risk among elderly patients.

## The Solution

This application provides three input modes (text entry, voice recording with medical transcription, and image upload) that feed into MedGemma's medical language understanding. The output is a structured, patient-friendly summary with key takeaways, medication explanations, medical term definitions, and suggested follow-up questions. A built-in chat interface allows patients to ask clarifying questions, all grounded in the original medical documentation.

The system runs locally on consumer hardware (preserving patient privacy) and is optimized for mobile viewing, as patients often need to reference medical information away from their computers.

## Sources

Health Literacy:

12% proficient health literacy: 2003 National Assessment of Adult Literacy (NAAL) - NCES Publication 2006-483; HHS/AHRQ 2008 Issue Brief

Comprehension Failures:

40-80% forgotten immediately: Kessels, Journal of the Royal Society of Medicine, 2003; AAFP, 2018
78% ED patients deficient comprehension: Engel et al., Annals of Emergency Medicine, 2009

Economic Impact:

$106-238 billion annually: Vernon et al., University of Connecticut/National Patient Safety Foundation, 2007

Mortality:

75% increased mortality risk (HR 1.75): Sudore et al., Journal of General Internal Medicine, 2006

Privacy Concerns:

92% consider privacy a right: AMA/Savvy Cooperative survey, AMA 2022
14% trust tech companies: Rock Health 2023 consumer survey

# Usage of this notebook

Consult the README, kaggle_setup, and requirements.txt for more information that may be necessary to get this project running. The models require particular versions of libraries and will not run without compatible versions. And as new models and libaries release it will be necessary to adjust the versions appropriately.

## Before running this notebook
You must create a prompt directory and populate it with json_prompt.txt and conversational_prompt.txt. THIS DIRECTORY MAY VARY FROM THE CODE: Search for PROMPT_FILE_PATH and ensure it matches your path.

You must create a valid Huggingface token for the Google MedGemma model access, and add it to this project.

You can run this base, headless version of the application in Kaggle or other notebook systems. If you want to modify it based off of the project code, for instance to run another version, keep these points in mind:

* The source code is organized into projects which have similar function names and possibly variables, so you cannot necessarily combine all functionality into one notebook.
* The project is broken up into packages.  You will need to remove package-based imports and consider the order in which you place functionality into cells.
 * You will need to remove the standalone testing stub from the json parsing code if you reimport it.

Also note that you will need to modify the code if you place the prompts in a directory other than that specified in this code.

In [None]:
%pip install -q bitsandbytes transformers accelerate json5

In [None]:
import os
os.environ['INFRA_ABILITY'] = 'HIGH'

In [None]:
# core/app_config.py
"""
Handles application-wide configuration: environment detection, secrets, and hardware profiles.
This module is the single source of truth for the runtime environment.
"""

import os
import torch
from pathlib import Path
from transformers import BitsAndBytesConfig
from typing import Dict, Any

# --- Environment Detection & Constants ---
IS_KAGGLE_ENV: bool = "KAGGLE_KERNEL_RUN_TYPE" in os.environ
BASE_DIR: Path = Path("/kaggle/input/medgemma-laurie-prompts") if IS_KAGGLE_ENV else Path(__file__).parent.parent.resolve()
PROMPT_FILE_PATH: Path = BASE_DIR / "json_prompt.txt"
CONVERSATIONAL_PROMPT_FILE_PATH: Path = BASE_DIR / "conversational_prompt.txt"

# --- Secret & Environment Variable Loading ---
if IS_KAGGLE_ENV:
    print("‚úÖ Running in Kaggle environment.")
    try:
        from kaggle_secrets import UserSecretsClient # type: ignore
        user_secrets = UserSecretsClient()
        os.environ["HUGGING_FACE_HUB_TOKEN"] = user_secrets.get_secret("HUGGING_FACE_HUB_TOKEN")
        print("‚úÖ Kaggle secret 'HUGGING_FACE_HUB_TOKEN' loaded into environment.")
    except (ImportError, Exception) as e:
        print(f"üî• Failed to get Kaggle secret. Model loading may fail. Error: {e}")
else:
    print("‚úÖ Running in a local environment.")
    try:
        from dotenv import load_dotenv
        load_dotenv()
        print("‚úÖ Local .env file processed.")
    except ImportError:
        print("‚ö†Ô∏è `python-dotenv` not found. Relying on manually set environment variables.")

# --- Core Infrastructure Configuration ---
INFRA_ABILITY: str = os.environ.get("INFRA_ABILITY", "LOW").upper()
# MODEL_MODE "VLM" for text and image
MODEL_MODE: str = os.environ.get("MODEL_MODE", "TEXT").upper() # Used by LLM service

# Defines hardware and performance configurations for all potential services.
INFRA_CONFIGS: Dict[str, Dict[str, Any]] = {
    "LOW": {
        "llm_init_args": {
            "quantization_config": BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.float16),
            "torch_dtype": torch.bfloat16,
            "device_map": "auto"
        },
        "llm_gen_args": {
            "max_new_tokens": 1536,
            "do_sample": False
        },
        "asr_init_args": {
            "quantization_config": BitsAndBytesConfig(load_in_4bit=True),
            "torch_dtype": torch.float16
        }
    },
    "HIGH": {
        "llm_init_args": {
            "quantization_config": None,
            "torch_dtype": torch.bfloat16,
            "device_map": "auto"
        },
        "llm_gen_args": {
            "max_new_tokens": 2048,
            "do_sample": True,
            "temperature": 0.6,
            "top_p": 0.9
        },
        "asr_init_args": {
            "quantization_config": None,
            "torch_dtype": torch.float16
        }
    },
    "APPLE_SILICON": {
        "llm_init_args": {
            "quantization_config": None,
            "torch_dtype": torch.float32,
            "device_map": "mps"
        },
        "llm_gen_args": {
            "max_new_tokens": 3096,
            "do_sample": True,
            "temperature": 0.3,
            "top_p": 0.9
        },
        "asr_init_args": {
            "quantization_config": None,
            "torch_dtype": torch.float32
        }
    }
}

active_infra_config: Dict[str, Any] = INFRA_CONFIGS.get(INFRA_ABILITY, INFRA_CONFIGS["LOW"])
print(f"‚úÖ App config loaded. Mode: '{INFRA_ABILITY}'")

In [None]:
# services/llm/config.py
"""LLM-specific model configurations."""
from typing import Dict, Any

# --- Prompt Loading ---
try:
    system_prompt_text = PROMPT_FILE_PATH.read_text()
except FileNotFoundError:
    print(f"üî• CRITICAL: Prompt file not found at '{PROMPT_FILE_PATH}'. Using fallback.")
    system_prompt_text = "You are a helpful medical assistant."

try:
    conversational_prompt_text = CONVERSATIONAL_PROMPT_FILE_PATH.read_text()
except FileNotFoundError:
    print(f"üî• CRITICAL: Prompt file not found at '{CONVERSATIONAL_PROMPT_FILE_PATH}'. Using fallback.")
    conversational_prompt_text = "You are a helpful medical assistant."

# --- LLM Model Definitions ---
MODEL_CONFIGS: Dict[str, Dict[str, Any]] = {
    "TEXT": {
        "model_id": "google/medgemma-1.5-4b-it",
        "pipeline_task": "text-generation",
        "system_prompt": system_prompt_text,
        "conversational_prompt": conversational_prompt_text
    },
    # --- MODIFIED: VLM configuration now uses MedGemma ---
    "VLM": {
        "model_id": "google/medgemma-1.5-4b-it", # Unified model for both text and vision
        "pipeline_task": "image-text-to-text",        # Correct pipeline for multimodal input
        "system_prompt": system_prompt_text,
        "conversational_prompt": conversational_prompt_text
    }
}
active_model_config: Dict[str, Any] = MODEL_CONFIGS.get(MODEL_MODE, MODEL_CONFIGS["TEXT"])
print(f"‚úÖ LLM Service Mode: '{MODEL_MODE}' using model '{active_model_config['model_id']}'")

In [None]:
# services/llm/engine.py
"""Initializes and provides the core LLM text-generation and image-comprehension function."""
from typing import Optional, Dict, Any, Union, List
import torch
from transformers import (
    AutoProcessor,
    AutoModelForCausalLM,
    BitsAndBytesConfig,
    PreTrainedModel,
    PreTrainedTokenizer,
)

# ==============================================================================
# SETUP: Define the Generator and its Initialization Function
# ==============================================================================
_initialized = False
_model = None
_processor = None

# This dictionary holds the loaded model and tokenizer
GeneratorObjects = Dict[str, Union[PreTrainedModel, PreTrainedTokenizer]]

def initialize_generator(
    model_id: str,
    torch_dtype: torch.dtype,
    quantization_config: BitsAndBytesConfig = None,
    device_map: str = "auto",
    **kwargs: Any # Catches unused arguments like 'task'
) -> GeneratorObjects:
    """
    Loads and initializes the model and tokenizer, returning them in a dictionary.
    """

    global _initialized
    global _model
    global _processor
    if _initialized:
        return {"model": _model, "processor": _processor}

    print("üöÄ Initializing model and processor for direct generation...")
    print(f"   - Model: {model_id}")
    print(f"   - DType: {torch_dtype}")
    print(f"   - Quantization: {'Enabled' if quantization_config else 'Disabled'}")

    try:
        _model = AutoModelForCausalLM.from_pretrained(
            model_id,
            torch_dtype=torch_dtype,
            quantization_config=quantization_config,
            device_map=device_map,
        )
        # Load a processor which can handle both text and images.
        _processor = AutoProcessor.from_pretrained(model_id)
        
        # Many models like Llama don't have a pad token, so we use the EOS token.
        if _processor.tokenizer.pad_token is None:
            _processor.tokenizer.pad_token = _processor.tokenizer.eos_token
        return {"model": _model, "processor": _processor}

    except Exception as e:
        print(f"üî• CRITICAL: Model initialization failed. Error: {e}")
        return {}


def generate_text(
    generator: GeneratorObjects,
    messages: List[Dict[str, str]],
    image_input: Optional[str],
    **kwargs: Any
) -> str:
    """
    A stateless text generation function using the model.generate() method.
    This function replaces the call to the `pipe()` object.
    """
    if not generator:
        print("üî• Generation failed: model and tokenizer not available.")
        return "Error: Generator not initialized."

    model = generator["model"]
    _processor = generator["processor"]
    
    print("Prepare inputs with the processor template")
    # --- Step 1: Prepare inputs using the processor ---
    # Mode: Use the processor's internal tokenizer to apply the chat template
    prompt_text = _processor.tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )

    if image_input:
        # VLM Mode: Load image and process text/image together
        inputs = _processor(text=prompt_text, images=image_input, return_tensors="pt").to(model.device)
    else:
        inputs = _processor.tokenizer(prompt_text, return_tensors="pt", return_attention_mask=True).to(model.device)

    print("Generate the model outputs...")
    # 2. Call model.generate()
    outputs = model.generate(**inputs, **kwargs)

    # 3. Decode only the newly generated tokens, not the original prompt
    # --- Step 3: Decode ---
    input_ids_len = inputs["input_ids"].shape[1]
    newly_generated_ids = outputs[0, input_ids_len:]
    generated_text = _processor.decode(newly_generated_ids, skip_special_tokens=True)

    return generated_text


def generate_summary(prompt_key: str, medical_text: str, image_input: Optional[str] = None) -> str:
    """
    Generates a summary from medical text or provides comprehension for an image.
    - In 'TEXT' mode, it uses `medical_text`.
    - In 'VLM' mode, it uses both `image_input` (as a file path) and `medical_text` (as a prompt).
    """
    # --- Logic is split based on MODEL_MODE ---
    if MODEL_MODE == "VLM":
        if not image_input:
            return "Image input is required for VLM mode."
        if not medical_text or not medical_text.strip():
            return "A text prompt is required for VLM mode."

        try:
            llm_generator = initialize_generator(
                model_id=active_model_config["model_id"],
                **active_infra_config["llm_init_args"]
            )
            # The 'image-text-to-text' pipeline requires a structured chat format.
            # The user's message content is a list containing dicts for the image and text.
            messages = [
                {
                    "role": "system",
                    # The content is a LIST containing a DICT
                    "content": [{"type": "text", "text": active_model_config[prompt_key]}]
                },
                {
                    "role": "user",
                    "content": [
                        {"type": "image", "url": image_input},
                        {"type": "text", "text": medical_text}
                    ]
                }
            ]

            print("‚è≥ Applying chat template and generating text...")
            output = generate_text(
                generator=llm_generator,
                messages=messages,
                image_input=image_input, # Pass the image path/URL here
                **active_infra_config["llm_gen_args"]
            )
            print("‚úÖ VLM generation complete.")

            # The output structure for image-to-text is slightly different
            return output.strip()
        except FileNotFoundError:
            print(f"üî• Error: Image file not found at '{image_input}'")
            return f"Error: Image file not found at '{image_input}'"
        except Exception as e:
            print(f"üî• An error occurred during VLM model inference: {e}")
            return "Sorry, an error occurred while processing your request."

    else: # Handles 'TEXT' mode
        if not medical_text or not medical_text.strip():
            return "Please enter a medical text to process."

        messages = [
            {"role": "system", "content": active_model_config[prompt_key]},
            {"role": "user", "content": medical_text},
        ]
        try:
            print("‚è≥ Generating text...")

            llm_generator = initialize_generator(
                model_id=active_model_config["model_id"],
                **active_infra_config["llm_init_args"]
            )

            result = generate_text(
                generator=llm_generator,
                messages=messages,
                image_input=None,
                **active_infra_config["llm_gen_args"]
            )

            print("‚úÖ Text generation complete.")
            return result
        except Exception as e:
            print(f"üî• An error occurred during text model inference: {e}")
            return "Sorry, an error occurred while processing your request."

In [None]:
# core_logic.py

import json5
import re

def strip_xml_tags(text: str) -> str:
    """
    If the text is wrapped in a single XML-like tag, strips the tag.
    Otherwise, returns the original text.
    Handles tags like <answer>...</answer> or <response>...</response>.
    """
    # This pattern looks for <tag>content</tag> and captures the content.
    match = re.search(r"<(?P<tag>\w+)>(?P<content>.*?)</(?P=tag)>", text.strip(), re.DOTALL)
    
    if match:
        # If a match is found, return the captured content, stripped of whitespace.
        return match.group("content").strip()
    
    # If no tags are found, return the original text.
    return text.strip()

def get_llm_summary(text,image_path):
    """
    This function encapsulates the entire long-running task.
    It is executed in the background thread by the animator.
    It must RETURN a final string, not yield.
    """
    # 1. Call the long-running AI function
    summary_result = generate_summary("system_prompt", text, image_path)
    # print("Summary result: ", summary_result)

    # 2. Clean and parse the result
    json_str = extract_json_from_text(summary_result)
    if not json_str:
        # Return an error string for the UI
        # return "Error: The AI engine did not return a valid summary structure. Please try again."
        # Better to return a poorly formatted string, rather than just an error message...
        return summary_result

    json_data = json5.loads(json_str)
    
    # 3. Format the JSON into readable text
    formatted_text = format_medical_info(json_data)
    
    # 4. Return the final, display-ready text
    return formatted_text


def get_follow_up_answer(original_text, summary, history, new_question, image_path):
    """
    (New Function) Generates an answer to a follow-up question.
    
    This function constructs a detailed prompt that gives the LLM all necessary
    context to provide a relevant, grounded answer.
    """
    # Construct a history string from the conversation
    history_str = "\n".join([f"User: {q}\nAI: {a}" for q, a in history])

    # Construct the full prompt for the LLM
    chat_text = f"""

<medical_text>
{original_text}
</medical_text>

<summary_of_text>
{summary}
</summary_of_text>

<conversation_history>
{history_str}
</conversation_history>

<user_question>
{new_question}
</user_question>

Provide your answer directly.
"""
    # Make the call to the LLM. This is expected to be a faster, simpler call.
    raw_answer = generate_summary("conversational_prompt", chat_text, image_path) 
    
    # Sometimes the LLM likes to resond with XML...
    clean_answer = strip_xml_tags(raw_answer)

    return clean_answer

In [None]:
# core/parse_json.py

import json5
import re
import html
from typing import Dict, List, Any, Optional


def extract_json_from_text(raw_text: str) -> Optional[str]:
    """
    Extract JSON from messy LLM output with multiple strategies.
    Returns the last valid JSON object found, or None.
    """
    text = raw_text.strip()

    # Strategy 1: Look for ```json blocks and try the last one
    json_blocks = re.findall(r'```json\s*(\{.*?\})\s*```', text, re.DOTALL)
    if json_blocks:
        for block in reversed(json_blocks):
            try:
                json5.loads(block)
                return block
            except ValueError:
                continue
    
    # Strategy 2: Find all complete {...} blocks and try the last valid one
    potential_jsons = []
    depth = 0
    start = -1
    
    for i, char in enumerate(text):
        if char == '{':
            if depth == 0:
                start = i
            depth += 1
        elif char == '}':
            depth -= 1
            if depth == 0 and start != -1:
                potential_jsons.append(text[start:i+1])
                start = -1
    
    for candidate in reversed(potential_jsons):
        try:
            json5.loads(candidate)
            return candidate
        except ValueError:
            continue
    
    return None


def safe_get_list(data: Dict, key: str) -> List[str]:
    """Safely extract a list of strings, handling errors and duplicates."""
    value = data.get(key)
    if value is None:
        return []
    if not isinstance(value, list):
        value = [value]
    
    seen = set()
    result = []
    for item in value:
        item_str = html.escape(str(item).strip()) if item is not None else ""
        if item_str and item_str not in seen:
            seen.add(item_str)
            result.append(item_str)
    return result


def safe_get_dict_list(data: Dict, key: str) -> List[Dict]:
    """Safely extract list of dictionaries."""
    value = data.get(key)
    if value is None:
        return []
    if isinstance(value, dict):
        value = [value]
    if not isinstance(value, list):
        return []
    return [item for item in value if isinstance(item, dict)]


def safe_get_string(data: Dict, key: str, default: str = '') -> str:
    """Safely get string value and escape for HTML safety."""
    value = data.get(key)
    if value is None or value == '':
        return default
    return html.escape(str(value).strip())


def format_medical_info(json_data: Dict[str, Any]) -> str:
    """
    Formats the structured medical JSON using semantic HTML tags 
    that Quill's clipboard parser recognizes and maps to its toolbar.
    """
    output_parts = []
    
    # 1. Header & Disclaimer
    output_parts.append("<h2>Medical Summary</h2>")
    output_parts.append("<blockquote><b>‚ö†Ô∏è DISCLAIMER:</b> This is not medical advice. Please consult your medical professional.</blockquote>")

    # 2. Key Takeaways (Bullet List)
    takeaways = safe_get_list(json_data, 'key_takeaways')
    if takeaways:
        output_parts.append("<h3>üìå Key Takeaways</h3><ul>")
        for item in takeaways:
            output_parts.append(f"<li>{item}</li>")
        output_parts.append("</ul>")

    # 3. Medications 
    medications = safe_get_dict_list(json_data, 'medications')
    if medications:
        output_parts.append("<h3>üíä Medications</h3>")
        for med in medications:
            name = safe_get_string(med, 'name', 'Unknown')
            dosage = safe_get_string(med, 'dosage')
            admin = safe_get_string(med, 'administration')
            desc = safe_get_string(med, 'description')
            
            # Use single-level paragraphs with indentation or bullets
            output_parts.append(f"<p><b>‚Ä¢ {name}</b></p>")
            if dosage: output_parts.append(f"<p style='margin-left: 20px;'>- Dosage: {dosage}</p>")
            if admin:  output_parts.append(f"<p style='margin-left: 20px;'>- How to take: {admin}</p>")
            if desc:   output_parts.append(f"<p style='margin-left: 20px;'><i>{desc}</i></p>")
        output_parts.append("<p><br></p>")

    # 4. Terms (Bold and Inline)
    terms = safe_get_dict_list(json_data, 'medical_terms')
    if terms:
        output_parts.append("<h3>üìñ Terms Explained</h3>")
        for term_obj in terms:
            term = safe_get_string(term_obj, 'term', 'Unknown')
            defn = safe_get_string(term_obj, 'definition', 'N/A')
            output_parts.append(f"<p><b>{term}</b>: {defn}</p>")
        output_parts.append("<p><br></p>")

    # 5. Questions (Ordered List)
    questions = safe_get_list(json_data, 'questions_for_provider')
    if questions:
        output_parts.append("<h3>‚ùì Questions for Provider</h3><ol>")
        for q in questions:
            output_parts.append(f"<li>{q}</li>")
        output_parts.append("</ol>")
        output_parts.append("<p><br></p>")
    
    return "".join(output_parts)


def format_medical_info_from_string(raw_text: str) -> str:
    """Extracts JSON and converts to HTML directly from LLM string."""
    json_str = extract_json_from_text(raw_text)
    if not json_str:
        return "<p>Could not find valid JSON data.</p>"
    
    try:
        json_data = json5.loads(json_str)
        return format_medical_info(json_data)
    except Exception as e:
        return f"<p>Error parsing JSON: {html.escape(str(e))}</p>"

# A test of the system

Note that the output will be a (hopefully valid) JSON string

In [None]:
# 1. Define your input text
medical_text = """
The patient is a 68-year-old Korean gentleman with a history of coronary artery disease, hypertension, diabetes and stage III CKD with a creatinine of 1.8 in May 2006 corresponding with the GFR of 40-41 mL/min. The patient had blood work done at Dr. XYZ's office on June 01, 2006, which revealed an elevation in his creatinine up to 2.3. He was asked to come in to see a nephrologist for further evaluation. I am therefore asked by Dr. XYZ to see this patient in consultation for evaluation of acute on chronic kidney failure. The patient states that he was actually taking up to 12 to 13 pills of Chinese herbs and dietary supplements for the past year. He only stopped about two or three weeks ago. He also states that TriCor was added about one or two months ago but he is not sure of the date. He has not had an ultrasound but has been diagnosed with prostatic hypertrophy by his primary care doctor and placed on Flomax. He states that his urinary dribbling and weak stream had not improved since doing this. For the past couple of weeks, he has had dizziness in the morning. This is then associated with low glucose. However the patient's blood glucose this morning was 123 and he still was dizzy. This was worse on standing. He states that he has been checking his blood pressure regularly at home because he has felt so bad and that he has gotten under 100/60 on several occasions. His pulses remained in the 60s.
"""

# 2. Call the generation function
print("\n--- Calling get_llm_summary ---")
summary = get_llm_summary(medical_text, None)

# 3. Print the final result
print("\n--- Final Generated Summary ---")
print(summary)

I don't have the link to the sample images on this computer or in the medgemma notes, so I can't actually test the image work while travelling...

# A Second test

In [None]:
# 1. Define your input text

medical_text = """
One day history upper abdominal pain. Elevated inflammatory markers. On physical exam, right sided abdominal tenderness with guarding in right iliac fossa ?appendicitis.

Histopathology report.
Biopsies from around lower pole of left kidney:
Cores of fibrovascular and adipose tissue including a small amount of inflamed renal cortical tissue at limit of one core.
Extensive fibrosis and focal fat necrosis; in areas the background is edematous with moderate chronic inflammation including few lymphoid aggregates and clusters of plasma cells with scattered mast cells. Spindle cell proliferation of bland cells as well as scattered foamy histiocytes. No evidence of malignancy. Immunostaining showing that the spindle cells are histiocytes: CD68 and CD163 positive with strong expression of factor XIIIa and focal staining for S100. Staining for CD1a and Braf is essentially negative.
Abridged genetics report: M117.1, BRAF c.1799T>A p.(Val600Glu) V600E pathogenic hotspot variant detected manually at very low level (3.5% VAF).
Findings consistent with Erdheim Chester disease.
Case Discussion
The renal findings in this case are distinctive and indeed virtually pathognomonic of Erdheim-Chester disease. Nevertheless it is useful to bear in mind mimics of the appearances if they are less florid such as lymphoma.
"""

# 2. Call the generation function
print("\n--- Calling get_llm_summary ---")
summary = get_llm_summary(medical_text, None)

# 3. Print the final result
print("\n--- Final Generated Summary ---")
print(summary)