<a href="https://colab.research.google.com/github/peterruler/med-gemma/blob/main/medgemma.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

must have a colab plus subscription (10$ / month), and choose an A100 GPU in the runtime settings.

In [None]:
!pip install transformers
!pip install huggingface_hub
!pip install accelerate
!pip install -U bitsandbytes



In [None]:
import torch
print(torch.cuda.is_available()) # If you intended to use GPU

True


must login to huggingface (https://huggingface.co) and create an acesss token with read rights

In [None]:
from huggingface_hub import login
from google.colab import userdata

hf_token = userdata.get('HF_TOKEN')
login(token=hf_token)

In [None]:
import torch
from transformers import pipeline
from PIL import Image
import requests
import os
from io import BytesIO

# Set environment variable to disable torch compile (kept as in original code)
os.environ["TORCH_COMPILE_DISABLE"] = "1"

# Check if CUDA is available (already done, but good to be explicit)
if not torch.cuda.is_available():
    print("CUDA is not available. Model loading might fail or be very slow on CPU.")
    device = "cpu"
else:
    device = "cuda"
    print(f"CUDA is available. Using device: {device}")

# Load the pipeline with 8-bit quantization to reduce memory usage
# This requires the bitsandbytes library, which you have already installed.
try:
    pipe = pipeline(
        "image-text-to-text",
        model="google/medgemma-4b-it",
        torch_dtype=torch.bfloat16, # Keep bfloat16 if possible, but 8-bit is the main memory saver
        # Add load_in_8bit=True for 8-bit quantization
        # If 8-bit still fails, try load_in_4bit=True (requires additional setup sometimes)
        model_kwargs={"load_in_4bit": True}
    )
    print("Pipeline loaded successfully using 4-bit quantization.")

except Exception as e:
    print(f"Error loading pipeline with 8-bit quantization: {e}")
    # Fallback to loading without quantization or suggest smaller model if necessary
    print("Consider trying 4-bit quantization or a smaller model if 8-bit fails.")
    # If you want to try 4-bit as a fallback, uncomment the code below (and ensure you have the necessary setup if needed)
    # try:
    #     pipe = pipeline(
    #         "image-text-to-text",
    #         model="google/medgemma-4b-it",
    #         torch_dtype=torch.bfloat16,
    #         device=device,
    #         model_kwargs={"load_in_4bit": True}
    #     )
    #     print("Pipeline loaded successfully using 4-bit quantization.")
    # except Exception as e4bit:
    #     print(f"Error loading pipeline with 4-bit quantization: {e4bit}")
    #     print("Could not load the model with 8-bit or 4-bit quantization.")
    #     exit() # Exit if model cannot be loaded

# Image attribution: Stillwaterising, CC0, via Wikimedia Commons
image_url = "https://upload.wikimedia.org/wikipedia/commons/c/c8/Chest_Xray_PA_3-8-2010.png"

try:
    response = requests.get(image_url, headers={"User-Agent": "example"}, stream=True)
    response.raise_for_status() # Raise an exception for bad status codes
    image = Image.open(BytesIO(response.content)).convert("RGB") # Ensure image is in RGB format
    print("Image loaded successfully.")
except requests.exceptions.RequestException as e:
    print(f"Error fetching image from URL: {e}")
    exit() # Exit if image cannot be loaded
except Exception as e:
    print(f"Error opening image: {e}")
    exit()

messages = [
    {
        "role": "system",
        "content": [{"type": "text", "text": "You are an expert radiologist."}]
    },
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Describe this X-ray"},
            {"type": "image", "image": image}
        ]
    }
]

# The pipeline expects the messages directly as the primary input.
# The `text=` notation is not correct for this pipeline type and message format.
try:
    output = pipe(messages, max_new_tokens=200) # Pass messages directly

    # Process the output to extract the generated text
    # The structure of the output can vary depending on the pipeline and model.
    # Based on the original code's access, it seems like the generated text is
    # a list of message-like segments within output[0]["generated_text"].
    # We will use a safe approach to access the content of the last segment.
    if output and isinstance(output, list) and len(output) > 0 and output[0].get("generated_text") is not None:
        generated_content_list = output[0]["generated_text"]
        if isinstance(generated_content_list, list) and len(generated_content_list) > 0:
            final_segment = generated_content_list[-1]
            if isinstance(final_segment, dict) and "content" in final_segment:
                print("Generated Text:")
                print(final_segment["content"])
            elif isinstance(final_segment, str): # Sometimes it might be a direct string
                 print("Generated Text:")
                 print(final_segment)
            else:
                print("Could not find 'content' in the last segment of the generated text.")
                print("Full output of the last segment:", final_segment)
        else:
            print("'generated_text' is an empty list.")
            print("Full output of output[0]:", output[0])
    else:
        print("Unexpected output structure from the pipeline.")
        print("Full output:", output)

except Exception as e:
    print(f"An error occurred during pipeline execution: {e}")

The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.


CUDA is available. Using device: cuda


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Device set to use cuda:0


Pipeline loaded successfully using 4-bit quantization.


Setting `pad_token_id` to `eos_token_id`:1 for open-end generation.


Image loaded successfully.
Generated Text:
Based on the provided chest X-ray, here's a description:

**Overall Impression:** The X-ray demonstrates a normal posteroanterior (PA) view of the chest. The lungs appear clear bilaterally.

**Specific Findings:**

*   **Heart Size and Shape:** The heart size and overall shape are unremarkable.
*   **Mediastinum:** The mediastinum (the space between the lungs containing the heart, great vessels, trachea, and other structures) is within normal limits. No obvious masses or effusions are seen.
*   **Bones:** The ribs, clavicles, and vertebral bodies appear intact without any evidence of fractures.
*   **Lungs:** The lungs are clear bilaterally. There are no signs of consolidation, effusions, or pneumothorax.

**Conclusion:** The X-ray appears normal.

**Disclaimer:** This is a preliminary interpretation based solely on the provided image. A full interpretation requires clinical context and comparison


In [None]:
from io import BytesIO

In [None]:
try:
    from google.colab import drive
    drive.mount('/content/drive')
    IS_COLAB = True
except ImportError:
    IS_COLAB = False
    print("Not running in Google Colab or drive mount failed.")

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
# your code with your MRI / CT images
# Pfade zu Ihren Bildern auf Google Drive
#image_path1 = "/content/drive/My Drive/dl-udemy/images/wado.jpeg"
#image_path2 = "/content/drive/My Drive/dl-udemy/images/wado-2.jpeg"
#image_path3 = "/content/drive/My Drive/dl-udemy/images/wado-3.jpeg"
#image_path4 = "/content/drive/My Drive/dl-udemy/images/wado-4.jpeg"
#image_path5 = "/content/drive/My Drive/dl-udemy/images/wado-5.jpeg"

# Bilder direkt von den Dateipfaden laden
#try:
#    image1 = Image.open(image_path1)
#    image2 = Image.open(image_path2)
#    image3 = Image.open(image_path3)
#    image4 = Image.open(image_path4)
#   image5 = Image.open(image_path5)
