~~~
Copyright 2025 Google LLC

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
~~~

# Prompt MedGemma 1.5 with Whole Slide Digital Pathology Imaging

<table><tbody><tr>
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/google-health/medgemma/blob/main/notebooks/rl_with_trl.ipynb">
      <img alt="Google Colab logo" src="https://www.tensorflow.org/images/colab_logo_32px.png" width="32px"><br> Run in Google Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogle-Health%2Fmedgemma%2Fmain%2Fnotebooks%2Frl_with_trl.ipynb">
      <img alt="Google Cloud Colab Enterprise logo" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" width="32px"><br> Run in Colab Enterprise
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/google-health/medgemma/blob/main/notebooks/rl_with_trl.ipynb">
      <img alt="GitHub logo" src="https://github.githubassets.com/assets/GitHub-Mark-ea2971cee799.png" width="32px"><br> View on GitHub
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://huggingface.co/collections/google/medgemma-release-680aade845f90bec6a3f60c4">
      <img alt="Hugging Face logo" src="https://huggingface.co/front/assets/huggingface_logo-noborder.svg" width="32px"><br> View on Hugging Face
    </a>
  </td>
</tr></tbody></table>

This notebook demonstrates how to use digital pathology whole slide imaging to prompt MedGemma 1.5 running on VertexAI.

Vertex AI makes it easy to serve your model and make it accessible to the world. Learn more about [Vertex AI](https://cloud.google.com/vertex-ai/docs/start/introduction-unified-platform).

### Costs

This tutorial uses billable components of Google Cloud:

* Vertex AI

Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Storage pricing](https://cloud.google.com/storage/pricing), and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage.

In [None]:
# @title Install [ez-wsi-dicomweb](https://colab.sandbox.google.com/github/GoogleCloudPlatform/EZ-WSI-DICOMweb/blob/main/ez_wsi_demo.ipynb) Python library
%%capture
! pip install ez-wsi-dicomweb==6.1.3

In [None]:
# @title Authenticate Colab User to connect to DICOM store.
from google.colab import auth


# There will be a popup asking you to sign in with your user account and approve
# access.
auth.authenticate_user()

# Retrieve Digital Pathology Imaging from Imaging Data Commons (IDC)

[Imaging Data Commons (IDC)](ttps://datacommons.cancer.gov/repository/imaging-data-commons#) is one of the largest publicly available, de-identified, repositories for cancer imaging. The repository is funded by the [National Cancer Institute (NCI)](https://www.cancer.gov/), an institute of the [National Institutes of Health (NIH)](https://www.nih.gov/), a part of the [U.S. Department of Health and Human Service](https://www.hhs.gov/). IDC contains imaging for all major medical imaging modalities. Imaging is stored within the archive as [DICOM](https://www.dicomstandard.org/). Imaging and its associated metadata can be searched, visualized through the [IDC website](https://portal.imaging.datacommons.cancer.gov/explore/), [BigQuery](https://cloud.google.com/healthcare-api/docs/resources/public-datasets/idc), and can be accessed using DICOMweb ([IDC tutorial](https://learn.canceridc.dev/data/downloading-data/dicomweb-access)).


## [DICOM Information Model](https://learn.canceridc.dev/dicom/data-model)
[DICOM](https://www.dicomstandard.org/) uniquely identifies imaging using three UIDs, Study Instance UID, Series Instance UID, and a SOP Instance UID. Conceptually, a Study Instance UID can be thought of as the UID that identifies all imaging acquired or generated as a result of a patient exam. Each medical image acquired as part of the exam, (e.g., unique digital pathology image), is identified by a unique Series Instance UID. Each image acquired or generated as part of the an acquisition is, in turn, identified with a unique SOP Instance UID.

At its highest magnifications digital pathology images are gigapixel. To enable these images to be rapidly panned and zoomed they are commonly stored as an image pyramid. Each level of the pyramid is stored as a unique image. The colab requests imaging from the the pyramid level that describes imaging at a 10x magnfication.

In [None]:
from ez_wsi_dicomweb import credential_factory
from ez_wsi_dicomweb.ml_toolkit import dicom_path
from ez_wsi_dicomweb import dicom_web_interface
from ez_wsi_dicomweb import dicom_slide
from ez_wsi_dicomweb import pixel_spacing

# This notebook uses imaging hosted by: Imaging Data Commons (IDC)
# This notebook utilizes data generated by the National Cancer Institute
# Clinical Proteomic Tumor Analysis Consortium (CPTAC).
# Collection: CPTAC-COAD   Case: 	01CO005
study_instance_uid = "2.25.160169972116293851749457993007066200227"
series_instance_uid = "1.3.6.1.4.1.5962.99.1.178766709.385691431.1640856306549.2.0"

# Read DICOM instance metadata for imaging from IDC
#series = f"https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb/studies/{study_instance_uid}/series/{series_instance_uid}"
series = f"https://healthcare.googleapis.com/v1/projects/nci-idc-data/locations/us-central1/datasets/idc/dicomStores/idc-store-v21/dicomWeb/studies/{study_instance_uid}/series/{series_instance_uid}"
dwi = dicom_web_interface.DicomWebInterface(credential_factory.DefaultCredentialFactory())
slide = dicom_slide.DicomSlide(dwi, dicom_path.FromString(series))
slide.init_slide_frame_cache()

# Retrieve imaging from pathology imaging which is ~10x
level = slide.get_level_by_pixel_spacing(pixel_spacing.PixelSpacing.FromMagnificationString("10X"))


In [None]:
# @title Read patches of digital pathology imaging from the DICOM store.

import random

from ez_wsi_dicomweb import patch_generator
import matplotlib.pyplot as plt


# Maximum number of patches to retrieve from the slide imaging.
maximum_number_of_patches = 125
patch_size = 896  # MedGemma input image size.

# Generate patches from non-overlapping tissue containing regions.
patche_generator = patch_generator.DicomPatchGenerator(slide, level, patch_size=patch_size, stride_size=patch_size)
sampled_patches = [p for p in patche_generator]
sampled_patches = random.sample(sampled_patches, k=min(maximum_number_of_patches, len(sampled_patches)))

print("Low magnfication view of the slide imaging.")
plt.imshow(patche_generator.get_tissue_mask())
plt.axis("off")
plt.show()

print("Visualization of 3 patches randomly sampled from ~10x imaging")
for p in sampled_patches[:min(3, len(sampled_patches))]:
   plt.imshow(p.image_bytes()) # Display the image with a grayscale colormap
   plt.axis("off") # Optional: Turn off axis labels and ticks
   plt.show()



In [None]:
# @title Construct MedGemma 1.5 prompt formatted as Chat Completion.


# @markdown This section shows how to construct [chat completions](https://platform.openai.com/docs/api-reference/chat) requests to the endpoint using Vertex AI [prediction](https://cloud.google.com/vertex-ai/docs/predictions/get-online-predictions).

import base64
import io

import numpy as np
import PIL.Image

def _encode(data: np.ndarray) -> str:
  """Encode pathology patch imaging inline in prompt."""
  # Image format pathology patches are encoded as.
  # options: "jpeg" or "png"
  image_format = "jpeg"
  with PIL.Image.fromarray(data) as img:
      with io.BytesIO() as img_bytes:
        img.save(img_bytes, format=image_format)
        encoded_string = base64.b64encode(img_bytes.getvalue()).decode("utf-8")
  return f"data:image/{image_format};base64,{encoded_string}"

# @markdown **Prompt:** Provide a brief descriptive text for the set of pathology patches extracted from a pathology slide. Consider the tissue type and procedure (below) when deciding what to include in the descriptive text.\ncolon, resection:

prompt = "Provide a brief descriptive text for the set of pathology patches extracted from a pathology slide. Consider the tissue type and procedure (below) when deciding what to include in the descriptive text.\ncolon, resection:"

# Generate chat completion formatted prompt.
content = [{"type": "text", "text": prompt}]
for p in sampled_patches:
  content.append({"type": "image_url", "image_url": {"url": _encode(p.image_bytes())}})

instance = {
        "@requestFormat": "chatCompletions",
        "messages": [{"role": "user", "content": content}],
        "max_tokens": 500,
        "temperature": 0
}


In [None]:
# @title Display full MedGemma 1.5 prompt.
import json

from IPython.display import display, Markdown


def truncate_prompt(obj, max_len):
  # Clip strings in prompt to avoid displaying excessively large content in colab notebook.
  if isinstance(obj, dict):
    return {k: truncate_prompt(v, max_len) for k, v in obj.items()}
  elif isinstance(obj, list):
    return [truncate_prompt(elem, max_len) for elem in obj]
  elif isinstance(obj, str) and len(obj) > max_len:
    return obj[:max_len] + "..."  # Add ellipsis for truncated strings
  return obj

txt = json.dumps(truncate_prompt(instance, 100), indent=4, sort_keys=True)
display(Markdown(f"```json\n{txt}"))

In [None]:
# @title Configure CoLab to call MedGemma 1.5 running in Vertex AI

# @markdown #### Prerequisites

# @markdown 1. Make sure that [billing is enabled](https://cloud.google.com/billing/docs/how-to/modify-project) for your project.

# @markdown 2. Make sure that either the Compute Engine API is enabled or that you have the [Service Usage Admin](https://cloud.google.com/iam/docs/understanding-roles#serviceusage.serviceUsageAdmin) (`roles/serviceusage.serviceUsageAdmin`) role to enable the API.

# @markdown This section sets the default Google Cloud project and enables the Compute Engine API (if not already enabled), and initializes the Vertex AI API.


# @title Import packages and define common functions
import os
from google.cloud import aiplatform

Google_Cloud_Project = ""  # @param {type: "string", placeholder:"e.g., my_project_name"}

# @markdown To get [online predictions](https://cloud.google.com/vertex-ai/docs/predictions/get-online-predictions), you will need a MedGemma [Vertex AI Endpoint](https://cloud.google.com/vertex-ai/docs/general/deployment) that has been deployed from Model Garden. If you have not already done so, go to the [MedGemma model card](https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/medgemma) and click "Deploy options > Vertex AI" to deploy the model.

# @markdown **Note:** The examples in this notebook are intended to be used with instruction-tuned variants. Make sure to use an instruction-tuned model variant to run this notebook.

# @markdown This section gets the Vertex AI Endpoint resource that you deployed from Model Garden to use for online predictions.

# @markdown Fill in the endpoint ID and region below. You can find your deployed endpoint on the [Vertex AI online prediction page](https://console.cloud.google.com/vertex-ai/online-prediction/endpoints).


ENDPOINT_ID = ""  # @param {type: "string", placeholder:"e.g. 123456789"}
ENDPOINT_REGION = ""  # @param {type: "string", placeholder:"e.g. us-central1"}

# @markdown **Note:** The colab requires dedicated [Vertex AI endpoint](https://cloud.google.com/blog/products/ai-machine-learning/reliable-ai-with-vertex-ai-prediction-dedicated-endpoints?e=48754805).

os.environ["CLOUDSDK_CORE_PROJECT"] = Google_Cloud_Project
os.environ["GOOGLE_CLOUD_PROJECT"] = Google_Cloud_Project
os.environ["GOOGLE_CLOUD_REGION"] = ENDPOINT_REGION

# Enable the Compute Engine API, if not already.
print("Enabling Compute Engine API.")
! gcloud services enable compute.googleapis.com

# Initialize Vertex AI API.
print("Initializing Vertex AI API.")
aiplatform.init(project=os.environ["GOOGLE_CLOUD_PROJECT"],
                location=os.environ["GOOGLE_CLOUD_REGION"])

endpoint = aiplatform.Endpoint(
    endpoint_name=ENDPOINT_ID,
    project=Google_Cloud_Project,
    location=ENDPOINT_REGION,
)

# Use the endpoint name to check that you are using an appropriate model variant.
# These checks are based on the default endpoint name from the Model Garden
# deployment settings.
ENDPOINT_NAME = endpoint.display_name
if "pt" in ENDPOINT_NAME:
    raise ValueError(
        "The examples in this notebook are intended to be used with "
        "instruction-tuned variants. Please use an instruction-tuned model."
    )
if "text" in ENDPOINT_NAME:
    raise ValueError(
        "You are using a text-only variant which does not support multimodal"
        " inputs. Please proceed to the 'Run inference on text only' section."
    )

In [None]:
# @title # Call MedGemma 1.5 and return prediction

response = endpoint.raw_predict(
    body=json.dumps(instance).encode("utf-8"), use_dedicated_endpoint=True,
    headers={"Content-Type": "application/json"}
)
response.raise_for_status()
medgemma_response = response.json()["choices"][0]["message"]["content"]
display(Markdown(f"---\n\n**[ MedGemma ]**\n\n{medgemma_response}\n\n---"))