https://huggingface.co/collections/google/gemma-3-release-67c6c6f89c4f76621268bb6d \
https://huggingface.co/google/gemma-3-4b-it

In [1]:
import os

In [2]:
# Set CUDA_VISIBLE_DEVICES to expose only device 0
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

# Set CUDA_VISIBLE_DEVICES to expose devices 0, 1, 2, and 3
# os.environ["CUDA_VISIBLE_DEVICES"] = "0,1,2,3"

In [3]:
from transformers import AutoProcessor, Gemma3ForConditionalGeneration
from PIL import Image
import torch
import pandas as pd

  from .autonotebook import tqdm as notebook_tqdm


#### Reading the Pathology Test Data

In [4]:
questions = pd.read_csv("../pathology_test_data/questions.csv")
questions

Unnamed: 0,image_name,question,original_responses
0,ck_PTGC_2x.jpeg,How can you best describe the low-power patter...,The lymph node shows a mixed follicular and in...
1,ck_PTGC_2x.jpeg,What are three main differential diagnostic co...,"Based on the low-power pattern, the primary co..."
2,ck_PTGC_5x.jpeg,What is the morphologic alteration being depic...,"The image shows a central enlarged, somewhat i..."
3,ck_PTGC_5x.jpeg,What is the expected immunoarchitecture of the...,This image shows an enlarged secondary follicl...
4,ck_serositis_4x.jpg,What is the most common source for the change ...,"The changes here show extensive serositis, wit..."
5,ck_serositis_4x.jpg,What is the specific anatomic region shown in ...,The right half shows the muscularis propria an...
6,ck_steatohepatitis_100x.jpg,If these histologic changes included conspicuo...,There is fatty liver disease and if concurrent...
7,ck_steatohepatitis_100x.jpg,What would the primary histologic feature to s...,Wilson's disease has many non-specific finding...
8,ck_steatohepatitis_200x.jpg,In an overweight adolescent with mildly increa...,"In the background of steatosis, there is incre..."


In [5]:
images_path = os.path.join(os.getcwd(), "../pathology_test_data/images")
images_list = os.listdir(images_path)
images_list

['ck_steatohepatitis_100x.jpg',
 'ck_PTGC_2x.jpeg',
 'ck_serositis_4x.jpg',
 'ck_steatohepatitis_200x.jpg',
 'ck_PTGC_5x.jpeg']

#### Loading the Model Checkpoint and Image Processor

In [6]:
model_id = "google/gemma-3-4b-it"

model = Gemma3ForConditionalGeneration.from_pretrained(
    model_id, device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True
).eval()

Loading checkpoint shards: 100%|██████████| 2/2 [00:02<00:00,  1.37s/it]


In [7]:
processor = AutoProcessor.from_pretrained(model_id, padding=True, return_tensors="pt")

Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.


#### Creating the message structure

In [8]:
messages = []
for index, row in questions.iterrows():
    image_path = os.path.join(images_path, row["image_name"])
    question = row["question"]
    
    message = [
        {
            "role": "system",
            "content": [{"type": "text", "text": "You are a helpful assistant."}]
        },
        {
            "role": "user",
            "content": [
                {"type": "image", "image": image_path},
                {"type": "text", "text": question}
            ]
        }
    ]   
    
    messages.append(message)

#### Running the batch inference

In [9]:
inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True,
                                        return_dict=True, return_tensors="pt")

inputs = inputs.to(model.device)

In [10]:
with torch.inference_mode():
    generated_ids = model.generate(**inputs, max_new_tokens=1024, do_sample=False)
    generated_ids_trimmed = [
        out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
    ]

The following generation flags are not valid and may be ignored: ['top_p', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


In [11]:
output_texts = processor.batch_decode(
    generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)

In [12]:
output_texts

["Okay, let's analyze this lymph node histology image. \n\nHere's a description of the low-power pattern, broken down into key observations:\n\n**Overall Impression:**\n\nThe image shows a lymph node with a somewhat disorganized architecture. It appears to be significantly expanded, suggesting a reactive or potentially malignant process.\n\n**Specific Features:**\n\n*   **Diffuse Aggregation:** The lymphoid tissue is diffusely packed within the node, rather than being organized into discrete follicles. This is a key feature suggesting a reactive process.\n*   **Loss of Follicular Architecture:**  Normally, lymph nodes have distinct follicles with germinal centers. In this image, these follicles are largely absent or significantly reduced.\n*   **Dark, Dense Background:** The background staining is quite dark and dense, indicating a high cellularity and likely a significant amount of extracellular matrix.\n*   **Presence of Fibrosis:** There's a prominent fibrous band running through th

#### Storing the output Responses

In [13]:
os.makedirs("./responses", exist_ok=True)
questions['gemma3-4b-it-response'] = output_texts
questions.to_csv("./responses/gemma3_4b_it_responses.csv", index=False)