Skip to content

Logic Errors in Image_processing_gemma3_fast.py #36806

Closed
@javierchacon262

Description

@javierchacon262

System Info

  • transformers version: 4.50.0.dev0
  • Platform: macOS-15.3.2-arm64-arm-64bit
  • Python version: 3.12.9
  • Huggingface_hub version: 0.29.3
  • Safetensors version: 0.5.3
  • Accelerate version: 1.5.2
  • Accelerate config: not found
  • DeepSpeed version: not installed
  • PyTorch version (GPU?): 2.6.0 (False)
  • Tensorflow version (GPU?): 2.19.0 (False)
  • Flax version (CPU?/GPU?/TPU?): 0.10.4 (cpu)
  • Jax version: 0.5.2
  • JaxLib version: 0.5.1
  • Using distributed or parallel set-up in script?:

Who can help?

@amyeroberts
@qubvel

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Steps to reproduce:

  1. Load the Gemma 3 model locally using a pipeline with an image as input.
  2. Ensure the do_pan_and_scan option is set to False.
  3. Run the script — the error appears when the model tries to process the image input.

Expected behavior

It tries to process the image but encounters some logic errors, they are not major errors but little yet errors:

image_processing_gemma3_fast.py
Line 357: The code references images_list, but this variable is defined only inside the if do_pan_and_scan: condition. When do_pan_and_scan == False, images_list is never initialized, resulting in an UnboundLocalError.

image_text_to_text.py
Line 84: Inside the retrieve_images_in_messages() function, the variable idx_images must be incremented even when the first if condition is met. Otherwise, the final check at line 105 throws an IndexError due to a mismatch in the expected number of images.

I implemented the following changes, which resolved the issues:

In image_processing_gemma3_fast.py, replace:

num_crops = [[0] for images in images_list]
With:

num_crops = [[0] for _ in image_list]

In the same file, replace all references to images_list with image_list after the if do_pan_and_scan: condition to ensure consistency.

In image_text_to_text.py, modify line 84 to increment idx_images inside the first if block:
if key in content:
retrieved_images.append(content[key])
idx_images += 1 # Fix to ensure alignment in the list of images

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions