Logic Errors in Image_processing_gemma3_fast.py

### System Info

- `transformers` version: 4.50.0.dev0
- Platform: macOS-15.3.2-arm64-arm-64bit
- Python version: 3.12.9
- Huggingface_hub version: 0.29.3
- Safetensors version: 0.5.3
- Accelerate version: 1.5.2
- Accelerate config:    not found
- DeepSpeed version: not installed
- PyTorch version (GPU?): 2.6.0 (False)
- Tensorflow version (GPU?): 2.19.0 (False)
- Flax version (CPU?/GPU?/TPU?): 0.10.4 (cpu)
- Jax version: 0.5.2
- JaxLib version: 0.5.1
- Using distributed or parallel set-up in script?: <fill in>

### Who can help?

@amyeroberts 
@qubvel 

### Information

- [ ] The official example scripts
- [x] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [x] My own task or dataset (give details below)

### Reproduction

Steps to reproduce:

1. Load the Gemma 3 model locally using a pipeline with an image as input.
2. Ensure the do_pan_and_scan option is set to False.
3. Run the script — the error appears when the model tries to process the image input.

### Expected behavior

It tries to process the image but encounters some logic errors, they are not major errors but little yet errors:

image_processing_gemma3_fast.py
Line 357: The code references images_list, but this variable is defined only inside the if do_pan_and_scan: condition. When do_pan_and_scan == False, images_list is never initialized, resulting in an UnboundLocalError.

image_text_to_text.py
Line 84: Inside the retrieve_images_in_messages() function, the variable idx_images must be incremented even when the first if condition is met. Otherwise, the final check at line 105 throws an IndexError due to a mismatch in the expected number of images.

I implemented the following changes, which resolved the issues:

In image_processing_gemma3_fast.py, replace:

num_crops = [[0] for images in images_list]
With:

num_crops = [[0] for _ in image_list]


In the same file, replace all references to images_list with image_list after the if do_pan_and_scan: condition to ensure consistency.

In image_text_to_text.py, modify line 84 to increment idx_images inside the first if block:
if key in content:
    retrieved_images.append(content[key])
    idx_images += 1  # Fix to ensure alignment in the list of images

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Logic Errors in Image_processing_gemma3_fast.py #36806

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Logic Errors in Image_processing_gemma3_fast.py #36806

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions