Potential bug in Qwen 2/2.5 VL Image Preprocessor #38003

ritwickchaudhry · 2025-05-07T20:23:56Z

transformers/src/transformers/models/qwen2_vl/image_processing_qwen2_vl.py

Line 278 in 5c47d08

repeats = np.repeat(patches[-1][np.newaxis], temporal_patch_size - 1, axis=0)

The temporal_patch_size is used to group consecutive video frames. However, if the number of frames are not divisible, then the last frame is repeated. The current number of repetitions is temporal_patch_size - 1. While this will work for temporal_patch_size = 2 but it wouldn't work for larger patch sizes.

In my opinion, the code should be modified to:

repeats = np.repeat(patches[-1][np.newaxis], temporal_patch_size - (patches.shape[0] % temporal_patch_size), axis=0)

The text was updated successfully, but these errors were encountered:

zucchini-nlp · 2025-05-08T12:30:36Z

@ritwickchaudhry correct! There was another issue with PR somewhere (#37350), probably got stale. I have forgot about it due to low priority. Would you like to open a PR for this?

LMK if you can't contribute, I can finalize and merge the existing PR later next week :)

ritwickchaudhry · 2025-05-09T18:17:35Z

Thanks @zucchini-nlp ! Sure, let me create a PR soon!

anshulsc · 2025-05-12T04:58:10Z

Hi @ritwickchaudhry and @zucchini-nlp,

I've also encountered this issue and have implemented the fix based on the discussion here.
@ritwickchaudhry, I saw you mentioned you'd be creating a PR. Are you still planning to, or would you mind if I submit one with the changes?

ritwickchaudhry · 2025-05-12T06:23:58Z

@anshulsc I'll be releasing the PR very soon, as I finished most of it. Thanks for the offer though!

anshulsc · 2025-05-12T06:34:42Z

@ritwickchaudhry great !!

ritwickchaudhry · 2025-05-12T07:04:53Z

Done actually! @zucchini-nlp can you please review the PR: #38076

ritwickchaudhry mentioned this issue May 12, 2025

Fix temporal padding in Qwen2VLImageProcessor when the number of frames is not divisible by temporal_patch_size #38076

Merged

zucchini-nlp closed this as completed in #38076 May 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential bug in Qwen 2/2.5 VL Image Preprocessor #38003

Potential bug in Qwen 2/2.5 VL Image Preprocessor #38003

ritwickchaudhry commented May 7, 2025

zucchini-nlp commented May 8, 2025 •

edited

Loading

ritwickchaudhry commented May 9, 2025

anshulsc commented May 12, 2025

ritwickchaudhry commented May 12, 2025

anshulsc commented May 12, 2025

ritwickchaudhry commented May 12, 2025

Potential bug in Qwen 2/2.5 VL Image Preprocessor #38003

Potential bug in Qwen 2/2.5 VL Image Preprocessor #38003

Comments

ritwickchaudhry commented May 7, 2025

zucchini-nlp commented May 8, 2025 • edited Loading

ritwickchaudhry commented May 9, 2025

anshulsc commented May 12, 2025

ritwickchaudhry commented May 12, 2025

anshulsc commented May 12, 2025

ritwickchaudhry commented May 12, 2025

zucchini-nlp commented May 8, 2025 •

edited

Loading