Skip to content

fixing more typos#45689

Merged
vasqu merged 49 commits intomainfrom
fix-typos2
Apr 28, 2026
Merged

fixing more typos#45689
vasqu merged 49 commits intomainfrom
fix-typos2

Conversation

@vasqu
Copy link
Copy Markdown
Contributor

@vasqu vasqu commented Apr 28, 2026

As per title

zucchini-nlp and others added 30 commits April 21, 2026 11:43
fix: fix vit_merger weight re-init and video frame extraction in processor
- Add video_token_id to config and model forward for video token masking
- Add _process_visual_features helper to handle NaViT repacking and
  beam search expansion for both image and video inputs
- Cast pixel_values to vision tower dtype to fix float32/bfloat16 mismatch
- Fix video placeholder in processor: compute per-frame tokens correctly
  (patches_per_frame × num_frames) and restore slice_start/slice_end structure
- Video processor now outputs grids_videos and num_frames metadata
- Pass pixel_values_videos/target_sizes_videos through generate pipeline
- Fix concat_frames_as_image canvas color from white (255) to black (0)
  to match the legacy _concat_images line_color=(0,0,0) separator lines
- Add missing `self` parameter to concat_frames_as_image
- Fix refine_size return order (height, width) in resize_and_split_patches
- Remove unused imports (group_videos_by_shape, reorder_videos)
- Rework _preprocess_video to process each visual unit independently
  with per-frame grids and target sizes instead of a single shared grid
- Update processor placeholder generation to iterate per-frame with
  individual grid/token counts via num_patches_per_frame
…antics

Reconstruct sub_timestamps using the same double-loop as sample_frames
and apply identical np.linspace downsampling, then group by second via
cursor-based loop (matching legacy _group_stacked_by_second) instead of
fixed-stride slicing — ensures composite-per-second interleaving is
faithful to the original pipeline for all video durations.
fix: adjust tokenizer config & video processor
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, minicpmv4_6

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@vasqu vasqu merged commit 93bed93 into main Apr 28, 2026
29 checks passed
@vasqu vasqu deleted the fix-typos2 branch April 28, 2026 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants