Skip to content

Fixing issue with video loading for Gemma 4 with relative paths#9201

Merged
Jintao-Huang merged 1 commit into
modelscope:mainfrom
perone:perone/fix-gemma4-video-paths
Apr 25, 2026
Merged

Fixing issue with video loading for Gemma 4 with relative paths#9201
Jintao-Huang merged 1 commit into
modelscope:mainfrom
perone:perone/fix-gemma4-video-paths

Conversation

@perone
Copy link
Copy Markdown
Contributor

@perone perone commented Apr 24, 2026

PR type

  • Bug Fix
  • New Feature
  • Document Updates
  • More Models or Datasets Support

PR information

When using datasets with relative media paths and --root_image_dir, video and audio file paths are not resolved to absolute paths before being passed to model-specific templates. Image paths are correctly resolved via _load_image, but video and audio paths are passed as raw strings.

This causes a FileNotFoundError for models like Gemma 4 whose templates delegate media loading to the HuggingFace processor, since the processor receives the unresolved relative path.

This fix adds path resolution for inputs.videos and inputs.audios in Template._preprocess_media (swift/template/base.py), matching the existing behavior for images. A relative path is resolved against root_image_dir only when:

  • root_image_dir is set
  • The path is a string (not already loaded)
  • The file does not exist at the original path
  • The path is not a URL
  • The resolved path points to an existing file

Experiment results

Verified that Gemma 4 video fine-tuning with relative video paths in the dataset and --root_image_dir correctly loads videos after this fix, where previously it raised FileNotFoundError.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds logic to resolve video and audio file paths using ROOT_IMAGE_DIR to ensure absolute paths are available for model-specific templates. The reviewer suggested refactoring the code to eliminate duplication between the video and audio path resolution loops.

Comment thread swift/template/base.py
@Jintao-Huang
Copy link
Copy Markdown
Collaborator

thanks!

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

There is also the case where the value is base64-encoded. Could you fix this by referencing the handling here:

def _check_path(path: str) -> Union[str, None]:

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

Never mind, I'll open a PR to fix it.

@Jintao-Huang Jintao-Huang merged commit cbe51b2 into modelscope:main Apr 25, 2026
1 of 3 checks passed
@Jintao-Huang Jintao-Huang mentioned this pull request Apr 25, 2026
@perone
Copy link
Copy Markdown
Contributor Author

perone commented Apr 25, 2026

Never mind, I'll open a PR to fix it.

Sorry just saw it now, thanks again !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants