Skip to content

fix(Phi4Multimodal): Fix incorrect default vision/audio config initialization in Phi4MultimodalConfig#43480

Merged
Rocketknight1 merged 2 commits intohuggingface:mainfrom
charlieJ107:fix/phi4mm-vision-config
Jan 26, 2026
Merged

fix(Phi4Multimodal): Fix incorrect default vision/audio config initialization in Phi4MultimodalConfig#43480
Rocketknight1 merged 2 commits intohuggingface:mainfrom
charlieJ107:fix/phi4mm-vision-config

Conversation

@charlieJ107
Copy link
Contributor

@charlieJ107 charlieJ107 commented Jan 26, 2026

🐛 Bug Fix: Phi4MultimodalConfig default sub-config initialization

This PR fixes two issues in Phi4MultimodalConfig.__init__ related to default initialization of multimodal sub-configs.

Rations in Phi4MultimodalConfig

What does this PR do?

❌ Problems fixed

  1. When vision_config is None, a default Phi4MultimodalVisionConfig() was instantiated but not assigned, leaving self.vision_config as None.
  2. The audio_config default initialization incorrectly checked vision_config is None instead of audio_config is None.

✅ Changes

if isinstance(vision_config, dict):
    vision_config = Phi4MultimodalVisionConfig(**vision_config)
elif vision_config is None:
    vision_config = Phi4MultimodalVisionConfig()
self.vision_config = vision_config

if isinstance(audio_config, dict):
    audio_config = Phi4MultimodalAudioConfig(**audio_config)
elif audio_config is None:
    audio_config = Phi4MultimodalAudioConfig()
self.audio_config = audio_config

🎯 Impact

  • Ensures vision_config and audio_config are always valid config objects when omitted

  • Prevents downstream errors caused by unexpected None values

  • Improves consistency with other multimodal config implementations

No behavior change for users explicitly passing configs.

Let me know if you'd like tests added or additional adjustments.
This pull request fixes initialization logic for the vision and audio configuration objects in the Phi4MultimodalModel constructor. The changes ensure that default configuration objects are correctly assigned when none are provided.

Configuration initialization fixes:

  • Corrected the assignment of default vision_config and audio_config objects in the __init__ method of modular_phi4_multimodal.py, ensuring that they are properly instantiated when not provided.

Fixes #43479

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@Cyrilvallez

Note: This PR description is refined by AI, I have review this description and report and taken the full responsibility for this PR.

@charlieJ107 charlieJ107 changed the title fix(config): Fix incorrect default vision/audio config initialization in Phi4MultimodalConfig fix(Phi4Multimodal): Fix incorrect default vision/audio config initialization in Phi4MultimodalConfig Jan 26, 2026
@charlieJ107 charlieJ107 force-pushed the fix/phi4mm-vision-config branch 2 times, most recently from 3b311a3 to 1bb4347 Compare January 26, 2026 10:45
Copy link
Member

@Rocketknight1 Rocketknight1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, yes, this is clearly incorrect. I guess it wasn't spotted because vision/audio configs are explicitly passed in from_pretrained() most of the time. Thank you for the fix!

@Rocketknight1 Rocketknight1 force-pushed the fix/phi4mm-vision-config branch from 1bb4347 to 0015de9 Compare January 26, 2026 13:38
@Rocketknight1 Rocketknight1 enabled auto-merge (squash) January 26, 2026 13:39
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: phi4_multimodal

@Rocketknight1 Rocketknight1 merged commit 3abe00a into huggingface:main Jan 26, 2026
19 checks passed
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@charlieJ107 charlieJ107 deleted the fix/phi4mm-vision-config branch January 26, 2026 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Phi4MultimodalConfig incorrectly initializes default vision/audio configs when passed as None

3 participants