Skip to content

Fix AutoImageProcessor to correctly detect local implementation whe…#44653

Closed
kaixuanliu wants to merge 6 commits intohuggingface:mainfrom
kaixuanliu:image_process_fix
Closed

Fix AutoImageProcessor to correctly detect local implementation whe…#44653
kaixuanliu wants to merge 6 commits intohuggingface:mainfrom
kaixuanliu:image_process_fix

Conversation

@kaixuanliu
Copy link
Copy Markdown
Contributor

@zucchini-nlp, can you help review? Thx! unit tests to reproduce this bug:
tests/models/phi4_multimodal/test_modeling_phi4_multimodal.py::Phi4MultimodalIntegrationTest::test_audio_text_generation

…n `image_processor_type` is None

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Comment on lines +599 to +609
# If we don't have an image_processor_class yet and config wasn't provided, try to load config
# to check if there's a local implementation registered in IMAGE_PROCESSOR_MAPPING
if image_processor_class is None and config is None:
try:
config = AutoConfig.from_pretrained(
pretrained_model_name_or_path,
trust_remote_code=trust_remote_code,
**kwargs,
)
except ValueError:
pass
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the review comment! In original code, it checks image_processor_auto_map is None, while in this test case, it is not None. I will adjust the code directly here to make it concise.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if image_processor_auto_map is not None, then it is a remote code. For Phi4 specifically, the model isn't supported in transformers correctly and works only from certain hub commit

I think the fix is not needed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get your point and it makes sense. But when I explictly set trust_remote_code=True in the test case:

-        self.processor = AutoProcessor.from_pretrained(self.checkpoint_path, revision=self.revision)
+        self.processor = AutoProcessor.from_pretrained(
+            self.checkpoint_path, revision=self.revision, trust_remote_code=True
+        )

it returns error:

E   ImportError: cannot import name 'CommonKwargs' from 'transformers.processing_utils
' (/root/upstream/transformers/src/transformers/processing_utils.py)

/root/.cache/huggingface/modules/transformers_modules/microsoft/Phi_hyphen_4_hyphen_mu
ltimodal_hyphen_instruct/467a5d77dcfa75a8801cf8b3fc08974c8c89d05b/processing_phi4_multimodal.py:32: ImportError

it seems the custom code has conflict with latest transformers. And I found Phi4MultimodalImageProcessorFast is already registered in transformers, hence I tried to use local code. So the best practice is to contact the model owner to update the remote code, right?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not sure about the current state of this model and how to use it via transformers. cc @Cyrilvallez who added the model

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is very very likely that the remote implem broke indeed. The model owners never wanted to merge the Transformers implementation unfortunately, which would have ensured the model keeps up-to-date and is always usable. But we cannot assume maintenance of remote code of course

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the update. I will close this PR.

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto

@github-actions
Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=44653&sha=376316

@kaixuanliu
Copy link
Copy Markdown
Contributor Author

kaixuanliu commented Mar 18, 2026

@zucchini-nlp Hi, can you help review again? In CI test a lot of mistral4 model test cases failed. It should not be caused by this PR.

@kaixuanliu kaixuanliu closed this Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants