Multimodal example

Hello, 

Thanks for your great work. I am trying out the multimodal example for qwen2.5vl. Would the code work? Dense.py uses Qwen2_5OmniThinkerForConditionalGeneration but for qwen2.5vl model it should be Qwen2_5VLForConditionalGeneration. Would the dataset.py files also need changing? It is using utils from omni package not the vl package.

Also, in dataset.py, the line: image = document_info.get('image', None). Should this be replaced by: image = document_info.get('document_image', None)?

Thanks in advance for your help,
Mehreen

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multimodal example #186

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multimodal example #186

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions