Add Support for Ovis2.5 Multi-Modal Model​

### Model description

Key Features:

Small Model Performance: Optimized training strategies enable small-scale models to achieve higher capability density, demonstrating cross-tier leading advantages.

Enhanced Reasoning Capabilities: Significantly strengthens Chain-of-Thought (CoT) reasoning abilities through the combination of instruction tuning and preference learning.

Video and Multi-Image Processing: Video and multi-image data are incorporated into training to enhance the ability to handle complex visual information across frames and images.

Multilingual Support and OCR: Enhances multilingual OCR beyond English and Chinese and improves structured data extraction from complex visual elements like tables and charts.

### Open source status

- [x] The model implementation is available
- [x] The model weights are available

### Provide useful links for the implementation

[1] Arxiv: https://arxiv.org/abs/2508.11737
[2] Huggingface: https://huggingface.co/AIDC-AI/Ovis2-8B

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Support for Ovis2.5 Multi-Modal Model #40841

Model description

Open source status

Provide useful links for the implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add Support for Ovis2.5 Multi-Modal Model​ #40841

Description

Model description

Open source status

Provide useful links for the implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Add Support for Ovis2.5 Multi-Modal Model #40841