Skip to content

Support image/video in Inference Command #47

@dongwang218

Description

@dongwang218

Is your feature request related to a problem? Please describe:

Currently matrix inference only support text.

Describe the solution you would like:

  1. Find huggingface dataset contains images (eg cais/hle) and videos
  2. Convert the row into Chat message
  3. Deploy maverick-fp8 on h100 and test it on images, which should work.
  4. Explore video input and model, eg gemma

Additional Context:
We currently integrated vllm v0.8.3, please see if it support video models.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions