Skip to content

[Question]: How to use img2text with Ollama? #6183

@andrealesani

Description

@andrealesani

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (Language Policy).
  • Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • Please do not modify this template :) and fill in all the required fields.

Describe your problem

I am trying to use vision models on Ollama such as llava or llama3.2-vision but I cannot get it working. I have a knowledge base with only one PDF document that contains 3 images. The parsing will always return a single, empty chunk.

This is what I have tried to set:

  • /knowledge/dataset?id=... > Configuration > Layout recognition & OCR > LLava
  • /knowledge/dataset?id=... > Action (on the specific document) > Chunk method > Layout recognition & OCR > LLava
  • /user-setting/model > System model settings > Img2text model > LLava

Am I missing something?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions