Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT]: Vision Models and embedding Documents #3395

Open
kalfbz opened this issue Mar 5, 2025 · 1 comment
Open

[FEAT]: Vision Models and embedding Documents #3395

kalfbz opened this issue Mar 5, 2025 · 1 comment
Assignees
Labels
enhancement New feature or request feature request

Comments

@kalfbz
Copy link

kalfbz commented Mar 5, 2025

What would you like to see?

When using a vision model like gemini-2.0-flash, it can extract information from images. However, the current issue is that when an image or document is uploaded in the chatbox, it gets embedded into the workspace instead of being sent to the model for interpretation.

Would it be possible to modify this behavior so that uploading an image or document in the chatbox sends it to the model for interpretation, while files intended for embedding into the workspace should be uploaded via the upload button in the left-side workspace?

@kalfbz kalfbz added enhancement New feature or request feature request labels Mar 5, 2025
@th3f001
Copy link

th3f001 commented Mar 8, 2025

...or maybe decouple the functionality from the workspace embedder (that I would assume is meant to be used with text-based files to be organized in chunks before being sent to the model)...and make a Custom Skill out of it?

So at that point the user would have 2 separate flows:
-> standard files to RAG upon -> workspace file embedder
-> image files to be passed to a Vision-enabled model -> Agent with dedicated Skill

Following the same approach we could also have a 3rd flow:
-> PDFs and other mixed-contents files to be passed to an OCR-enabled model (like latest Mistral-OCR) -> Agent with dedicated skill

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature request
Projects
None yet
Development

No branches or pull requests

4 participants