Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

> Yeah, if someone could get a PR of a vision model working locally on the project that'd be great I think #101

Closed
Andy1996247 opened this issue Dec 11, 2023 · 5 comments

Comments

@Andy1996247
Copy link

Yeah, if someone could get a PR of a vision model working locally on the project that'd be great I think

Would this work? https://llava-vl.github.io/

https://simonwillison.net/2023/Nov/29/llamafile/

Originally posted by @Andy1996247 in #86 (comment)

@BorisMolch
Copy link

From what i manually tested its not better than gptV...
i guess we have to wait for vision to improve and provide accurate coordinates #7

or take a different approach.

@joshbickett
Copy link
Contributor

@BorisMolch even though Llava may not perform well, others may be interested to try it and see how they can improve it. If you want to make a PR for running Llava locally, I'd be happy to review it

@BorisMolch
Copy link

@BorisMolch even though Llava may not perform well, others may be interested to try it and see how they can improve it. If you want to make a PR for running Llava locally, I'd be happy to review it

"manually tested"
as in gave Llava screenshots and tried to see if its capable to instructing. its not (as well as GPTV)

@joshbickett
Copy link
Contributor

oh ok, understood. Well thanks for the input nonetheless!

@joshbickett
Copy link
Contributor

We now have Llava integrated thanks to the PR from @michaelhhogue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants