Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support a vision model on webcam live stream #291

Closed
flatsiedatsie opened this issue Feb 6, 2024 · 1 comment
Closed

Support a vision model on webcam live stream #291

flatsiedatsie opened this issue Feb 6, 2024 · 1 comment

Comments

@flatsiedatsie
Copy link

It would be very interesting to run a vision model like LLAVA, or the very impressive and tiny new Moondream LLM, in the browser. It would be fantastic if it could analyze images and (live) video.

That would, for example, allow us to create a tool to help blind people to get a sense of the world around them simply by opening a webpage on their phone.

@CharlieFRuan
Copy link
Contributor

Yep, multimodal models like LLaVA are on our roadmap as brought up here #276

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants