A simple "Be My Eyes" web app with a llama.cpp/llava backend
-
Updated
Nov 28, 2023 - JavaScript
A simple "Be My Eyes" web app with a llama.cpp/llava backend
Sample skill which demonstrates the new Alexa Presentation Language (APL). The multi modal skill functionality is same as Alexa Fact Skill template it will select a fact at random and tell it to the user when the multi modal skill is invoked and is compatible with devices having display.
[CVPR2021] SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events
Amazon Alexa Skill - "Alexa, ask Fork On The Road"
Three-level multimodal emotion recognition framework to detect emotions combining different inputs with different formats.
How you can add semantic search to your applications. This sample shows how you can use a multimodal model to find images which are semantically similar to some text. New blog coming out soon.
Build and explore multimodal web interactives with pieces of paper!
[ICCV2021 Workshop] Multi-Modal Video Reasoning and Analyzing Competition
A Vision Assistance Multimodal Application build on top of google gemini vision pro.
Turn yourself into a Halloween-styled character and get an original roast with the power of AI.
🧠 | Multimodal Integration of Oncology Data System
TerraWatch is a proof of concept system developed during the TUM AI Hackathon 2024 to detect deforestation from satellite images and reason out the causes and potential environmental effects using computer vision models and multimodal large language models.
This is a simple application that generates scripts for the user to read. Based on the audio, the application would provide a score for their pronunciation and suggest possible methods to improve it.
Web-Based Exercise Posture Evaluation and AI Voice Feedback System
Our project enhances Trulens analytics through two key initiatives: developing an interactive visual node for integration in Jupyter notebooks, and creating a comprehensive RAG framework for Trulens documentation. These efforts aim to simplify and enrich the user experience with Trulens, making advanced data analysis more accessible and intuitive.
Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.
To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."