🧠 | Multimodal Integration of Oncology Data System
-
Updated
May 31, 2024 - JavaScript
🧠 | Multimodal Integration of Oncology Data System
TerraWatch is a proof of concept system developed during the TUM AI Hackathon 2024 to detect deforestation from satellite images and reason out the causes and potential environmental effects using computer vision models and multimodal large language models.
Web-Based Exercise Posture Evaluation and AI Voice Feedback System
This is a simple application that generates scripts for the user to read. Based on the audio, the application would provide a score for their pronunciation and suggest possible methods to improve it.
Our project enhances Trulens analytics through two key initiatives: developing an interactive visual node for integration in Jupyter notebooks, and creating a comprehensive RAG framework for Trulens documentation. These efforts aim to simplify and enrich the user experience with Trulens, making advanced data analysis more accessible and intuitive.
Three-level multimodal emotion recognition framework to detect emotions combining different inputs with different formats.
How you can add semantic search to your applications. This sample shows how you can use a multimodal model to find images which are semantically similar to some text. New blog coming out soon.
Build and explore multimodal web interactives with pieces of paper!
[ICCV2021 Workshop] Multi-Modal Video Reasoning and Analyzing Competition
Amazon Alexa Skill - "Alexa, ask Fork On The Road"
Turn yourself into a Halloween-styled character and get an original roast with the power of AI.
A Vision Assistance Multimodal Application build on top of google gemini vision pro.
[CVPR2021] SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events
Employee Productivity GenAI Assistant Example is an innovative code sample and architecture pattern designed to enhance writing tasks efficiency using AWS serverless technologies and Amazon Bedrock's generative AI models.
React component library for crafting user-friendly and engaging conversational experiences
Sample skill which demonstrates the new Alexa Presentation Language (APL). The multi modal skill functionality is same as Alexa Fact Skill template it will select a fact at random and tell it to the user when the multi modal skill is invoked and is compatible with devices having display.
A simple "Be My Eyes" web app with a llama.cpp/llava backend
Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.
To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."