multimodal

Star

Here are 16 public repositories matching this topic...

lxe / llavavision

Star

A simple "Be My Eyes" web app with a llama.cpp/llava backend

machine-learning ai computer-vision artificial-intelligence webapp llama multimodal llm llamacpp local-llm

Updated Nov 28, 2023
JavaScript

sutdcv / SUTD-TrafficQA

Star

[CVPR2021] SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events

paper annotations dataset vqa cvpr video-qa vqa-dataset traffic-events multimodal multimodal-deep-learning cvpr2021 video-reasoning

Updated Dec 13, 2022
JavaScript

rimmi21-zz / Alexa-APL-Fact-Skill

Star

Sample skill which demonstrates the new Alexa Presentation Language (APL). The multi modal skill functionality is same as Alexa Fact Skill template it will select a fact at random and tell it to the user when the multi modal skill is invoked and is compatible with devices having display.

Updated Jun 26, 2019
JavaScript

phetsims / paper-land

Star

Build and explore multimodal web interactives with pieces of paper!

javascript open-source community paper ar augmented codesign multimodal

Updated May 10, 2024
JavaScript

benursu / Afrosquared-ForkOnTheRoad

Star

Amazon Alexa Skill - "Alexa, ask Fork On The Road"

nodejs alexa webgl multimodal

Updated Mar 24, 2019
JavaScript

aws-samples / semantic-image-search-for-articles

Star

How you can add semantic search to your applications. This sample shows how you can use a multimodal model to find images which are semantically similar to some text. New blog coming out soon.

search aws semantic vector multimodal vector-search generative-ai

Updated Feb 22, 2024
JavaScript

josemariagarcia95 / hera-system

Star

Three-level multimodal emotion recognition framework to detect emotions combining different inputs with different formats.

detector affective-computing detect-emotions affective multimodal pad-form ensembler

Updated Dec 9, 2022
JavaScript

synapse2001 / visionary

Star

A Vision Assistance Multimodal Application build on top of google gemini vision pro.

google accessibility gemini multimodal vision-assistance

Updated Feb 24, 2024
JavaScript

saharmor / MonsterBooth

Sponsor

Star

Turn yourself into a Halloween-styled character and get an original roast with the power of AI.

multimodal gpt4 generative-ai llava

Updated Feb 4, 2024
JavaScript

msai-cereal / ai_fitness_trainer_v2

Star

Web-Based Exercise Posture Evaluation and AI Voice Feedback System

computer-vision fitness-app buildship multimodal openai-api yolov8

Updated Dec 14, 2023
JavaScript

sutdcv / multi-modal-video-reasoning

Star

[ICCV2021 Workshop] Multi-Modal Video Reasoning and Analyzing Competition

workshop multimodality iccv multimodal multimodal-deep-learning iccv2021

Updated Jul 11, 2022
JavaScript

lab-rasool / MINDS

Star

🧠 | Multimodal Integration of Oncology Data System

data machine-learning deep-learning cancer nih oncology multimodal gdc-portal

Updated Apr 14, 2024
JavaScript

Qredence / Trueblock

Star

Our project enhances Trulens analytics through two key initiatives: developing an interactive visual node for integration in Jupyter notebooks, and creating a comprehensive RAG framework for Trulens documentation. These efforts aim to simplify and enrich the user experience with Trulens, making advanced data analysis more accessible and intuitive.

gemini multimodal visualblocks trulens

Updated Dec 26, 2023
JavaScript

tobiasleibrock / terra-watch

Star

TerraWatch is a proof of concept system developed during the TUM AI Hackathon 2024 to detect deforestation from satellite images and reason out the causes and potential environmental effects using computer vision models and multimodal large language models.

computer-vision tum tensorflow satellite-imagery multimodal gpt-4 llms

Updated May 7, 2024
JavaScript

abdulhakkeempa / AccentAce

Star

This is a simple application that generates scripts for the user to read. Based on the audio, the application would provide a score for their pronunciation and suggest possible methods to improve it.

docker nextjs google-cloud multimodal fastapi llm generative-ai gen-ai gemini-pro

Updated May 3, 2024
JavaScript

itsshreyashk / trying-gemini

Star

google's gemini

api google ai gemini multimodal

Updated Apr 16, 2024
JavaScript

Improve this page

Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multimodal

Here are 16 public repositories matching this topic...

lxe / llavavision

sutdcv / SUTD-TrafficQA

rimmi21-zz / Alexa-APL-Fact-Skill

phetsims / paper-land

benursu / Afrosquared-ForkOnTheRoad

aws-samples / semantic-image-search-for-articles

josemariagarcia95 / hera-system

synapse2001 / visionary

saharmor / MonsterBooth

msai-cereal / ai_fitness_trainer_v2

sutdcv / multi-modal-video-reasoning

lab-rasool / MINDS

Qredence / Trueblock

tobiasleibrock / terra-watch

abdulhakkeempa / AccentAce

itsshreyashk / trying-gemini

Improve this page

Add this topic to your repo