multi-modal
Here are 272 public repositories matching this topic...
Basic implementation code for multimodal models and some applications or fine-tuning tasks based on them.
-
Updated
Jun 8, 2024 - Jupyter Notebook
Start building LLM-empowered multi-agent applications in an easier way.
-
Updated
Jun 8, 2024 - Python
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
-
Updated
Jun 8, 2024 - Python
GPT4V-level open-source multi-modal model based on Llama3-8B
-
Updated
Jun 8, 2024 - Python
[ICLR 2024 Spotlight] This is the official code for the paper "SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training"
-
Updated
Jun 8, 2024 - Python
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
-
Updated
Jun 7, 2024 - Python
Language Modeling Research Hub, a comprehensive compendium for enthusiasts and scholars delving into the fascinating realm of language models (LMs), with a particular focus on large language models (LLMs)
-
Updated
Jun 7, 2024 - Python
Represent, send, store and search multimodal data
-
Updated
Jun 6, 2024 - Python
The TypeScript library for building AI applications.
-
Updated
Jun 6, 2024 - TypeScript
Open Source Routing Engine for OpenStreetMap
-
Updated
Jun 8, 2024 - C++
Time-series forecasting of market price data using a multi-modal Convolutional Neural Network
-
Updated
Jun 6, 2024 - Jupyter Notebook
ModelScope: bring the notion of Model-as-a-Service to life.
-
Updated
Jun 6, 2024 - Python
A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
-
Updated
Jun 6, 2024 - Python
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
-
Updated
Jun 6, 2024 - Python
Visualizing the attention of vision-language models
-
Updated
Jun 6, 2024 - Jupyter Notebook
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks
-
Updated
Jun 5, 2024 - Python
VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle various visual tasks.
-
Updated
Jun 4, 2024 - Python
Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Agent, Audio, Image, Video, Music and 3D content. 🔥
-
Updated
Jun 4, 2024
Efficient Retrieval Augmentation and Generation Framework
-
Updated
Jun 4, 2024 - Python
Improve this page
Add a description, image, and links to the multi-modal topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the multi-modal topic, visit your repo's landing page and select "manage topics."