multimodal

Star

Here are 470 public repositories matching this topic...

jina-ai / jina

Star

☁️ Build multimodal AI applications with cloud-native stack

Updated Sep 5, 2024
Python

microsoft / unilm

Star

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Updated Aug 28, 2024
Python

haotian-liu / LLaVA

Star

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

chatbot llama multimodal multi-modality gpt-4 foundation-models visual-language-learning chatgpt instruction-tuning vision-language-model llava llama2 llama-2

Updated Aug 12, 2024
Python

NVIDIA / NeMo

Star

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr multimodal speech-translation large-language-models speaker-diariazation generative-ai

Updated Sep 10, 2024
Python

bentoml / BentoML

Star

The easiest way to serve AI apps and models - Build reliable Inference APIs, LLM apps, Multi-model chains, RAG service, and much more!

python machine-learning deep-learning model-serving multimodal mlops ml-engineering ai-inference llm generative-ai llmops llm-serving model-inference-service llm-inference inference-platform

Updated Sep 10, 2024
Python

facebookresearch / mmf

Star

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

deep-learning dialog pytorch vqa pretrained-models captioning multimodal multi-tasking textvqa hateful-memes

Updated May 25, 2024
Python

SkalskiP / courses

Sponsor

Star

This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI)

nlp machine-learning natural-language-processing tutorial deep-neural-networks computer-vision deep-learning transformers generative-model multimodal mlops stable-diffusion

Updated Apr 22, 2024
Python

kyegomez / tree-of-thoughts

Sponsor

Star

Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%

deep-learning prompt artificial-intelligence multimodal gpt4 prompt-learning prompt-tuning prompt-engineering chatgpt

Updated Aug 29, 2024
Python

IDEA-CCNL / Fengshenbang-LM

Star

Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系，成为中文AIGC和认知智能的基础设施。

transformers pytorch chinese-nlp pretrained-models distributed-training multimodal aigc

Updated Aug 13, 2024
Python

jina-ai / discoart

Star

🪩 Create Disco Diffusion artworks in one line

generative-art cross-modal diffusion prompts creative-ai creative-art multimodal clip-guided-diffusion dalle disco-diffusion midjourney imgen discodiffusion latent-diffusion stable-diffusion

Updated May 16, 2023
Python

rom1504 / img2dataset

Star

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

image big-data deep-learning dataset image-dataset download-images multimodal

Updated Aug 7, 2024
Python

open-mmlab / mmpretrain

Star

OpenMMLab Pre-training Toolbox and Benchmark

deep-learning pytorch image-classification resnet pretrained-models clip mae mobilenet moco multimodal self-supervised-learning constrastive-learning beit vision-transformer swin-transformer masked-image-modeling convnext

Updated Aug 20, 2024
Python

modelscope / ms-swift

Star

Use PEFT or Full-parameter to finetune 300+ LLMs or 80+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V-2.6, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Updated Sep 10, 2024
Python

NExT-GPT / NExT-GPT

Star

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

multimodal gpt-4 foundation-models visual-language-learning large-language-models llm chatgpt instruction-tuning multi-modal-chatgpt

Updated Jan 22, 2024
Python

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

Updated Aug 20, 2024
Python

microsoft / torchscale

Star

Foundation Architecture for (M)LLMs

machine-learning natural-language-processing translation computer-vision transformer speech-processing multimodal pretrained-language-model

Updated Apr 11, 2024
Python

docarray / docarray

Star

Represent, send, store and search multimodal data

elasticsearch machine-learning deep-learning protobuf pytorch data-structures nearest-neighbor-search cross-modal multi-modal semantic-search multimodal nested-data weaviate dataclass pydantic fastapi neural-search qdrant docarray

Updated Sep 4, 2024
Python

X-PLUG / MobileAgent

Star

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

android agent harmony ios app gui automation mobile copilot multimodal mobile-agents mllm multimodal-large-language-models gpt4v multimodal-agent

Updated Sep 3, 2024
Python

InternLM / InternLM-XComposer

Star

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

foundation gpt language-model multimodal multi-modality vision-transformer gpt-4 visual-language-learning llm chatgpt instruction-tuning large-language-model supervised-finetuning mllm vision-language-model large-vision-language-model

Updated Aug 30, 2024
Python

OFA-Sys / OFA

Star

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

prompt chinese image-captioning pretrained-models visual-question-answering multimodal text-to-image-synthesis vision-language pretraining referring-expression-comprehension prompt-tuning

Updated Apr 24, 2024
Python

Improve this page

Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multimodal

Here are 470 public repositories matching this topic...

jina-ai / jina

microsoft / unilm

haotian-liu / LLaVA

NVIDIA / NeMo

bentoml / BentoML

facebookresearch / mmf

SkalskiP / courses

kyegomez / tree-of-thoughts

IDEA-CCNL / Fengshenbang-LM

jina-ai / discoart

rom1504 / img2dataset

open-mmlab / mmpretrain

modelscope / ms-swift

NExT-GPT / NExT-GPT

OpenGVLab / InternGPT

microsoft / torchscale

docarray / docarray

X-PLUG / MobileAgent

InternLM / InternLM-XComposer

OFA-Sys / OFA

Improve this page

Add this topic to your repo