#

gpt4v

Here are 14 public repositories matching this topic...

ShareGPT4Omni / ShareGPT4V

An official implementation of ShareGPT4V: Improving Large Multi-modal Models with Better Captions

gpt language-model large-language-models chatgpt instruction-tuning vision-language-model large-vision-language-models gpt4v large-multimodal-models gpt-4v

Updated Jun 6, 2024
Python

danomation / Discord-Vision-Bot

poc gpt-4 vision bot

discord vision openai pycord gpt4v gpt-4-vision-preview

Updated Nov 7, 2023
Python

Envedity / DAIA

Digital Artificial Intelligence Agent

machine-learning ai ml agi auto-agent ai-agent llm llm-agent ai-vision-model gpt4v gpt4vision

Updated Dec 28, 2023
Python

neka-nat / mylangrobot

Language instructions to mycobot using GPT-4V

whisper mycobot chatgpt segment-anything gpt4v gpt-4-vision-preview gpt-4-vision

Updated Dec 11, 2023
Python

elizabethsiegle / stephensmithify-openaivision-sendgrid

Analyze a Video and generate commentary about it with OpenAI's GPT-4V, Text-to-speech, LangChain, Streamlit, Replit, Twilio SendGrid, and OpenCV!

openai sendgrid opencv-python replit streamlit openai-api langchain gpt4v openai-v

Updated Dec 14, 2023
Python

Ravi-Teja-konda / TunedLlavaDelights

Explore the rich flavors of Indian desserts with TunedLlavaDelights. Utilizing the in Llava fine-tuning, our project unveils detailed nutritional profiles, taste notes, and optimal consumption times for beloved sweets. Dive into a fusion of AI innovation and culinary tradition

dessert nutrition nutrition-information finetuning multimodal multi-modality gpt4 tranformers dalle2 stable-diffusion chatgpt vision-language-model llava vision-language-learning llama2 gpt4v

Updated Mar 17, 2024
Python

BUAADreamer / Chinese-LLaVA-Med

中文医学多模态大模型 Large Chinese Language-and-Vision Assistant for BioMedicine

ai transformers medical chinese multimodal huggingface-datasets mllm llava minigpt4 gpt4v qwen1-5 llama-factory

Updated May 22, 2024
Python

GraphPKU / CoI

Chain of Images for Intuitively Reasoning

chatbot llama multimodal chatgpt llava visual-language-models gpt4v dalle3 chain-of-throught chain-of-image

Updated Nov 29, 2023
Python

kyegomez / HRTX

Multi-Modal Multi-Embodied Hivemind-like Iteration of RTX-2

machine-learning ai ml artificial-intelligence ensemble multi-modal rtx multi-modality rt-2 gpt4v

Updated Mar 12, 2024
Python

logicalroot / gpt-4v-demos

🤖 GPT-4V Demos • Test the model's vision capabilities in your browser using Streamlit • Easy setup

python openai streamlit gpt-4 gpt4 gpt4v gpt-4v

Updated Dec 3, 2023
Python

kyegomez / MambaByte

Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta

machine-learning ai tokenizer ml artificial-intelligence mamba multi-modality megabyte gpt4v

Updated May 17, 2024
Python

AmberSahdev / Open-Interface

Control Any Computer Using LLMs

python windows macos linux machine-learning automation assistant openai gpt pyinstaller self-driving pyautogui assistant-computer-control self-driving-software gpt4 llm gpt4v gpt4vision

Updated May 12, 2024
Python

X-PLUG / MobileAgent

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

android agent harmony ios app gui automation mobile copilot multimodal mobile-agents mllm multimodal-large-language-models gpt4v multimodal-agent

Updated Jun 7, 2024
Python

mnotgod96 / AppAgent

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

agent gpt4 llm generative-ai chatgpt gpt4v

Updated May 26, 2024
Python

Improve this page

Add a description, image, and links to the gpt4v topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gpt4v topic, visit your repo's landing page and select "manage topics."