#

gpt4v

Here are 35 public repositories matching this topic...

zzxslp / MM-Navigator

GPT-4V in Wonderland: LMMs as Smartphone Agents

web-navigation gpt4v llm-agents

Updated Jul 17, 2024
Python

roboflow / gpt-checkup

Monitor the performance of OpenAI's GPT-4V model over time.

computer-vision model-analysis gpt4v gpt-4v

Updated Jul 17, 2024
HTML

mnotgod96 / AppAgent

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

agent gpt4 llm generative-ai chatgpt gpt4v

Updated Jul 16, 2024
Python

reworkd / tarsier

Vision utilities for web interaction agents 👀

python ocr selenium webscraping pypi-package playwright llms gpt4v

Updated Jul 15, 2024
Jupyter Notebook

X-PLUG / MobileAgent

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

android agent harmony ios app gui automation mobile copilot multimodal mobile-agents mllm multimodal-large-language-models gpt4v multimodal-agent

Updated Jul 15, 2024
Python

ShareGPT4Omni / ShareGPT4V

[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions

gpt language-model large-language-models chatgpt instruction-tuning vision-language-model large-vision-language-models gpt4v large-multimodal-models gpt-4v eccv2024

Updated Jul 1, 2024
Python

AmberSahdev / Open-Interface

Control Any Computer Using LLMs

python windows macos linux machine-learning automation assistant openai gpt pyinstaller self-driving pyautogui assistant-computer-control self-driving-software gpt4 llm gpt4v gpt4vision

Updated Jun 20, 2024
Python

kyegomez / HRTX

Multi-Modal Multi-Embodied Hivemind-like Iteration of RTX-2

machine-learning ai ml artificial-intelligence ensemble multi-modal rtx multi-modality rt-2 gpt4v

Updated Jun 17, 2024
Python

kyegomez / MambaByte

Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta

machine-learning ai tokenizer ml artificial-intelligence mamba multi-modality megabyte gpt4v

Updated Jun 17, 2024
Python

ethan-yz-hao / equation-ocr-app

OCR application for converting handwritten equations into LaTeX code using OpenAI's GPT-4V API, with LaTeX renderer for editing and checking (Next.js, Typescript, OpenAI GPT-4V, KaTex, Vercel)

typescript nextjs openai katex vercel gpt4v

Updated Jun 7, 2024
TypeScript

dceluis / vacocam_render

Vision-Assisted Camera Orientation

computer-vision ffmpeg artificial-intelligence gpt4 gpt4-api gpt4v gpt4-vision

Updated Jun 6, 2024
Jupyter Notebook

easonlai / webcam_chat_with_aoai_gpt4o

Discover the GPT-4o multimodal model at Microsoft Build 2024, now with text and image capabilities. My prototype enhances chats with real-time camera snapshots, powered by Flask, OpenCV, and Azure’s OpenAI Services. It’s interactive, visual, and simple to use. Give it a try!

python flask python3 openai flask-web gpt opencv-python microsoftazure gpt4 azure-openai azureopenai azure-openai-api gpt4v gpt4o

Updated May 30, 2024
HTML

BUAADreamer / Chinese-LLaVA-Med

中文医学多模态大模型 Large Chinese Language-and-Vision Assistant for BioMedicine

ai transformers medical chinese multimodal huggingface-datasets mllm llava minigpt4 gpt4v qwen1-5 llama-factory

Updated May 22, 2024
Python

martintmv-git / gpt4v-streamlit-voiceover

AI Voiceover with GPT4V

python jupyter-notebook openai streamlit gpt4v

Updated May 10, 2024
Jupyter Notebook

vscode-ui-sketcher

pAIrprogio / vscode-ui-sketcher

Draw your projects to life

ui-design vscode-extension tldraw gpt4v

Updated May 6, 2024
TypeScript

sketch2app

cameronking4 / sketch2app

The ultimate sketch to code app made using GPT4 vision. Choose your desired framework (React, Next, React Native, Flutter) for your app. It will instantly generate code and preview (sandbox) from a simple hand drawn sketch on paper captured from webcam

code-generator nextjs openai wireframe app-maker sketch2code gpt4 design2code code-assistant ai-tool gpt4v gpt4-vision sketch2app pad2pixel generate-app-ai

Updated May 3, 2024

gpt4api9 / gpt4api9

麻雀GPTs-API市场

openai gpt4 gpt35turbo gpt4all-api gpt4api gpt4v

Updated Mar 27, 2024

Ravi-Teja-konda / TunedLlavaDelights

Explore the rich flavors of Indian desserts with TunedLlavaDelights. Utilizing the in Llava fine-tuning, our project unveils detailed nutritional profiles, taste notes, and optimal consumption times for beloved sweets. Dive into a fusion of AI innovation and culinary tradition

dessert nutrition nutrition-information finetuning multimodal multi-modality gpt4 tranformers dalle2 stable-diffusion chatgpt vision-language-model llava vision-language-learning llama2 gpt4v

Updated Mar 17, 2024
Python

amazing-openai-api

soulteary / amazing-openai-api

Convert different model APIs into the OpenAI API format out of the box.

openai openai-api azure-openai azure-openai-api gpt4v gpt4vision yi-34b google-gemini gemini-pro yi-34b-chat

Updated Feb 21, 2024
Go

sagentic-ai / cupid

Valentine's Day Cupid Agent

ai chatbot gpt4 llm gpt4v bazed-af

Updated Feb 14, 2024
TypeScript

Improve this page

Add a description, image, and links to the gpt4v topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gpt4v topic, visit your repo's landing page and select "manage topics."