Skip to content

👀🧠 GPT-4 Vision x 💪⌨️ Vimium = Autonomous Web Agent

License

Notifications You must be signed in to change notification settings

Jiayi-Pan/GPT-V-on-Web

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

👀🧠GPT-4 Vision x 💪⌨️Vimium = Autonomous Web Agent

demo

Demo Video: GPT-4V finds a cute cat image through Chrome.

This project leverages GPT4V to create an autonomous / interactive web agent. The action space are discretized by Vimium.

It's a minimal proof-of-concept project at the current stage and performance can be furhter improved with more engineering.

Get Started

Install

# For Mac users
brew install chromedriver  
# run `brew upgrade chromedriver` if you already have it
pip install git+https://github.com/Jiayi-Pan/GPT-V-on-Web.git

Run

webai
# or if you want start at a specific website
webai --start_link "https://www.google.com"

Related Works

About

👀🧠 GPT-4 Vision x 💪⌨️ Vimium = Autonomous Web Agent

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages