Cross-platform Rust CLI that captures images from your webcam and asks the OpenAI gpt-4.1 model to describe what it sees. The tool discovers connected cameras, lets you pick one, and repeats vision calls each time you press v.
- OpenAI API key.
- An active Webcam or HDMI capture card
cargo build --releaseProvide your OpenAI credentials via one of the following (priority order):
-
CLI flag
-k/--key -
Environment variable
OPENAI_API_KEY -
.envfile in the project directory:OPENAI_API_KEY=sk-... PROMPT=Describe the image with details. Read any text that you see.
The prompt defaults to "Describe the image with details. Read any text that you see.". Override it with -p/--prompt or a PROMPT entry in your environment or .env file.
Launch the CLI with cargo (or the compiled binary):
cargo run --release -- [OPTIONS]-k,--key <API_KEY>� supply the OpenAI API key-p,--prompt <PROMPT>� custom prompt text sent with each vision request-h,--help� display help information
- The app enumerates available webcams and prompts you to choose one (if multiple).
- It verifies that an OpenAI API key is available; otherwise it exits.
- After initialization, the terminal switches to raw mode and waits for keystrokes:
- Press
vto capture a frame, send it to OpenAI, and print the description. - Press
x,Esc, orCtrl+Cto exit.
- Press
- The loop continues, letting you request multiple captures in a single session.
- Captured frames are resized so the longest side is at most 1024 px, preserving aspect ratio.
- Frames are encoded as JPEG (quality 85) before upload, which keeps detail while lowering latency and bandwidth usage.
- No webcams listed � ensure your device is connected and accessible to the operating system. Some platforms require additional permissions (e.g., Windows Camera privacy settings, macOS camera access).
- API errors � check that your key has access to
gpt-4.1and that you have sufficient quota. - Build issues � install the latest stable Rust toolchain and a linker for your platform. On Linux, extra packages (e.g.,
libv4l-dev) may be required for camera access.