A real-time video AI agent that identifies movies and TV shows from your screen share or camera, then instantly surfaces reviews, ratings, and AI-powered sentiment summaries from RevApp.
Built for the WeMakeDevs Vision Possible: Agent Protocol Hackathon using Stream's Vision Agents SDK.
- 🖥️ Screen Share Recognition — Share your Netflix/Disney+ window, RevVision identifies what you're watching
- 📸 Camera Recognition — Point your webcam at movie posters, DVD covers, or streaming thumbnails
- 🤖 AI Review Summaries — Hear AI-generated review summaries from aggregated critic and user reviews
- ⭐ Instant Ratings — IMDB ratings fetched in real-time
- 🗣️ Voice Output — The agent speaks review summaries aloud via ElevenLabs TTS
- ⚡ Ultra-Low Latency — Under 30ms video latency via Stream's edge network
User Video Feed → Stream Edge Network → RevVision Agent (Python)
↓
Gemini Realtime (Vision AI)
↓
RevApp API (Reviews)
↓
Voice + UI Response
| Component | Technology |
|---|---|
| Agent Backend | Python 3.12+, Vision Agents SDK |
| Vision AI | Google Gemini Realtime (2 fps frame analysis) |
| Text-to-Speech | ElevenLabs |
| Movie Data | RevApp (TMDB + review aggregation + AI summaries) |
| Video Network | Stream Edge Network |
| Frontend | React + Vite + Stream Video React SDK |
- Python 3.12+
- uv (Python package manager)
- Node.js 18+
- API keys for: Stream, Google AI Studio, ElevenLabs
git clone https://github.com/YOUR_USERNAME/RevVision.git
cd RevVisioncd agent
cp .env.example .env # Fill in your API keys
uv sync # Install dependenciescd frontend
cp .env.example .env # Add your Stream API key
npm installTerminal 1 — Start the agent:
cd agent
uv run python main.py --call-type default --call-id revvisionTerminal 2 — Start the frontend:
cd frontend
npm run devOpen http://localhost:5173, enter the session ID revvision, and start watching! 🎬
RevVision/
├── agent/
│ ├── main.py # Vision Agent entry point
│ ├── revapp_client.py # RevApp API client
│ ├── pyproject.toml # Python dependencies
│ └── .env # API keys (not committed)
├── frontend/
│ ├── src/
│ │ ├── App.jsx # Main app with Stream Video SDK
│ │ ├── components/
│ │ │ └── MovieCard.jsx # Movie info + review card
│ │ └── index.css # Dark theme styles
│ ├── index.html
│ └── .env # Frontend config (not committed)
└── README.md
- You share your screen (or point a camera) via the web app
- Video frames are sent to the RevVision agent at 2 fps via Stream's edge network
- Gemini Realtime analyzes frames and identifies the movie/show
- RevApp API is queried for reviews from IMDB, Twitter, Reddit, and Letterboxd
- AI Summary is generated and spoken aloud via ElevenLabs
- Movie card with ratings, reviews, and sentiment appears in the sidebar
WeMakeDevs Vision Possible: Agent Protocol Hackathon (Feb 22 – March 1, 2026)
Prizes: $2,000 (1st) · $1,500 (2nd) · $500 (Best Blog)
MIT