Create and edit images using your voice.
This is a realtime demo of voice-powered function calling using Cloudflare Workers, Replicate, and the OpenAI Realtime API.
It generates images using Flux Schnell and edits them using Flux Kontext Pro.
Created from this guide and template: https://replicate.com/docs/guides/openai-realtime?utm_campaign=kontext-realtime&utm_source=project
Here's what you'll need to build this project:
- An OpenAI account. No special plan is required to use the Realtime API Beta.
- A Replicate account.
- Node.js 20 or later.
- Git for cloning the project from GitHub.
- Optional: A Cloudflare account if you want to deploy the app to the web. You can sign up and run workers for free.
- Create a Replicate API token at replicate.com/account/api-tokens
- Create an OpenAI API key at platform.openai.com/api-keys
Copy .dev.vars.example to .dev.vars
:
cp .dev.vars.example .dev.vars
Edit .dev.vars
and add your OpenAI API key and Replicate API token:
OPENAI_API_KEY=...
REPLICATE_API_TOKEN=...
Install dependencies
npm install
Run local server
npm run dev
Upload your secrets
npx wrangler secret put OPENAI_API_KEY
npx wrangler secret put REPLICATE_API_TOKEN
When you first load the app, you will be prompted to enter your Replicate API token in a modal dialog. The token is stored in your browser's localStorage and used for all Replicate API requests. You can get a token from Replicate's API tokens page.
You no longer need to set the REPLICATE_API_TOKEN
environment variable or use wrangler secret put
for this project.
npm run deploy