Speak your intent. The agent transcribes your voice, plans with GPT‑5, executes tools to update a live HTML preview, and replies by voice.
- Modular pipeline: STT → Agent (with tools) → TTS
- Progressive UX: chat updates after each stage
- Composio tool
WRITE_FULL_HTML_PREVIEW
to writepublic/preview.html
- Composio Notion tools:
NOTION_FETCH_DATA
,NOTION_FETCH_BLOCK_CONTENTS
- Next.js App Router (Node runtime)
- Install
npm install
- Environment
Create
.env.local
:
OPENAI_API_KEY=your_openai_api_key
COMPOSIO_API_KEY=your_composio_api_key
- Run
npm run dev
Open http://localhost:3000 and click the mic.
- POST
/api/stt
- Request: multipart form with
audio
(e.g.,audio/webm
) - Response:
{ transcript: string }
- Request: multipart form with
- POST
/api/agent
- Request: JSON
{ text: string }
- Response:
{ text: string }
- Request: JSON
- POST
/api/tts
- Request: JSON
{ text: string }
- Response:
{ audioBase64: string, mimeType: string }
- Request: JSON
- POST
/api/voice-chat
(orchestrator)- Request: multipart form with
audio
- Response:
{ transcript, text, audioBase64, mimeType }
- Request: multipart form with
app/api/_lib/agentState.ts
: minimal global message state + OpenAI message adapterapp/api/_lib/openaiClient.ts
: OpenAI client factoryapp/api/_lib/composioTools.ts
: Composio setup andensureToolsRegistered()
app/api/_lib/stt.ts
: Whisper transcriptionapp/api/_lib/agent.ts
: GPT‑5 reasoning, tool calls, summary + voice‑friendly rewriteapp/api/_lib/tts.ts
: Text‑to‑speech with length‑safe condensation
Notes
- Global state is in‑memory; use persistent storage for multi‑user production.
- Tool list is intentionally small to keep behavior focused and safe.
STT
curl -X POST http://localhost:3000/api/stt \
-F "audio=@input.webm;type=audio/webm"
Agent
curl -X POST http://localhost:3000/api/agent \
-H "Content-Type: application/json" \
-d '{"text":"Create a landing page with a hero and CTA"}'
TTS
curl -X POST http://localhost:3000/api/tts \
-H "Content-Type: application/json" \
-d '{"text":"Your page is ready."}'
- GitHub: Composio
- Website: composio.dev
-
Create API key
- Go to the Composio platform dashboard and create an API key.
- Add it to
.env.local
asCOMPOSIO_API_KEY
(see Environment section above).
-
Connect Notion
- In the Composio platform, Add Notion to your project.
- Complete the Notion OAuth flow. Share the pages/databases you want the agent to read with the Composio integration so it has access.
- Ensure this Notion connection is available as the default connection (aka default user/connection). The agent will use the first connected account by default;
-
Tools used
NOTION_FETCH_DATA
: fetches metadata/data for pages or databases.NOTION_FETCH_BLOCK_CONTENTS
: fetches the block contents of a page.