Skip to content

layercodedev/example-fullstack-nextjs

Repository files navigation

Layercode Voice Agent Example

This open source project demonstrates how to build a real-time voice agent using Layercode Voice Pipelines, with a Next.js frontend and backend.

Read the companion guides:

Features

  • Browser-based Voice Interaction: Users can speak to the agent directly from their browser.
  • LLM Integration: User queries are sent to an LLM using Vercel AI SDK and Gemini Flash 2.0.
  • Streaming Responses: LLM responses are streamed back, where Layercode handles the conversion to speech and playback to the user.

How It Works

  1. Frontend:
    Uses @layercode/react-sdk to connect the browser's microphone and speaker to a Layercode voice pipeline. Connections must be authenticated with an client session key, which is generated by an /api/authorize route, which calls the Layercode API to create a client session. The React SDK handles calling this route for you.

  2. Transcription & Webhook:
    Layercode transcribes user speech. For each complete message, it sends a webhook containing the transcribed text to the /api/agent route.

  3. Backend Processing:
    The Next.js API route uses @layercode/node-server-sdk to handle the webhook. The transcribed text is sent to the LLM (Gemini Flash 2.0 via Vercel AI SDK) to generate a response.

  4. Streaming & Speech Synthesis:
    As soon as the LLM starts generating a response, the backend streams the output back as SSE messags to Layercode, which converts it to speech and delivers it to the frontend for playback in realtime.

Getting Started

Note: Layercode needs to send a webhook to your backend to generate agent responses. So if you're running this locally, you'll need to setup a tunnel to your localhost. See step 5 onwards below.

  1. Clone this repository.
  2. Install dependencies with npm install.
  3. Edit your .env environment variables. You'll need to add:
    • GOOGLE_GENERATIVE_AI_API_KEY - Your Google AI API key
    • LAYERCODE_API_KEY - Your Layercode API key found in the Layercode dashboard settings
    • LAYERCODE_WEBHOOK_SECRET - Your Layercode pipeline's webhook secret, found in the Layercode dashboard (goto your pipeline, click Edit in the Your Backend Box and copy the webhook secret shown)
    • NEXT_PUBLIC_LAYERCODE_PIPELINE_ID - The Layercode pipeline ID for your voice agent. Find this id in Layercode dashboard
  4. Run the development server with npm run dev.
  5. If running locally, setup a tunnel (we recommend cloudflared which is free for dev) to your localhost so the Layercode webhook can reach your backend. Follow our tunneling guide here: https://docs.layercode.com/tunnelling
  6. If you didn't follow the tunneling guide, and are deploying this example to the internet, remember to set the Webhook URL in the Layercode dashboard (click Edit in the Your Backend box) to your publically accessible backend URL.
  7. Now open http://localhost:3000 in your browser and start speaking to your voice agent!

Extra features

Push-to-talk mode

Layercode supports multiple turn-taking modes which are configured in the Transcriber settings of your voice pipeline in the Layercode dashboard. By default this example uses automated turn taking (which is the default for voice pipelines).

Push-to-talk is an alternative mode, where the user must hold the button down to speak. To enable this, go to the Transcriber settings of your voice pipeline in the Layercode dashboard and set the Turn Taking Mode to "Push-to-talk".

Then edit app/page.tsx and change the VoiceAgent component to use the VoiceAgentPushToTalk component. Now the user's speech will only be transcribed when the button is held down (or spacebar is pressed).

License

MIT

About

Example Layercode powered Voice Agent with Next.js frontend & backend

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •