A modern voice-enabled AI agent built with Next.js that lets you have natural conversations with AI using speech recognition and text-to-speech capabilities.
- Voice Interaction: Speak to the AI and get audio responses
- Real-time Speech Recognition: Convert speech to text using browser APIs
- Text-to-Speech: AI responses are played back as audio
- Tool Integration: Powered by Composio for advanced AI capabilities
- Modern UI: Clean interface built with Tailwind CSS and Radix UI
- Responsive Design: Works seamlessly across desktop and mobile devices
- Framework: Next.js 15.3.4 with App Router
- Language: TypeScript
- Styling: Tailwind CSS
- UI Components: Radix UI + shadcn/ui
- AI: OpenAI GPT models via LangChain
- Tools: Composio Core for AI agent capabilities
- Speech: Web Speech API for recognition and synthesis
- State Management: Zustand
- Animations: Framer Motion
voice-ai-agent/
├── app/ # Next.js app directory
│ ├── api/ # API routes
│ │ ├── chat/ # Chat endpoint
│ │ └── tts/ # Text-to-speech endpoint
│ ├── globals.css # Global styles
│ ├── layout.tsx # Root layout
│ └── page.tsx # Home page
├── components/ # React components
│ ├── ui/ # Base UI components
│ ├── chat-header.tsx # Chat header component
│ ├── chat-input.tsx # Message input component
│ ├── chat-interface.tsx # Main chat interface
│ ├── chat-messages.tsx # Messages display
│ └── settings-modal.tsx # Settings modal
├── hooks/ # Custom React hooks
│ ├── use-audio.ts # Audio playback logic
│ ├── use-chat.ts # Chat state management
│ ├── use-mounted.ts # Mount detection
│ └── use-speech-recognition.ts # Speech recognition
├── lib/ # Utility libraries
│ ├── validators/ # Input validation
│ ├── alias-store.ts # State management
│ ├── constants.ts # App constants
│ ├── error-handler.ts # Error handling
│ └── utils.ts # Helper functions
└── public/ # Static assets
- Node.js 18+
- npm or yarn
- OpenAI API key
- Composio API key (optional)
- Clone the repository:
git clone <repository-url>
cd voice-ai-agent
- Install dependencies:
npm install
- Set up environment variables:
cp .env.example .env.local
Add your API keys:
OPENAI_API_KEY=your_openai_api_key_here
COMPOSIO_API_KEY=your_composio_api_key_here
- Run the development server:
npm run dev
- Open http://localhost:3000 in your browser
- Text Chat: Type messages in the input field and press Enter
- Voice Chat: Click the microphone button to start voice input
- Audio Responses: Toggle audio playback in the settings
- Settings: Access configuration options via the settings modal
npm run dev # Start development server with Turbopack
npm run build # Build for production
npm run start # Start production server
npm run lint # Run ESLint
- ChatInterface: Main component orchestrating the chat experience
- useChat: Manages chat state and API communication
- useAudio: Handles text-to-speech functionality
- useSpeechRecognition: Manages voice input with debouncing
- API Routes: Handle chat processing and TTS generation
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
This project is licensed under the Apache License.