Skip to content

The AI Voice Assistant to transcribe your voice into text, send that text to GPT-4o, and synthesize the response into audio

License

Notifications You must be signed in to change notification settings

shlomoc/ai-voice-chatbot

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Voice Assistant (Flask + ReactJS)

This application utilizes OpenAI's API to transcribe audio inputs from the user, generate text-based responses, and synthesize these responses back into audio. It features a ReactJS frontend for user interaction and a Flask backend for handling the audio processing and API communication.

Demo of the App

AI.mp4

Prerequisites

Before you begin, ensure you have met the following requirements:

  • Python 3.8 or higher
  • Node.js and npm
  • An OpenAI API key

Installation

Clone this repository to your local machine:

git clone https://github.com/krisograbek/ai-voice-minimal
cd ai-voice-minimal

Setting up the Backend

Create a virtual environment in the back-end directory:

python -m venv venv

Activate the virtual environment:

  • On Windows: venv\Scripts\activate
  • On macOS/Linux: source venv/bin/activate

Install the required Python dependencies:

pip install -r requirements.txt

Create a .env file based on the provided .env.example. Replace OPENAI_API_KEY with your actual OpenAI API key.

Setting up the Frontend

Navigate to the frontend directory:

cd client

Install the required npm packages:

npm install

Running the Application

Starting the Backend Server

From the project root directory, create the directories to store audio files:

mkdir -p static/audio

Ensure your virtual environment is activated, then run:

python app.py

This will start the Flask server on http://localhost:5000/.

Starting the Frontend Application

Open a new terminal window, navigate to the frontend directory, and run:

npm start

This will start the React application and open it in your default web browser at http://localhost:3000/.

Using the Application

Once both servers are running, go to http://localhost:3000/ in your browser. Click on the microphone icon to start recording your message. Click the stop icon to end the recording. The application will then transcribe your message, generate a response, and synthesize this response into audio. Listen to the synthesized response through the audio player that appears.

About

The AI Voice Assistant to transcribe your voice into text, send that text to GPT-4o, and synthesize the response into audio

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 54.1%
  • Python 29.2%
  • HTML 10.8%
  • CSS 5.9%