Skip to content

A simple voice controlled chat GPT style node application.

License

Notifications You must be signed in to change notification settings

AGuski/look-ma-no-hands-gpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Look Ma No Hands GPT

This is a simple prototype of a node.js based speech-to-text -> GPT -> text-to-speech pipeline. It uses the Google Cloud Speech-to-Text and Google Cloud Text-to-Speech APIs to convert speech to text and text to speech, respectively. The OpenAI GPT-3.5 model is used to generate text from the speech-to-text output.

Requirements

Setup

  1. Clone this repository
  2. Create a .env file in the root directory of the project and add the following environment variables:
    • GOOGLE_APPLICATION_CREDENTIALS: Path to your Google Cloud Platform service account key file
    • OPENAI_API_KEY: Your OpenAI API key
    • INIT_PROMPT: The initial prompt to use for the GPT model. Optional and defaults to a default prompt if not set.

Text-to-speech alternative Eleven Labs

You can use the Eleven Labs text to speech service instead of Google Cloud Text-to-Speech. To do so, you need to create an account on the Eleven Labs website and get an API key. Then, you need to add the following environment variable to your .env file:

  • ELEVENLABS_API_KEY: Your Eleven Labs API key

Then you need to swap the import in the server.ts file from text-to-speech.ts to text-to-speech-elevenlabs.ts.

Usage

  1. Run npm install to install the dependencies
  2. Run npm start to start the server
  3. Wait until it says Press the space bar to start recording.;
  4. Press the space bar to start recording and wait for the answer. It will take a few seconds to process the audio and generate the answer and is is relatively slow as is has not been optimized by using streaming APIs.

Disclaimer

This is a prototype and is not intended for production use. It is not optimized for performance. As with everything where you use your own API keys, you are responsible for the costs incurred by using this software.

About

A simple voice controlled chat GPT style node application.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages