A python alexa using openai, replicate, and gtts to create a better version of alexa.
- A text file transcription of what is heard and replied.
- Transcribes audio to text with openai whisper and feeds it into openai for a response.
- Detects the word image and returns a generated image using replicate and stable-diffusion.
- When image is created, response will use a model that creates text descriptions of images from the generated image. (It actually knows what is in the image it is sending back to you)
- Remembers as many lines of the conversation as you want (edit in the pylexa.py)
- Change record length also in the pylexa.py file.
- Recommended use is with VisualStudio with a split window. One can be open to image.png and one to trans.txt to watch everything.
- Get openai and replicate api keys.
- Clone the repository
- pip install -r requirements.txt
- OpenAi key goes on pylexa.py where is says 'YOUR_API_KEY'
- Load your replicate api key as an environment variable exacatly like so in terminal: export REPLICATE_API_TOKEN='YOUR_API_KEY'
- Now you just run python3 pylexa.py and be ready to record your audio=]