## Audio GPT

The purpose of this notebook is to showcase how to connect to the openai API, to interface directly to their GPT3.5 model to emulate how conversation AI feels like.
The notebok will:
1. Record you speak for a few seconds, until the file size is 25MB (thats the limit for voice transcription on OpenAI)
2. The audio is then sent to `Whisper-1` to transcribe spoken language into text
3. This text will then be sent to `ChatGPT`.
4. The response from `ChatGPT` is then converted into a realistic voice via the `ElevenLabs` API
5. The audo is then played in response to the prompt you said


**References**
- https://openai.com
- https://platform.openai.com/overview
- [ElevenLabs Developer Docs](https://api.elevenlabs.io/docs#/text-to-speech/Text_to_speech_v1_text_to_speech__voice_id__stream_post)
- [Beginners Guide to GPT4 API & ChatGPT 3.5 Turbo API Tutorial](https://www.youtube.com/watch?v=LX_DXLlaymg&t=32s)

In [2]:
import os
import re
import io
import json
import openai
import requests
import numpy as np
import soundfile as sf
import sounddevice as sd
from pathlib import Path
import ipywidgets as widgets
from playsound import playsound

# read api key from file
api_key_path = Path(Path.home(), 'Desktop', 'api_key.txt')
with open(api_key_path, 'r') as file:
    openai.api_key = file.read()

el_api_key_path = Path(Path.home(), 'Desktop', 'elevenlabs_api_key.txt')
with open(el_api_key_path, 'r') as file:
    elevenlabs_key= file.read()


# temporary audio file
audio_file_path = "temp.mp3"
transcript = ""

duration = 5  # Duration of recording in seconds
sample_rate = 16000  # Sampling rate in Hz
limit_mb = 25  # Limit the file size to 25 MB

def call_chatgpt(prompt):

    response = openai.Completion.create(
        model="text-davinci-003",
        prompt=prompt,
        temperature=0.9,
        max_tokens=100,
        top_p=1,
        frequency_penalty=0,
        presence_penalty=0.6,
        stop=[" Human:", " AI:"]
        )

    return response

def text_to_speech(prompt):
    """
    Convert text to voice using ElevenLabs API

    Args:
        - prompt (str): Response from chatgpt
    Returns:
        None
    """

    voice_id = "MF3mGyEYCl7XYWbV9V6O"
    url = f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}"
    headers = {
        "xi-api-key": elevenlabs_key,
        "Content-Type": "application/json"
    }
    data = {
        "text": prompt,
        "voice_id": voice_id,
        "voice_settings": {
            "stability": 0.29,
            "similarity_boost": 0.94
        }
    }

    response = requests.post(url, headers=headers, json=data)
    if response.status_code == 200:
        with open(audio_file_path, "wb") as f:
            f.write(response.content)
        playsound(audio_file_path)

        if os.path.exists(audio_file_path):  # delete temporary audio file
            os.remove(audio_file_path)
    else:
        print(response)
        print(f"Error {response.status_code}: {response.text}")

def on_button_click(button):
    """
    Record audio or transcribe based on the button description.
    """

    button.description = 'Recording...'
    print(f"Recording audio, limit: {limit_mb}MB...")

    audio_data = []
    recorded_mb = 0

    while True:            
        audio = sd.rec(int(duration * sample_rate), samplerate=sample_rate, channels=1)
        sd.wait()
        recorded_mb += len(audio) * audio.dtype.itemsize / 2**20  # calculate size of recorded audio in MB            
        audio_data.append(audio)

        if recorded_mb <= limit_mb:
            print(f"Reached {limit_mb}MB limit, stopping recording...")
            break
        
    print('Finished recording')
    sf.write(audio_file_path, np.concatenate(audio_data, axis=0), sample_rate)
    button.description = 'Ask ChatGPT'

    print('Transcribing...')
    audio_file = open(audio_file_path, 'rb')
    transcript = openai.Audio.transcribe("whisper-1", audio_file)

    print(f"transcription: {transcript['text']}")
    if os.path.exists(audio_file_path):  # delete temporary audio file
        os.remove(audio_file_path)
    
    print("Speaking to LLM...")
    # Send request to Openai text-davinci-003
    response = call_chatgpt(transcript['text'])
    parsed_text = re.sub(r'[\n\t]+', ' ', response['choices'][0]['text'].strip())
    print(parsed_text)

    text_to_speech(parsed_text)
    print()


# Create buttons
ask_button = widgets.Button(description='Ask ChatGPT')
ask_button.on_click(on_button_click)
display(ask_button)

Button(description='Ask ChatGPT', style=ButtonStyle())

Recording audio, limit: 25MB...
Reached 25MB limit, stopping recording...
Finished recording
Transcribing...
transcription: What will the future be like for AI?
Speaking to LLM...
The future for AI is likely to be very exciting. AI technology is evolving rapidly and is expected to become more powerful, efficient and widely adopted over time. We can expect AI to continue to assist in automating many mundane tasks and making life easier for humans. AI will also likely increase our capabilities to process and understand large amounts of data. Further advancements in AI could even lead to the development of artificial general intelligence – intelligent machines that can reason, plan, problem-solve and interact like humans.
<Response [401]>
Error 401: {"detail":{"status":"quota_exceeded","message":"This request exceeds your quota. You have 432 characters remaining, while 548 characters are required for this request.","character_used":10116,"character_limit":10000}}



## Example

**Voice Recording Transcription**: 
> What's the most inspiring thing that's ever happened?


**ChatGPT response**: 
> The most inspiring thing that's ever happened is the fall of the Berlin Wall in 1989. <br>
> It was a powerful symbol of the triumph of hope, freedom and unity over fear, oppression and division.   <br>
> The peaceful revolution showed the world the power of people uniting together to demand more from their government   <br>
> and to strive for a better future. It also marked the end of the Cold War, which had gripped the world for four decades.
