### Test Case Situations 

##### S1: One conversation between two people (immediate)
- Prompt to ask direct questions about the conversation
- Prompt to ask questions about things which have been mentioned at different instances

##### S2: One conversation between two people (past)
- Prompt to ask direct questions about the conversation
- Prompt to ask questions about things which have been mentioned at different instances

##### S3: One conversation between three (or more) people (immediate)
- Prompt to ask direct questions about the conversation
- Prompt to ask questions about things which have been mentioned at different instances

##### S4: One conversation between three (or more) people (past)
- Prompt to ask direct questions about the conversation
- Prompt to ask questions about things which have been mentioned at different instances

##### S5: One conversation between two conflicting people (immediate)
The two people have a conflicting "opinion" about a subject
- Prompt to ask direct questions about the conversation
- Prompt to ask questions about the opinion of either of the speakers

##### S6: One conversation between two conflicting people (past)
The two people have a conflicting "opinion" about a subject
- Prompt to ask direct questions about the conversation
- Prompt to ask questions about the opinion of either of the speakers

##### S7: One conversation between three (or more) conflicting people (immediate)
The two people have a conflicting "opinion" about a subject
- Prompt to ask direct questions about the conversation
- Prompt to ask questions about the opinion of either of the speakers

##### S8: One conversation between three (or more) conflicting people (past)
The two people have a conflicting "opinion" about a subject
- Prompt to ask direct questions about the conversation
- Prompt to ask questions about the opinion of either of the speakers

##### S9: Ask a question which requires information from two (or more) different conversations
- Do I know any marketing managers?
- What are the different meetings I have had over the past week? (Specify duration)

##### S10: Ask a question about information which has different forms in different conversations
Example
- Sarah was promoted to Head of the Marketing Department (in conversation 1)
- Sarah was promoted to Head of the PR Department (in conversation 2)

### Use this cell to make installations

In [4]:
!pip install playsound
!pip install pyttsx3
!pip install -U openai-whisper

### `FFmpeg` Installation

#### On Windows:

##### Download
Go to the FFmpeg Official Website and download the latest build for Windows.

##### Extract
Extract the downloaded ZIP file to a directory, for example, C:\FFmpeg.

##### Environment Variable:
- Right-click on 'This PC' or 'Computer' on your desktop or File Explorer, and select 'Properties'.

- Click on 'Advanced system settings' and then 'Environment Variables'.

- Under 'System Variables', find and select 'Path', then click 'Edit'.

- Click 'New' and add the path to your FFmpeg bin directory, e.g., C:\FFmpeg\bin.

- Click 'OK' to close all dialog boxes.


#### On macOS:

You can install `ffmpeg` using Homebrew:

`brew install ffmpeg`

#### On Linux:
For Ubuntu and other Debian-based distributions, you can install ffmpeg from the apt repository:

`sudo apt update`

`sudo apt install ffmpeg`



### Use this cell to import any libraries

In [37]:
from openai import OpenAI #Only for testing purposes
import os
import openai
from dotenv import load_dotenv, find_dotenv
from playsound import playsound
import pyaudio
import wave
import numpy as np
import whisper
import warnings

### Microphone Device Selection

#### The `PyAudio` library requires you to choose a device with which you want to input speech. 

#### The function `getAudio()` has an argument `device_name`. Before running the `main` cell, change the default argument from `MacBook Pro Microphone` to the the device you want to use. 

In [39]:
def list_audio_devices():
    p = pyaudio.PyAudio()
    for i in range(p.get_device_count()):
        device_info = p.get_device_info_by_index(i)
        print(device_info['name'])
    p.terminate()

list_audio_devices()

Microphone (Jabra BIZ 1500)
Microphone (Voicemod Virtual Au
Microphone Array (Realtek(R) Au
AI Noise-cancelling Input (ASUS
Speakers (Jabra BIZ 1500)
Speakers (Realtek(R) Audio)
AI Noise-cancelling Output (ASU
Line (Voicemod Virtual Audio De
Microphone (Jabra BIZ 1500)
Microphone (Voicemod Virtual Audio Device (WDM))
Microphone Array (Realtek(R) Audio)
AI Noise-cancelling Input (ASUS Utility)
Primary Sound Driver
Speakers (Jabra BIZ 1500)
AI Noise-cancelling Output (ASUS Utility)
Line (Voicemod Virtual Audio Device (WDM))
Speakers (Realtek(R) Audio)
AI Noise-cancelling Output (ASUS Utility)
Line (Voicemod Virtual Audio Device (WDM))
Speakers (Jabra BIZ 1500)
Speakers (Realtek(R) Audio)
Microphone (Voicemod Virtual Audio Device (WDM))
Microphone Array (Realtek(R) Audio)
Microphone (Jabra BIZ 1500)
AI Noise-cancelling Input (ASUS Utility)
Microphone (Voicemod VAD Wave)
Line Out (Voicemod VAD Wave)
Microphone Array (Realtek HD Audio Mic Array input)
Microphone (Realtek HD Audio Mic input)

### Main Code Cell
#### Recording Audio using `pyAudio`
#### Speech to Text using `Whisper`
#### GPT Model: `gpt-3.5-turbo`

In [43]:
def get_API():
    """
    Loads the OpenAI API key from the environment variables.

    Returns:
    - str: The OpenAI API key.
    """
    _ = load_dotenv(find_dotenv())  # Load environment variables from .env file

    openai_api_key = os.getenv('OPENAI_API_KEY')
    if not openai_api_key:
        raise ValueError("OpenAI API key is not set. Please set the 'OPENAI_API_KEY' environment variable in your .env file.")
    return openai_api_key

def get_Model(): 
    """
    Determines the appropriate GPT-3.5 model based on the current date.

    Returns:
    - str: The model name to use.
    
    Note: We are not using this function right now 
    because the process_prompt function already decides the model
    """
    current_date = datetime.datetime.now().date()
    target_date = datetime.date(2024, 6, 12)

    # Select the model based on the current date
    if current_date > target_date:
        llm_model = "gpt-3.5-turbo"
    else:
        llm_model = "gpt-3.5-turbo-0301"
    return llm_model

def list_audio_devices():
    """
    Lists all available audio input devices.

    Returns:
    - list: A list of tuples containing device index, name, max input channels, and default sample rate.
    """
    p = pyaudio.PyAudio()
    devices = []
    for i in range(p.get_device_count()):
        device_info = p.get_device_info_by_index(i)
        devices.append((i, device_info['name'], device_info['maxInputChannels'], device_info['defaultSampleRate']))
    p.terminate()
    return devices

def get_device_index_by_name(name): 
    """
    Finds the index of an audio device by its name.

    Args:
    - name (str): The name of the device.

    Returns:
    - int: The index of the device.
    
    Note: This is a helper function which will be used in getAudio()
    """
    devices = list_audio_devices()
    for index, device_name, _, _ in devices:
        if name.lower() in device_name.lower():
            return index
    return None

def getAudio(output_filename="recorded_speech.wav", device_name="Microphone (Jabra BIZ 1500)", chunk_size=1024, 
             format=pyaudio.paInt16, channels=1, rate=16000, silence_threshold=1000, silence_duration=5):
    """
    Records audio until a period of silence is detected and saves it to a file.

    Args:
    - output_filename (str): Name of the output WAV file.
    - device_name (str): Name of the input audio device.
    - chunk_size (int): Number of frames per buffer.
    - format: Audio format (e.g., pyaudio.paInt16).
    - channels (int): Number of audio channels.
    - rate (int): Sampling rate in Hz.
    - silence_threshold (int): Amplitude threshold for silence detection.
    - silence_duration (int): Duration of silence required to stop recording (in seconds).

    Returns:
    - str: The name of the saved audio file.
    
    Note: Start talking only when you the message "Please start speaking. Recording..." 
    If your conversation/prompt is over, but Memoro continues to record, just interrupt it
    """
    device_index = get_device_index_by_name(device_name)
    if device_index is None:
        raise ValueError(f"Device '{device_name}' not found.")

    # Variables to store audio frames and silence detection
    audio_frames = []
    silent_chunks = 0
    max_silent_chunks = int(rate / chunk_size * silence_duration) #Formula given by ChatGPT

    def is_silent(data, threshold=silence_threshold):
        """Returns 'True' if below the silence threshold."""
        max_amplitude = np.max(np.abs(data))
        return max_amplitude < threshold

    def callback(in_data, frame_count, time_info, status):
        nonlocal silent_chunks, audio_frames
        audio_frames.append(in_data)
        audio_data = np.frombuffer(in_data, dtype=np.int16)
        if is_silent(audio_data):
            silent_chunks += 1
        else:
            silent_chunks = 0
        if silent_chunks > max_silent_chunks:
            return (None, pyaudio.paComplete)
        return (in_data, pyaudio.paContinue)

    # Initialize PyAudio
    p = pyaudio.PyAudio()

    try:
        # Open stream
        stream = p.open(format=format,
                        channels=channels,
                        rate=rate,
                        input=True,
                        frames_per_buffer=chunk_size,
                        stream_callback=callback,
                        input_device_index=device_index)

        print("Please start speaking. Recording...")
        stream.start_stream()

        # Keep the stream active while recording
        while stream.is_active():
            pass

        # Stop and close the stream
        stream.stop_stream()
        stream.close()

    except KeyboardInterrupt: 
        '''
        If you are in an environment which has a lot of noise, 
        the function may keep running even after your prompt/conversation is over. 
        Interrupt the kernel in that situation.
        '''
        print("Recording interrupted by user.")
    except Exception as e:
        print(f"An error occurred: {e}")
    finally:
        p.terminate()

    # Save the recorded audio to a file
    try:
        with wave.open(output_filename, 'wb') as wf:
            wf.setnchannels(channels)
            wf.setsampwidth(p.get_sample_size(format))
            wf.setframerate(rate)
            wf.writeframes(b''.join(audio_frames))
        print(f"Audio saved to {output_filename}")
    except Exception as e:
        print(f"Failed to save audio file: {e}")

    return output_filename

def speech_to_text():
    """
    Converts recorded audio to text using Whisper model.

    Returns:
    - str: The transcribed text.
    """
    audio = getAudio()

    # Suppress the FP16 warning
    warnings.filterwarnings("ignore", category=UserWarning, message="FP16 is not supported on CPU; using FP32 instead")

    # Load the Whisper model
    model = whisper.load_model("base")  
    '''
    Choose among tiny, base, small, medium, large models
    The higher the model, higher the accuracy. But more accuracy means 
    it will take a lot longer to transcribe the audio.
    '''

    print('Processing...')
    # Transcribe the audio file
    result = model.transcribe(audio)
    print('Processed!')
    text = result['text']
    write_to_file(text)
    return text

def text_to_speech(text):
    """
    Converts text to speech and plays the audio.

    Args:
    - text (str): The text to be converted to speech.
    """
    response = openai.audio.speech.create(
        model="tts-1",
        voice="onyx",
        input=text
    )
    file_path = 'response_voice.mp3' #Contains the audio you hear when Memoro responds

    warnings.filterwarnings("ignore", category=DeprecationWarning)
    response.stream_to_file(file_path)
    play_audio(file_path)

def write_to_file(text):
    """
    Writes the text to a file.

    Args:
    - text (str): The text to be written.

    Returns:
    - str: The file path.
    """
    with open('STT_file.txt', 'w') as file:
        file.write(text)
    return os.getcwd() + '/STT_file.txt'

def read_from_file(file_path):
    """
    Reads text from a file.

    Args:
    - file_path (str): The path of the file.

    Returns:
    - str: The read text.
    
    Note: We are not using this function right now and may discard it after 
    integrating Memoro with PineCone
    """
    with open(file_path, 'r', encoding='utf-8') as file:
        text = file.read()
    return text

def play_audio(file_path):
    """
    Plays an audio file.

    Args:
    - file_path (str): The path of the audio file.
    """
    if os.name == 'nt':  # For Windows
        os.startfile(file_path)
    elif os.name == 'posix':  # For macOS and Linux
        subprocess.call(['open', file_path]) if sys.platform == 'darwin' else subprocess.call(['xdg-open', file_path])

def process_prompt(conversation_text):
    """
    Processes the user's prompt by converting it to text, 
    querying OpenAI, and converting the response to speech.

    Args:
    - conversation_text (str): The text of the conversation.
    """
    intro_path = os.path.join(os.getcwd(), 'intro', 'intro_prompt_voice.mp3')
    play_audio(intro_path)
    prompt = speech_to_text()
    print("Recognized Prompt:", prompt)  

    if prompt:
        response = openai.chat.completions.create(
            model="gpt-3.5-turbo",
            messages=[
                {"role": "system", "content": "You are a memory assistant, listening to my conversations."},
                {"role": "user", "content": conversation_text},  # Provide the conversation context
                {"role": "user", "content": prompt}  # User's question
            ]
        )

        response_text = response.choices[0].message.content
        text_to_speech(response_text)

# Example usage
api = get_API()
text = speech_to_text()
process_prompt(text)

Please start speaking. Recording...
Audio saved to recorded_speech.wav
Processing...
Processed!
Please start speaking. Recording...
Audio saved to recorded_speech.wav
Processing...
Processed!
Recognized Prompt:  What is the name I gave you? What was the name I gave you? This.
