<a href="https://colab.research.google.com/github/wassimchouchen/Automatic-Speech-Recognition-/blob/main/Whisper_%2B_Falcon_%2B_Bark_v2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Install Dependencies

In [None]:
!pip install GoogleAudio
!pip install monsterapi

from IPython.display import Audio, display, HTML

def play_audio_from_json(json_data):
    try:
        # Extract the audio URL from the JSON data
        audio_url = json_data["output"][0]
        # Generate HTML to embed an audio player
        audio_html = f'<audio controls autoplay><source src="{audio_url}" type="audio/mpeg"></audio>'
        # Display the audio player
        display(HTML(audio_html))
    except KeyError:
        print("Invalid JSON format. Missing required keys.")
    except Exception as e:
        print(f"An error occurred: {str(e)}")

Collecting GoogleAudio
  Downloading GoogleAudio-0.0.3-py3-none-any.whl (3.4 kB)
Collecting ffmpeg-python (from GoogleAudio)
  Downloading ffmpeg_python-0.2.0-py3-none-any.whl (25 kB)
Installing collected packages: ffmpeg-python, GoogleAudio
Successfully installed GoogleAudio-0.0.3 ffmpeg-python-0.2.0
Collecting monsterapi
  Downloading monsterapi-1.0.2-py3-none-any.whl (11 kB)
Collecting requests-toolbelt (from monsterapi)
  Downloading requests_toolbelt-1.0.0-py2.py3-none-any.whl (54 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m54.5/54.5 kB[0m [31m881.7 kB/s[0m eta [36m0:00:00[0m
Installing collected packages: requests-toolbelt, monsterapi
Successfully installed monsterapi-1.0.2 requests-toolbelt-1.0.0


### Initialize MonsterAPI Client

Copy and Paste your MonsterAPI Key before running the cell below. If you don't have a Key then sign up on [MonsterAPI](https://monsterapi.ai/signup).

In [None]:
# Import necessary libraries
import os
from monsterapi import client
from googleaudio import colabaudio

# Set the Monster API key as an environment variable
os.environ['MONSTER_API_KEY'] = 'YOUR_MONSTER_API_KEY'
# Initialize the Monster API client
monster_client = client()

### Speak and instruct LLM to perform a task and get an audio response!

Record an audio and click on "press stop". You can speak an instruction such as:
- Write an essay for me on Steve Jobs
- Can you explain how quantum mechanics work?
- Write an email copy for explaining concepts of large language models


Then, just wait for few seconds to get a text response from Falcon 7B and an Audio response from Bark.

In [None]:
# Record audio and save it to a file
audio_name = 'audio.mp3'
audio, sr = colabaudio.get_audio()
colabaudio.saveaudio(audio_name, audio, sr)

# Transcribe the audio using Monster API's speech to text API
transcript = monster_client.generate(model='whisper', data={"file": "/content/audio.mp3"})
transcribed_text = transcript['text']
print(f"Instruction provided by you: {transcribed_text}")

print("Processing your request...")
# Send the transcribed text to Monster API's Falcon 7B API to generate a text response
response = monster_client.generate(model='falcon-7b-instruct', data={"prompt": transcribed_text, "max_length": 64})
# Print the generated response
falcon_output = response['text']
print(f"Output: {falcon_output}")

print(f"Generating audio response...")
# Send the response text to Monster API's Text to Speech API to generate an audio response
response = monster_client.get_response(model='sunoai-bark', data={"prompt": falcon_output, 'speaker':'de_speaker_4', 'text_temp':0.3, 'wave_temp':0.3, "sample_rate": 27000})
audio = monster_client.wait_and_get_result(response['process_id'],timeout=200)
print("Audio response generated, playing it back now:")
# Play the Generated Audio
play_audio_from_json(audio)

Instruction provided by you:  What is the largest river in the world?
Processing your request...
Output:  The largest river in the world is the Amazon River, which runs through South America. It's over 5,000 miles long and can hold up to 90 million cubic miles of water.
Generating audio response...
Audio response generated, playing it back now:
