# **Audio to Text with GroqAIAudio**

### **Introduction:**  
GroqAudio provides advanced audio processing models optimized for speed and accuracy, enabling efficient audio-to-text conversion tasks. GroqAudio's models are designed for developers and researchers who need high-performance solutions for transcribing audio content.  

This notebook guides users through setting up and utilizing GroqAudio's models for various audio-to-text use cases.

This notebook will:  

- Introduce the setup and initialization of GroqAIAudio within **Swarmauri**.  
- Demonstrate **synchronous audio transcription**, **asynchronous** and **batch processing** for multiple audio files.    

By the end of this notebook, you will have a comprehensive understanding of how to use GroqAudio's models to efficiently and effectively transcribe audio content.

# **Setup and Configuration**

## Import the GroqAIAudio class

In [3]:
from swarmauri.llms.concrete.GroqAIAudio import GroqAIAudio

### Load your API KEY from your environment variables
- Make sure you have python-dotenv installed if not, run `pip install python-dotenv` so you can install it.
- Get your API KEY [HERE](https://console.groq.com/keys)

In [4]:
import os
from dotenv import load_dotenv
load_dotenv()

GROQ_API_KEY = os.getenv("GROQ_API_KEY")

### Initialize GroqAIAudio Model
- Note: You can as well input your api key directly, but it's better to load from env file
- Also, the `name` argument is an optional argument that allows you to input a model from the list of allowed_models.

In [5]:
model = GroqAIAudio(api_key=GROQ_API_KEY, name="distil-whisper-large-v3-en") 

### Transcribe Audio using the `transcribe_audio` method from the GroqAIAudio class

In [6]:
# Add a simple audio file path
audio_file_path = "swarmauri.mp3"

# Transcribe the audio file
transcription = model.predict(audio_file_path)

print(transcription)

 Swarmory is a Python library designed for working with large language models, LLMs, image generation models, vision models, audio models, vector databases, tools, and agents. It provides a comprehensive framework for integrating and utilizing various AI models and tools in a seamless manner, enabling developers to build sophisticated AI applications efficiently.


As you can see, the `GroqAIAudio` model returns the transcribed text from the audio file.  

#### To transcribe the audio file asynchronously, use the code below

In [18]:
transcription = await model.apredict(audio_file_path)

## Transcribing batch audio files 

To transcribe audio files in batch, we just have to use the `batch` method

In [11]:
path_task_dict = {
    "swarmauri.mp3": "translation",
    "mission.mp3": "transcription",
}

In [12]:
results = model.batch(path_task_dict=path_task_dict)

In [14]:
for result in results:
    print(result)

 Swarmari is a Python library designed for working with large language models , image generation models, vision models, audio models, vector databases, tools, and agents. It provides a comprehensive framework for integrating and utilizing various AI models and tools in a seamless manner, enabling developers to build sophisticated AI applications efficiently.
 A man on a mission without a vision will end up in confusion.


Feel free to try this out with your own audio files

To asynchronously transcribe audio files in batch, we just have to use the `abatch` method. 
Run the code below to asynchronously transcribe audio files in batch.

In [None]:
result_transcriptions = await model.abatch_transcribe(path_task_dict)

# Notebook Metadata

In [15]:
from swarmauri.utils import print_notebook_metadata

metadata = print_notebook_metadata.print_notebook_metadata("Victory Nnaji", "3rd-Son")
print(metadata) 

Author: Victory Nnaji
GitHub Username: 3rd-Son
Notebook File: Notebook_03_Audio_To_Text_with_GroqAudio.ipynb
Last Modified: 2024-12-30 13:10:57.651688
Platform: Darwin 24.1.0
Python Version: 3.11.11 (main, Dec 11 2024, 10:25:04) [Clang 14.0.6 ]
Swarmauri Version: 0.5.2
None
