# **Text to Audio with OpenAIAudioTTS**

### **Introduction:**  
OpenAIAudioTTS provides advanced text-to-speech models optimized for speed and naturalness, enabling efficient text-to-audio conversion tasks. OpenAIAudioTTS's models are designed for developers and researchers who need high-performance solutions for generating audio content from text.  

This notebook guides users through setting up and utilizing OpenAIAudioTTS's models for various text-to-audio use cases.

This notebook will:  

- Introduce the setup and initialization of OpenAIAudioTTS within **Swarmauri**.  
- Demonstrate **synchronous text-to-audio conversion**, **asynchronous** and **batch processing** for multiple text inputs.  

By the end of this notebook, you will have a comprehensive understanding of how to use OpenAIAudioTTS's models to efficiently and effectively generate audio content from text.

# **Setup and Configuration**

## Import the OpenAIAudioTTS class

In [2]:
from swarmauri.llms.concrete.OpenAIAudioTTS import OpenAIAudioTTS

### Load your API KEY from your environment variables
- Make sure you have python-dotenv installed if not, run `pip install python-dotenv` so you can install it.
- Get your API KEY [HERE](https://platform.openai.com/settings/organization/api-keys)

In [3]:
import os
from dotenv import load_dotenv
load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

### Initialize OpenAIAudioTTS Model
- Note: You can as well input your api key directly, but it's better to load from env file
- Also, the `name` argument is an optional argument that allows you to input a model from the list of allowed_models.
- Also, the `voice` argument let's you choose a a voice from the available voices

In [4]:
model = OpenAIAudioTTS(api_key=OPENAI_API_KEY, name="tts-1", voice="alloy") 

#### To quickly see the allowed_voices, run the code below

In [5]:
available_voices = model.allowed_voices
print(available_voices)

['alloy', 'echo', 'fable', 'onyx', 'nova', 'shimmer']


### Generate Audio using the `predict` method from the OpenAIAudioTTS class

In [None]:
# Add a simple text input
input_text = "Swarmauri is a Python library designed for working with large language models (LLMs), image generation models, vision models, audio models, vector databases, tools, and agents. It provides a comprehensive framework for integrating and utilizing various AI models and tools in a seamless manner, enabling developers to build sophisticated AI applications efficiently."
file_path = "swarmauri.mp3"

# Generate the audio file
audio_file_path = model.predict(text=input_text, audio_path=file_path)

print(f"Successfully generated audio and saved it at {file_path}")

Successfully generated audio and saved it at swarmauri.mp3


As you can see, the `OpenAIAudioTTS` model returns the path to the generated audio file and when you navigate there, you will see the audio and will be able to play it out.

#### To generate the audio file asynchronously, use the code below

In [18]:
audio_file_path = await model.apredict(text=input_text, audio_path=file_path)

## Generating batch audio files 

To generate audio files in batch, we just have to use the `batch` method

In [14]:
text_path_dict = {
    "Data is the new oil": "data.mp3",
    "A man on a mission, without a vision, will end up in confusion": "mission.mp3",
    "I am a python developer": "python.mp3",
}

results = model.batch(text_path_dict=text_path_dict)

As you can see, the three audio files have been saved as .mp3 files

To asynchronously generate audio files in batch, we just have to use the `abatch` method. 
Run the code below to asynchronously generate audio files in batch.

In [None]:
result_audio_files = await model.abatch(text_path_dict=text_path_dict)

# Notebook Metadata

In [15]:
from swarmauri.utils import print_notebook_metadata

metadata = print_notebook_metadata.print_notebook_metadata("Victory Nnaji", "3rd-Son")
print(metadata) 

Author: Victory Nnaji
GitHub Username: 3rd-Son
Notebook File: Notebook_02_Text_To_Audio_with_OpenAIAudioTTS.ipynb
Last Modified: 2024-12-30 13:59:54.410822
Platform: Darwin 24.1.0
Python Version: 3.11.11 (main, Dec 11 2024, 10:25:04) [Clang 14.0.6 ]
Swarmauri Version: 0.5.2
None
