# **Notebook 04: Audio Parameters**
# **Introduction:** 

In this notebook, we'll explore how various audio parameters affect the output of text-to-speech synthesis, leveraging the OpenAIAudioTTS class from the Swarmauri framework. By modifying parameters like model type, voice, and concurrency in batch processing, we gain fine-grained control over the generated audio output. Understanding these parameters will allow for more tailored and efficient TTS applications.

**Import of dependencies**

In [2]:
import os
import asyncio
from dotenv import load_dotenv
from swarmauri.llms.concrete.OpenAIAudioTTS import OpenAIAudioTTS
from pathlib import Path

**Load environment variables and setup**

In [3]:

load_dotenv()
API_KEY = os.getenv("OPENAI_API_KEY")

**Initialize TTS model**

In [4]:

tts = OpenAIAudioTTS(api_key=API_KEY)

**Display model capabilities**

In [5]:

print("OpenAI TTS Configuration:")
print(f"Available Models: {tts.allowed_models}")
print(f"Available Voices: {tts.allowed_voices}")
print(f"Default Model: {tts.name}")
print(f"Default Voice: {tts.voice}")

OpenAI TTS Configuration:
Available Models: ['tts-1', 'tts-1-hd']
Available Voices: ['alloy', 'echo', 'fable', 'onyx', 'nova', 'shimmer']
Default Model: tts-1
Default Voice: alloy


**Setup output directory**

In [6]:

output_dir = Path("output")
output_dir.mkdir(exist_ok=True)

## **1.** **Basic Text-to-Speech Generation**

In [7]:
sample_text = "Welcome to the text-to-speech demonstration using OpenAI's TTS service."

**Generate speech with default settings**

In [11]:
output_path = output_dir / "basic_output.mp3"
print("\nGenerating basic TTS output...")
audio_file = tts.predict(
    text=sample_text,
    audio_path=str(output_path)
)

print("\nDemonstration complete! Check the output directory for generated audio files.")


Generating basic TTS output...

Demonstration complete! Check the output directory for generated audio files.


## **2. Try Different Voices**

In [None]:

print("\nGenerating samples with different voices...")
for voice in tts.allowed_voices:
    tts.voice = voice
    voice_path = output_dir / f"voice_{voice}.mp3"
    audio_file = tts.predict(
        text=sample_text,
        audio_path=str(voice_path)
    )
    print(f"This is the {voice} voice speaking.")
    print(f"\nDemonstration complete! Check the output directory for generated {voice} audio files.")

## **3. Streaming Audio Generation**


In [None]:
print("\nDemonstrating streaming audio generation...")
stream_text = "This is a streaming audio demonstration."
print("Collecting audio chunks...")
chunk_count = 0
for chunk in tts.stream(text=stream_text):
    chunk_count += 1
print(f"Received {chunk_count} audio chunks")

## **4. Batch Processing**

In [None]:

print("\nDemonstrating batch processing...")
batch_texts = {
    "Hello world": str(output_dir / "batch_1.mp3"),
    "This is a test": str(output_dir / "batch_2.mp3"),
    "Batch processing demo": str(output_dir / "batch_3.mp3")
}

In [None]:
batch_results = tts.batch(text_path_dict=batch_texts)
print("Batch processing results:")
for result in batch_results:
    print(f"Generated: {result}")

## **5. Async Operations**

**Async prediction**

In [None]:
print("\nDemonstrating async operations...")
async_output = output_dir / "async_output.mp3"
result = await tts.apredict(
    text="This is an async TTS test",
    audio_path=str(async_output)
)
print("\nDemonstration complete! Check the output directory for generated audio files.")

**Async batch processing**

In [None]:
async_batch = {
        "First async message": str(output_dir / "async_1.mp3"),
        "Second async message": str(output_dir / "async_2.mp3")
    }
results = await tts.abatch(text_path_dict=async_batch, max_concurrent=2)
print("Async batch results:")
for result in results:
    print(f"Generated: {result}")

## **6. High-Definition Model**

In [10]:


print("\nTrying HD model...")
tts.name = "tts-1-hd"
hd_output = output_dir / "hd_output.mp3"
hd_result = tts.predict(
    text="This is high-definition text-to-speech audio.",
    audio_path=str(hd_output)
)

print("\nDemonstration complete! Check the output directory for generated audio files.")


Trying HD model...

Demonstration complete! Check the output directory for generated audio files.


# **Conclusion:**

In this notebook, we explored different audio parameters, including model selection, voice variation, streaming, and batch processing with concurrency control. 
These parameter adjustments provide flexibility in TTS applications, from real-time streaming to high-volume processing. Understanding and effectively tuning these parameters will enable efficient and adaptable audio generation solutions.