# UTTS - Compare Text-to-Speech Models

### Installation

In [None]:
!pip install --upgrade git+https://github.com/arch1baald/utts.git

Obtain API keys for the services you want to use:
- [OpenAI](https://platform.openai.com/settings/api-keys)
- [ElevenLabs](https://elevenlabs.io/app/settings/api-keys)
- [Replicate](https://replicate.com/account/api-tokens) (for Kokoro and Orpheus)
- [Zyphra/Zonos](https://playground.zyphra.com/settings/api-keys)
- [Hume AI](https://platform.hume.ai/settings/keys)
- [Cartesia](https://play.cartesia.ai/keys)

In [None]:
from IPython.display import Audio

import utts
from utts.utils import batch_generate, random_choice_enum


# Leave blank if you don't have a key
api_keys = """
OPENAI__API_KEY=
ELEVENLABS__API_KEY=
REPLICATE__API_KEY=
ZYPHRA__API_KEY=
HUME__API_KEY=
CARTESIA__API_KEY=
"""
with open('.env', 'w') as fout:
    fout.write(api_keys)

### Quick Start

In [None]:
audio = utts.elevenlabs.generate('Hello, world!')
Audio(audio)

### Batch generation
With default voices and models:

In [None]:
text = "Hello, world!"

batch = [
    (utts.openai.generate, text),
    (utts.elevenlabs.generate, text),
    (utts.cartesia.generate, text),
    (utts.kokoro.generate, text),
    (utts.hume.generate, text),
    (utts.zyphra.generate, text)
]

res = batch_generate(batch)

With random voices and models:

In [None]:
text = "Hello, world!"

batch = [
    (utts.openai.generate, text, {"voice": random_choice_enum(utts.openai.Voice), "model": random_choice_enum(utts.openai.Model)}),
    (utts.elevenlabs.generate, text, {"voice": random_choice_enum(utts.elevenlabs.Voice), "model": random_choice_enum(utts.elevenlabs.Model)}),
    (utts.cartesia.generate, text, {"voice": random_choice_enum(utts.cartesia.Voice), "model": random_choice_enum(utts.cartesia.Model)}),
    (utts.hume.generate, text),
    (utts.kokoro.generate, text, {"voice": random_choice_enum(utts.kokoro.Voice), "model": random_choice_enum(utts.kokoro.Model)}),
    (utts.zyphra.generate, text, {"voice": random_choice_enum(utts.zyphra.Voice), "model": random_choice_enum(utts.zyphra.Model)}),
]

res = batch_generate(batch)