<a href="https://colab.research.google.com/github/Troyanovsky/awesome-TTS-Colab/blob/main/piper1_gpl_TTS.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🗣️ Piper GPL TTS Google Colab

## 📄 Description  
This Colab notebook uses Piper to generate speech from text. It is designed for fast, local execution.

**Capabilities**: Text-to-speech, Multi-language support (20+ languages), Multiple voices, Local execution, Customizable voices (training support)

---

## How to use

- Follow the instructions from the comments to change the text_to_generate
- Run all cells in the section you need
- The generated output will be in `output.wav`

---

## 🔗 Resources

- **GitHub Repository:** https://github.com/OHF-Voice/piper1-gpl
- **Model Availability:** https://huggingface.co/rhasspy/piper-voices

---

## 🎙️ Explore More TTS Models  
Want to try out additional TTS models? Check out the curated collection here:  
👉 [awesome-TTS-Colab](https://github.com/Troyanovsky/awesome-TTS-Colab)

## General TTS

In [1]:
!pip install piper-tts -q
!pip install onnxruntime-gpu -q

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.8/13.8 MB[0m [31m93.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.5/16.5 MB[0m [31m100.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.0/46.0 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m86.8/86.8 kB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m283.2/283.2 MB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
import os
import wave
from piper import PiperVoice, SynthesisConfig
from IPython.display import Audio, display
import ipywidgets as widgets
from IPython.display import display, HTML

# Check if piper-tts is installed, if not, install it.
try:
    import piper
except ImportError:
    print("Installing piper-tts...")
    !pip install piper-tts -q
    print("piper-tts installed.")

# Check if onnxruntime-gpu is needed and can be installed
# This part is optional but good for performance on Colab's GPUs
try:
    import torch
    if torch.cuda.is_available():
        print("CUDA detected. Installing onnxruntime-gpu for faster inference...")
        !pip install onnxruntime-gpu -q
        print("onnxruntime-gpu installed.")
    else:
        print("No CUDA detected or GPU not available. Using CPU for inference.")
except Exception as e:
    print(f"Could not check for CUDA or install onnxruntime-gpu: {e}")
    print("Proceeding with CPU inference.")

CUDA detected. Installing onnxruntime-gpu for faster inference...
onnxruntime-gpu installed.


In [4]:
# @markdown ### 📝 Enter your text here:
text_to_generate = "This was generatd by Piper TTS. Hello!" # @param {type:"string"}

# @markdown ### ⚙️ Synthesis Configuration:
# @markdown Adjust these parameters to change the output speech characteristics.
volume = 1.0 # @param {type:"slider", min:0.1, max:2.0, step:0.1}
length_scale = 1.0 # @param {type:"slider", min:0.5, max:2.0, step:0.05}
noise_scale = 0.667 # @param {type:"slider", min:0.0, max:1.0, step:0.01}
noise_w_scale = 0.8 # @param {type:"slider", min:0.0, max:1.0, step:0.01}
normalize_audio = True # @param {type:"boolean"}

# Download a voice model
# You can choose other voices from https://huggingface.co/rhasspy/piper-voices
# For example, 'en_US-lessac-medium' or 'en_US-libritts-medium'
voice_name = "en_US-lessac-medium"
model_dir = "piper_voices"
model_path = os.path.join(model_dir, f"{voice_name}.onnx")

if not os.path.exists(model_path):
    print(f"Downloading voice model: {voice_name}...")
    # Ensure the directory exists
    os.makedirs(model_dir, exist_ok=True)
    # Corrected argument: use --download-dir instead of -o
    !python3 -m piper.download_voices {voice_name} --download-dir {model_dir}
    print(f"Voice model downloaded to {model_path}")
else:
    print(f"Voice model {voice_name} already downloaded.")

# Load the voice model
use_cuda = False
try:
    import torch
    if torch.cuda.is_available() and "onnxruntime_gpu" in globals(): # Check if onnxruntime-gpu was successfully imported/installed
        use_cuda = True
        print("Loading voice model with CUDA acceleration.")
    else:
        print("Loading voice model using CPU.")
except Exception:
    print("Loading voice model using CPU (torch not found or CUDA not available).")


try:
    voice = PiperVoice.load(model_path, use_cuda=use_cuda)
except Exception as e:
    print(f"Error loading voice model: {e}")
    print("Attempting to load without CUDA acceleration if it failed with CUDA.")
    voice = PiperVoice.load(model_path, use_cuda=False)


# Set up synthesis configuration
syn_config = SynthesisConfig(
    volume=volume,
    length_scale=length_scale,
    noise_scale=noise_scale,
    noise_w_scale=noise_w_scale,
    normalize_audio=normalize_audio,
)

# Synthesize speech and save to WAV file
output_filename = "output.wav"
print(f"Generating speech to '{output_filename}'...")
try:
    with wave.open(output_filename, "wb") as wav_file:
        voice.synthesize_wav(text_to_generate, wav_file, syn_config=syn_config)
    print("Speech generation complete!")

    # 7. Display the generated audio
    print("\n--- Generated Audio ---")
    display(Audio(output_filename))

except Exception as e:
    print(f"An error occurred during speech synthesis: {e}")
    print("Please check the input text and ensure the model loaded correctly.")

Downloading voice model: en_US-lessac-medium...
INFO:__main__:Downloaded: en_US-lessac-medium
Voice model downloaded to piper_voices/en_US-lessac-medium.onnx
Loading voice model using CPU.
Generating speech to 'output.wav'...
Speech generation complete!

--- Generated Audio ---
