In [None]:
# Spanish Voice Cloning with Tortoise TTS

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/juanvolpe/voiceJuan/blob/main/colab_spanish_tts.ipynb)

This notebook will help you:
1. Set up the Spanish voice cloning system
2. Upload your voice samples
3. Generate Spanish speech with your voice

First, let's set up the environment:


In [None]:
# Clone the repository
!git clone https://github.com/juanvolpe/voiceJuan.git
%cd voiceJuan

# Install dependencies
!pip install -r requirements.txt


In [None]:
## Upload Voice Samples

Now you can upload your WAV files. Make sure they are:
- Clear Spanish speech
- WAV format
- Good quality audio

Run the cell below and use the 'Choose Files' button to upload your samples:


In [None]:
from google.colab import files
import os

# Create directories
!mkdir -p tortoise/voices/juan_es/samples

# Upload interface
print("Please upload your WAV files...")
uploaded = files.upload()

# Save files
for filename in uploaded.keys():
    if filename.endswith('.wav'):
        path = f'tortoise/voices/juan_es/samples/{filename}'
        with open(path, 'wb') as f:
            f.write(uploaded[filename])
        print(f'Saved {filename}')

# List all uploaded samples
print("\nUploaded voice samples:")
!ls tortoise/voices/juan_es/samples/


In [None]:
## Generate Speech

Now you can generate speech using your voice samples. The system will ask if you want to:
1. Use existing voice cache (faster)
2. Reprocess voice samples (use if you've added new samples)

Run the cell below to start generating:


In [None]:
from spanish_tortoise import SpanishTTS
from IPython.display import Audio

# Initialize TTS
tts = SpanishTTS()  # Will ask about cache usage

# Get text input
text = input("Enter Spanish text: ")

# Available presets
presets = ['ultra_fast', 'fast', 'standard', 'high_quality']
print("\nAvailable quality presets:")
for i, p in enumerate(presets, 1):
    print(f"{i}. {p}")

# Get preset choice
while True:
    choice = input("\nSelect quality (1-4) [default=2]: ").strip()
    if not choice:
        preset = 'fast'
        break
    try:
        idx = int(choice) - 1
        if 0 <= idx < len(presets):
            preset = presets[idx]
            break
    except ValueError:
        pass
    print("Please enter a number between 1 and 4")

# Generate speech
output_file = tts.generate_speech(text, preset=preset)

# Play the generated audio
Audio(output_file)


In [None]:
## Download Generated Audio

You can download the generated audio file:


In [None]:
files.download(output_file)
