# 🔊 Bark Text-to-Audio Generator

Generate realistic audio from text using Suno's Bark model. This notebook runs in Colab's free tier and requires no API key.

## Model Information
Bark is currently the best open-source option for realistic speech generation, offering:
- High-quality, natural-sounding speech synthesis
- Multilingual support with consistent quality
- Ability to generate music and sound effects
- Zero-shot voice cloning capabilities

While newer models exist, Bark remains the most reliable and versatile option for Colab users, providing production-quality results without requiring significant computational resources.

## Features
- Generate natural-sounding speech
- Multiple voice presets
- Support for sound effects and music
- Adjustable generation parameters

## Setup
First, let's install the required packages:

In [None]:
!pip install -q git+https://github.com/suno-ai/bark.git
!pip install -q gradio

## Import Dependencies

In [None]:
from bark import SAMPLE_RATE, generate_audio, preload_models
from scipy.io.wavfile import write as write_wav
import gradio as gr
import numpy as np
import torch
import os

## Load Models

In [None]:
# Download and load all models
preload_models()

print(f"Models loaded on: {'GPU' if torch.cuda.is_available() else 'CPU'}")

## Define Voice Presets

In [None]:
VOICE_PRESETS = {
    "Announcer (Male)": "v2/en_speaker_6",
    "Narrator (Female)": "v2/en_speaker_9",
    "Professional (Male)": "v2/en_speaker_3",
    "Casual (Female)": "v2/en_speaker_7",
    "Elder (Male)": "v2/en_speaker_5",
    "Young (Female)": "v2/en_speaker_8",
}

## Create Generation Function

In [None]:
def generate_audio_from_text(text, voice_preset, temperature=0.7):
    # Generate audio from text
    audio_array = generate_audio(
        text,
        history_prompt=VOICE_PRESETS[voice_preset],
        temperature=temperature
    )
    
    # Save audio to temporary file
    output_path = "output.wav"
    write_wav(output_path, SAMPLE_RATE, audio_array)
    
    return output_path

## Create Gradio Interface

In [None]:
interface = gr.Interface(
    fn=generate_audio_from_text,
    inputs=[
        gr.Textbox(
            label="Text",
            placeholder="Enter the text you want to convert to speech...",
            lines=5
        ),
        gr.Dropdown(
            choices=list(VOICE_PRESETS.keys()),
            value="Narrator (Female)",
            label="Voice Preset"
        ),
        gr.Slider(
            minimum=0.1,
            maximum=1.0,
            value=0.7,
            step=0.1,
            label="Temperature",
            info="Higher values = more variation"
        )
    ],
    outputs=gr.Audio(label="Generated Audio"),
    title="Bark Text-to-Audio Generator",
    description="Convert text to natural-sounding speech using Suno's Bark model.",
    examples=[
        ["Welcome to the future of text-to-speech technology. This audio was generated using artificial intelligence.",
         "Announcer (Male)",
         0.7],
        ["Once upon a time, in a galaxy far, far away, there lived a young adventurer who dreamed of exploring the stars.",
         "Narrator (Female)",
         0.7],
        ["Breaking news: Scientists have discovered a new species of deep-sea creatures living in the Mariana Trench.",
         "Professional (Male)",
         0.7]
    ]
)

interface.launch(share=True)