<a href="https://colab.research.google.com/github/Amey-Thakur/DEEPFAKE-AUDIO/blob/main/DEEPFAKE-AUDIO.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#
<h1 align="center">🎙️ Deepfake Audio</h1>
<h3 align="center"><i>A neural voice cloning studio powered by SV2TTS technology</i></h3>

<div align="center">

| **Author** | **Profiles** |
|:---:|:---|
| **Amey Thakur** | [![GitHub](https://img.shields.io/badge/GitHub-Amey--Thakur-181717?logo=github)](https://github.com/Amey-Thakur) [![ORCID](https://img.shields.io/badge/ORCID-0000--0001--5644--1575-A6CE39?logo=orcid)](https://orcid.org/0000-0001-5644-1575) [![Google Scholar](https://img.shields.io/badge/Google_Scholar-Amey_Thakur-4285F4?logo=google-scholar&logoColor=white)](https://scholar.google.ca/citations?user=0inooPgAAAAJ&hl=en) [![Kaggle](https://img.shields.io/badge/Kaggle-Amey_Thakur-20BEFF?logo=kaggle)](https://www.kaggle.com/ameythakur20) |
| **Mega Satish** | [![GitHub](https://img.shields.io/badge/GitHub-msatmod-181717?logo=github)](https://github.com/msatmod) [![ORCID](https://img.shields.io/badge/ORCID-0000--0002--1844--9557-A6CE39?logo=orcid)](https://orcid.org/0000-0002-1844-9557) [![Google Scholar](https://img.shields.io/badge/Google_Scholar-Mega_Satish-4285F4?logo=google-scholar&logoColor=white)](https://scholar.google.ca/citations?user=7Ajrr6EAAAAJ&hl=en) [![Kaggle](https://img.shields.io/badge/Kaggle-Mega_Satish-20BEFF?logo=kaggle)](https://www.kaggle.com/megasatish) |

---

**Attribution:** This project builds upon the foundational work of [CorentinJ/Real-Time-Voice-Cloning](https://github.com/CorentinJ/Real-Time-Voice-Cloning).

🚀 **Live Demo:** [Hugging Face Space](https://huggingface.co/spaces/ameythakur/Deepfake-Audio) | 🎥 **Video Demo:** [YouTube](https://youtu.be/i3wnBcbHDbs) | 💻 **Repository:** [GitHub](https://github.com/Amey-Thakur/DEEPFAKE-AUDIO)

<a href="https://youtu.be/i3wnBcbHDbs">
  <img src="https://img.youtube.com/vi/i3wnBcbHDbs/0.jpg" alt="Video Demo" width="60%">
</a>

</div>

## 📖 Introduction

> **An audio deepfake is when a “cloned” voice that is potentially indistinguishable from the real person’s is used to produce synthetic audio.**

This research notebook demonstrates the **SV2TTS (Speaker Verification to Text-to-Speech)** framework, a three-stage deep learning pipeline capable of cloning a voice from a mere 5 seconds of audio.

### The Pipeline
1.  **Speaker Encoder**: Creates a fixed-dimensional embedding (fingerprint) from the reference audio.
2.  **Synthesizer**: Generates a Mel Spectrogram from text, conditioned on the speaker embedding.
3.  **Vocoder**: Converts the Mel Spectrogram into a raw time-domain waveform (audible speech).

## ☁️ Cloud Environment Setup
Execute the following cell **only** if you are running this notebook in a cloud environment like **Google Colab** or **Kaggle**.

This script will:
1.  **Clone the Repository**: Tries GitHub first, then falls back to **Personal Hugging Face Space** (`ameythakur/Deepfake-Audio`) if GitHub fails.
2.  **Environment Detection**: Automatically detects **Kaggle** vs **Colab**.
3.  **Data Retrieval**:
    *   **Kaggle**: Links directly from `/kaggle/input/deepfakeaudio` (No download needed).
    *   **Others**: Attempts Git LFS pull.
4.  **Fallback to Kagglehub**: If LFS budget exceeded, downloads from `ameythakur20/deepfakeaudio`.
5.  Install all required Python and System dependencies.

In [1]:
import os
import sys
import shutil

# Detect Cloud Environment (Colab/Kaggle)
try:
    shell = get_ipython()
    if 'google.colab' in str(shell):
        print("💻 Detected Google Colab Environment. Initiating setup...")

        # 1. Clone the Repository (GitHub with HF Fallback)
        if not os.path.exists("DEEPFAKE-AUDIO"):
            print("⬇️ Cloning DEEPFAKE-AUDIO repository from MAIN (GitHub)...")
            clone_status = shell.system("git clone https://github.com/Amey-Thakur/DEEPFAKE-AUDIO")

            # Fallback to Hugging Face if GitHub clone failed (folder empty or not created)
            if not os.path.exists("DEEPFAKE-AUDIO") or not os.listdir("DEEPFAKE-AUDIO"):
                print("⚠️ GitHub Clone Failed. Attempting Fallback: Personal Hugging Face Space...")
                if os.path.exists("DEEPFAKE-AUDIO"): shutil.rmtree("DEEPFAKE-AUDIO")
                shell.system("git clone https://huggingface.co/spaces/ameythakur/Deepfake-Audio DEEPFAKE-AUDIO")
                print("✅ Cloned from Hugging Face Space.")

        os.chdir("/content/DEEPFAKE-AUDIO")

        # Install Dependencies (Colab)
        print("🔧 Installing dependencies...")
        shell.system("apt-get install -y libsndfile1")
        shell.system("pip install librosa==0.9.2 unidecode webrtcvad inflect umap-learn scikit-learn>=1.3 tqdm scipy 'matplotlib>=3.7,<3.9' Pillow>=10.2 soundfile huggingface_hub")



        # 2. Attempt Git LFS (Colab)
        print("📦 Attempting Git LFS pull...")
        shell.system("git lfs install")
        lfs_status = shell.system("git lfs pull")

        # 3. Check for Fallback (If LFS failed or budget exceeded)
        sample_trigger = "Dataset/samples/Steve Jobs.wav"
        if lfs_status != 0 or not os.path.exists(sample_trigger) or os.path.getsize(sample_trigger) < 1000:
            print("⚠️ GitHub LFS Budget Exceeded or Pull Failed. Using Kaggle Fallback...")
            shell.system("pip install kagglehub")
            import kagglehub
            print("🚀 Downloading assets from Kagglehub (ameythakur20/deepfakeaudio)...")
            k_path = kagglehub.dataset_download("ameythakur20/deepfakeaudio")

            # Link/Copy samples
            k_samples = os.path.join(k_path, "samples")
            if os.path.exists(k_samples):
                 if os.path.exists("Dataset/samples"):
                      shutil.rmtree("Dataset/samples")
                 os.makedirs("Dataset", exist_ok=True)
                 os.symlink(k_samples, "Dataset/samples")
                 print("✅ Samples linked from Kaggle.")

            # Link/Copy models
            for model in ["encoder.pt", "synthesizer.pt", "vocoder.pt"]:
                 k_model = os.path.join(k_path, model)
                 if os.path.exists(k_model):
                      target = os.path.join("Dataset", model)
                      if os.path.exists(target): os.remove(target)
                      os.symlink(k_model, target)
            print("✅ Models linked from Kaggle.")

    elif "kaggle" in os.environ.get("KAGGLE_KERNEL_RUN_TYPE", ""):
        print("💻 Detected Kaggle Environment. Initiating setup...")
        os.chdir("/kaggle/working")

        # 1. Clone the Repository (GitHub with HF Fallback)
        if not os.path.exists("DEEPFAKE-AUDIO"):
            print("⬇️ Cloning DEEPFAKE-AUDIO repository from MAIN (GitHub)...")
            shell.system("git clone https://github.com/Amey-Thakur/DEEPFAKE-AUDIO")

            if not os.path.exists("DEEPFAKE-AUDIO") or not os.listdir("DEEPFAKE-AUDIO"):
                print("⚠️ GitHub Clone Failed. Attempting Fallback: Personal Hugging Face Space...")
                if os.path.exists("DEEPFAKE-AUDIO"): shutil.rmtree("DEEPFAKE-AUDIO")
                shell.system("git clone https://huggingface.co/spaces/ameythakur/Deepfake-Audio DEEPFAKE-AUDIO")
                print("✅ Cloned from Hugging Face Space.")

        os.chdir("/kaggle/working/DEEPFAKE-AUDIO")

        # 2. Priority: Link Kaggle Dataset (Skip LFS pull if dataset exists)
        kaggle_input = "/kaggle/input/deepfakeaudio"
        if os.path.exists(kaggle_input):
            print(f"✅ Kaggle Dataset Detected at {kaggle_input}. Linking assets...")
            # Link logic specific to Kaggle structure
            if os.path.exists("Dataset/samples"):
                 shutil.rmtree("Dataset/samples")
            if not os.path.exists("Dataset"):
                 os.makedirs("Dataset")
            # Attempt to symlink folder or copy items
            try:
                if os.path.exists(os.path.join(kaggle_input, "samples")):
                    os.symlink(os.path.join(kaggle_input, "samples"), "Dataset/samples")
                for model in ["encoder.pt", "synthesizer.pt", "vocoder.pt"]:
                    src = os.path.join(kaggle_input, model)
                    dst = os.path.join("Dataset", model)
                    if os.path.exists(src):
                        if os.path.exists(dst): os.remove(dst)
                        os.symlink(src, dst)
            except Exception as e: print(f"Warning during linking: {e}")
            print("✅ Assets linked from Kaggle Input.")
        else:
            print("⚠️ Kaggle Input not found. Attempting standard LFS pull...")
            shell.system("git lfs install")
            shell.system("git lfs pull")

        # Install Dependencies
        print("🔧 Installing dependencies...")
        shell.system("apt-get install -y libsndfile1")
        shell.system("pip install librosa==0.9.2 unidecode webrtcvad inflect umap-learn scikit-learn>=1.3 tqdm scipy 'matplotlib>=3.7,<3.9' Pillow>=10.2 soundfile huggingface_hub")

        # 2. Attempt Git LFS
        print("📦 Attempting Git LFS pull...")
        shell.system("git lfs install")
        lfs_status = shell.system("git lfs pull")

        # 3. Check for Fallback (If LFS failed or budget exceeded)
        # Detection: If samples folder is empty or contains small pointer files
        sample_trigger = "Dataset/samples/Steve Jobs.wav"
        if lfs_status != 0 or not os.path.exists(sample_trigger) or os.path.getsize(sample_trigger) < 1000:
            print("⚠️ GitHub LFS Budget Exceeded or Pull Failed. Using Kaggle Fallback...")
            shell.system("pip install kagglehub")
            import kagglehub

            # Pull from public Kaggle dataset
            print("🚀 Downloading assets from Kagglehub (ameythakur20/deepfakeaudio)...")
            k_path = kagglehub.dataset_download("ameythakur20/deepfakeaudio")

            # Link/Copy samples
            k_samples = os.path.join(k_path, "samples")
            if os.path.exists(k_samples):
                 if os.path.exists("Dataset/samples"):
                      shutil.rmtree("Dataset/samples")
                 os.makedirs("Dataset", exist_ok=True)
                 os.symlink(k_samples, "Dataset/samples")
                 print("✅ Samples linked from Kaggle.")

            # Link/Copy models
            for model in ["encoder.pt", "synthesizer.pt", "vocoder.pt"]:
                 k_model = os.path.join(k_path, model)
                 if os.path.exists(k_model):
                      target = os.path.join("Dataset", model)
                      if os.path.exists(target): os.remove(target)
                      os.symlink(k_model, target)
            print("✅ Models linked from Kaggle.")

        # 4. Pull Latest Code Changes
        print("🔄 Synchronizing with remote repository...")
        shell.system("git pull")

        # 5. Install System Dependencies
        print("🔧 Installing system dependencies (libsndfile1)...")
        shell.system("apt-get install -y libsndfile1")

        # 6. Install Python Dependencies
        print("📦 Installing Python libraries...")
        shell.system("pip install librosa==0.9.2 unidecode webrtcvad inflect umap-learn scikit-learn>=1.3 tqdm scipy 'matplotlib>=3.7,<3.9' Pillow>=10.2 soundfile huggingface_hub")

        print("✅ Environment setup complete. Ready for cloning.")
    else:
        print("🏠 Running in local or custom environment. Skipping cloud setup.")
except NameError:
    print("🏠 Running in local or custom environment. Skipping cloud setup.")

💻 Detected Google Colab Environment. Initiating setup...
⬇️ Cloning DEEPFAKE-AUDIO repository from MAIN (GitHub)...
Cloning into 'DEEPFAKE-AUDIO'...
remote: Enumerating objects: 682, done.[K
remote: Counting objects: 100% (62/62), done.[K
remote: Compressing objects: 100% (55/55), done.[K
remote: Total 682 (delta 7), reused 7 (delta 7), pack-reused 620 (from 1)[K
Receiving objects: 100% (682/682), 71.95 MiB | 12.37 MiB/s, done.
Resolving deltas: 100% (328/328), done.
Downloading Dataset/encoder.pt (17 MB)
Error downloading object: Dataset/encoder.pt (39373b8): Smudge error: Error downloading Dataset/encoder.pt (39373b86598fa3da9fcddee6142382efe09777e8d37dc9c0561f41f0070f134e): batch response: This repository exceeded its LFS budget. The account responsible for the budget should increase it to restore access.

Errors logged to /content/DEEPFAKE-AUDIO/.git/lfs/logs/20260129T113138.869344323.log
Use `git lfs logs last` to view the log.
error: external filter 'git-lfs filter-process' f

100%|██████████| 550M/550M [00:25<00:00, 22.7MB/s]

Extracting files...





✅ Samples linked from Kaggle.
✅ Models linked from Kaggle.


## 1️⃣ Model & Data Initialization

We prioritize data availability to ensure the notebook runs smoothly regardless of the platform. The system checks for checkpoints in this order:

1.  **Repository Local** (`Dataset/` / `Source Code/`): Fast local access if cloned.
2.  **Kaggle Dataset** (`/kaggle/input/deepfakeaudio/`): Pre-loaded environment data.
    *   *Reference*: [Amey Thakur's Kaggle Dataset](https://www.kaggle.com/datasets/ameythakur20/deepfakeaudio)
3.  **Personal Backup** (Hugging Face Space): `ameythakur/Deepfake-Audio`.
    *   *Reference*: [Amey Thakur's HF Space](https://huggingface.co/spaces/ameythakur/Deepfake-Audio)
4.  **HuggingFace Auto-Download**: Robust fallback for fresh environments.
    *   *Reference*: [CorentinJ's SV2TTS Repository](https://huggingface.co/CorentinJ/SV2TTS)

In [2]:
import sys
import os
from pathlib import Path
import zipfile
import shutil

# Determine if running in Google Colab
IS_COLAB = 'google.colab' in sys.modules

# Register 'Source Code' to Python path for module imports
source_path = os.path.abspath("Source Code")
if source_path not in sys.path:
    sys.path.append(source_path)

print(f"📂 Working Directory: {os.getcwd()}")
print(f"✅ Module Path Registered: {source_path}")

# Define paths for model checkpoints
extract_path = "pretrained_models"

if not os.path.exists(extract_path):
    os.makedirs(extract_path)

# --- 🧠 Checkpoint Verification Strategy ---
print("⬇️ Verifying Model Availability...")

# Priority 1: Check Local Repository 'Dataset/' folder
core_models = ["encoder.pt", "synthesizer.pt", "vocoder.pt"]

def is_valid_pt(p):
    """Checks if a file exists and is not an LFS pointer (typically < 1KB)."""
    return os.path.exists(p) and os.path.getsize(p) > 1000

dataset_models_present = all([is_valid_pt(os.path.join("Dataset", m)) for m in core_models])

if dataset_models_present:
    print("✅ Found high-priority local models in 'Dataset/'. Verified.")
else:
    # Priority 2: Check Kaggle Dataset (Online Pre-loaded environment data)
    kaggle_path = "/kaggle/input/deepfakeaudio"
    kaggle_models_present = all([is_valid_pt(os.path.join(kaggle_path, m)) for m in core_models])

    if kaggle_models_present:
        print(f"✅ Found hardcoded Kaggle Dataset models at {kaggle_path}. Skipping download.")
    else:
        print("⚠️ Models not found or are LFS pointers. Attempting fallback download...")

        # Priority 3: Personal Hugging Face Space (ameythakur/Deepfake-Audio)
        personal_hf_success = False
        try:
            print("🚀 Attempting download from Personal Hugging Face Space (ameythakur/Deepfake-Audio)...")
            from huggingface_hub import hf_hub_download
            os.makedirs("pretrained_models", exist_ok=True)
            for model in core_models:
                 try:
                     fpath = hf_hub_download(repo_id="ameythakur/Deepfake-Audio", filename=f"Dataset/{model}", repo_type="space", local_dir="pretrained_models")
                 except:
                     fpath = hf_hub_download(repo_id="ameythakur/Deepfake-Audio", filename=model, repo_type="space", local_dir="pretrained_models")
                 target = os.path.join("pretrained_models", model)
                 if fpath != target and os.path.exists(fpath): shutil.move(fpath, target)
            if os.path.exists(os.path.join("pretrained_models", "Dataset")): shutil.rmtree(os.path.join("pretrained_models", "Dataset"))
            print("✅ Models successfully acquired via Personal Hugging Face fallback.")
            personal_hf_success = True
        except Exception as e_hf:
            print(f"⚠️ Personal HF Checkpoint failed: {e_hf}. Trying External Fallback...")

        # Priority 4 (Fallback): Auto-download from HuggingFace via utils script
        if not personal_hf_success:
            try:
                from utils.default_models import ensure_default_models
                ensure_default_models(Path("pretrained_models"))
                print("✅ Models successfully acquired via External HuggingFace fallback.")
            except Exception as e:
                print(f"⚠️ Critical: Could not auto-download models. Error: {e}")

📂 Working Directory: /content/DEEPFAKE-AUDIO
✅ Module Path Registered: /content/DEEPFAKE-AUDIO/Source Code
⬇️ Verifying Model Availability...
✅ Found high-priority local models in 'Dataset/'. Verified.


## 2️⃣ Architecture Loading

We now initialize the three distinct neural networks that comprise the SV2TTS framework. Please ensure you are running on a **GPU Runtime** (e.g., T4 on Colab) for optimal performance.

In [3]:
from encoder import inference as encoder
from synthesizer.inference import Synthesizer
from vocoder import inference as vocoder
import numpy as np
import torch
from pathlib import Path

# Hardware Acceleration Check
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"🎯 Computation Device: {device}")

def resolve_checkpoint(component_name, legacy_path_suffix):
    """
    Intelligently resolves the path to model checkpoints based on priority.
    1. Repository /Dataset/ folder.
    2. Kaggle Input directory (Hardcoded: /kaggle/input/deepfakeaudio/).
    3. Auto-downloaded 'pretrained_models'.
    """

    def is_valid(p):
        return p.exists() and p.stat().st_size > 1000

    # 1. Repository Local (Dataset/)
    dataset_p = Path("Dataset") / f"{component_name.lower()}.pt"
    if is_valid(dataset_p):
        print(f"🟢 Loading {component_name} from Repository: {dataset_p}")
        return dataset_p

    # 2. Kaggle Environment (Hardcoded Path: /kaggle/input/deepfakeaudio/)
    kaggle_p = Path("/kaggle/input/deepfakeaudio") / f"{component_name.lower()}.pt"
    if is_valid(kaggle_p):
        print(f"🟢 Loading {component_name} from Kaggle: {kaggle_p}")
        return kaggle_p

    # 3. Default / Auto-Downloaded Fallback
    default_p = Path("pretrained_models/default") / f"{component_name.lower()}.pt"
    if is_valid(default_p):
        print(f"🟢 Loading {component_name} from Fallback: {default_p}")
        return default_p

    # 4. Legacy/Manual Paths
    legacy_p = Path("pretrained_models") / legacy_path_suffix
    if legacy_p.exists():
         if legacy_p.is_dir():
             pts = [f for f in legacy_p.glob("*.pt") if is_valid(f)]
             if pts: return pts[0]
             pts_rec = [f for f in legacy_p.rglob("*.pt") if is_valid(f)]
             if pts_rec: return pts_rec[0]
         elif is_valid(legacy_p):
             return legacy_p

    print(f'⚠️ Warning: {component_name} checkpoint not found! Falling back to dynamic search...')
    return None

print("⏳ Loading Neural Networks (SV2TTS Pipeline)...")

try:
    # 1. Encoder: Extract speaker embedding
    encoder_path = resolve_checkpoint("Encoder", "encoder/saved_models")
    encoder.load_model(encoder_path)

    # 2. Synthesizer: Generates spectrograms from text
    synth_path = resolve_checkpoint("Synthesizer", "synthesizer/saved_models/logs-pretrained/taco_pretrained")
    synthesizer = Synthesizer(synth_path)

    # 3. Vocoder: Converts spectrograms to audio waveforms
    vocoder_path = resolve_checkpoint("Vocoder", "vocoder/saved_models/pretrained")
    vocoder.load_model(vocoder_path)

    print("✅ Pipeline operational. All components loaded correctly.")
except Exception as e:
    print(f"❌ Architecture Error: {e}")

🎯 Computation Device: cuda
⏳ Loading Neural Networks (SV2TTS Pipeline)...
🟢 Loading Encoder from Repository: Dataset/encoder.pt
Loaded encoder "encoder.pt" trained to step 1564501
🟢 Loading Synthesizer from Repository: Dataset/synthesizer.pt
Synthesizer using device: cuda
🟢 Loading Vocoder from Repository: Dataset/vocoder.pt
Building Wave-RNN
Trainable Parameters: 4.481M
Loading model weights at Dataset/vocoder.pt
✅ Pipeline operational. All components loaded correctly.


## 3️⃣ Inference Interface

Select your **Input Method** below to begin cloning.

*   **Presets**: Choose from a high-quality list of celebrity samples.
*   **Upload**: Use your own `.wav` or `.mp3` file (5-10 seconds recommended).
*   **Record**: Capture your voice directly in the browser (Colab only).

In [4]:
import ipywidgets as widgets
from IPython.display import display, Javascript, Audio
try:
    from google.colab import output
    HAS_COLAB = True
except ImportError:
    HAS_COLAB = False
from base64 import b64decode
import io
import librosa
import librosa.display
import os
import soundfile as sf
import matplotlib.pyplot as plt
import numpy as np
import glob

RECORD = """
const sleep = time => new Promise(resolve => setTimeout(resolve, time))
const b2text = blob => new Promise(resolve => {
  const reader = new FileReader()
  reader.onloadend = e => resolve(e.srcElement.result)
  reader.readAsDataURL(blob)
})
var record = time => new Promise(async resolve => {
  stream = await navigator.mediaDevices.getUserMedia({ audio: true })
  recorder = new MediaRecorder(stream)
  chunks = []
  recorder.ondataavailable = e => chunks.push(e.data)
  recorder.start()
  await sleep(time)
  recorder.onstop = async ()=>{
    blob = new Blob(chunks)
    text = await b2text(blob)
    resolve(text)
  }
  recorder.stop()
})"""

def record_audio(sec=10):
    if not HAS_COLAB:
        raise RuntimeError("Recording is only available in a Google Colab environment.")
    print("🔴 Recording active for %d seconds..." % sec)
    display(Javascript(RECORD))
    s = output.eval_js('record(%d)' % (sec*1000))
    print("✅ Recording saved.")
    binary = b64decode(s.split(',')[1])
    with open('recording.wav', 'wb') as f:
        f.write(binary)
    return 'recording.wav'

def visualize_results(original_wav, generated_wav, spec, embed, title="Analysis"):
    try:
        fig, axes = plt.subplots(3, 1, figsize=(10, 12))
        axes[0].set_title("Input Voice vs. Cloned Voice (Waveform)")
        try:
            librosa.display.waveshow(original_wav, alpha=0.5, ax=axes[0], label="Original")
            librosa.display.waveshow(generated_wav, alpha=0.5, ax=axes[0], label="Cloned", color='r')
            axes[0].legend()
        except:
            axes[0].plot(original_wav, alpha=0.5, label="Original")
            axes[0].plot(generated_wav, alpha=0.5, label="Cloned", color='r')
            axes[0].legend()

        axes[1].set_title("Generated Mel Spectrogram")
        im = axes[1].imshow(spec, aspect="auto", origin="lower", interpolation='none')
        fig.colorbar(im, ax=axes[1])

        axes[2].set_title("Speaker Embedding (256-D Heatmap)")
        if len(embed) == 256:
            axes[2].imshow(embed.reshape(16, 16), aspect='auto', cmap='viridis')
        else:
            axes[2].plot(embed)

        plt.tight_layout()
        plt.show()
    except Exception as e:
        print(f"⚠️ Graphs partially failed: {e}. Audio was successful.")

# --- 🛡️ IMPROVED SAMPLE DISCOVERY ---
def find_samples_dir():
    """Locates reference samples with high persistence across all environments."""
    # Priority paths
    priority_roots = [
        "Source Code/samples",
        "Dataset/samples",
        "D:/GitHub/DEEPFAKE-AUDIO/Source Code/samples",
        "D:/GitHub/DEEPFAKE-AUDIO/Dataset/samples",
        "/content/DEEPFAKE-AUDIO/Source Code/samples",
        "/kaggle/input/deepfakeaudio/samples",
        "/kaggle/input/deepfakeaudio"
    ]

    def filter_real_audio(d):
        if not os.path.exists(d): return []
        # Check if files are real audio (not small LFS pointers < 1KB)
        return [f for f in os.listdir(d) if f.lower().endswith((".wav", ".mp3")) and os.path.getsize(os.path.join(d, f)) > 1024]

    for d in priority_roots:
        files = filter_real_audio(d)
        if files:
            print(f"✅ Samples located at: {os.path.abspath(d)}")
            return d, files

    # More aggressive glob search
    print("🔍 Searching folders for audio samples...")
    potential_matches = glob.glob("**/samples/*.wav", recursive=True) + glob.glob("**/samples/*.mp3", recursive=True)
    valid_matches = [m for m in potential_matches if os.path.getsize(m) > 1024]

    if valid_matches:
        root = os.path.dirname(valid_matches[0])
        files = [os.path.basename(f) for f in glob.glob(os.path.join(root, "*.*")) if f.lower().endswith((".wav", ".mp3")) and os.path.getsize(f) > 1024]
        print(f"✨ Located samples via glob at: {os.path.abspath(root)}")
        return root, files

    return None, []

print("Select Input Method:")
tab = widgets.Tab()

samples_dir, preset_files = find_samples_dir()
if samples_dir:
    preset_files.sort()
    for name in reversed(["Donald Trump.wav", "Steve Jobs.wav"]):
        if name in preset_files:
            preset_files.insert(0, preset_files.pop(preset_files.index(name)))
else:
    print("⚠️ Warning: No reference samples found. Please run the setup cell or upload manually.")

dropdown = widgets.Dropdown(options=preset_files,
                            value=preset_files[0] if preset_files else None,
                            description='Preset:')
uploader = widgets.FileUpload(accept='.wav,.mp3', multiple=False)
record_btn = widgets.Button(description="Start Recording (10s)", button_style='danger')
record_out = widgets.Output()

def on_record_click(b):
    with record_out:
        record_btn.disabled = True
        try: record_audio(10)
        except Exception as e: print(f"Error: {e}.")
        record_btn.disabled = False
record_btn.on_click(on_record_click)

# Tab assignment MUST use .children attribute
tab.children = [
    widgets.VBox([dropdown]),
    widgets.VBox([uploader]),
    widgets.VBox([record_btn, record_out])
]
tab.set_title(0, '🎵 Presets')
tab.set_title(1, '📂 Upload')
tab.set_title(2, '🔴 Record')
display(tab)

text_input = widgets.Textarea(
    value="Hello, I'm Elon Musk. Welcome to Deepfake Audio by Amey Thakur and Mega Satish. Explore AI voice Go!",
    placeholder='Enter text to synthesize...',
    description='Text:',
    layout=widgets.Layout(width='50%', height='100px')
)
clone_btn = widgets.Button(description="Clone Voice! 🚀", button_style='primary')
out = widgets.Output()
display(text_input, clone_btn, out)

def run_cloning(b):
    with out:
        out.clear_output()
        active_tab = tab.selected_index
        input_path = None
        try:
            if active_tab == 0:
                 if not dropdown.value: return print("❌ No preset selected.")
                 input_path = os.path.join(samples_dir, dropdown.value)
                 print(f"🎙️ Source: Preset ({dropdown.value})")
            elif active_tab == 1:
                 if not uploader.value: return print("❌ No file uploaded.")
                 fname = list(uploader.value.keys())[0]
                 content = uploader.value[fname]['content']
                 input_path = "uploaded_sample.wav"
                 with open(input_path, "wb") as f: f.write(content)
                 print(f"🎙️ Source: Upload ({fname})")
            elif active_tab == 2:
                 if not os.path.exists("recording.wav"): return print("❌ No recording found.")
                 input_path = "recording.wav"
                 print("🎙️ Source: Microphone")

            print("⏳ Step 1/3: Encoding speaker identity...")
            original_wav, sampling_rate = librosa.load(input_path)
            preprocessed_wav = encoder.preprocess_wav(original_wav, sampling_rate)
            embed = encoder.embed_utterance(preprocessed_wav)
            print("⏳ Step 2/3: Synthesizing speech...")
            specs = synthesizer.synthesize_spectrograms([text_input.value], [embed])
            spec = specs[0]
            print("⏳ Step 3/3: Generating waveform...")
            generated_wav = vocoder.infer_waveform(spec)
            print("🎉 Synthesis Complete!")
            display(Audio(generated_wav, rate=synthesizer.sample_rate))
            print("\n📊 Generating Analysis...")
            visualize_results(original_wav, generated_wav, spec, embed)
        except Exception as e: print(f"❌ Error: {e}")
clone_btn.on_click(run_cloning)

Select Input Method:
✅ Samples located at: /content/DEEPFAKE-AUDIO/Dataset/samples


Tab(children=(VBox(children=(Dropdown(description='Preset:', options=('Donald Trump.wav', 'Steve Jobs.wav', 'A…

Textarea(value="Hello, I'm Elon Musk. Welcome to Deepfake Audio by Amey Thakur and Mega Satish. Explore AI voi…

Button(button_style='primary', description='Clone Voice! 🚀', style=ButtonStyle())

Output()