# Voice Cloning & Augmentation untuk Training Dataset

Program ini akan menggunakan **semua file suara** dari setiap kategori (buka & tutup) di folder `voice ori` dan menghasilkan **tepat 100 file** untuk setiap kategori dengan penamaan konsisten (buka1, buka2, ..., buka100 dan tutup1, tutup2, ..., tutup100).

**Source Files:**
- Kategori Buka: 9 files (buka1.mp3 - buka9.mp3)
- Kategori Tutup: 6 files (tutup1.mp3 - tutup6.mp3)

Teknik augmentasi yang digunakan:
- Pitch Shifting
- Time Stretching
- Adding Noise
- Speed Change
- Volume Change
- Kombinasi teknik random

## 1. Install Dependencies

In [1]:
# Install required libraries
!pip install librosa soundfile audiomentations pydub numpy scipy


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.2[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [2]:
!pip install tqdm


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.2[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


## 2. Import Libraries

In [3]:
import os
import librosa
import soundfile as sf
import numpy as np
from audiomentations import Compose, AddGaussianNoise, TimeStretch, PitchShift, Shift
from pydub import AudioSegment
import random
from tqdm import tqdm

## 3. Setup Paths

In [4]:
base_path = "/workspaces/PSD/PSD/Data/data_voice"
output_path = "/workspaces/PSD/PSD/Data/voice_augmented"

# Target jumlah file per kategori
TARGET_FILES_PER_CATEGORY = 100

# Pastikan folder output (buka & tutup) sudah ada
import os

os.makedirs(os.path.join(output_path, "buka"), exist_ok=True)
os.makedirs(os.path.join(output_path, "tutup"), exist_ok=True)

# === FUNCTION: Get all audio files ===
def get_audio_files(category_path):
    """Get all audio files (mp3, wav) from a directory"""
    audio_extensions = ['.mp3', '.wav', '.m4a', '.flac']
    files = []
    if os.path.exists(category_path):
        for file in os.listdir(category_path):
            if any(file.lower().endswith(ext) for ext in audio_extensions):
                files.append(file)
    return sorted(files)

# === SCAN ALL AUDIO FILES ===
SOURCE_FILES = {
    'buka': get_audio_files(os.path.join(base_path, 'buka')),
    'tutup': get_audio_files(os.path.join(base_path, 'tutup'))
}

# === PRINT CHECK ===
print(f"Base path: {base_path}")
print(f"Output path: {output_path}")
print(f"Target: {TARGET_FILES_PER_CATEGORY} files per category")
print(f"\nSource files found:")

for category, files in SOURCE_FILES.items():
    print(f"  {category}: {len(files)} files")
    for file in files:
        print(f"    - {file}")


Base path: /workspaces/PSD/PSD/Data/data_voice
Output path: /workspaces/PSD/PSD/Data/voice_augmented
Target: 100 files per category

Source files found:
  buka: 10 files
    - Buka_1.m4a
    - Buka_10.m4a
    - Buka_2.m4a
    - Buka_3.m4a
    - Buka_4.m4a
    - Buka_5.m4a
    - Buka_6.m4a
    - Buka_7.m4a
    - Buka_8.m4a
    - Buka_9.m4a
  tutup: 10 files
    - Tutup_1.m4a
    - Tutup_10.m4a
    - Tutup_2.m4a
    - Tutup_3.m4a
    - Tutup_4.m4a
    - Tutup_5.m4a
    - Tutup_6.m4a
    - Tutup_7.m4a
    - Tutup_8.m4a
    - Tutup_9.m4a


## 4. Fungsi Augmentasi Audio

In [5]:
import os
import librosa
import soundfile as sf
import audioread

def load_audio(file_path):
    """
    Load audio file and return audio data and sample rate
    """
    try:
        audio, sr = librosa.load(file_path, sr=None)
        return audio, sr
    except Exception as e:
        print(f"Error loading {file_path}: {e}")
        return None, None


def save_audio(audio, sr, output_path):
    """
    Save audio data to file (make sure folder exists)
    """
    try:
        # Pastikan folder tujuan ada
        os.makedirs(os.path.dirname(output_path), exist_ok=True)
        sf.write(output_path, audio, sr)
    except Exception as e:
        print(f"Error saving {output_path}: {e}")


In [6]:
import numpy as np
import librosa

def pitch_shift_augmentation(audio, sr, n_steps_list=[2, -2, 3, -3]):
    """
    Pitch shifting: mengubah nada suara
    """
    augmented_audios = []
    for n_steps in n_steps_list:
        shifted = librosa.effects.pitch_shift(audio, sr=sr, n_steps=n_steps)
        augmented_audios.append((shifted, f"pitch_{n_steps}"))
    return augmented_audios


def time_stretch_augmentation(audio, rates=[0.9, 1.1, 0.85, 1.15]):
    """
    Time stretching: mengubah kecepatan tanpa mengubah pitch
    """
    augmented_audios = []
    for rate in rates:
        try:
            stretched = librosa.effects.time_stretch(audio, rate=rate)
            augmented_audios.append((stretched, f"timestretch_{rate}"))
        except Exception as e:
            print(f"⚠️ Time-stretch gagal ({rate}): {e}")
    return augmented_audios


def add_noise_augmentation(audio, sr, noise_levels=[0.005, 0.01, 0.015]):
    """
    Add Gaussian noise: menambahkan noise ke audio
    """
    augmented_audios = []
    for noise_level in noise_levels:
        noise = np.random.randn(len(audio)) * noise_level
        noisy_audio = audio + noise
        # Normalize (hindari pembagian dengan 0)
        max_val = np.max(np.abs(noisy_audio))
        if max_val > 0:
            noisy_audio = noisy_audio / max_val
        augmented_audios.append((noisy_audio, f"noise_{noise_level}"))
    return augmented_audios


def speed_change_augmentation(audio, sr, speed_factors=[1.1, 0.9, 1.2, 0.8]):
    """
    Speed change: mengubah kecepatan audio
    """
    augmented_audios = []
    for speed in speed_factors:
        indices = np.round(np.arange(0, len(audio), speed)).astype(int)
        indices = indices[indices < len(audio)]
        if len(indices) > 0:
            changed = audio[indices]
            augmented_audios.append((changed, f"speed_{speed}"))
    return augmented_audios


def volume_change_augmentation(audio, volume_factors=[1.2, 0.8, 1.3, 0.7]):
    """
    Volume change: mengubah volume audio
    """
    augmented_audios = []
    for volume in volume_factors:
        changed = audio * volume
        # Clip to prevent distortion
        changed = np.clip(changed, -1.0, 1.0)
        augmented_audios.append((changed, f"volume_{volume}"))
    return augmented_audios


In [7]:
import numpy as np
import librosa
import random

def generate_diverse_augmentation(audio, sr):
    """
    Generate berbagai variasi augmentasi dengan parameter yang lebih beragam
    """
    augmented_audios = []
    
    # 1. Pitch shift variations
    pitch_steps = [-4, -3, -2, -1, 1, 2, 3, 4, -2.5, 2.5]
    for step in pitch_steps:
        try:
            shifted = librosa.effects.pitch_shift(audio, sr=sr, n_steps=step)
            augmented_audios.append(shifted)
        except Exception as e:
            print(f"⚠️ Pitch shift gagal ({step}): {e}")
    
    # 2. Time stretch variations
    time_rates = [0.8, 0.85, 0.9, 0.95, 1.05, 1.1, 1.15, 1.2, 0.92, 1.08]
    for rate in time_rates:
        try:
            stretched = librosa.effects.time_stretch(audio, rate=rate)
            augmented_audios.append(stretched)
        except Exception as e:
            print(f"⚠️ Time stretch gagal ({rate}): {e}")
    
    # 3. Noise variations
    noise_levels = [0.003, 0.005, 0.007, 0.01, 0.012, 0.015, 0.02, 0.004, 0.008, 0.018]
    for noise_level in noise_levels:
        noise = np.random.randn(len(audio)) * noise_level
        noisy = audio + noise
        max_val = np.max(np.abs(noisy))
        if max_val > 0:
            noisy = noisy / max_val
        augmented_audios.append(noisy)
    
    # 4. Speed variations
    speed_factors = [0.75, 0.85, 0.9, 0.95, 1.05, 1.1, 1.15, 1.25, 0.88, 1.12]
    for speed in speed_factors:
        indices = np.round(np.arange(0, len(audio), speed)).astype(int)
        indices = indices[indices < len(audio)]
        if len(indices) > 0:
            changed = audio[indices]
            augmented_audios.append(changed)
    
    # 5. Volume variations
    volume_factors = [0.6, 0.7, 0.8, 0.9, 1.1, 1.2, 1.3, 1.4, 0.75, 1.25]
    for volume in volume_factors:
        changed = audio * volume
        changed = np.clip(changed, -1.0, 1.0)
        augmented_audios.append(changed)
    
    # 6. Combined augmentations (random combinations)
    num_combinations = 50
    for i in range(num_combinations):
        aug_audio = audio.copy()
        
        # Random pitch (50% chance)
        if random.random() > 0.5:
            n_steps = random.uniform(-3, 3)
            try:
                aug_audio = librosa.effects.pitch_shift(aug_audio, sr=sr, n_steps=n_steps)
            except Exception as e:
                print(f"⚠️ Random pitch gagal ({n_steps:.2f}): {e}")
        
        # Random time stretch (50% chance)
        if random.random() > 0.5:
            rate = random.uniform(0.85, 1.15)
            try:
                aug_audio = librosa.effects.time_stretch(aug_audio, rate=rate)
            except Exception as e:
                print(f"⚠️ Random time stretch gagal ({rate:.2f}): {e}")
        
        # Random noise (60% chance)
        if random.random() > 0.4:
            noise_level = random.uniform(0.003, 0.015)
            noise = np.random.randn(len(aug_audio)) * noise_level
            aug_audio = aug_audio + noise
            max_val = np.max(np.abs(aug_audio))
            if max_val > 0:
                aug_audio = aug_audio / max_val
        
        # Random volume (50% chance)
        if random.random() > 0.5:
            volume = random.uniform(0.7, 1.3)
            aug_audio = aug_audio * volume
            aug_audio = np.clip(aug_audio, -1.0, 1.0)
        
        augmented_audios.append(aug_audio)
    
    return augmented_audios


## 5. Fungsi Augmentasi untuk Menghasilkan 100 File dari Multiple Source Files

In [8]:
import os
import random
import gc

def augment_multiple_files_to_target(source_files, category, base_path, output_path, target_count=100):
    """
    Augmentasi multiple file audio menjadi tepat target_count file dengan penamaan konsisten
    """
    print(f"\n{'='*60}")
    print(f"Processing {category.upper()}")
    print(f"Source files: {len(source_files)} files")
    print(f"Target: {target_count} files")
    print(f"{'='*60}")
    
    if not source_files:
        print(f"Error: No source files found for {category}")
        return 0
    
    output_dir = os.path.join(output_path, category)
    os.makedirs(output_dir, exist_ok=True)   # ✅ pastikan folder output ada
    
    files_created = 0
    all_augmented = []
    
    # Process each source file
    for idx, filename in enumerate(source_files, 1):
        source_file = os.path.join(base_path, category, filename)
        
        print(f"\n[{idx}/{len(source_files)}] Loading: {filename}")
        
        # Load audio
        audio, sr = load_audio(source_file)
        if audio is None:
            print(f"  ⚠️ Error: Failed to load {source_file}")
            continue
        
        # Add original
        all_augmented.append((audio, sr, f"original_{filename}"))
        
        # Generate augmented versions
        print(f"  Generating augmentations...")
        try:
            augmented_audios = generate_diverse_augmentation(audio, sr)
            for aug_audio in augmented_audios:
                all_augmented.append((aug_audio, sr, f"aug_{filename}"))
            print(f"  Generated: {len(augmented_audios)} variations")
        except Exception as e:
            print(f"  ⚠️ Augmentation error for {filename}: {e}")
        
        # optional memory release per file
        gc.collect()
    
    # Shuffle all augmented files for variety
    print(f"\nTotal generated: {len(all_augmented)} audio samples")
    print(f"Selecting {target_count} samples...")
    random.shuffle(all_augmented)
    
    # Select exactly target_count files
    selected = all_augmented[:target_count]
    
    # Save with sequential numbering
    print(f"\nSaving {target_count} files to: {output_dir}")
    for i, (aug_audio, sr, source_info) in enumerate(selected, 1):
        output_file = os.path.join(output_dir, f"{category}{i}.wav")
        try:
            save_audio(aug_audio, sr, output_file)
            files_created += 1
        except Exception as e:
            print(f"  ⚠️ Failed to save file {output_file}: {e}")
        
        # Progress indicator
        if i % 10 == 0 or i == target_count:
            print(f"  Progress: {i}/{target_count} files saved")
    
    print(f"✓ Completed: {files_created} files created for '{category}'")
    return files_created


def augment_dataset_100_per_category(base_path, output_path, source_files, target_count=100):
    """
    Augmentasi dataset: multiple files per kategori menjadi target_count file per kategori
    """
    stats = {}
    
    for category, filenames in source_files.items():
        if not filenames:
            print(f"Warning: No source files found for category: {category}")
            stats[category] = 0
            continue
        
        files_created = augment_multiple_files_to_target(
            filenames,
            category, 
            base_path,
            output_path, 
            target_count
        )
        stats[category] = files_created
    
    print("\n=== SUMMARY ===")
    for cat, count in stats.items():
        print(f"{cat}: {count} files generated")
    
    return stats


## 6. Jalankan Augmentasi (Multiple Source Files → 100 Files per Category)

In [9]:
import random
import numpy as np

# Set random seed (supaya hasil augmentasi konsisten)
random.seed(42)
np.random.seed(42)

print("="*60)
print("VOICE AUGMENTATION: Multiple Files -> 100 Files per Category")
print("="*60)

# Pastikan path sudah sesuai (gunakan absolut)
base_path = "/workspaces/PSD/PSD/Data/data_voice"
output_path = "/workspaces/PSD/PSD/Data/voice_augmented"

# Jalankan augmentasi utama
stats = augment_dataset_100_per_category(
    base_path=base_path, 
    output_path=output_path, 
    source_files=SOURCE_FILES, 
    target_count=TARGET_FILES_PER_CATEGORY
)

print("\n=== FINAL SUMMARY ===")
for cat, count in stats.items():
    print(f"{cat}: {count} files created")


VOICE AUGMENTATION: Multiple Files -> 100 Files per Category

Processing BUKA
Source files: 10 files
Target: 100 files

[1/10] Loading: Buka_1.m4a


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generating augmentations...
  Generated: 100 variations

[2/10] Loading: Buka_10.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[3/10] Loading: Buka_2.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[4/10] Loading: Buka_3.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[5/10] Loading: Buka_4.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[6/10] Loading: Buka_5.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[7/10] Loading: Buka_6.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[8/10] Loading: Buka_7.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[9/10] Loading: Buka_8.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[10/10] Loading: Buka_9.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

Total generated: 1010 audio samples
Selecting 100 samples...

Saving 100 files to: /workspaces/PSD/PSD/Data/voice_augmented/buka
  Progress: 10/100 files saved
  Progress: 20/100 files saved
  Progress: 30/100 files saved
  Progress: 40/100 files saved
  Progress: 50/100 files saved
  Progress: 60/100 files saved
  Progress: 70/100 files saved
  Progress: 80/100 files saved
  Progress: 90/100 files saved
  Progress: 100/100 files saved
✓ Completed: 100 files created for 'buka'

Processing TUTUP
Source files: 10 files
Target: 100 files

[1/10] Loading: Tutup_1.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[2/10] Loading: Tutup_10.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[3/10] Loading: Tutup_2.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[4/10] Loading: Tutup_3.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[5/10] Loading: Tutup_4.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[6/10] Loading: Tutup_5.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[7/10] Loading: Tutup_6.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[8/10] Loading: Tutup_7.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[9/10] Loading: Tutup_8.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

[10/10] Loading: Tutup_9.m4a
  Generating augmentations...


  audio, sr = librosa.load(file_path, sr=None)
	Deprecated as of librosa version 0.10.0.
	It will be removed in librosa version 1.0.
  y, sr_native = __audioread_load(path, offset, duration, dtype)


  Generated: 100 variations

Total generated: 1010 audio samples
Selecting 100 samples...

Saving 100 files to: /workspaces/PSD/PSD/Data/voice_augmented/tutup
  Progress: 10/100 files saved
  Progress: 20/100 files saved
  Progress: 30/100 files saved
  Progress: 40/100 files saved
  Progress: 50/100 files saved
  Progress: 60/100 files saved
  Progress: 70/100 files saved
  Progress: 80/100 files saved
  Progress: 90/100 files saved
  Progress: 100/100 files saved
✓ Completed: 100 files created for 'tutup'

=== SUMMARY ===
buka: 100 files generated
tutup: 100 files generated

=== FINAL SUMMARY ===
buka: 100 files created
tutup: 100 files created


## 7. Verifikasi Hasil & Statistik

In [10]:
print("\n" + "="*60)
print("AUGMENTATION RESULTS")
print("="*60)

for category, count in stats.items():
    print(f"\n{category.upper()}: {count} files")
    
    # Path folder hasil augmentasi
    category_path = os.path.join(output_path, category)
    
    if not os.path.exists(category_path):
        print(f"  ⚠️ Folder not found: {category_path}")
        continue
    
    # Ambil semua file .wav di folder kategori
    actual_files = [f for f in os.listdir(category_path) if f.endswith('.wav')]
    actual_files_sorted = sorted(actual_files, key=lambda x: int(''.join(filter(str.isdigit, x)) or 0))
    print(f"  Verified: {len(actual_files)} files in folder")
    
    # Tampilkan beberapa nama file contoh
    print(f"  Sample filenames:")
    sample_indices = [1, 2, 3, 50, 99, 100]
    for i in sample_indices:
        expected_file = f"{category}{i}.wav"
        if expected_file in actual_files:
            print(f"    ✓ {expected_file}")
        else:
            print(f"    ✗ {expected_file} (missing)")
    
total_files = sum(stats.values())
print(f"\n{'='*60}")
print(f"TOTAL FILES CREATED: {total_files}")
print(f"{'='*60}")

# === Additional verification ===
print("\n" + "="*60)
print("FOLDER STRUCTURE")
print("="*60)
for category in ['buka', 'tutup']:
    category_path = os.path.join(output_path, category)
    if os.path.exists(category_path):
        files = sorted([f for f in os.listdir(category_path) if f.endswith('.wav')])
        print(f"\n{category_path}")
        print(f"  Total: {len(files)} files")
        if files:
            print(f"  First: {files[0]}")
            print(f"  Last:  {files[-1]}")
    else:
        print(f"⚠️ Folder {category_path} tidak ditemukan.")



AUGMENTATION RESULTS

BUKA: 100 files
  Verified: 100 files in folder
  Sample filenames:
    ✓ buka1.wav
    ✓ buka2.wav
    ✓ buka3.wav
    ✓ buka50.wav
    ✓ buka99.wav
    ✓ buka100.wav

TUTUP: 100 files
  Verified: 100 files in folder
  Sample filenames:
    ✓ tutup1.wav
    ✓ tutup2.wav
    ✓ tutup3.wav
    ✓ tutup50.wav
    ✓ tutup99.wav
    ✓ tutup100.wav

TOTAL FILES CREATED: 200

FOLDER STRUCTURE

/workspaces/PSD/PSD/Data/voice_augmented/buka
  Total: 100 files
  First: buka1.wav
  Last:  buka99.wav

/workspaces/PSD/PSD/Data/voice_augmented/tutup
  Total: 100 files
  First: tutup1.wav
  Last:  tutup99.wav


## 8. Visualisasi Sample Audio

In [11]:
import matplotlib.pyplot as plt
import librosa
import librosa.display
import numpy as np
import os

def visualize_augmented_samples(category='buka', sample_indices=[1, 25, 50, 75, 100]):
    """
    Visualisasi beberapa sample hasil augmentasi (waveform + spectrogram)
    """
    output_dir = os.path.join(output_path, category)
    
    # Pastikan folder ada
    if not os.path.exists(output_dir):
        print(f"⚠️ Folder {output_dir} tidak ditemukan.")
        return
    
    # Setup figure
    fig, axes = plt.subplots(len(sample_indices), 2, figsize=(15, 3*len(sample_indices)))
    if len(sample_indices) == 1:
        axes = np.array([axes])  # agar indexing tetap 2D
    
    for idx, file_num in enumerate(sample_indices):
        file_name = f"{category}{file_num}.wav"
        file_path = os.path.join(output_dir, file_name)
        
        if os.path.exists(file_path):
            audio, sr = load_audio(file_path)
            if audio is None:
                # jika gagal load, tampilkan pesan
                for ax in axes[idx]:
                    ax.text(0.5, 0.5, f"Gagal memuat {file_name}", ha='center', va='center', transform=ax.transAxes)
                    ax.axis("off")
                continue

            # Waveform
            axes[idx, 0].plot(audio, color='tab:blue')
            axes[idx, 0].set_title(f"Waveform: {file_name}", fontsize=10)
            axes[idx, 0].set_xlabel("Sample")
            axes[idx, 0].set_ylabel("Amplitude")
            axes[idx, 0].grid(True, alpha=0.3)

            # Spectrogram
            D = np.abs(librosa.stft(audio))
            D_db = librosa.amplitude_to_db(D, ref=np.max)
            img = librosa.display.specshow(D_db, sr=sr, x_axis='time', y_axis='hz', ax=axes[idx, 1])
            axes[idx, 1].set_title(f"Spectrogram: {file_name}", fontsize=10)
            fig.colorbar(img, ax=axes[idx, 1], format="%+2.0f dB")
        else:
            for ax in axes[idx]:
                ax.text(0.5, 0.5, f"File not found:\n{file_name}", ha='center', va='center', transform=ax.transAxes)
                ax.axis("off")
    
    plt.tight_layout()
    save_path = f'augmentation_samples_{category}.png'
    plt.savefig(save_path, dpi=150, bbox_inches='tight', facecolor='white')
    plt.show()
    
    print(f"✅ Visualization saved as '{save_path}'")


## 9. Fungsi Utilitas untuk Load Dataset

In [12]:
import os
from tqdm import tqdm

def load_augmented_dataset(augmented_path):
    """
    Load dataset yang sudah di-augmentasi untuk training
    Returns: list of dicts {'audio', 'sr', 'label', 'filename'}
    """
    dataset = []
    categories = ['buka', 'tutup']

    print("="*60)
    print("LOADING AUGMENTED DATASET")
    print("="*60)

    for category in categories:
        category_path = os.path.join(augmented_path, category)

        # Pastikan folder ada
        if not os.path.exists(category_path):
            print(f"⚠️ Folder not found: {category_path}")
            continue

        audio_files = [f for f in os.listdir(category_path) if f.endswith('.wav')]
        audio_files = sorted(audio_files, key=lambda x: int(''.join(filter(str.isdigit, x)) or 0))

        print(f"\nLoading {category.upper()}: {len(audio_files)} files")

        for audio_file in tqdm(audio_files, desc=f"Loading {category}", ncols=80):
            file_path = os.path.join(category_path, audio_file)
            audio, sr = load_audio(file_path)
            
            # Pastikan audio valid
            if audio is not None and len(audio) > 100:  # minimal panjang untuk valid audio
                dataset.append({
                    'audio': audio,
                    'sr': sr,
                    'label': category,
                    'filename': audio_file
                })
            else:
                print(f"  ⚠️ Skipped {audio_file} (invalid or too short)")

    print(f"\n{'='*60}")
    print(f"✅ Total dataset loaded: {len(dataset)} samples")
    print(f"{'='*60}")
    return dataset


In [13]:
from collections import Counter

print("="*60)
print("LOADING AUGMENTED DATASET & CHECKING DISTRIBUTION")
print("="*60)

# Load augmented dataset
print("Loading augmented dataset...")
augmented_dataset = load_augmented_dataset(output_path)

# Cek hasil load
if not augmented_dataset:
    print("⚠️ Dataset kosong atau gagal dimuat.")
else:
    # Hitung distribusi label
    label_counts = Counter([item['label'] for item in augmented_dataset])
    
    print("\nDataset distribution:")
    total_samples = sum(label_counts.values())
    for label, count in label_counts.items():
        perc = (count / total_samples) * 100 if total_samples > 0 else 0
        print(f"  {label}: {count} samples ({perc:.1f}%)")
    
    print(f"\nTotal samples: {total_samples}")
    print("="*60)


LOADING AUGMENTED DATASET & CHECKING DISTRIBUTION
Loading augmented dataset...
LOADING AUGMENTED DATASET

Loading BUKA: 100 files


Loading buka: 100%|█████████████████████████| 100/100 [00:00<00:00, 1641.93it/s]



Loading TUTUP: 100 files


Loading tutup: 100%|████████████████████████| 100/100 [00:00<00:00, 1967.60it/s]


✅ Total dataset loaded: 200 samples

Dataset distribution:
  buka: 100 samples (50.0%)
  tutup: 100 samples (50.0%)

Total samples: 200





## 10. Export ke Format Lain (Optional)

In [14]:
from pydub import AudioSegment
from tqdm import tqdm
import os
import shutil

def convert_to_mp3(wav_path, mp3_path):
    """
    Convert WAV to MP3 (optional, untuk menghemat storage)
    """
    try:
        audio = AudioSegment.from_wav(wav_path)
        audio.export(mp3_path, format="mp3", bitrate="192k")
    except Exception as e:
        print(f"⚠️ Gagal convert {wav_path}: {e}")


def convert_augmented_to_mp3(output_path, mp3_output_path="voice_augmented_mp3", overwrite=False):
    """
    Convert seluruh hasil augmentasi WAV -> MP3 untuk semua kategori (buka, tutup)
    """
    print("="*60)
    print("CONVERTING AUGMENTED DATASET TO MP3")
    print("="*60)

    os.makedirs(mp3_output_path, exist_ok=True)
    categories = ['buka', 'tutup']

    for category in categories:
        wav_dir = os.path.join(output_path, category)
        mp3_dir = os.path.join(mp3_output_path, category)
        os.makedirs(mp3_dir, exist_ok=True)

        if not os.path.exists(wav_dir):
            print(f"⚠️ Folder {wav_dir} tidak ditemukan, skip kategori {category}")
            continue

        wav_files = [f for f in os.listdir(wav_dir) if f.endswith('.wav')]
        print(f"\nConverting {category}: {len(wav_files)} files")

        for wav_file in tqdm(wav_files, desc=f"Converting {category}", ncols=80):
            wav_path = os.path.join(wav_dir, wav_file)
            mp3_file = wav_file.replace('.wav', '.mp3')
            mp3_path = os.path.join(mp3_dir, mp3_file)

            # Skip jika file sudah ada dan overwrite=False
            if not overwrite and os.path.exists(mp3_path):
                continue

            convert_to_mp3(wav_path, mp3_path)

    print("\n✅ Konversi selesai.")
    print(f"File MP3 tersimpan di: {os.path.abspath(mp3_output_path)}")


In [15]:
# Jalankan konversi dari hasil augmentasi
convert_augmented_to_mp3(
    output_path="/workspaces/PSD/PSD/Data/voice_augmented",
    mp3_output_path="/workspaces/PSD/PSD/Data/voice_augmented_mp3",
    overwrite=False  # ubah ke True kalau mau timpa file lama
)


CONVERTING AUGMENTED DATASET TO MP3

Converting buka: 100 files


Converting buka: 100%|████████████████████████| 100/100 [00:08<00:00, 11.38it/s]



Converting tutup: 100 files


Converting tutup: 100%|███████████████████████| 100/100 [00:08<00:00, 11.32it/s]


✅ Konversi selesai.
File MP3 tersimpan di: /workspaces/PSD/PSD/Data/voice_augmented_mp3





## Summary

Program ini telah berhasil:
1. ✅ Menggunakan **semua file** dari setiap kategori di folder `voice ori`
   - Buka: 9 files (buka1.mp3 - buka9.mp3)
   - Tutup: 6 files (tutup1.mp3 - tutup6.mp3)
2. ✅ Menghasilkan **tepat 100 file** untuk setiap kategori melalui augmentasi
3. ✅ Penamaan konsisten: **buka1, buka2, ..., buka100** dan **tutup1, tutup2, ..., tutup100**
4. ✅ Menggunakan 6 teknik augmentasi berbeda untuk variasi maksimal
5. ✅ Kombinasi random dari semua source files untuk hasil yang lebih beragam
6. ✅ Total dataset: **200 files** (100 buka + 100 tutup)

Dataset hasil augmentasi siap digunakan untuk training model voice recognition!

**Lokasi Output:** `voice_augmented/`
- `voice_augmented/buka/` : 100 files (buka1.wav - buka100.wav)
- `voice_augmented/tutup/` : 100 files (tutup1.wav - tutup100.wav)