### Voice Analysis

#### - Import library yang dibutuhkan

Mengimpor librosa untuk pemrosesan audio, pandas untuk mengelola data, os untuk navigasi file, dan tfsl untuk ekstraksi fitur


In [3]:
import os
import librosa
import numpy as np
import pandas as pd
from tqdm import tqdm
import tsfel 

#### - Ekstraksi Fitur

In [4]:
 # Import TSFEL

# Tentukan path ke DATASET AUGMENTASI Anda
augmented_path = "C:\\Dokumen\\PSD\\dataset\\voice_augmented"
categories = ["buka", "tutup"]
all_data_dfs = []  # Kita akan menyimpan list of DataFrame

# Siapkan konfigurasi TSFEL
# Ini akan mengambil semua fitur default (statistical, temporal, spectral)
try:
    cfg = tsfel.get_features_by_domain()
except Exception as e:
    print(f"Error saat memuat konfigurasi TSFEL: {e}")
    print("Pastikan Anda telah menginstal TSFEL: pip install tsfel")
    # Jika Anda ingin melanjutkan tanpa tsfel, Anda bisa keluar di sini
    # exit() 
    cfg = None # Atau set ke None jika Anda ingin menangani error nanti

print(f"Memulai ekstraksi fitur TSFEL dari {augmented_path}...")

def extract_tsfel_features(file_path, cfg):
    """Memuat file audio dan mengekstrak fitur TSFEL."""
    try:
        # Kita masih bisa pakai librosa untuk memuat audio, 
        # karena ini efisien dan mengambil sample rate (sr)
        audio, sr = librosa.load(file_path, sr=None)
        
        # Ekstrak fitur TSFEL
        # Ini akan mengembalikan DataFrame dengan 1 baris 
        # berisi semua fitur yang dihitung dari 'audio'
        features_df = tsfel.time_series_features_extractor(cfg, audio, fs=sr)
        return features_df
        
    except Exception as e:
        print(f"Error memproses {file_path}: {e}")
        return None

# Pastikan cfg berhasil dibuat sebelum melanjutkan
if cfg is not None:
    # Proses semua 200 file
    for category in categories:
        category_path = os.path.join(augmented_path, category)
        files = [f for f in os.listdir(category_path) if f.endswith('.wav')]
        
        for file_name in tqdm(files, desc=f"Memproses {category}"):
            file_path = os.path.join(category_path, file_name)
            
            # Dapatkan DataFrame 1-baris dari TSFEL
            features_df = extract_tsfel_features(file_path, cfg)
            
            if features_df is not None:
                # Tambahkan label ke DataFrame ini
                features_df['label'] = category
                
                # Tambahkan DataFrame ini ke list kita
                all_data_dfs.append(features_df)

    print("\nEkstraksi fitur selesai.")

    # Gabungkan semua DataFrame (yang masing-masing 1 baris) menjadi satu DataFrame besar
    if all_data_dfs:
        data = pd.concat(all_data_dfs, ignore_index=True)

        print(data.head())

        # Simpan ke CSV baru
        output_filename = "voice_features_tsfel.csv"
        data.to_csv(output_filename, index=False)
        print(f"Berhasil! Data fitur ({len(data)} baris) telah disimpan ke {output_filename}")
    else:
        print("Tidak ada data yang berhasil diekstrak.")
else:
    print("Ekstraksi fitur dibatalkan karena konfigurasi TSFEL gagal dimuat.")

Memulai ekstraksi fitur TSFEL dari C:\Dokumen\PSD\dataset\voice_augmented...


Memproses buka:   0%|          | 0/100 [00:00<?, ?it/s]

Memproses buka:   1%|          | 1/100 [00:02<03:29,  2.12s/it]

Memproses buka:   2%|▏         | 2/100 [00:03<03:09,  1.94s/it]

Memproses buka:   3%|▎         | 3/100 [00:04<02:13,  1.38s/it]

Memproses buka:   4%|▍         | 4/100 [00:05<01:48,  1.13s/it]

Memproses buka:   5%|▌         | 5/100 [00:06<01:50,  1.17s/it]

Memproses buka:   6%|▌         | 6/100 [00:07<01:36,  1.03s/it]

Memproses buka:   7%|▋         | 7/100 [00:08<01:29,  1.04it/s]

Memproses buka:   8%|▊         | 8/100 [00:08<01:19,  1.15it/s]

Memproses buka:   9%|▉         | 9/100 [00:09<01:13,  1.25it/s]

Memproses buka:  10%|█         | 10/100 [00:10<01:16,  1.18it/s]

Memproses buka:  11%|█         | 11/100 [00:11<01:21,  1.09it/s]

Memproses buka:  12%|█▏        | 12/100 [00:12<01:28,  1.00s/it]

Memproses buka:  13%|█▎        | 13/100 [00:14<01:50,  1.27s/it]

Memproses buka:  14%|█▍        | 14/100 [00:15<01:33,  1.09s/it]

Memproses buka:  15%|█▌        | 15/100 [00:16<01:24,  1.01it/s]

Memproses buka:  16%|█▌        | 16/100 [00:16<01:18,  1.07it/s]

Memproses buka:  17%|█▋        | 17/100 [00:18<01:35,  1.15s/it]

Memproses buka:  18%|█▊        | 18/100 [00:20<01:52,  1.37s/it]

Memproses buka:  19%|█▉        | 19/100 [00:21<01:38,  1.22s/it]

Memproses buka:  20%|██        | 20/100 [00:22<01:27,  1.09s/it]

Memproses buka:  21%|██        | 21/100 [00:22<01:15,  1.05it/s]

Memproses buka:  22%|██▏       | 22/100 [00:23<01:08,  1.13it/s]

Memproses buka:  23%|██▎       | 23/100 [00:24<01:09,  1.10it/s]

Memproses buka:  24%|██▍       | 24/100 [00:26<01:38,  1.30s/it]

Memproses buka:  25%|██▌       | 25/100 [00:27<01:27,  1.16s/it]

Memproses buka:  26%|██▌       | 26/100 [00:28<01:27,  1.19s/it]

Memproses buka:  27%|██▋       | 27/100 [00:29<01:15,  1.04s/it]

Memproses buka:  28%|██▊       | 28/100 [00:30<01:07,  1.07it/s]

Memproses buka:  29%|██▉       | 29/100 [00:31<01:20,  1.14s/it]

Memproses buka:  30%|███       | 30/100 [00:33<01:30,  1.30s/it]

Memproses buka:  31%|███       | 31/100 [00:35<01:47,  1.55s/it]

Memproses buka:  32%|███▏      | 32/100 [00:37<02:00,  1.77s/it]

Memproses buka:  33%|███▎      | 33/100 [00:40<02:16,  2.04s/it]

Memproses buka:  34%|███▍      | 34/100 [00:42<02:09,  1.97s/it]

Memproses buka:  35%|███▌      | 35/100 [00:43<01:45,  1.62s/it]

Memproses buka:  36%|███▌      | 36/100 [00:44<01:49,  1.70s/it]

Memproses buka:  37%|███▋      | 37/100 [00:45<01:32,  1.46s/it]

Memproses buka:  38%|███▊      | 38/100 [00:47<01:36,  1.55s/it]

Memproses buka:  39%|███▉      | 39/100 [00:48<01:24,  1.39s/it]

Memproses buka:  40%|████      | 40/100 [00:50<01:31,  1.53s/it]

Memproses buka:  41%|████      | 41/100 [00:51<01:18,  1.33s/it]

Memproses buka:  42%|████▏     | 42/100 [00:54<01:41,  1.75s/it]

Memproses buka:  43%|████▎     | 43/100 [00:55<01:25,  1.50s/it]

Memproses buka:  44%|████▍     | 44/100 [00:56<01:29,  1.60s/it]

Memproses buka:  45%|████▌     | 45/100 [00:59<01:37,  1.78s/it]

Memproses buka:  46%|████▌     | 46/100 [00:59<01:19,  1.48s/it]

Memproses buka:  47%|████▋     | 47/100 [01:01<01:22,  1.55s/it]

Memproses buka:  48%|████▊     | 48/100 [01:02<01:06,  1.28s/it]

Memproses buka:  49%|████▉     | 49/100 [01:04<01:26,  1.69s/it]

Memproses buka:  50%|█████     | 50/100 [01:05<01:08,  1.38s/it]

Memproses buka:  51%|█████     | 51/100 [01:07<01:17,  1.58s/it]

Memproses buka:  52%|█████▏    | 52/100 [01:08<01:04,  1.34s/it]

Memproses buka:  53%|█████▎    | 53/100 [01:09<00:57,  1.23s/it]

Memproses buka:  54%|█████▍    | 54/100 [01:11<01:05,  1.43s/it]

Memproses buka:  55%|█████▌    | 55/100 [01:12<00:56,  1.25s/it]

Memproses buka:  56%|█████▌    | 56/100 [01:12<00:50,  1.14s/it]

Memproses buka:  57%|█████▋    | 57/100 [01:13<00:45,  1.05s/it]

Memproses buka:  58%|█████▊    | 58/100 [01:14<00:40,  1.04it/s]

Memproses buka:  59%|█████▉    | 59/100 [01:15<00:36,  1.12it/s]

Memproses buka:  60%|██████    | 60/100 [01:16<00:39,  1.01it/s]

Memproses buka:  61%|██████    | 61/100 [01:17<00:36,  1.06it/s]

Memproses buka:  62%|██████▏   | 62/100 [01:18<00:36,  1.05it/s]

Memproses buka:  63%|██████▎   | 63/100 [01:18<00:32,  1.15it/s]

Memproses buka:  64%|██████▍   | 64/100 [01:19<00:31,  1.15it/s]

Memproses buka:  65%|██████▌   | 65/100 [01:20<00:33,  1.05it/s]

Memproses buka:  66%|██████▌   | 66/100 [01:21<00:30,  1.11it/s]

Memproses buka:  67%|██████▋   | 67/100 [01:22<00:27,  1.20it/s]

Memproses buka:  68%|██████▊   | 68/100 [01:24<00:38,  1.21s/it]

Memproses buka:  69%|██████▉   | 69/100 [01:25<00:32,  1.03s/it]

Memproses buka:  70%|███████   | 70/100 [01:27<00:39,  1.30s/it]

Memproses buka:  71%|███████   | 71/100 [01:28<00:42,  1.48s/it]

Memproses buka:  72%|███████▏  | 72/100 [01:29<00:36,  1.32s/it]

Memproses buka:  73%|███████▎  | 73/100 [01:32<00:48,  1.81s/it]

Memproses buka:  74%|███████▍  | 74/100 [01:34<00:46,  1.79s/it]

Memproses buka:  75%|███████▌  | 75/100 [01:35<00:35,  1.44s/it]

Memproses buka:  76%|███████▌  | 76/100 [01:35<00:29,  1.24s/it]

Memproses buka:  77%|███████▋  | 77/100 [01:36<00:25,  1.09s/it]

Memproses buka:  78%|███████▊  | 78/100 [01:37<00:21,  1.03it/s]

Memproses buka:  79%|███████▉  | 79/100 [01:38<00:20,  1.04it/s]

Memproses buka:  80%|████████  | 80/100 [01:39<00:21,  1.08s/it]

Memproses buka:  81%|████████  | 81/100 [01:40<00:18,  1.01it/s]

Memproses buka:  82%|████████▏ | 82/100 [01:41<00:18,  1.05s/it]

Memproses buka:  83%|████████▎ | 83/100 [01:42<00:16,  1.01it/s]

Memproses buka:  84%|████████▍ | 84/100 [01:43<00:15,  1.03it/s]

Memproses buka:  85%|████████▌ | 85/100 [01:44<00:14,  1.05it/s]

Memproses buka:  86%|████████▌ | 86/100 [01:45<00:12,  1.11it/s]

Memproses buka:  87%|████████▋ | 87/100 [01:46<00:13,  1.01s/it]

Memproses buka:  88%|████████▊ | 88/100 [01:47<00:12,  1.00s/it]

Memproses buka:  89%|████████▉ | 89/100 [01:47<00:09,  1.16it/s]

Memproses buka:  90%|█████████ | 90/100 [01:50<00:13,  1.33s/it]

Memproses buka:  91%|█████████ | 91/100 [01:51<00:10,  1.18s/it]

Memproses buka:  92%|█████████▏| 92/100 [01:51<00:08,  1.03s/it]

Memproses buka:  93%|█████████▎| 93/100 [01:52<00:06,  1.07it/s]

Memproses buka:  94%|█████████▍| 94/100 [01:53<00:05,  1.16it/s]

Memproses buka:  95%|█████████▌| 95/100 [01:54<00:04,  1.10it/s]

Memproses buka:  96%|█████████▌| 96/100 [01:54<00:03,  1.18it/s]

Memproses buka:  97%|█████████▋| 97/100 [01:55<00:02,  1.19it/s]

Memproses buka:  98%|█████████▊| 98/100 [01:57<00:01,  1.04it/s]

Memproses buka:  99%|█████████▉| 99/100 [01:57<00:00,  1.12it/s]

Memproses buka: 100%|██████████| 100/100 [01:58<00:00,  1.19s/it]
Memproses tutup:   0%|          | 0/100 [00:00<?, ?it/s]

Memproses tutup:   1%|          | 1/100 [00:01<03:10,  1.92s/it]

Memproses tutup:   2%|▏         | 2/100 [00:02<01:57,  1.20s/it]

Memproses tutup:   3%|▎         | 3/100 [00:04<02:17,  1.42s/it]

Memproses tutup:   4%|▍         | 4/100 [00:05<01:53,  1.18s/it]

Memproses tutup:   5%|▌         | 5/100 [00:05<01:37,  1.03s/it]

Memproses tutup:   6%|▌         | 6/100 [00:07<01:55,  1.23s/it]

Memproses tutup:   7%|▋         | 7/100 [00:09<02:03,  1.32s/it]

Memproses tutup:   8%|▊         | 8/100 [00:09<01:45,  1.14s/it]

Memproses tutup:   9%|▉         | 9/100 [00:11<01:58,  1.30s/it]

Memproses tutup:  10%|█         | 10/100 [00:12<01:54,  1.27s/it]

Memproses tutup:  11%|█         | 11/100 [00:13<01:35,  1.07s/it]

Memproses tutup:  12%|█▏        | 12/100 [00:14<01:26,  1.02it/s]

Memproses tutup:  13%|█▎        | 13/100 [00:14<01:24,  1.03it/s]

Memproses tutup:  14%|█▍        | 14/100 [00:15<01:14,  1.15it/s]

Memproses tutup:  15%|█▌        | 15/100 [00:16<01:09,  1.23it/s]

Memproses tutup:  16%|█▌        | 16/100 [00:16<01:04,  1.29it/s]

Memproses tutup:  17%|█▋        | 17/100 [00:17<01:01,  1.34it/s]

Memproses tutup:  18%|█▊        | 18/100 [00:18<01:05,  1.26it/s]

Memproses tutup:  19%|█▉        | 19/100 [00:19<01:02,  1.29it/s]

Memproses tutup:  20%|██        | 20/100 [00:19<00:59,  1.35it/s]

Memproses tutup:  21%|██        | 21/100 [00:20<00:59,  1.33it/s]

Memproses tutup:  22%|██▏       | 22/100 [00:22<01:32,  1.19s/it]

Memproses tutup:  23%|██▎       | 23/100 [00:23<01:21,  1.06s/it]

Memproses tutup:  24%|██▍       | 24/100 [00:24<01:16,  1.00s/it]

Memproses tutup:  25%|██▌       | 25/100 [00:25<01:09,  1.08it/s]

Memproses tutup:  26%|██▌       | 26/100 [00:26<01:24,  1.15s/it]

Memproses tutup:  27%|██▋       | 27/100 [00:28<01:39,  1.37s/it]

Memproses tutup:  28%|██▊       | 28/100 [00:30<01:50,  1.53s/it]

Memproses tutup:  29%|██▉       | 29/100 [00:31<01:31,  1.29s/it]

Memproses tutup:  30%|███       | 30/100 [00:32<01:18,  1.12s/it]

Memproses tutup:  31%|███       | 31/100 [00:32<01:08,  1.01it/s]

Memproses tutup:  32%|███▏      | 32/100 [00:33<01:00,  1.13it/s]

Memproses tutup:  33%|███▎      | 33/100 [00:35<01:15,  1.12s/it]

Memproses tutup:  34%|███▍      | 34/100 [00:36<01:07,  1.03s/it]

Memproses tutup:  35%|███▌      | 35/100 [00:37<01:15,  1.16s/it]

Memproses tutup:  36%|███▌      | 36/100 [00:38<01:04,  1.01s/it]

Memproses tutup:  37%|███▋      | 37/100 [00:39<01:01,  1.02it/s]

Memproses tutup:  38%|███▊      | 38/100 [00:40<01:15,  1.21s/it]

Memproses tutup:  39%|███▉      | 39/100 [00:41<01:04,  1.06s/it]

Memproses tutup:  40%|████      | 40/100 [00:42<00:57,  1.04it/s]

Memproses tutup:  41%|████      | 41/100 [00:42<00:50,  1.16it/s]

Memproses tutup:  42%|████▏     | 42/100 [00:44<00:59,  1.02s/it]

Memproses tutup:  43%|████▎     | 43/100 [00:45<00:54,  1.04it/s]

Memproses tutup:  44%|████▍     | 44/100 [00:46<01:08,  1.22s/it]

Memproses tutup:  45%|████▌     | 45/100 [00:47<00:58,  1.07s/it]

Memproses tutup:  46%|████▌     | 46/100 [00:48<00:54,  1.01s/it]

Memproses tutup:  47%|████▋     | 47/100 [00:49<00:48,  1.10it/s]

Memproses tutup:  48%|████▊     | 48/100 [00:49<00:44,  1.17it/s]

Memproses tutup:  49%|████▉     | 49/100 [00:51<00:57,  1.13s/it]

Memproses tutup:  50%|█████     | 50/100 [00:53<01:04,  1.29s/it]

Memproses tutup:  51%|█████     | 51/100 [00:54<01:06,  1.35s/it]

Memproses tutup:  52%|█████▏    | 52/100 [00:56<01:12,  1.51s/it]

Memproses tutup:  53%|█████▎    | 53/100 [00:57<01:00,  1.28s/it]

Memproses tutup:  54%|█████▍    | 54/100 [00:58<00:51,  1.11s/it]

Memproses tutup:  55%|█████▌    | 55/100 [00:58<00:45,  1.01s/it]

Memproses tutup:  56%|█████▌    | 56/100 [00:59<00:43,  1.01it/s]

Memproses tutup:  57%|█████▋    | 57/100 [01:00<00:39,  1.08it/s]

Memproses tutup:  58%|█████▊    | 58/100 [01:01<00:34,  1.23it/s]

Memproses tutup:  59%|█████▉    | 59/100 [01:01<00:31,  1.29it/s]

Memproses tutup:  60%|██████    | 60/100 [01:02<00:29,  1.37it/s]

Memproses tutup:  61%|██████    | 61/100 [01:03<00:27,  1.44it/s]

Memproses tutup:  62%|██████▏   | 62/100 [01:03<00:27,  1.37it/s]

Memproses tutup:  63%|██████▎   | 63/100 [01:05<00:40,  1.08s/it]

Memproses tutup:  64%|██████▍   | 64/100 [01:06<00:35,  1.02it/s]

Memproses tutup:  65%|██████▌   | 65/100 [01:07<00:31,  1.11it/s]

Memproses tutup:  66%|██████▌   | 66/100 [01:07<00:28,  1.21it/s]

Memproses tutup:  67%|██████▋   | 67/100 [01:08<00:27,  1.20it/s]

Memproses tutup:  68%|██████▊   | 68/100 [01:09<00:28,  1.12it/s]

Memproses tutup:  69%|██████▉   | 69/100 [01:10<00:29,  1.06it/s]

Memproses tutup:  70%|███████   | 70/100 [01:11<00:27,  1.10it/s]

Memproses tutup:  71%|███████   | 71/100 [01:12<00:25,  1.14it/s]

Memproses tutup:  72%|███████▏  | 72/100 [01:13<00:22,  1.24it/s]

Memproses tutup:  73%|███████▎  | 73/100 [01:15<00:35,  1.33s/it]

Memproses tutup:  74%|███████▍  | 74/100 [01:16<00:29,  1.13s/it]

Memproses tutup:  75%|███████▌  | 75/100 [01:17<00:24,  1.02it/s]

Memproses tutup:  76%|███████▌  | 76/100 [01:17<00:20,  1.15it/s]

Memproses tutup:  77%|███████▋  | 77/100 [01:18<00:19,  1.16it/s]

Memproses tutup:  78%|███████▊  | 78/100 [01:20<00:25,  1.17s/it]

Memproses tutup:  79%|███████▉  | 79/100 [01:21<00:21,  1.04s/it]

Memproses tutup:  80%|████████  | 80/100 [01:21<00:19,  1.03it/s]

Memproses tutup:  81%|████████  | 81/100 [01:22<00:16,  1.16it/s]

Memproses tutup:  82%|████████▏ | 82/100 [01:24<00:18,  1.06s/it]

Memproses tutup:  83%|████████▎ | 83/100 [01:24<00:16,  1.03it/s]

Memproses tutup:  84%|████████▍ | 84/100 [01:26<00:19,  1.22s/it]

Memproses tutup:  85%|████████▌ | 85/100 [01:28<00:19,  1.33s/it]

Memproses tutup:  86%|████████▌ | 86/100 [01:28<00:16,  1.16s/it]

Memproses tutup:  87%|████████▋ | 87/100 [01:30<00:14,  1.12s/it]

Memproses tutup:  88%|████████▊ | 88/100 [01:31<00:15,  1.25s/it]

Memproses tutup:  89%|████████▉ | 89/100 [01:32<00:12,  1.13s/it]

Memproses tutup:  90%|█████████ | 90/100 [01:33<00:09,  1.02it/s]

Memproses tutup:  91%|█████████ | 91/100 [01:34<00:09,  1.05s/it]

Memproses tutup:  92%|█████████▏| 92/100 [01:34<00:07,  1.09it/s]

Memproses tutup:  93%|█████████▎| 93/100 [01:35<00:06,  1.14it/s]

Memproses tutup:  94%|█████████▍| 94/100 [01:36<00:04,  1.25it/s]

Memproses tutup:  95%|█████████▌| 95/100 [01:37<00:04,  1.18it/s]

Memproses tutup:  96%|█████████▌| 96/100 [01:39<00:04,  1.21s/it]

Memproses tutup:  97%|█████████▋| 97/100 [01:41<00:04,  1.37s/it]

Memproses tutup:  98%|█████████▊| 98/100 [01:42<00:02,  1.50s/it]

Memproses tutup:  99%|█████████▉| 99/100 [01:43<00:01,  1.31s/it]

Memproses tutup: 100%|██████████| 100/100 [01:45<00:00,  1.05s/it]


Ekstraksi fitur selesai.
   0_Absolute energy  0_Area under the curve  0_Autocorrelation  \
0        2602.918984                0.174058               19.0   
1        1550.156331                0.126149               24.0   
2        2348.326999                0.153402               18.0   
3        2169.795192                0.158505               19.0   
4        3579.066433                0.202078               22.0   

   0_Average power  0_Centroid  0_ECDF Percentile Count_0  \
0      1142.246928    1.552798                    21876.0   
1       876.867916    0.831794                    16971.0   
2      1223.100250    1.026031                    18432.0   
3      1082.023471    1.371756                    19251.0   
4      1242.740390    1.025401                    27648.0   

   0_ECDF Percentile Count_1  0_ECDF Percentile_0  0_ECDF Percentile_1  \
0                    87505.0            -0.019836             0.020325   
1                    67885.0            -0.034943       




#### - Pembuatan Model (Random Forest)

Impor Library: Mengimpor pandas, joblib (untuk menyimpan model), dan berbagai modul dari sklearn (untuk membagi data, scaling, melatih model, dan evaluasi).

In [6]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
import joblib

Memuat Data Membaca file voice_features.dan Memisahkan data menjadi fitur (X) dan label (y).
lalu Membagi dataset menjadi 80% data latih dan 20% data uji menggunakan train_test_split. Penggunaan stratify=y memastikan proporsi kelas "buka" dan "tutup" seimbang di kedua set.

In [15]:
data = pd.read_csv("voice_features_tsfel.csv")

# 2. Pisahkan fitur (X) dan label (y)
X = data.drop('label', axis=1)
y = data['label']

# 3. Split data training dan testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

print("Data berhasil dipisah menjadi training dan testing : ")
print(f" - Ukuran data training: {X_train.shape[0]} sampel")
print(f" - Ukuran data testing: {X_test.shape[0]} sampel")


Data berhasil dipisah menjadi training dan testing : 
 - Ukuran data training: 160 sampel
 - Ukuran data testing: 40 sampel


 Scaling Fitur dengan Menerapkan StandardScaler pada data fitur (X). Ini adalah langkah penting untuk menormalkan data sehingga model (seperti Random Forest) dapat bekerja lebih optimal. Scaler ini di-fit hanya pada data latih.

In [17]:
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("Scaling fitur selesai.")


Scaling fitur selesai.


Pelatihan Model Menggunakan RandomForestClassifier (sebuah model ensemble yang kuat) dan melatihnya (.fit()) menggunakan data latih yang sudah di-scale.

In [18]:
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train_scaled, y_train)

Evaluasi Model pada data uji yang belum pernah dilihat sebelumnya.

In [19]:
#Evaluasi Model
print("\nMemulai evaluasi model...")
from sklearn.metrics import classification_report, confusion_matrix
y_pred = model.predict(X_test_scaled)
print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))
print("\nClassification Report:")
print(classification_report(y_test, y_pred))



Memulai evaluasi model...

Confusion Matrix:
[[19  1]
 [ 0 20]]

Classification Report:
              precision    recall  f1-score   support

        buka       1.00      0.95      0.97        20
       tutup       0.95      1.00      0.98        20

    accuracy                           0.97        40
   macro avg       0.98      0.97      0.97        40
weighted avg       0.98      0.97      0.97        40



Menyimpan dua file penting ke disk menggunakan joblib.dump:

- model_voice.joblib: File model Random Forest yang sudah terlatih.

- scaler_voice.joblib: File scaler yang berisi informasi (mean/std) dari data latih. File ini sangat penting untuk memproses data baru (misalnya, input audio real-time di Streamlit) sebelum diberikan ke model untuk prediksi.

In [11]:
# 7. SIMPAN MODEL DAN SCALER
# Ini adalah file yang akan dipakai Streamlit
joblib.dump(model, "model_voice.joblib")
joblib.dump(scaler, "scaler_voice.joblib")

print("\nBerhasil! Model disimpan ke 'model_voice.joblib'")
print("Berhasil! Scaler disimpan ke 'scaler_voice.joblib'")


Berhasil! Model disimpan ke 'model_voice.joblib'
Berhasil! Scaler disimpan ke 'scaler_voice.joblib'
