# BayesianSearchCV

1. Apa itu hyperparameter tuning?
2. Jelaskan metode BayesSearchCV!
3. Bagaimana cara kerja BayesSearchCV?
4. Apa kelebihan BayesSearchCV dibandingkan metode hyperparameter tuning lainnya?



1. Proses mencari nilai kombinasi nilai terbaik agar model punya performa optimal
2. Metode hyperparameter tuning berbasis Bayesian Optimization yang menyesuaikan pencarian berdasarkan hasil percobaan sebelumnya.
3. Mulai dari percobaan acak, bangun model probabilistik, prediksi area potensial terbaik, uji, dan ulangi hingga ditemukan kombinasi optimal.
4. Lebih efisien, adaptif, dan cocok untuk model yang mahal dilatih dibanding Grid Search atau Random Search.

In [1]:
# import library
!pip install scikit-optimize # install jika belum pernah install #

Collecting scikit-optimize
  Downloading scikit_optimize-0.10.2-py2.py3-none-any.whl.metadata (9.7 kB)
Collecting pyaml>=16.9 (from scikit-optimize)
  Downloading pyaml-25.7.0-py3-none-any.whl.metadata (12 kB)
Downloading scikit_optimize-0.10.2-py2.py3-none-any.whl (107 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m107.8/107.8 kB[0m [31m4.7 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pyaml-25.7.0-py3-none-any.whl (26 kB)
Installing collected packages: pyaml, scikit-optimize
Successfully installed pyaml-25.7.0 scikit-optimize-0.10.2


In [4]:
# Load Dataset
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.75, random_state=0)

In [8]:
from sklearn.svm import SVC
from skopt.space import Real, Categorical, Integer
from sklearn.preprocessing import StandardScaler # Import StandardScaler
from sklearn.pipeline import Pipeline # Import Pipeline

# Definisikan model
# Gunakan Pipeline untuk menggabungkan scaler dan model
model = Pipeline([
    ('scaler', StandardScaler()), # Tambahkan langkah scaling
    ('svc', SVC(random_state=0)) # Model SVC
])

# Definisikan ruang hyperparameter
# Sesuaikan nama parameter untuk Pipeline (nama_step__nama_parameter)
param_space = {
    'svc__C': Real(1e-6, 1e+6, prior='log-uniform'),
    'svc__gamma': Real(1e-6, 1e+6, prior='log-uniform'),
    'svc__kernel': Categorical(['rbf', 'poly', 'sigmoid']),
    'svc__degree': Integer(1, 5) # Only relevant for 'poly' kernel
}

print("Model defined:", model)
print("Hyperparameter space defined:", param_space)

Model defined: Pipeline(steps=[('scaler', StandardScaler()), ('svc', SVC(random_state=0))])
Hyperparameter space defined: {'svc__C': Real(low=1e-06, high=1000000.0, prior='log-uniform', transform='identity'), 'svc__gamma': Real(low=1e-06, high=1000000.0, prior='log-uniform', transform='identity'), 'svc__kernel': Categorical(categories=('rbf', 'poly', 'sigmoid'), prior=None), 'svc__degree': Integer(low=1, high=5, prior='uniform', transform='identity')}


# Fungsi Optimisasi Menggunakan BayesianSearchCV

In [9]:
from skopt import BayesSearchCV
from sklearn.model_selection import StratifiedKFold

# Inisialisasi BayesSearchCV

bayes_search = BayesSearchCV(
    estimator=model, # Menggunakan model Pipeline
    search_spaces=param_space, # Menggunakan param_space yang disesuaikan untuk Pipeline
    n_iter=50,
    cv=StratifiedKFold(n_splits=5, shuffle=True, random_state=0),
    n_jobs=-1,
    random_state=0,
    verbose=2
)

print("BayesSearchCV initialized:", bayes_search)

BayesSearchCV initialized: BayesSearchCV(cv=StratifiedKFold(n_splits=5, random_state=0, shuffle=True),
              estimator=Pipeline(steps=[('scaler', StandardScaler()),
                                        ('svc', SVC(random_state=0))]),
              n_jobs=-1, random_state=0,
              search_spaces={'svc__C': Real(low=1e-06, high=1000000.0, prior='log-uniform', transform='identity'),
                             'svc__degree': Integer(low=1, high=5, prior='uniform', transform='identity'),
                             'svc__gamma': Real(low=1e-06, high=1000000.0, prior='log-uniform', transform='identity'),
                             'svc__kernel': Categorical(categories=('rbf', 'poly', 'sigmoid'), prior=None)},
              verbose=2)


In [10]:
# Jalankan optimisasi
# Proses ini akan mencari kombinasi hyperparameter terbaik
# berdasarkan performa model pada data training menggunakan cross-validation
bayes_search.fit(X_train, y_train) # Sekarang data akan discale di dalam pipeline

print("Optimisasi selesai.")
print("Hasil terbaik:")
print("Best score:", bayes_search.best_score_)
print("Best parameters:", bayes_search.best_params_)

Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fits
Fitting 5 folds for each of 1 candidates, totalling 5 fi

In [12]:
#Evaluasi Model
from sklearn.metrics import accuracy_score

y_pred = bayes_search.best_estimator_.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

print("Evaluasi Model Selesai.")
print(f"Akurasi pada data testing: {accuracy:.4f}")

Evaluasi Model Selesai.
Akurasi pada data testing: 0.9737
