<a href="https://colab.research.google.com/github/raven-gith/machinelearning1/blob/main/07.%20Chapter%2007/chapter_07_ensemble_learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Chapter 7: Ensemble Learning and Random Forests

Notebook ini mereproduksi dan menjelaskan isi Bab 7 dari buku _Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow_ oleh Aurélien Géron.

Topik utama:
- Voting Classifier
- Bagging dan Random Forest
- Out-of-Bag evaluation
- Boosting (AdaBoost & Gradient Boosting)


In [1]:

from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
from sklearn.ensemble import VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Dataset
X, y = make_moons(n_samples=500, noise=0.3, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

# Voting Classifier
log_clf = LogisticRegression()
tree_clf = DecisionTreeClassifier()
svm_clf = SVC(probability=True)

voting_clf = VotingClassifier(
    estimators=[('lr', log_clf), ('dt', tree_clf), ('svc', svm_clf)],
    voting='soft'
)

voting_clf.fit(X_train, y_train)
for clf in (log_clf, tree_clf, svm_clf, voting_clf):
    clf.fit(X_train, y_train)
    y_pred = clf.predict(X_test)
    print(clf.__class__.__name__, "accuracy:", accuracy_score(y_test, y_pred))


LogisticRegression accuracy: 0.864
DecisionTreeClassifier accuracy: 0.872
SVC accuracy: 0.896
VotingClassifier accuracy: 0.928


In [2]:

from sklearn.ensemble import RandomForestClassifier

rf_clf = RandomForestClassifier(n_estimators=100, max_leaf_nodes=16, n_jobs=-1, random_state=42)
rf_clf.fit(X_train, y_train)

y_pred_rf = rf_clf.predict(X_test)
print("Random Forest accuracy:", accuracy_score(y_test, y_pred_rf))


Random Forest accuracy: 0.928


In [6]:

from sklearn.ensemble import AdaBoostClassifier

ada_clf = AdaBoostClassifier(
    DecisionTreeClassifier(max_depth=1),
    n_estimators=200,
    algorithm="SAMME",
    learning_rate=0.5,
    random_state=42
)

ada_clf.fit(X_train, y_train)
y_pred_ada = ada_clf.predict(X_test)
print("AdaBoost accuracy:", accuracy_score(y_test, y_pred_ada))




AdaBoost accuracy: 0.896


In [5]:

from sklearn.ensemble import GradientBoostingClassifier

gb_clf = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1, max_depth=3, random_state=42)
gb_clf.fit(X_train, y_train)

y_pred_gb = gb_clf.predict(X_test)
print("Gradient Boosting accuracy:", accuracy_score(y_test, y_pred_gb))


Gradient Boosting accuracy: 0.888
