Istifleme (Stacking) Nedir?

- Stacking, farkli makine ogrenimi modellerinin (alt modellerin) tahminlerini birlestirerek daha guclu bir tahmin model (meta model) olusturan bir ensemble yontemidir.

- Temel fikir, bireysel modellerin zayif yonlerini telafi etmek ve genneleme yetenegini arttirmaktir.

Istifleme Genel Surec

- Egitim verisi alt modellere verilir ve her biri kendi tahminlerini uretir.

- Bu tahminler, meta modelin egitimi icin yeni bir ozellik seti olarak kullanilir.

- Meta model, nihai tahmini yapar.

- Ornek: Lojistik regresyon, karar agaclari ve SVM gibi farkli modellerin tahminlerini birlestirip, ustune bir lineer regresyon modeli egitmek.

In [1]:
# import libraries
from sklearn.datasets import load_breast_cancer
from sklearn.ensemble import StackingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report
from sklearn.model_selection import train_test_split

In [2]:
# load data set: breast cancer
data = load_breast_cancer()
X = data.data
y = data.target

# train test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [3]:
# define base models: DT, SVM
base_models = [
    ("dt", DecisionTreeClassifier(max_depth=5, random_state=42)),
    ("svc", SVC(probability=True, kernel="rbf", C=1, random_state=42))
]

# Define Meta model: Logistic Regression
meta_model = LogisticRegression()

# Define Stacking Classifier
stacking_clf = StackingClassifier(
    estimators=base_models,           # base models: dt, svc
    final_estimator=meta_model,       # meta model: logistic regression
    cv=5,                              # K fold cross validation
    stack_method="predict_proba"      # tahmin olasılıklarına göre bir sonuç üret
)

In [4]:
# model training and testing
stacking_clf.fit(X_train, y_train)
y_pred = stacking_clf.predict(X_test)

In [5]:
# model training and testing
stacking_clf.fit(X_train, y_train)
y_pred = stacking_clf.predict(X_test)

# evaluation: accuracy score, classification report
print(f"Stacking Classifier Acc: {accuracy_score(y_test, y_pred)}")
print(f"Classification report \n: {classification_report(y_test, y_pred)}")


Stacking Classifier Acc: 0.9649122807017544
Classification report 
:               precision    recall  f1-score   support

           0       0.95      0.95      0.95        63
           1       0.97      0.97      0.97       108

    accuracy                           0.96       171
   macro avg       0.96      0.96      0.96       171
weighted avg       0.96      0.96      0.96       171

