# Ensemble Learning

### Random Forest & Gradient Boosting
* Çoklu algoritmalar içeren algoritmalara ensemble denir.
* Birden fazla desicion treeden oluşur (Random Forest)


In [5]:
from sklearn.datasets import make_moons

In [6]:
x,y = make_moons(n_samples=100,noise=0.25,random_state=3)

In [7]:
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,stratify=y)

In [9]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC

log = LogisticRegression(solver='lbfgs').fit(x_train,y_train)
rnd = RandomForestClassifier(n_estimators=10).fit(x_train,y_train)
svm = SVC(gamma='auto').fit(x_train,y_train)
voting = VotingClassifier([('lr',log),('rf',rnd),('svc',svm)],
                          voting='hard').fit(x_train,y_train)

In [10]:
print(log.score(x_test,y_test))
print(rnd.score(x_test,y_test))
print(svm.score(x_test,y_test))
print(voting.score(x_test,y_test))

0.84
0.72
0.88
0.88


## Bagging and Boostrap

In [11]:
from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import make_blobs

In [12]:
from sklearn.model_selection import train_test_split
x,y = make_blobs(n_samples=400,centers=5,random_state=0,cluster_std=1)
x_train,x_test,y_train,y_test = train_test_split(x,y,stratify=y)

In [13]:
tree = DecisionTreeClassifier().fit(x_train,y_train)
bag = BaggingClassifier(tree,
                        n_estimators=100,
                        max_samples=0.8,
                        n_jobs=-1,
                        random_state=1).fit(x_train,y_train)

In [14]:
print(tree.score(x_test,y_test))
print(bag.score(x_test,y_test))

0.92
0.94


## Gradient Boosted Ağaçları
* Gradient boosted ağaçlar, zayıf tahminleyicileri bir araya getirerek, her birinin hatalarını düzeltmeye çalışarak daha güçlü bir model oluşturan bir makine öğrenmesi algoritmasıdır.

In [16]:
from sklearn.ensemble import GradientBoostingClassifier
gbrt = GradientBoostingClassifier(random_state=0).fit(x_train,y_train)

In [18]:
print(f"Train Acc {gbrt.score(x_train,y_train)}")
print(f"Test Acc {gbrt.score(x_test,y_test)}")

Train Acc 1.0
Test Acc 0.93


* modelde overfitting var. bunu max_depth parameterisi ile düzeltelim.

In [21]:
from sklearn.ensemble import GradientBoostingClassifier
gbrt = GradientBoostingClassifier(max_depth=1,random_state=0).fit(x_train,y_train)

In [22]:
print(f"Train Acc {gbrt.score(x_train,y_train)}")
print(f"Test Acc {gbrt.score(x_test,y_test)}")

Train Acc 0.9833333333333333
Test Acc 0.95
