### Adaboost

This activity focuses on using the `AdaBoostClassifier` and the performance resulting from changing the base classifier that is used.  As discussed in the lectures, adaptive boosting is a successive reweighting of data using a set number of estimators.  These weighted estimators are what form the ensemble, and the predictions are a result of a weighted combination of the estimators.  

- [Problem 1](#-Problem-1)
- [Problem 2](#-Problem-2)
- [Problem 3](#-Problem-3)
- [Problem 4](#-Problem-4)
- [Problem 5](#-Problem-5)

In [6]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.model_selection import GridSearchCV, train_test_split

In [9]:
df = pd.read_csv('codio_21_5_solution/data/fetal.zip', compression = 'zip') 
df.head()

Unnamed: 0,baseline value,accelerations,fetal_movement,uterine_contractions,light_decelerations,severe_decelerations,prolongued_decelerations,abnormal_short_term_variability,mean_value_of_short_term_variability,percentage_of_time_with_abnormal_long_term_variability,...,histogram_min,histogram_max,histogram_number_of_peaks,histogram_number_of_zeroes,histogram_mode,histogram_mean,histogram_median,histogram_variance,histogram_tendency,fetal_health
0,120.0,0.0,0.0,0.0,0.0,0.0,0.0,73.0,0.5,43.0,...,62.0,126.0,2.0,0.0,120.0,137.0,121.0,73.0,1.0,2.0
1,132.0,0.006,0.0,0.006,0.003,0.0,0.0,17.0,2.1,0.0,...,68.0,198.0,6.0,1.0,141.0,136.0,140.0,12.0,0.0,1.0
2,133.0,0.003,0.0,0.008,0.003,0.0,0.0,16.0,2.1,0.0,...,68.0,198.0,5.0,1.0,141.0,135.0,138.0,13.0,0.0,1.0
3,134.0,0.003,0.0,0.008,0.003,0.0,0.0,16.0,2.4,0.0,...,53.0,170.0,11.0,0.0,137.0,134.0,137.0,13.0,1.0,1.0
4,132.0,0.007,0.0,0.008,0.0,0.0,0.0,16.0,2.4,0.0,...,53.0,170.0,9.0,0.0,137.0,136.0,138.0,11.0,1.0,1.0


In [18]:
X = df.drop('fetal_health',axis = 1)
y = df['fetal_health']

In [19]:
X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                   random_state=42)

### Problem 1

#### `AdaBoostClassifier`

What is the default estimator in the `AdaBoostClassifier`?  Instantiate the default estimator with the correct hyperparameters to `ans1` below.

In [12]:
ans1 = DecisionTreeClassifier(max_depth = 1)
ans1

### Problem 2

#### Fitting the Ensemble

Below, use the `AdaBoostClassifier` to fit the data.  Use all default settings and and assign the accuracy of the model on the test data to `model_1_acc` below.

In [20]:
model_1 = AdaBoostClassifier().fit(X_train,y_train)
model_1

In [21]:
model_1_acc = model_1.score(X_test,y_test)
model_1_acc

0.9097744360902256

### Problem 3

#### Grid Searching the Ensemble

As the documentation states [here](https://scikit-learn.org/stable/modules/ensemble.html#usage), the main parameters to search are the number of estimators and the complexity of the base estimator.  Create a parameter grid that considers the following parameters:

- *number of estimators*: 100, 200
- *max_depths*: 1, 2, 3

as `params` below.  Use this with the `AdaBoostClassifier` to grid search named `tree_grid` on the train data.  Assign the score on the test data as `grid_acc`.  Be sure to set the `random_state = 42` in your `AdaBoostClassifier`.

**NOTE:** This is computaitonally expensive. It may take up to two minutes for the answer check (print(grid_acc) ) results to appear. It is also advised that you do NOT run the blank cell underneath (the grading cell) until you see the results from your answer.


In [26]:
params = {'n_estimators': [100, 200],
         'estimator__max_depth': [1, 2, 3]}

In [27]:
tree_grid = GridSearchCV(AdaBoostClassifier(estimator= DecisionTreeClassifier(), random_state = 42), param_grid = params).fit(X_train,y_train)
tree_grid

In [28]:
grid_acc = tree_grid.score(X_test,y_test)
grid_acc

0.9548872180451128

### Problem 4

#### A Different Base Estimator

Consider using a different base estimator such as `LogisticRegression` estimator.  Explore the neighbors parameters with 

- `C = [.001, 0.01, 0.1, 1.0, 10.0]`

Create a `Pipeline` that scales the data first and then implements an `AdaBoostClassifier` with `random_state = 42` and a Logistic Regression model.  Grid search the pipeline with a grid and assign the score on the test data to `score2`. 

**Note:** Again, this one is computationally expensive. Be patient with the results, and do NOT run the blank cell underneath (the grading cell) until you see the results from your answer.


In [31]:
params = {'mod__estimator__C':[0.001,0.01,0.1,1.0,10.0]}

In [32]:
p = Pipeline([('scaler',StandardScaler()),
             ('mod',AdaBoostClassifier(estimator = LogisticRegression(),
                                      random_state = 42))])
p

In [33]:
g = GridSearchCV(p,param_grid = params)
g.fit(X_train,y_train)
score2 = g.score(X_test,y_test)
score2
                 

0.8458646616541353

### Problem 5

#### Evaluating the models

Which model performed the best on the test data?

- `a`: Base `AdaBoostClassifier`
- `b`: Grid Searched Tree Model
- `c`: Grid Searched Logistic Model
- `d`: None of the above

Assign your answer as a string to `ans5` below.

In [34]:
ans5 = 'b'