In [1]:
from IPython.core.display import HTML
import urllib2
HTML(urllib2.urlopen('https://gist.githubusercontent.com/mattlewissf/83989910849fdb4a04a72d431e84053f/raw/cefa015a9065665faccd0219774c7087be7d21a8/skeleton.css').read())

#### MIMIC Deep Dive - Results, Analysis, and Future Work
**[Intro](#intro)**   
**[30 Day Readmission](#30_day_readmission)**  
**[The MIMIC Dataset](#mimic_dataset)**  
**[Setting up the database](#setting_up_db)** 

As we've seen before, GBC gives us the highest AUC: 

- **GradientBoostingClassifier - 0.8127**
- AdaBoostClassifier - 0.8037
- LogisticRegression -  0.7769
- LogisticRegressionCV - 0.7718
- DecisionTreeClassifier - 0.7452
- RandomForestClassifier -  0.6706










In [None]:
# put in ROC curves (super small, maybe stiched together, for all of them)

#### GradientBoostingClassifier

Sk-learn's GradientBoostingClassifier (GBC) is (unsurprisingly) an example of a gradient boosted model (GBM). GBMs, especially the XGBoost algorith, are popular-used ensamble models for classification (and other) tasks. **Boosting** refers to a sequential turning **weak learners** into an ensemble strong learner classifier. In short: GBC starts with a rough prediction and then works to build a series of decision trees, with each tree in the series trying to correct the previous prediction error. 

<br></br>

To quote a [higher power](
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3885826/):  
> The common ensemble techniques like random forests rely on simple averaging of models in the ensemble. The family of boosting methods is based on a different.... add[ing] new models to the ensemble sequentially. At each particular iteration, a new weak, base-learner model is trained with respect to the error of the whole ensemble learnt so far....In gradient boosting machines, or simply, GBMs, the learning procedure consecutively fits new models to provide a more accurate estimate of the response variable.


Here's some additional resources on GBMs: 
- Amazing interactive graphics: http://arogozhnikov.github.io/2016/06/24/gradient_boosting_explained.html
- A paper with a gentle introduction; https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3885826/


Our highest AUC comes from using GBC - but can we get more out of the classifier itself? Most of our work so far has been extracting feature - or ** model parameters** - from the data itself to give ourselves the best chance of good predicitve abilities. Parameters can be learned from the training process, and that's what we are trying to do when we fit the model to our data. 

**Hyperparameters** are properties of the model itself that can't be learned directly from the data, and so need to be predefined. These can be things like the set number of leaves in a tree model. For GBC, sk-learn gives really easy access to the tunable hyperparameters. Here's the default settings: 

``` 
GradientBoostingClassifier(loss='deviance', learning_rate=0.1, n_estimators=100, subsample=1.0, criterion='friedman_mse', min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_split=1e-07, init=None, random_state=None, max_features=None, verbose=0, max_leaf_nodes=None, warm_start=False, presort='auto'
```

After reading a decent amount about trying to tune GBC classifers with sk-learn, I decided to experiment a bit with some of the more commonly 'tuned' hyperparameters to see if I could squeeze some additional AUC from our GBC. I foucused on varying max_depth, the number of estimateors, and the learning rate. 


In [None]:
GBC_attempts = {
                'standard': GradientBoostingClassifier(),
                'depth_5': GradientBoostingClassifier(max_depth=5), 
                'depth_2': GradientBoostingClassifier(max_depth=2), 
                'depth_4': GradientBoostingClassifier(max_depth=1), 
                'max_features_auto': GradientBoostingClassifier(max_features="auto"),
                'n_est_200': GradientBoostingClassifier(n_estimators=200), 
                'n_est_40': GradientBoostingClassifier(n_estimators=40),
                'n_est_40_l_rate_05': GradientBoostingClassifier(n_estimators=40, learning_rate=.05) }


So what happened when I tried to turn all of the knobs? Here's the results, pitted against the out-of-the-box setup: 

Some stuff