In [1]:
from IPython.core.display import HTML
import urllib2
HTML(urllib2.urlopen('https://gist.githubusercontent.com/mattlewissf/83989910849fdb4a04a72d431e84053f/raw/cefa015a9065665faccd0219774c7087be7d21a8/skeleton.css').read())

#### MIMIC Deep Dive - Fitting the Model and Plotting the ROC Curve
**[Intro](#intro)**   
**[30 Day Readmission](#30_day_readmission)**  
**[The MIMIC Dataset](#mimic_dataset)**  
**[Setting up the database](#setting_up_db)** 

Now that we have our cross-validation set up, it is time for us to start actually fitting different classifiers to our data. But before we do that: what do we mean by 'fitting' a model? 

In terms of what we are doing code wise: not much. Here's an example of us fitting a classifier to our data, again using our k-fold cross validation. 

In [None]:
 for train, test in kf.split(X):
        clf.fit(X.loc[train], y.loc[train])
        prob = clf.predict_proba(X.loc[test])

Conceptually, we're applying a given alogrithm to our data, and getting back a representation of the relationship between our features (or predictors) and outcomes so that we can most accurately predict future outcomes. In our case, these alogrithms are different implementations of classifiers that we're importing from sk-learn. 

Since we're using the k-fold technique, we're actually re-fitting the model to new data each time we iterate through the differently slices of our larger data set, and we're likely to get a different 'fit' each time that we do so. Since what we want to be able to measure at this point is the AUC of a given classifier for our data, we'll want to take the mean AUC of our data. 

<br></br>

Here's some code that calculates the mean AUC for a given classifier and dataset, leaning heaviliy on sk-learn methods like roc_curve(), clf.predict_proba, etc. 

In [None]:
def get_mean_auc(df, clf):
    kf = KFold(n_splits = 10) 
    kf.get_n_splits(X)
    mean_tpr = 0.0
    mean_fpr = np.linspace(0, 1, 100)

    for train, test in kf.split(X):
        clf.fit(X.loc[train], y.loc[train])
        prob = clf.predict_proba(X.loc[test])
        fpr, tpr = roc_curve(y[test], prob[:,1])
        mean_tpr += interp(mean_fpr, fpr, tpr)
        mean_tpr[0] = 0.0
        roc_auc = auc(fpr, tpr)
        
        plots.append([fpr, tpr]) # for plotting

    mean_tpr /= kf.get_n_splits(X,y)
    mean_tpr[-1] = 1.0
    mean_auc = auc(mean_fpr, mean_tpr)
    #... 

We're also going to want to be able to plot ROC curves for each k-fold. We'll be using [matplotib.pyplot](https://matplotlib.org/api/pyplot_summary.html) for this: 

In [None]:
for plot in plots: 
    plt.plot(plot[0], plot[1], color='b')
    plt.plot(mean_fpr, mean_tpr, color ='r')
    plt.xlabel('{0}  -- mean auc = {1}'.format(clf, mean_auc))
    plt.show()

And just like that - we're in a position to actually determine what our ROC curves look like for different classifiers, and we can start to determine what classifier gives us the best predicitvie ability for our data. 

<br></br>

Without further ado, let's take a look at the calcualted AUC for all of our classifiers, based on the maximum number of features that we've extracted: 

- RandomForestClassifier -  0.6706
- AdaBoostClassifier - 0.8037
- **GradientBoostingClassifier - 0.8127**
- DecisionTreeClassifier - 0.7452
- LogisticRegression -  0.7769
- LogisticRegressionCV - 0.7718


We're seeing what looks like a clear winner here with our GBC classifer. Next, we'll look into hyperparameter tuning, and do some analysis of our results. 

#####  **Next  |   [Results and Analysis]()**