## Comparing models

In this notebook i'm going to show how to compare scores of different models. Different models can perform better or worse on various problems as they more or less suited to certain types of prolems. So i often run code similar to below to see what kind of model works best - which direction to continue in.

In [24]:
from sklearn import datasets

X,y = datasets.load_boston(return_X_y=True)

We've loaded the dataset above. Now i'm going to define several models that seem suitable and see how they compare...

In [25]:
from sklearn.linear_model import LinearRegression, Lasso
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor

lr = LinearRegression()
lasso = Lasso(alpha=0.01)
rf = RandomForestRegressor(min_samples_leaf=5, min_samples_split=10)
gbm  = GradientBoostingRegressor()

I'm also interested in performance of a stacked model, so a model that uses predictions from several base models and feeds them to a second level model.

In [31]:
from mlxtend.regressor import StackingCVRegressor

stacked = StackingCVRegressor(
    regressors=(lr, lasso, rf, gbm),
    meta_regressor=Lasso(alpha=15),
    cv=10
)

So now the comparing. We define a list of models that we're interested in and output scores for each:

In [33]:
from sklearn.model_selection import cross_val_score, ShuffleSplit

models = [
    ('Lasso', lasso),
    ('LinearRegression', lr),
    ('RandomForestRegressor', rf),
    ('GradientBoostingRegressor', gbm),
    ('Stacked', stacked),
]

for label, model in models:
    scores = cross_val_score(model, X, y, cv=ShuffleSplit(n_splits=10, test_size=0.15))
    print("Model %-26s Score: %0.2f (+/- %0.2f)" % (label, scores.mean(), scores.std()))

Model Lasso                      Score: 0.73 (+/- 0.07)
Model LinearRegression           Score: 0.72 (+/- 0.03)
Model RandomForestRegressor      Score: 0.85 (+/- 0.03)
Model GradientBoostingRegressor  Score: 0.85 (+/- 0.05)
Model Stacked                    Score: 0.86 (+/- 0.03)


As you can see the random forest and gradient boosting are the best single models. The simple regression models perform significantly worse, so they seem less suited here. Stacking also seems suitable, so that's also a direction worth exploring further for this dataset!