Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

MRG Training Score in Gridsearch #1742

Open
wants to merge 1 commit into from

3 participants

@amueller
Owner

This PR adds training scores to the GridSearchCV output, as wished for by @ogrisel.

sklearn/grid_search.py
@@ -483,12 +511,12 @@ def _fit(self, X, y, parameter_iterator, **params):
self._set_methods()
# Store the computed scores
- CVScoreTuple = namedtuple('CVScoreTuple', ('parameters',
- 'mean_validation_score',
- 'cv_validation_scores'))
+ CVScoreTuple = namedtuple('CVScoreTuple',
+ ('parameters', 'mean_test_score',
+ 'mean_training_score', 'cv_test_scores'))
@ogrisel Owner
ogrisel added a note

Why did you rename *_validation_score to *_test_score? validation sounds more correct in a CV setting. Don't you think?

@amueller Owner
amueller added a note

First I thought training and test build a nicer pair. Then I though validation would be better but didn't change it back. Will do once my slides are done ;)

@ogrisel Owner
ogrisel added a note

Alright as you wish I don't have any strong opinion on this either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@ogrisel
Owner

What about measuring the training_duration and testing_duration as well? That should be cheap without any overhead.

@amueller
Owner

It's on the todo. Is there a better way than using time.time?

@ogrisel
Owner

I think time.time is good enough for a start. I don't see any better way.

@amueller
Owner

Fixed doctests, rebased squashed. Should be good to go.

examples/svm/plot_rbf_parameters.py
((15 lines not shown))
-pl.xticks(np.arange(len(gamma_range)), gamma_range, rotation=45)
-pl.yticks(np.arange(len(C_range)), C_range)
+# We extract validation and training scores, as well as training and prediction
+# times
+_, val_scores, _, train_scores, train_time, pred_time = zip(*score_dict)
+
+arrays = [val_scores, train_scores, train_time, pred_time]
+titles = ["Validation Score", "Training Score", "Training Time",
+ "Prediction Time"]
+
+# for each value draw heatmap as a function of gamma and C
+pl.figure(figsize=(8, 8))
+for i, (arr, title) in enumerate(zip(arrays, titles)):
+ pl.subplot(2, 2, i + 1)
+ arr = np.array(arr).reshape(len(C_range), len(gamma_range))
+ #pl.subplots_adjust(left=0.05, right=0.95, bottom=0.15, top=0.95)
@ogrisel Owner
ogrisel added a note

Is this a left-over of some experiment? It should be removed if it's not useful.

@amueller Owner

Whoops. Actually I still need to have a look how it renders on the website.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@ogrisel ogrisel commented on the diff
examples/svm/plot_rbf_parameters.py
@@ -14,10 +14,19 @@
the decision surface smooth, while a high C aims at classifying
all training examples correctly.
-Two plots are generated. The first is a visualization of the
-decision function for a variety of parameter values, and the second
-is a heatmap of the classifier's cross-validation accuracy as
-a function of `C` and `gamma`.
+Two plots are generated. The first is a visualization of the decision function
+for a variety of parameter values, and the second is a heatmap of the
+classifier's cross-validation accuracy and training time as a function of `C`
+and `gamma`.
+
+An interesting observation on overfitting can be made when comparing validation
+and training error: higher C always result in lower training error, as it
+inceases complexity of the classifier.
@ogrisel Owner
ogrisel added a note

You should add a note on which areas under / overfitting areas here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
examples/svm/plot_rbf_parameters.py
@@ -14,10 +14,19 @@
the decision surface smooth, while a high C aims at classifying
all training examples correctly.
-Two plots are generated. The first is a visualization of the
-decision function for a variety of parameter values, and the second
-is a heatmap of the classifier's cross-validation accuracy as
-a function of `C` and `gamma`.
+Two plots are generated. The first is a visualization of the decision function
+for a variety of parameter values, and the second is a heatmap of the
+classifier's cross-validation accuracy and training time as a function of `C`
+and `gamma`.
+
+An interesting observation on overfitting can be made when comparing validation
+and training error: higher C always result in lower training error, as it
+inceases complexity of the classifier.
+For the validation set on the other hand, there is a tradeoff of goodness of
+fit and generalization.
@ogrisel Owner
ogrisel added a note

Actually better here: this is basically just adding an alternative phrasing of what you already say but I think it's better to repeat those concepts over and over again to teach them (and to increase googlability).

@amueller Owner

I added something, but I'm not entirely sure what you wanted. You can always edit later ;)

@ogrisel Owner
ogrisel added a note

Something like:

We can observe that the lower-right half of the parameters (below the diagonal, when both C and gamma are high) is characteristic of parameters that yields an overfitting model: the training score is very high but there is a wide gap.

The top and top-left parts of the parameter plots show underfitting models: the C and gamma values can individually or in conjunction constrain the model too much leading to low training scores (hence low validation scores too as validation scores are on average upper bounded by training scores).

@amueller Owner

Done and also made the plots look better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@ogrisel
Owner

Please add some smoke tests for the new tuple items: for instance check that all of them are positive and that train_score is lower than 1.0.

@ogrisel
Owner

Other than the above comments this looks good to me.

@amueller
Owner

Also added some tests.

@ogrisel ogrisel commented on the diff
examples/svm/plot_rbf_parameters.py
((7 lines not shown))
-a function of `C` and `gamma`.
+Two plots are generated. The first is a visualization of the decision function
+for a variety of parameter values, and the second is a heatmap of the
+classifier's cross-validation accuracy and training time as a function of `C`
+and `gamma`.
+
+An interesting observation on overfitting can be made when comparing validation
+and training error: higher C always result in lower training error, as it
+inceases complexity of the classifier.
+
+For the validation set on the other hand, there is a tradeoff of goodness of
+fit and generalization.
+
+We can observe that the lower right half of the parameters (below the diagonal
+with high C and gamma values) is characteristic of parameters that yields an
+overfitting model: the trainin score is very high but there is a wide gap. The
@ogrisel Owner
ogrisel added a note

typo: trainin (my fault)

@ogrisel Owner
ogrisel added a note

" wide gap ... with the validation score"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@ogrisel ogrisel commented on the diff
examples/svm/plot_rbf_parameters.py
((13 lines not shown))
+An interesting observation on overfitting can be made when comparing validation
+and training error: higher C always result in lower training error, as it
+inceases complexity of the classifier.
+
+For the validation set on the other hand, there is a tradeoff of goodness of
+fit and generalization.
+
+We can observe that the lower right half of the parameters (below the diagonal
+with high C and gamma values) is characteristic of parameters that yields an
+overfitting model: the trainin score is very high but there is a wide gap. The
+top and left parts of the parameter plots show underfitting models: the C and
+gamma values can individually or in conjunction constrain the model too much
+leading to low training scores (hence low validation scores too as validation
+scores are on average upper bounded by training scores).
+
+
@ogrisel Owner
ogrisel added a note

Please remove one of the blank lines. I let you choose which one :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@ogrisel ogrisel commented on the diff
examples/svm/plot_rbf_parameters.py
((16 lines not shown))
+
+For the validation set on the other hand, there is a tradeoff of goodness of
+fit and generalization.
+
+We can observe that the lower right half of the parameters (below the diagonal
+with high C and gamma values) is characteristic of parameters that yields an
+overfitting model: the trainin score is very high but there is a wide gap. The
+top and left parts of the parameter plots show underfitting models: the C and
+gamma values can individually or in conjunction constrain the model too much
+leading to low training scores (hence low validation scores too as validation
+scores are on average upper bounded by training scores).
+
+
+We can also see that the training time is quite sensitive to the parameter
+setting, while the prediction time is not impacted very much. This is probably
+a consequence of the small size of the data set.
@ogrisel Owner
ogrisel added a note

We can also notice that the time plot look noisy for the same reason. A higher number of cross validation steps would be required to properly evaluate the impact of the parameters on the training and prediction times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@jnothman
Owner

See an alternative patch at https://github.com/jnothman/scikit-learn/tree/grid_search_more_info

Note I have chosen different field names, aiming for consistency and memorability, if not preciseness of name.

@amueller
Owner

@jnothman btw, does your version work with lists of dicts as param_grid and with RandomizedSearchCV?
Thinking about it a bit more, I'm not sure your interface is better if the parameter space has a more complicated form. Could you maybe issue a PR? That would make tracking the changes easier.

@jnothman
Owner

I don't think it's better, but it's certainly no worse: it provides exactly the same ordering according to parameter_iterator as your solution did. If that ordering is meaningful, then the data can be reshaped! If it is not, then you've lost nothing.

It doesn't do anything particular to GridSearchCV, though I see now why you might not want to call the attribute grid_results_. But params_results_ is not nice; point_results_ might work, but fit_grid_point actually fits one fold, not one point.

PR forthcoming.

@amueller
Owner
@amueller amueller ENH add training score to GridSearchCV.cv_scores_
add docstring for GridSearchCV, RandomizedSearchCV and fit_grid_point. In "fit_grid_point" I used test_score rather than validation_score, as the split is given to the function.
rbf svm grid search example now also shows training scores - which illustrates overfitting for high C, and training/prediction times... which pasically serve to illustrate that this is possible. Maybe random forests would be better to evaluate training times?
52ceff3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on May 7, 2013
  1. @amueller

    ENH add training score to GridSearchCV.cv_scores_

    amueller authored
    add docstring for GridSearchCV, RandomizedSearchCV and fit_grid_point. In "fit_grid_point" I used test_score rather than validation_score, as the split is given to the function.
    rbf svm grid search example now also shows training scores - which illustrates overfitting for high C, and training/prediction times... which pasically serve to illustrate that this is possible. Maybe random forests would be better to evaluate training times?
This page is out of date. Refresh to see the latest.
View
2  doc/tutorial/statistical_inference/model_selection.rst
@@ -144,7 +144,7 @@ estimator during the construction and exposes an estimator API::
>>> clf = GridSearchCV(estimator=svc, param_grid=dict(gamma=gammas),
... n_jobs=-1)
>>> clf.fit(X_digits[:1000], y_digits[:1000]) # doctest: +ELLIPSIS
- GridSearchCV(cv=None,...
+ GridSearchCV(compute_training_score=False,...
>>> clf.best_score_ # doctest: +ELLIPSIS
0.9889...
>>> clf.best_estimator_.gamma
View
67 examples/svm/plot_rbf_parameters.py
@@ -14,10 +14,30 @@
the decision surface smooth, while a high C aims at classifying
all training examples correctly.
-Two plots are generated. The first is a visualization of the
-decision function for a variety of parameter values, and the second
-is a heatmap of the classifier's cross-validation accuracy as
-a function of `C` and `gamma`.
+Two plots are generated. The first is a visualization of the decision function
+for a variety of parameter values, and the second is a heatmap of the
+classifier's cross-validation accuracy and training time as a function of `C`
+and `gamma`.
+
+An interesting observation on overfitting can be made when comparing validation
+and training error: higher C always result in lower training error, as it
+inceases complexity of the classifier.
@ogrisel Owner
ogrisel added a note

You should add a note on which areas under / overfitting areas here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+
+For the validation set on the other hand, there is a tradeoff of goodness of
+fit and generalization.
+
+We can observe that the lower right half of the parameters (below the diagonal
+with high C and gamma values) is characteristic of parameters that yields an
+overfitting model: the trainin score is very high but there is a wide gap. The
@ogrisel Owner
ogrisel added a note

typo: trainin (my fault)

@ogrisel Owner
ogrisel added a note

" wide gap ... with the validation score"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+top and left parts of the parameter plots show underfitting models: the C and
+gamma values can individually or in conjunction constrain the model too much
+leading to low training scores (hence low validation scores too as validation
+scores are on average upper bounded by training scores).
+
+
@ogrisel Owner
ogrisel added a note

Please remove one of the blank lines. I let you choose which one :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+We can also see that the training time is quite sensitive to the parameter
+setting, while the prediction time is not impacted very much. This is probably
+a consequence of the small size of the data set.
@ogrisel Owner
ogrisel added a note

We can also notice that the time plot look noisy for the same reason. A higher number of cross validation steps would be required to properly evaluate the impact of the parameters on the training and prediction times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
'''
print(__doc__)
@@ -65,7 +85,8 @@
gamma_range = 10.0 ** np.arange(-5, 4)
param_grid = dict(gamma=gamma_range, C=C_range)
cv = StratifiedKFold(y=Y, n_folds=3)
-grid = GridSearchCV(SVC(), param_grid=param_grid, cv=cv)
+grid = GridSearchCV(SVC(), param_grid=param_grid, cv=cv,
+ compute_training_score=True)
grid.fit(X, Y)
print("The best classifier is: ", grid.best_estimator_)
@@ -108,18 +129,28 @@
# cv_scores_ contains parameter settings and scores
score_dict = grid.cv_scores_
-# We extract just the scores
-scores = [x[1] for x in score_dict]
-scores = np.array(scores).reshape(len(C_range), len(gamma_range))
-
-# draw heatmap of accuracy as a function of gamma and C
-pl.figure(figsize=(8, 6))
-pl.subplots_adjust(left=0.05, right=0.95, bottom=0.15, top=0.95)
-pl.imshow(scores, interpolation='nearest', cmap=pl.cm.spectral)
-pl.xlabel('gamma')
-pl.ylabel('C')
-pl.colorbar()
-pl.xticks(np.arange(len(gamma_range)), gamma_range, rotation=45)
-pl.yticks(np.arange(len(C_range)), C_range)
+# We extract validation and training scores, as well as training and prediction
+# times
+_, val_scores, _, train_scores, train_time, pred_time = zip(*score_dict)
+
+arrays = [val_scores, train_scores, train_time, pred_time]
+titles = ["Validation Score", "Training Score", "Training Time",
+ "Prediction Time"]
+
+# for each value draw heatmap as a function of gamma and C
+pl.figure(figsize=(12, 8))
+for i, (arr, title) in enumerate(zip(arrays, titles)):
+ pl.subplot(2, 2, i + 1)
+ arr = np.array(arr).reshape(len(C_range), len(gamma_range))
+ pl.title(title)
+ pl.imshow(arr, interpolation='nearest', cmap=pl.cm.spectral)
+ pl.xlabel('gamma')
+ pl.ylabel('C')
+ pl.colorbar()
+ pl.xticks(np.arange(len(gamma_range)), ["%.e" % g for g in gamma_range],
+ rotation=45)
+ pl.yticks(np.arange(len(C_range)), ["%.e" % C for C in C_range])
+
+pl.subplots_adjust(top=.95, hspace=.35, left=.0, right=.8, wspace=.05)
pl.show()
View
173 sklearn/grid_search.py
@@ -14,7 +14,7 @@
from itertools import product
import numbers
import operator
-import time
+from time import time
import warnings
import numpy as np
@@ -190,8 +190,8 @@ def __len__(self):
return self.n_iter
-def fit_grid_point(X, y, base_clf, clf_params, train, test, scorer,
- verbose, loss_func=None, **fit_params):
+def fit_grid_point(X, y, base_clf, clf_params, train, test, scorer, verbose,
+ loss_func=None, compute_training_score=False, **fit_params):
"""Run fit on one set of parameters.
Parameters
@@ -218,6 +218,9 @@ def fit_grid_point(X, y, base_clf, clf_params, train, test, scorer,
If provided must be a scoring object / function with signature
``scorer(estimator, X, y)``.
+ compute_training_score : bool, default=False
+ Whether to compute the training loss. If False, None is returned.
+
verbose : int
Verbosity level.
@@ -227,8 +230,18 @@ def fit_grid_point(X, y, base_clf, clf_params, train, test, scorer,
Returns
-------
- score : float
- Score of this parameter setting on given training / test split.
+ test_score : float
+ Test score of this parameter setting on given training / test split.
+
+ training_score : float or None
+ Training score of this parameter setting or None if
+ ``compute_training_score=False`` (default).
+
+ training_time : float
+ Training time for this parameter setting in seconds.
+
+ prediction_time : float
+ Prediction time for the given test set in seconds.
estimator : estimator object
Estimator object of type base_clf that was fitted using clf_params
@@ -238,7 +251,7 @@ def fit_grid_point(X, y, base_clf, clf_params, train, test, scorer,
Number of test samples in this split.
"""
if verbose > 1:
- start_time = time.time()
+ start_time = time()
msg = '%s' % (', '.join('%s=%s' % (k, v)
for k, v in clf_params.items()))
print("[GridSearchCV] %s %s" % (msg, (64 - len(msg)) * '.'))
@@ -269,34 +282,49 @@ def fit_grid_point(X, y, base_clf, clf_params, train, test, scorer,
X_train = X[safe_mask(X, train)]
X_test = X[safe_mask(X, test)]
+ score_func = (clf.score if scorer is None
+ else lambda X_, y_: scorer(clf, X_, y_))
+
if y is not None:
y_test = y[safe_mask(y, test)]
y_train = y[safe_mask(y, train)]
+ start = time()
+ # do actual fitting
clf.fit(X_train, y_train, **fit_params)
-
- if scorer is not None:
- this_score = scorer(clf, X_test, y_test)
- else:
- this_score = clf.score(X_test, y_test)
+ training_time = time() - start
+ start = time()
+ test_score = score_func(X_test, y_test)
+ predict_time = time() - start
else:
+ start = time()
+ # do actual fitting
clf.fit(X_train, **fit_params)
- if scorer is not None:
- this_score = scorer(clf, X_test)
+ training_time = time() - start
+ start = time()
+ test_score = score_func(X_test)
+ predict_time = time() - start
+
+ if compute_training_score:
+ if y is not None:
+ training_score = score_func(X_train, y_train)
else:
- this_score = clf.score(X_test)
+ training_score = score_func(X_train)
+ else:
+ training_score = None
- if not isinstance(this_score, numbers.Number):
+ if not isinstance(test_score, numbers.Number):
raise ValueError("scoring must return a number, got %s (%s)"
- " instead." % (str(this_score), type(this_score)))
+ " instead." % (str(test_score), type(test_score)))
if verbose > 2:
- msg += ", score=%f" % this_score
+ msg += ", score=%f" % test_score
if verbose > 1:
end_msg = "%s -%s" % (msg,
- logger.short_format_time(time.time() -
+ logger.short_format_time(time() -
start_time))
print("[GridSearchCV] %s %s" % ((64 - len(end_msg)) * '.', end_msg))
- return this_score, clf_params, _num_samples(X_test)
+ return (test_score, training_score, training_time, predict_time,
+ clf_params, _num_samples(X_test))
def _check_param_grid(param_grid):
@@ -318,8 +346,10 @@ def _check_param_grid(param_grid):
_CVScoreTuple = namedtuple('_CVScoreTuple',
- ('parameters', 'mean_validation_score',
- 'cv_validation_scores'))
+ ('parameters', 'mean_validation_score',
+ 'cv_validation_scores',
+ 'mean_training_score', 'training_time',
+ 'prediction_time'))
class BaseSearchCV(six.with_metaclass(ABCMeta, BaseEstimator,
@@ -330,8 +360,10 @@ class BaseSearchCV(six.with_metaclass(ABCMeta, BaseEstimator,
@abstractmethod
def __init__(self, estimator, scoring=None, loss_func=None,
score_func=None, fit_params=None, n_jobs=1, iid=True,
- refit=True, cv=None, verbose=0, pre_dispatch='2*n_jobs'):
+ refit=True, cv=None, verbose=0, pre_dispatch='2*n_jobs',
+ compute_training_score=False):
+ self.compute_training_score = compute_training_score
self.scoring = scoring
self.estimator = estimator
self.loss_func = loss_func
@@ -452,35 +484,59 @@ def _fit(self, X, y, parameter_iterator, **params):
pre_dispatch=pre_dispatch)(
delayed(fit_grid_point)(
X, y, base_clf, clf_params, train, test, scorer,
- self.verbose, **self.fit_params) for clf_params in
+ self.verbose,
+ compute_training_score=self.compute_training_score,
+ **self.fit_params) for clf_params in
parameter_iterator for train, test in cv)
-
+ # type and list for storing results
+ cv_scores = []
# Out is a list of triplet: score, estimator, n_test_samples
n_param_points = len(list(parameter_iterator))
n_fits = len(out)
n_folds = n_fits // n_param_points
- scores = list()
- cv_scores = list()
- for grid_start in range(0, n_fits, n_folds):
+ for start in range(0, n_fits, n_folds):
n_test_samples = 0
- score = 0
- these_points = list()
- for this_score, clf_params, this_n_test_samples in \
- out[grid_start:grid_start + n_folds]:
- these_points.append(this_score)
+ mean_validation_score, mean_training_score = 0, 0
+ # lists for accumulating statistics over fold
+ test_points, training_times, prediction_times = [], [], []
+ for (test_score, training_score, training_time, prediction_time,
+ clf_params, this_n_test_samples) in out[start:start +
+ n_folds]:
+ test_points.append(test_score)
+ training_times.append(training_time)
+ prediction_times.append(prediction_time)
if self.iid:
- this_score *= this_n_test_samples
- n_test_samples += this_n_test_samples
- score += this_score
+ test_score *= this_n_test_samples
+ # assumes n_train + n_test = len(X)
+ mean_validation_score += test_score
+
+ if self.compute_training_score:
+ if self.iid:
+ training_score *= n_samples - this_n_test_samples
+ mean_training_score += training_score
+
+ n_test_samples += this_n_test_samples
+
if self.iid:
- score /= float(n_test_samples)
+ mean_validation_score /= float(n_test_samples)
+ else:
+ mean_validation_score /= n_folds
+
+ if self.compute_training_score:
+ if self.iid:
+ # again, we assume n_train + n_test = len(X)
+ mean_training_score /= (n_folds * n_samples
+ - float(n_test_samples))
+ else:
+ mean_training_score /= n_folds
else:
- score /= float(n_folds)
- scores.append((score, clf_params))
- cv_scores.append(these_points)
+ mean_training_score = None
- cv_scores = np.asarray(cv_scores)
+ cv_scores.append(_CVScoreTuple(
+ clf_params, mean_validation_score,
+ test_points, mean_training_score,
+ np.mean(training_times), np.mean(prediction_times)))
# Note: we do not use max(out) to make ties deterministic even if
# comparison on estimator instances is not deterministic
@@ -494,14 +550,17 @@ def _fit(self, X, y, parameter_iterator, **params):
else:
best_score = np.inf
- for score, params in scores:
+ for point in cv_scores:
+ score = point.mean_validation_score
if ((score > best_score and greater_is_better)
- or (score < best_score and not greater_is_better)):
+ or (score < best_score
+ and not greater_is_better)):
best_score = score
- best_params = params
+ best_params = point.parameters
self.best_params_ = best_params
self.best_score_ = best_score
+ self.cv_scores_ = cv_scores
if self.refit:
# fit the best estimator using the entire dataset
@@ -513,11 +572,6 @@ def _fit(self, X, y, parameter_iterator, **params):
best_estimator.fit(X, **self.fit_params)
self.best_estimator_ = best_estimator
- # Store the computed scores
- self.cv_scores_ = [
- _CVScoreTuple(clf_params, score, all_scores)
- for clf_params, (score, _), all_scores
- in zip(parameter_iterator, scores, cv_scores)]
return self
@@ -598,7 +652,7 @@ class GridSearchCV(BaseSearchCV):
>>> clf = grid_search.GridSearchCV(svr, parameters)
>>> clf.fit(iris.data, iris.target)
... # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
- GridSearchCV(cv=None,
+ GridSearchCV(compute_training_score=False, cv=None,
estimator=SVC(C=1.0, cache_size=..., coef0=..., degree=...,
gamma=..., kernel='rbf', max_iter=-1, probability=False,
shrinking=True, tol=...),
@@ -617,6 +671,12 @@ class GridSearchCV(BaseSearchCV):
* ``mean_validation_score``, the mean score over the
cross-validation folds
* ``cv_validation_scores``, the list of scores for each fold
+ * ``mean_training_score``, the mean of the training score
+ over cross-validation folds. Only available if
+ ``compute_training_score=True``.
+ * ``training_time``, the mean training time in seconds.
+ * ``prediction_time``, the mean prediction time over the test set
+ in seconds.
`best_estimator_` : estimator
Estimator that was chosen by the search, i.e. estimator
@@ -656,10 +716,11 @@ class GridSearchCV(BaseSearchCV):
def __init__(self, estimator, param_grid, scoring=None, loss_func=None,
score_func=None, fit_params=None, n_jobs=1, iid=True,
- refit=True, cv=None, verbose=0, pre_dispatch='2*n_jobs'):
+ refit=True, cv=None, verbose=0, pre_dispatch='2*n_jobs',
+ compute_training_score=False):
super(GridSearchCV, self).__init__(
estimator, scoring, loss_func, score_func, fit_params, n_jobs, iid,
- refit, cv, verbose, pre_dispatch)
+ refit, cv, verbose, pre_dispatch, compute_training_score)
self.param_grid = param_grid
_check_param_grid(param_grid)
@@ -773,6 +834,12 @@ class RandomizedSearchCV(BaseSearchCV):
* ``mean_validation_score``, the mean score over the
cross-validation folds
* ``cv_validation_scores``, the list of scores for each fold
+ * ``mean_training_score``, the mean of the training score
+ over cross-validation folds. Only available if
+ ``compute_training_score=True``.
+ * ``training_time``, the mean training time in seconds.
+ * ``prediction_time``, the mean prediction time over the test set
+ in seconds.
`best_estimator_` : estimator
Estimator that was chosen by the search, i.e. estimator
@@ -814,13 +881,13 @@ class RandomizedSearchCV(BaseSearchCV):
def __init__(self, estimator, param_distributions, n_iter=10, scoring=None,
loss_func=None, score_func=None, fit_params=None, n_jobs=1,
iid=True, refit=True, cv=None, verbose=0,
- pre_dispatch='2*n_jobs'):
+ pre_dispatch='2*n_jobs', compute_training_score=False):
self.param_distributions = param_distributions
self.n_iter = n_iter
super(RandomizedSearchCV, self).__init__(
estimator, scoring, loss_func, score_func, fit_params, n_jobs, iid,
- refit, cv, verbose, pre_dispatch)
+ refit, cv, verbose, pre_dispatch, compute_training_score)
def fit(self, X, y=None, **params):
"""Run fit on the estimator with randomly drawn parameters.
View
26 sklearn/tests/test_grid_search.py
@@ -17,6 +17,7 @@
from sklearn.utils.testing import assert_equal
from sklearn.utils.testing import assert_raises
from sklearn.utils.testing import assert_raise_message
+from sklearn.utils.testing import assert_greater
from sklearn.utils.testing import assert_true
from sklearn.utils.testing import assert_array_equal
from sklearn.utils.testing import assert_almost_equal
@@ -227,20 +228,22 @@ def test_grid_search_iid():
# once with iid=True (default)
grid_search = GridSearchCV(svm, param_grid={'C': [1, 10]}, cv=cv)
grid_search.fit(X, y)
- _, average_score, scores = grid_search.cv_scores_[0]
+ scores = grid_search.cv_scores_[0].cv_validation_scores
assert_array_almost_equal(scores, [1, 1. / 3.])
# for first split, 1/4 of dataset is in test, for second 3/4.
# take weighted average
+ average_score = grid_search.cv_scores_[0].mean_validation_score
assert_almost_equal(average_score, 1 * 1. / 4. + 1. / 3. * 3. / 4.)
# once with iid=False (default)
grid_search = GridSearchCV(svm, param_grid={'C': [1, 10]}, cv=cv,
iid=False)
grid_search.fit(X, y)
- _, average_score, scores = grid_search.cv_scores_[0]
+ scores = grid_search.cv_scores_[0].cv_validation_scores
# scores are the same as above
assert_array_almost_equal(scores, [1, 1. / 3.])
# averaged score is just mean of scores
+ average_score = grid_search.cv_scores_[0].mean_validation_score
assert_almost_equal(average_score, np.mean(scores))
@@ -412,6 +415,21 @@ def test_grid_search_precomputed_kernel_error_kernel_function():
assert_raises(ValueError, cv.fit, X_, y_)
+def test_grid_search_training_score():
+ # test that the training score contains sensible numbers
+ X, y = make_classification(n_samples=200, n_features=100, random_state=0)
+ clf = LinearSVC(random_state=0)
+ cv = GridSearchCV(clf, {'C': [0.1, 1.0]}, compute_training_score=True)
+ cv.fit(X, y)
+ for grid_point in cv.cv_scores_:
+ assert_greater(grid_point.mean_training_score,
+ grid_point.mean_validation_score)
+ # hacky greater-equal
+ assert_greater(1 + 1e-10, grid_point.mean_training_score)
+ assert_greater(grid_point.training_time, 0)
+ assert_greater(grid_point.prediction_time, 0)
+
+
class BrokenClassifier(BaseEstimator):
"""Broken classifier that cannot be fit twice"""
@@ -510,9 +528,9 @@ def test_grid_search_score_consistency():
grid_search = GridSearchCV(clf, {'C': Cs}, scoring=score)
grid_search.fit(X, y)
cv = StratifiedKFold(n_folds=3, y=y)
- for C, scores in zip(Cs, grid_search.cv_scores_):
+ for C, result in zip(Cs, grid_search.cv_scores_):
clf.set_params(C=C)
- scores = scores[2] # get the separate runs from grid scores
+ scores = result[2] # get the separate runs from grid scores
i = 0
for train, test in cv:
clf.fit(X[train], y[train])
Something went wrong with that request. Please try again.