You can clone with
HTTPS or Subversion.
Estimators in the forest module (random forest and extra trees) and in the bagging module allows to compute the out-of-bag estimates of the performance of the forest.
A nice things to add would to allow the choice of the scoring function using the scorer interface. The oob_score parameter would be equal would be the string corresponding to the appropriate scorer.
Thus, you would have
oob_score : bool or string, (default=False)
Whether to use out-of-bag samples to estimate
the generalization error. The oob scoring function could be chosen
by passing a string (see model evaluation documentation) or
a scorer callable object / function with signature
``scorer(estimator, X, y)``.
In fact, one of the things that I don't like with the current implementation of oob scores is that they don't at least rely on the underlying score method of the ensemble. By default, a better implementation should do that (instead of reimplementing the zero-one loss and the squared error loss in the forest module), or use a given scorer, if any is provided, as you suggests.
Also, I think such a refactoring should go in pair with #3436, which again adds boilerplate code for reimplementing the zero-one loss and the squared error loss.
+1 for the refactoring
Hi - Is this ticket available? I'm interested in working on it.
[WIP] Choose out of bag scoring metric. Fixes #3455
Hi - I created a wip, please let me know if this approach with a DummyPredictor seems reasonable, see description in the pull request: #3723