Skip to content

Commit

Permalink
Merge pull request #138 from ilarischeinin/master
Browse files Browse the repository at this point in the history
Fix a couple of typos in manual pages.
  • Loading branch information
topepo committed Apr 8, 2015
2 parents d3b6bfd + c3eb5aa commit b303cb7
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 6 deletions.
10 changes: 5 additions & 5 deletions pkg/caret/man/selection.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -19,22 +19,22 @@ tolerance(x, metric, tol = 1.5, maximize)
\item{tol}{the acceptable percent tolerance (for \code{tolerance} only)}
}
\details{
These functions can be used by \code{\link{train}} to select the "optimal" model form a series of models. Each requires the user to select a metric that will be used to judge performance. For regression models, values of \code{"RMSE"} and \code{"Rsquared"} are applicable. Classification models use either \code{"Accuracy"} or \code{"Kappa"} (for unbalanced class distributions.
These functions can be used by \code{\link{train}} to select the "optimal" model from a series of models. Each requires the user to select a metric that will be used to judge performance. For regression models, values of \code{"RMSE"} and \code{"Rsquared"} are applicable. Classification models use either \code{"Accuracy"} or \code{"Kappa"} (for unbalanced class distributions.

More details on these functions can be found at \url{http://topepo.github.io/caret/training.html#custom}.

By default, \code{\link{train}} uses \code{best}.

\code{best} simply chooses the tuning parameter associated with the largest (or lowest for \code{"RMSE"}) performance.

\code{oneSE} is a rule in the spirit of the "one standard error" rule of Breiman et al. (1984), who suggest that the tuning parameter associated with eh best performance may over fit. They suggest that the simplest model within one standard error of the empirically optimal model is the better choice. This assumes that the models can be easily ordered from simplest to most complex (see the Details section below).
\code{oneSE} is a rule in the spirit of the "one standard error" rule of Breiman et al. (1984), who suggest that the tuning parameter associated with the best performance may over fit. They suggest that the simplest model within one standard error of the empirically optimal model is the better choice. This assumes that the models can be easily ordered from simplest to most complex (see the Details section below).

\code{tolerance} takes the simplest model that is within a percent tolerance of the empirically optimal model. For example, if the largest Kappa value is 0.5 and a simpler model within 3 percent is acceptable, we score the other models using \code{(x - 0.5)/0.5 * 100}. The simplest model whose score is not less than 3 is chosen (in this case, a model with a Kappa value of 0.35 is acceptable).

User--defined functions can also be used. The argument \code{selectionFunction} in \code{\link{trainControl}} can be used to pass the function directly or to pass the function by name.
}
\value{
an row index
a row index
}
\references{Breiman, Friedman, Olshen, and Stone. (1984) \emph{Classification and Regression Trees}. Wadsworth.}
\author{Max Kuhn}
Expand All @@ -49,9 +49,9 @@ RBF SVM models are ordered first by the cost parameter, then by the kernel param

Neural networks are ordered by the number of hidden units and then the amount of weight decay.

$k$--nearest neighbor models are ordered from most neighbors to least (i.e. smoothest to model jagged decision boundaries).
k--nearest neighbor models are ordered from most neighbors to least (i.e. smoothest to model jagged decision boundaries).

Elastic net models are ordered first n the L1 penalty, then by the L2 penalty.
Elastic net models are ordered first on the L1 penalty, then by the L2 penalty.
}
\seealso{\code{\link{train}}, \code{\link{trainControl}}}
\examples{
Expand Down
2 changes: 1 addition & 1 deletion pkg/caret/man/trainControl.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ trainControl(method = "boot",
\details{
When setting the seeds manually, the number of models being evaluated is required. This may not be obvious as \code{train} does some optimizations for certain models. For example, when tuning over PLS model, the only model that is fit is the one with the largest number of components. So if the model is being tuned over \code{comp in 1:10}, the only model fit is \code{ncomp = 10}. However, if the vector of integers used in the \code{seeds} arguments is longer than actually needed, no error is thrown.

Using \code{method = "none"} and specifying model than one model in \code{\link{train}}'s \code{tuneGrid} or \code{tuneLength} arguments will result in an error.
Using \code{method = "none"} and specifying more than one model in \code{\link{train}}'s \code{tuneGrid} or \code{tuneLength} arguments will result in an error.
Using adaptive resampling when \code{method} is either \code{"adaptive_cv"}, \code{"adaptive_boot"} or \code{"adaptive_LGOCV"}, the full set of resamples is not run for each model. As resampling continues, a futility analysis is conducted and models with a low probability of being optimal are removed. These features are experimental. See Kuhn (2014) for more details. The options for this procedure are:
Expand Down

0 comments on commit b303cb7

Please sign in to comment.