add case weights to summary function #1

topepo · 2014-05-16T16:23:21Z

The original request is from here. This is the request:

I'm using R's caret package to do some grid search and model evaluation. I have a custom evaluation metric that is a weighted average of absolute error. Weights are assigned at the observation level.

X <- c(1,1,2,0,1) #feature 1
w <- c(1,2,2,1,1) #weights
Y <- 1:5 #target, continuous

#assume I run a model using X as features and Y as target and get a vector of predictions

mymetric <- function(predictions, target, weights){

v <- sum(abs(target-predictions)*weights)/sum(weights) 
return(v)
}

Here an example is given on how to use summaryFunction to define a custom evaluation metric for caret's train().
To quote:

The trainControl function has a argument called summaryFunction that specifies a function for computing performance. The function should have these arguments:

data is a reference for a data frame or matrix with columns called obs
and pred for the observed and predicted outcome values (either numeric
data for regression or character values for classification).
Currently, class probabilities are not passed to the function. The
values in data are the held-out predictions (and their associated
reference values) for a single combination of tuning parameters. If
the classProbs argument of the trainControl object is set to TRUE,
additional columns in data will be present that contains the class
probabilities. The names of these columns are the same as the class
levels. lev is a character string that has the outcome factor levels
taken from the training data. For regression, a value of NULL is
passed into the function. model is a character string for the model
being used (i.e. the value passed to the method argument of train).

I cannot quite figure out how to pass the observation weights to summaryFunction.

The text was updated successfully, but these errors were encountered:

topepo · 2014-06-04T19:40:15Z

Resolved (and test cases added) as of caret version 6.0-29

Travis ci

Catchup for trim

myloginid · 2015-12-03T11:01:56Z

Hi, I am using caret 6.0-58, But still not able to use a custom summary function. My Code and error as below -

mymetric <- function(predictions, target, weights){
v <- sum(abs(target-predictions)*weights)/sum(weights)
return(v) }

number = 10
tmethod = "boot"
tc = trainControl(method = "boot",
number = ifelse(grepl("cv", tmethod), 10, 25),
repeats = ifelse(grepl("cv", tmethod), 1, number),
p = 0.75,
search = "grid",
initialWindow = NULL,
horizon = 1,
fixedWindow = TRUE,
verboseIter = FALSE,
returnData = TRUE,
returnResamp = "final",
savePredictions = FALSE,
classProbs = FALSE,
# summaryFunction = mymetric,
selectionFunction = "best",
preProcOptions = list(thresh = 0.95, ICAcomp = 3, k = 5),
sampling = NULL,
index = NULL,
indexOut = NULL,
timingSamps = 0,
predictionBounds = rep(FALSE, 2),
seeds = NA,
adaptive = list(min = 5, alpha = 0.05,method = "gls", complete = TRUE),
trim = FALSE,
allowParallel = TRUE)

tglm = train( x = wip[,c(basef,retf_p1)] , y = wip$Ret_2, method = "glm", weights = wip$Weight_Intraday, trControl = tc)
Hide Traceback

Error in FUN(left, right) : non-numeric argument to binary operator
7 eval(expr, envir, enclos)
6 eval(f)
5 Ops.data.frame(abs(target - predictions), weights)
4 ctrl$summaryFunction(testOutput, lev, method)
3 evalSummaryFunction(y, wts = weights, ctrl = trControl, lev = classLevels,
metric = metric, method = method)
2 train.default(x = wip[, c(basef, retf_p1)], y = wip$Ret_2, method = "glm",
weights = wip$Weight_Intraday, trControl = tc)
1 train(x = wip[, c(basef, retf_p1)], y = wip$Ret_2, method = "glm",
weights = wip$Weight_Intraday, trControl = tc)

I have supplied the weights while specifying the train function. Pls let me know if there is mistake in the call.

I also tried the same with changing the column names of the function to match to the data column names as this -
mymetric <- function(predictions, Ret_2, Weight_Intraday){
v <- sum(abs(Ret_2-predictions)*Weight_Intraday)/sum(Weight_Intraday)
return(v)
}

But it still failed.

Thanks,
Manish

topepo · 2015-12-03T16:13:09Z

The (lack of) details are here.

Basically, when the summary function us called, there is a data frame called data available within the R function that you supply. Normally, it has columns for the holdout data called obs, pred, and rwoIndex. If you use weights in the function call to train, then there is an additional column called weights. For example, the data object might look like this:

          pred        obs   weights rowIndex
123  virginica     setosa 0.8394404      107
86  versicolor  virginica 0.7548209      136
35      setosa  virginica 0.4112744       62
17      setosa versicolor 0.4314737      116
121  virginica versicolor 0.3823880       23
137  virginica     setosa 0.3162717        2

Your summary function can use this weight column for its calculations.

Please note that not all R model functions can use case weights so if you want to use a column of your data for case weights, you will have to look at the underlying model function using getModelInfo. I'm working on tagging models that can use weights so a list of them will be available.

fix minor typo

topepo added the bug label May 16, 2014

topepo self-assigned this May 16, 2014

topepo closed this as completed Jun 4, 2014

zachmayer added a commit that referenced this issue Oct 16, 2014

Merge pull request #1 from topepo/travis_ci

e9543e6

Travis ci

topepo pushed a commit that referenced this issue Apr 13, 2015

Merge pull request #1 from topepo/master

11d6e2b

Catchup for trim

topepo pushed a commit that referenced this issue Dec 28, 2016

fix BioConductor install on Linux issue #1

84f0f0b

topepo pushed a commit that referenced this issue Apr 19, 2017

Merge pull request #1 from MMohey/MMohey-patch-1

6f33680

fix minor typo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add case weights to summary function #1

add case weights to summary function #1

topepo commented May 16, 2014

topepo commented Jun 4, 2014

myloginid commented Dec 3, 2015

topepo commented Dec 3, 2015

add case weights to summary function #1

add case weights to summary function #1

Comments

topepo commented May 16, 2014

topepo commented Jun 4, 2014

myloginid commented Dec 3, 2015

topepo commented Dec 3, 2015