New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k-fold prediction functions #468

Open
liamkendall opened this Issue Jul 5, 2018 · 10 comments

Comments

Projects
None yet
3 participants
@liamkendall

liamkendall commented Jul 5, 2018

Hi Paul,

I was wondering if it would be possible for the k-fold function to return the mean absolute error (MAE) and the root mean square error (RMSE) of each training/test set along with k-fold IC? These are common k-fold metrics and it'd be cool to get them as well for model comparison (especially the RMSE as it is in the unit of the response variable).

Thanks,
Liam

@paul-buerkner

This comment has been minimized.

Owner

paul-buerkner commented Jul 5, 2018

@avehtari is our expert for that. Aki, what do you think about this idea?

@avehtari

This comment has been minimized.

avehtari commented Jul 5, 2018

to return the mean absolute error (MAE) and the root mean square error (RMSE) of each training/test set

Could you elaborate why would like to have these for each training set?
Could you elaborate why would like to have these for each test set instead of each observation?
How are these connected to your prediction task?

to return the mean absolute error (MAE) and the root mean square error (RMSE)

These are cost functions for point estimates. Which point estimate would you like to use? Different point estimates are optimal for MAE and RMSE.

to return the mean absolute error (MAE)

Why mean absolute error and not median absolute error?

Since there are many different options for cost/utility functions, I think it would be better to add kfold prediction functions similar to loo prediction functions https://www.rdocumentation.org/packages/rstantools/versions/1.5.0/topics/loo-prediction
and let the user to compute the rest. For example

kcv <- kfold(fit)
predicts <- kfold_linpred(fit, kcv)
rmse(data$y, colMeans(predicts))
mae(data$y, apply(predicts, 2, median))

I guess in brms style it would also be possible to have a function to add these to the summary printouts as different *IC's.

@liamkendall

This comment has been minimized.

liamkendall commented Jul 5, 2018

Hi there,

Could you elaborate why would like to have these for each training set?
Could you elaborate why would like to have these for each test set instead of each observation?

To clarify, I meant I'd like to either get the RMSE/MAE for actual/predicted values of each test set so that I could then calculate mean/median (plus a measure of variance i.e. SE) of either across sets, as is usually done in CV or even have the mean/median + SE across sets provided in the summary.

From my perspective, the RMSE is useful from a reader's point of view as it is easy to interpret given it is in the units of the response variable. In my case, I am estimating trait values, so I can provide an estimate of prediction error in millimetres or milligrams depending on the trait.

Why mean absolute error and not median absolute error?

No real reason why MAE over median absolute error, this is just what I have used in the past.

I'd be happy to calculate these myself as you have suggested in your example but I noticed I wasn't able to extract the actual/predicted values of each test set from the kfold output in brms.

@liamkendall

This comment has been minimized.

liamkendall commented Jul 5, 2018

I would add that these metrics are also useful when comparing the accuracy/error of your 'new' model with pre-existing predictive models for which the original data may not be available

@avehtari

This comment has been minimized.

avehtari commented Jul 5, 2018

Sorry for not being more clear. I don't object RMSE and MAE and I think there should be an easy way compute these. They are useful as they are on the scale of the measurements (although I think it often would be more useful to report some tail quantile of the error distribution, e.g. state that 95% of prediction errors will be less than x).

I object of calculating RMSE/MAE for each test set by default, as this is in most cases wrong thing to do. I asked what is you prediction task, in that rare case that your prediction task is really to predict groups of observations where the group size is the same as the test set sizes in k-fold-cv. I know it's common to do as you describe, but that doesn't make it right. It is more likely that the correct computation for you is to make predictions for all i, and then proceed as in loo, and compute RMSE/MAE and corresponding SEs or other uncertainty estimates using n values.

Since different users are likely to have different predictions tasks, different point estimate predictions, and different cost/utility functions there is a combinatorial explosion and it's better to make it easy for users to compute what they want and give some examples fro most common ones like RMSE/MSE.

I'd be happy to calculate these myself as you have suggested in your example

Great!

but I noticed I wasn't able to extract the actual/predicted values of each test set from the kfold output in brms.

What I proposed is to add functions which would make this easy, and the example was to illustrate how such function might look like. With these helper functions you could then choose to compute RMSE over n individual observations, or to compute RMSE first for different groups and then combine the results whatever way you like. Model comparison is also easier with pointwise differences.

Remember that we have predictive distribution, but RMSE/MAE are measuring the error of point estimate. You have to explicitly decide also how do you compute your point prediction. Depending on your model, for example, mean and median point predictions can be different and would give different RMSE/MAE.

@paul-buerkner so maybe make a new issue for adding kfold prediction functions? Pinging also @jgabry

@paul-buerkner paul-buerkner changed the title from RMSE and MAE in k-fold output to k-fold prediction functions Jul 6, 2018

@paul-buerkner

This comment has been minimized.

Owner

paul-buerkner commented Jul 6, 2018

Yeah, that could be useful. I just changed the title of this issue to reflect that. How could an interface of such a method look like?

@avehtari

This comment has been minimized.

avehtari commented Jul 6, 2018

Interface depends on what kfold() returns, and might need change for that. We probably don't want to store by default draws from the predictive distribution, so K fit objects would need to be stored or kfold() would already need to make the decision which point predictions are stored.

@liamkendall

This comment has been minimized.

liamkendall commented Jul 9, 2018

Thank you both for commenting on my post and seeing the utility in this feature. I am most curious about why calculating an RMSE for each test set and then taking a mean/median across sets is incorrect in your mind @avehtari? The aim is to test on 'untested' data and this seems the only way to get the metric (RMSE) plus a measure of variance (SE) for that 'untested' data. Apologises if i've misunderstood what you said and thanks again

@avehtari

This comment has been minimized.

avehtari commented Jul 9, 2018

I am most curious about why calculating an RMSE for each test set and then taking a mean/median across sets is incorrect in your mind @avehtari?

First, I didn't say it's always incorrect. Second, discussion of the details are not that relevant for this issue, but might be interesting for others who don't see this issue, so @liamkendall could you ask your question again in http://discourse.mc-stan.org/ and I'll answer there?

@liamkendall

This comment has been minimized.

liamkendall commented Jul 9, 2018

Done! Thanks again 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment