New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance for logistic regression #38
Comments
Ok, maybe we can implement some of those tests, though I hesitate including too many tests particularly for logistic regression, because it can become "biased" towards these regression type. But maybe that's no real concern. If these tests are rarely implemented in R, I think it's good to have them in performance. |
I agree, this could be an added value to the package. And I think it works quite nicely with the scope of the package as a general toolbox offering to users an easy way to discover, try, select and use the tools they want |
Should we find a common prefix for the goodness of fit functions? Like |
Do you consider r2 and all of the other indices as goodness of fit metrics? Initially, we thought about prefixing all "performance" (in the very large sense, including goodness of fit) by However, if the number of indices that we provide increases, we could indeed re-prefix the functions, and provide "shortcuts" for the most popular ones (like |
No, I was not thinking of R2, but rather hosmer-lemeshow, or chisq-gof, maybe error-rate. Maybe those gof-tests that do some significance testing. So these might get a prefix like |
oh right (although r2 for lm also has some significance testing amd is considered as a goodness of fit index 😁) tbh I don't find I am not sure of the alternatives tho, |
:-D
Indeed, I haven't even seen one episode yet, though I'm at least a bit addicted to fantasy. |
😮 |
^^ But to come back to the main matter, how do you separate indices like r2 from the other goodness of fit indices that would be prefixed? because they have some level of "testing" (with p-value and all)? |
Yes, it's not an "index" (like rmse or r2), but a significance-test (except error rate, actually) - at least this could be a rough discrimination. |
right I see your point! Then maybe something like |
Hi - sorry to be late to the party here, but I have used logistic regressions quite a bit, and I have a standard battery of goodness of fit tests I run for these models. For example:
And then of course to compare between models (i.e., model selection):
Let me know your thoughts here or if I am way off base. |
Just clarifying myself: we could have the individual tests available directly by their name, and all of them "wrapped" together within a @pdwaggoner What does |
That makes sense.
|
|
Oh brilliant- cross-validation is much more useful. |
@pdwaggoner you've just been sjstats'ed 😅😅 (the phenomenon of coming up with something and realising sjstats already does it) |
Ha! It feels... great (?) ;) |
I have drafted a function @pdwaggoner @DominiqueMakowski does anybody of you have worked with the boot package before? I haven't, and I'm not sure how to use it here. In sjstats, I have used an own bootstrapping-method from the cv <- data %>%
bootstrap(n) %>%
dplyr::mutate(
models = purrr::map(.data$strap, ~ stats::lm(formula, data = .x)),
predictions = purrr::map(.data$models, ~ stats::predict(.x, type = "response")),
response = purrr::map(.data$models, ~ resp_val(.x)),
accuracy = purrr::map2_dbl(.data$predictions, .data$response, ~ stats::cor(.x, .y, use = "pairwise.complete.obs"))
) Feel free to open a PR or commit to master... |
I do also find the native boot quite unintuitive. Maybe it would be worth it reimplementing a bootsrap method, as it would also be used in parameters (see current implementation here). |
@pdwaggoner Do you have examples for these two? Although you mentioned that cross-validation might be the better approach compared to ePCP, we still could think about visualization, so I'm interested in the plots you mentioned here. |
Hi @strengejacke - apologies for my delay here. Yeah, I have some MWE for each. First, for the ePCP (expected proportion correctly predicted), we are looking for a higher value, bounded between 0 and 1, where 1 = a model that perfectly predicted all actual observations. Here is some sample code using
And next, for the
** You can copy and paste this code to see the figures. Pretty clean, efficient renderings of model fit. ** And of course, ROC curves are always great, simple model fit checks. Let me know thoughts if you have them. Hope this helps a bit! |
The epcp-function looks nice, and has the beauty of clear interpretation. I implemented a function that calculates the epcp (05243ee), currently named |
library(performance)
m1 <- glm(vs ~ mpg, data = mtcars, family = binomial)
m2 <- glm(vs ~ mpg + cyl, data = mtcars, family = binomial)
m3 <- glm(vs ~ mpg + cyl * hp, data = mtcars, family = binomial)
compare_performance(m1, m2, m3)
#> name class AIC BIC R2_Tjur RMSE LOGLOSS EPCP
#> 1 m1 glm 23.49109 27.88830 0.6666308 0.7393217 0.2732983 0.8359198
#> 2 m2 glm 24.84308 32.17176 0.7063532 0.6810625 0.2319231 0.8554707
#> 3 m3 glm 29.53334 32.46481 0.4738108 0.8932618 0.3989584 0.7410163 Created on 2019-04-28 by the reprex package (v0.2.1) |
btw, the referred paper (https://www.cambridge.org/core/services/aop-cambridge-core/content/view/92B052AADD9756C8BCC00527749E029D/S1047198700008524a.pdf/postestimation_uncertainty_in_limited_dependent_variable_models.pdf) also shows a way to implement this for multinomial models, we should keep this in mind... I did not fully understand yet how to do it, but it doesn't look that complicated. |
As you might expect, I prefer I wonder if this could fit into a larger |
I only prefer |
I think this can be closed. I have opened separate issues for the two remaining features. |
Related to #1
they say (on wikipedia 😀):
I don't really have any opinion, never used any of those...
The text was updated successfully, but these errors were encountered: