Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to return p.values to tidy.polr #833

Merged
merged 4 commits into from Jun 2, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 5 additions & 0 deletions DESCRIPTION
Expand Up @@ -285,6 +285,11 @@ Authors@R:
role = "ctb",
email = "shirokuriwaki@gmail.com",
comment = c(ORCID = "0000-0002-5687-2647")),
person(given = "Lukas",
family = "Wallrich",
role = "ctb",
email = "lukas.wallrich@gmail.com",
comment = c(ORCID = "0000-0003-2121-5177")),
person(given = "Chuliang",
family = "Xiao",
role = "ctb",
Expand Down
2 changes: 2 additions & 0 deletions NEWS.md
Expand Up @@ -155,6 +155,8 @@ pending getting `safepredict()` going urgh)

- `tidy.lsmobj()` gained a `conf.int` argument for consistency with other tidiers.

- `tidy.polr()` now returns p-values if `p.values` is set to TRUE and the model does not contain factors with more than two levels.

- `tidy.zoo()` now doesn't change column names that have spaces or other
special characters (previously they were converted to data.frame friendly
column names by `make.names`)
Expand Down
33 changes: 31 additions & 2 deletions R/mass-polr-tidiers.R
Expand Up @@ -4,6 +4,8 @@
#' @param x A `polr` object returned from [MASS::polr()].
#' @template param_confint
#' @template param_exponentiate
#' @param p.values Logical. Should p-values be returned,
#' based on chi-squared tests from [MASS::dropterm()]. Defaults to FAlSE
#' @template param_unused_dots
#'
#' @examples
Expand All @@ -16,18 +18,31 @@
#'
#' glance(fit)
#' augment(fit, type.predict = "class")
#'
#' fit2 <- polr(factor(gear) ~ am + mpg + qsec, data = mtcars)
#' tidy(fit, p.values = TRUE)
#'
#' @evalRd return_tidy(regression = TRUE)
#'
#' @details In `broom 0.7.0` the `coefficient_type` column was renamed to
#' `coef.type`, and the contents were changed as well. Now the contents
#' are `coefficient` and `scale`, rather than `coefficient` and `zeta`.
#'
#' Calculating p.values with the `dropterm()` function is the approach
#' suggested by the MASS package author
#' (https://r.789695.n4.nabble.com/p-values-of-plor-td4668100.html). This
#' approach is computationally intensive, so that p.values are only
#' returned if requested explicitly. Additionally, it only works for
#' models containing no variables with more than two categories. If this
#' condition is not met, a message is shown and NA is returned instead of
#' p-values.
#'
#' @aliases polr_tidiers
#' @export
#' @seealso [tidy], [MASS::polr()]
#' @family ordinal tidiers
tidy.polr <- function(x, conf.int = FALSE, conf.level = 0.95,
exponentiate = FALSE, ...) {
exponentiate = FALSE, p.values = FALSE, ...) {
ret <- as_tibble(coef(summary(x)), rownames = "term")
colnames(ret) <- c("term", "estimate", "std.error", "statistic")

Expand All @@ -38,8 +53,22 @@ tidy.polr <- function(x, conf.int = FALSE, conf.level = 0.95,

if (exponentiate) {
ret <- exponentiate(ret)

if (p.values) {
sig <- MASS::dropterm(x, test = "Chisq")
p <- sig %>% dplyr::select(`Pr(Chi)`) %>% dplyr::pull() %>% .[-1]
terms <- purrr::map(rownames(sig)[-1], function(x)
ret$term[stringr::str_detect(ret$term, stringr::fixed(x))]) %>% unlist()
if (length(p) == length(terms)) {
ret <- dplyr::left_join(ret, tibble::tibble(term = terms, p.value = p), by = "term")
} else {
message("p-values can presently only be returned for models that contain
no categorical variables with more than two levels")
ret$p.value <- NA
}
}

}

mutate(
ret,
coef.type = if_else(term %in% names(x$zeta), "scale", "coefficient")
Expand Down