Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding statistic and p-values to rq class objects #404

Closed
IndrajeetPatil opened this issue Jun 25, 2018 · 6 comments
Closed

Adding statistic and p-values to rq class objects #404

IndrajeetPatil opened this issue Jun 25, 2018 · 6 comments

Comments

@IndrajeetPatil
Copy link
Contributor

The tidy.rq function currently doesn't output either the statistic (t-value) or p.value for the quantile regression, but this can be easily obtained using summary.rq(). It will be nice if this is included in the tidy output to be consistent with output from other regression models. The se = "nid" part can be absorbed in ..., so if anybody wants to use a differerent method to compute standard standard errors, they can use this argument.

# loading libraries and needed data
library(broom)
library(quantreg)
data(engel)

# specifying quantile regression model
rq_model <- quantreg::rq(
  formula = foodexp ~ income,
  tau = 0.5,
  data = engel
)

# getting tidy summary
broom::tidy(x = rq_model)
#>          term   estimate   conf.low   conf.high tau
#> 1 (Intercept) 81.4822474 47.0904023 135.1883939 0.5
#> 2      income  0.5601806  0.4803301   0.6127786 0.5

# getting summary with summary.rq
coef(summary(object = rq_model, se = "nid"))
#>                  Value  Std. Error   t value     Pr(>|t|)
#> (Intercept) 81.4822474 19.25066025  4.232699 3.322875e-05
#> income       0.5601806  0.02827721 19.810319 0.000000e+00

Created on 2018-06-25 by the reprex package (v0.2.0).

@IndrajeetPatil IndrajeetPatil changed the title Adding statistics and p-values to rq class objects Adding statistic and p-values to rq class objects Jun 26, 2018
@alexpghayes alexpghayes added this to the 0.7.0 milestone Jul 13, 2018
@ethchr
Copy link
Contributor

ethchr commented Jul 18, 2018

I'd be happy to take a crack at this.

@alexpghayes
Copy link
Collaborator

Awesome! Will follow up with some details tomorrow.

@alexpghayes
Copy link
Collaborator

You'll want to start by forking the broom repository, and then cloning your fork to your computer. Create a new branch called tidy-rq-p-values or something similar. If you are new to git, I recommend this guide by Jenny Bryan. Then you'll want to look for R/quantreg-rq-tidiers.R. Once you've found the tidy.rq() function, you'll need to modify it to include statistic and p.value columns.

After you've got everything working the way you want, you should document your work following tidyverse documentation guidelines.

Then next step will be to write some tests to ensure that your functions are working correctly. You'll want to edit the tests in tests/testthat/test-quantreg-rq.R.

Then you'll want to update NEWS.md with a quick description of the changes you've made. You should also add yourself to DESCRIPTION as a contributor.

Once the tests are passing, the final step is to commit all your work to your fork, and then open a pull request to broom.

It's also a good idea to read through the adding new tidiers vignette.

I'm happy to answer questions anytime or help you move forward if you get stuck! Happy coding! 🎉

@ethchr
Copy link
Contributor

ethchr commented Jul 20, 2018

The issue here seems to be tidy.rq()'s use of se = "rank" as the default standard error method. summary.rq() doesn't provide t-statistics and p-values from this standard error method. tidy.rq() is already programmed to provide statistic and p.value columns for any other type of standard error, for example using se = "nid" below:

library(broom)
library(quantreg)
data(engel)

# specifying quantile regression model
rq_model <- quantreg::rq(
  formula = foodexp ~ income,
  tau = 0.5,
  data = engel
)

# get the tidy summary using "nid" standard errors
broom::tidy(rq_model, se = "nid")
#>          term   estimate   std.error statistic      p.value   conf.low   conf.high tau
#> 1 (Intercept) 81.4822474 19.25066025  4.232699 3.322875e-05 43.5546428 119.4098520 0.5
#> 2      income  0.5601806  0.02827721 19.810319 0.000000e+00  0.5044689   0.6158922 0.5  

It's worth noting that summary.rq() doesn't always use "rank" as it's default se method. According to the summary.rq() documentation:

If se = NULL (the default) and covariance = FALSE, and the sample size is
less than 1001, then the "rank" method is used, otherwise the "nid" method is
used.

tidy.rq() currently doesn't discern the sample size when applying its default.

dim(engel)
#> 235   2

#summary.rq() default -sample less than 1,001
coef(summary(object = rq_model))
#>             coefficients   lower bd   upper bd
#> (Intercept)   81.4822474 53.2591515 114.011557
#> income         0.5601806  0.4870223   0.601989

#repeat data to be over 1,001 threshold
library(dplyr)
engel_long <- engel %>% slice(rep(row_number(), 5))

# re-specify model
rq_model_long <- quantreg::rq(
  formula = foodexp ~ income,
  tau = 0.5,
  data = engel_long
)

#get tidy results - larger sample
broom::tidy(rq_model_long)
#>          term   estimate   conf.low   conf.high tau
#> 1 (Intercept) 81.4822474 46.2422596 128.0846740 0.5
#> 2      income  0.5601806  0.4590202   0.6070497 0.5

# get summary.rq() results - larger sample
coef(summary(rq_model_long))
#>                  Value Std. Error   t value Pr(>|t|)
#> (Intercept) 81.4822474 8.50455514  9.581012        0
#> income       0.5601806 0.01298399 43.143937        0

I can prep a PR to make tidy.rq() match the default standard error method of summary.rq() if you want that to be consistent, or update the documentation to note the difference.

@alexpghayes
Copy link
Collaborator

I think changing the default is to match up with summary.rq() is a good idea. Documenting that the statistic and p.value columns don't show up for rank = "se" would also be good.

@github-actions
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 11, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants