The tidy.rq function currently doesn't output either the statistic (t-value) or p.value for the quantile regression, but this can be easily obtained using summary.rq(). It will be nice if this is included in the tidy output to be consistent with output from other regression models. The se = "nid" part can be absorbed in ..., so if anybody wants to use a differerent method to compute standard standard errors, they can use this argument.
# loading libraries and needed data
# specifying quantile regression modelrq_model<-quantreg::rq(
# getting tidy summarybroom::tidy(x=rq_model)
#> term estimate conf.low conf.high tau#> 1 (Intercept) 81.4822474 47.0904023 135.1883939 0.5#> 2 income 0.5601806 0.4803301 0.6127786 0.5# getting summary with summary.rq
#> Value Std. Error t value Pr(>|t|)#> (Intercept) 81.4822474 19.25066025 4.232699 3.322875e-05#> income 0.5601806 0.02827721 19.810319 0.000000e+00
You'll want to start by forking the broom repository, and then cloning your fork to your computer. Create a new branch called tidy-rq-p-values or something similar. If you are new to git, I recommend this guide by Jenny Bryan. Then you'll want to look for R/quantreg-rq-tidiers.R. Once you've found the tidy.rq() function, you'll need to modify it to include statistic and p.value columns.
After you've got everything working the way you want, you should document your work following tidyverse documentation guidelines.
Then next step will be to write some tests to ensure that your functions are working correctly. You'll want to edit the tests in tests/testthat/test-quantreg-rq.R.
Then you'll want to update NEWS.md with a quick description of the changes you've made. You should also add yourself to DESCRIPTION as a contributor.
Once the tests are passing, the final step is to commit all your work to your fork, and then open a pull request to broom.
The issue here seems to be tidy.rq()'s use of se = "rank" as the default standard error method. summary.rq() doesn't provide t-statistics and p-values from this standard error method. tidy.rq() is already programmed to provide statistic and p.value columns for any other type of standard error, for example using se = "nid" below:
# specifying quantile regression model
rq_model <- quantreg::rq(
formula = foodexp ~ income,
tau = 0.5,
data = engel
# get the tidy summary using "nid" standard errors
broom::tidy(rq_model, se = "nid")
#> term estimate std.error statistic p.value conf.low conf.high tau
#> 1 (Intercept) 81.4822474 19.25066025 4.232699 3.322875e-05 43.5546428 119.4098520 0.5
#> 2 income 0.5601806 0.02827721 19.810319 0.000000e+00 0.5044689 0.6158922 0.5
It's worth noting that summary.rq() doesn't always use "rank" as it's default se method. According to the summary.rq() documentation:
If se = NULL (the default) and covariance = FALSE, and the sample size is
less than 1001, then the "rank" method is used, otherwise the "nid" method is
tidy.rq() currently doesn't discern the sample size when applying its default.
#> 235 2
#summary.rq() default -sample less than 1,001
coef(summary(object = rq_model))
#> coefficients lower bd upper bd
#> (Intercept) 81.4822474 53.2591515 114.011557
#> income 0.5601806 0.4870223 0.601989
#repeat data to be over 1,001 threshold
engel_long <- engel %>% slice(rep(row_number(), 5))
# re-specify model
rq_model_long <- quantreg::rq(
formula = foodexp ~ income,
tau = 0.5,
data = engel_long
#get tidy results - larger sample
#> term estimate conf.low conf.high tau
#> 1 (Intercept) 81.4822474 46.2422596 128.0846740 0.5
#> 2 income 0.5601806 0.4590202 0.6070497 0.5
# get summary.rq() results - larger sample
#> Value Std. Error t value Pr(>|t|)
#> (Intercept) 81.4822474 8.50455514 9.581012 0
#> income 0.5601806 0.01298399 43.143937 0
I can prep a PR to make tidy.rq() match the default standard error method of summary.rq() if you want that to be consistent, or update the documentation to note the difference.