goals for 0.0.7 #62

IndrajeetPatil · 2018-10-01T12:40:47Z

IndrajeetPatil · 2018-10-01T17:02:16Z

@ibecav I think, for now, we can stop tweaking the vignettes because the package size is now 3.3 MB, way below the CRAN limit of 5 MB.

I think the thing that really needs to be implemented asap is tests; only 14% of the code has code coverage, which is far from ideal.

Do you mind working on adding more tests? I will work in parallel on adding a new function for dot plots.

ibecav · 2018-10-01T17:47:13Z

Let me finish purrr_examples and then I'll move to tests okay?

…

On Mon, Oct 1, 2018 at 1:02 PM Indrajeet Patil ***@***.***> wrote: @ibecav <https://github.com/ibecav> I think, for now, we can stop tweaking the vignettes because the package size is now 3.3 MB, way below the CRAN limit of 5 MB. I think the thing that really needs to be implemented asap is tests; only 14% of the code has code coverage, which is far from ideal. Do you mind working on adding more tests? I will work in parallel on adding a new function for dot plots. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#62 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AZU62YHiQRQKAktMYCtFLinX6Mp42Wodks5ugkqagaJpZM4XB-tU> .

IndrajeetPatil · 2018-10-01T17:54:44Z

Sounds good. Also, note that purrr_examples vignette has also changed over the weekend.

ibecav · 2018-10-01T17:58:18Z

Got it other than the pull request I'm synched back up with you.

…

On Mon, Oct 1, 2018 at 1:54 PM Indrajeet Patil ***@***.***> wrote: Sounds good. Also, note that purrr_examples vignette has also changed over the weekend. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#62 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AZU62fbrXBsOye4xBclgrwEK1nGS7CdDks5uglblgaJpZM4XB-tU> .

IndrajeetPatil · 2018-10-05T15:32:37Z

@ibecav In light of the discussion about lm_effsize_ci bug, I was thinking that it'll be a good idea if you can give the vignette for ggcoefstats a read. It can really benefit from your statistical expertise as I am not really well-versed in all nuances surrounding various kinds of regression models.

(Please don't remove any examples because, right now, this is the only way to check which regression models are supported by this function. We can downsize it only if the need arises.)

ibecav · 2018-10-05T15:54:34Z

Okay. Not sure when but I'll do that next.

…

On Fri, Oct 5, 2018 at 11:32 AM Indrajeet Patil ***@***.***> wrote: @ibecav <https://github.com/ibecav> In light of the discussion about lm_effsize_ci bug, I was thinking that it'll be a good idea if you can give the vignette for ggcoefstats a read. It can really benefit from your statistical expertise as I am not really well-versed in all nuances surrounding various kinds of regression models. (Please don't remove any examples because, right now, this is the only way to check which regression models are supported by this function. We can downsize it only if the need arises.) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#62 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AZU62ct4cmtZ9SavYD78_xNKDySLw3sBks5uh3uXgaJpZM4XB-tU> .

ibecav · 2018-10-05T21:25:16Z

I took a look today and definitely see the need. Wow that's a lot of vignette! I'll see if I can add some value.

ibecav · 2018-10-05T21:54:17Z

And while far more simplistic I have been trying to make use of ggplot2 myself https://cran.r-project.org/web/packages/CGPfunctions/vignettes/Using-Plot2WayANOVA.html

…

On Fri, Oct 5, 2018 at 11:32 AM Indrajeet Patil ***@***.***> wrote: @ibecav <https://github.com/ibecav> In light of the discussion about lm_effsize_ci bug, I was thinking that it'll be a good idea if you can give the vignette for ggcoefstats a read. It can really benefit from your statistical expertise as I am not really well-versed in all nuances surrounding various kinds of regression models. (Please don't remove any examples because, right now, this is the only way to check which regression models are supported by this function. We can downsize it only if the need arises.) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#62 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AZU62ct4cmtZ9SavYD78_xNKDySLw3sBks5uh3uXgaJpZM4XB-tU> .

IndrajeetPatil · 2018-10-06T12:44:56Z

Yeah, I agree that that vignette is pretty big, but that was deliberate on my part. I just want the user to have the impression that any regression model they can think of is supported by this function!

Thanks for pointing me to CGPfunctions. Looks pretty cool!
I especially love the ggslopegraph function. I am definitely considering including slopegraphs in ggstatsplot to show categorical data since a lot of people don't want to use pie charts, for good reasons of course.

I've been thinking about adding support for more complex factorial designs (not just 2-way, but any kind of design the user can think of) in ggstatsplot by 1.0.0. release, something along the lines of ggbetweenstats in terms of plot design. The closest thing I have seen is the new afex_plot() function that is mighty general in terms of what kinds of factorial designs it can display and also satisfying with regards to the plot design:
https://cran.r-project.org/web/packages/afex/vignettes/afex_plot_introduction.html

ibecav · 2018-10-08T20:26:18Z

Thanks for the tip on afex. A little disappointed that it doesn't accept many of the standard model objects.

I have been going over the vignette for ggcoefstats. Some points for discussion before I start editing:

Is more or less a feature request. Would you consider supporting standardized coefficients? If the predictors are even one or two orders of magnitude different like wt & disp in mtcars it throws the plot off
Any problem if I significantly reorder things?
Long term would you consider breaking the vignette up into smaller chunks if it made sense topically?

Thanks.

set.seed(123)
# your current example
ggstatsplot::ggcoefstats(x = lm(formula = mpg ~ cyl * am, data = mtcars))

# a different exmaple some would say more classic regression
ggstatsplot::ggcoefstats(x = lm(formula = mpg ~ wt * disp, data = mtcars))

# Note R squared
summary(lm(formula = mpg ~ cyl * am, data = mtcars))
#> 
#> Call:
#> lm(formula = mpg ~ cyl * am, data = mtcars)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -6.5255 -1.2820 -0.0191  1.6301  5.9745 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  30.8735     3.1882   9.684 1.95e-10 ***
#> cyl          -1.9757     0.4485  -4.405 0.000141 ***
#> am           10.1754     4.3046   2.364 0.025258 *  
#> cyl:am       -1.3051     0.7070  -1.846 0.075507 .  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.939 on 28 degrees of freedom
#> Multiple R-squared:  0.7852, Adjusted R-squared:  0.7621 
#> F-statistic: 34.11 on 3 and 28 DF,  p-value: 1.73e-09
# Note R squared
summary(lm(formula = mpg ~ wt * disp, data = mtcars))
#> 
#> Call:
#> lm(formula = mpg ~ wt * disp, data = mtcars)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#> -3.267 -1.677 -0.836  1.351  5.017 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 44.081998   3.123063  14.115 2.96e-14 ***
#> wt          -6.495680   1.313383  -4.946 3.22e-05 ***
#> disp        -0.056358   0.013239  -4.257  0.00021 ***
#> wt:disp      0.011705   0.003255   3.596  0.00123 ** 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 2.455 on 28 degrees of freedom
#> Multiple R-squared:  0.8501, Adjusted R-squared:  0.8341 
#> F-statistic: 52.95 on 3 and 28 DF,  p-value: 1.158e-11
# stanardized coefficents mean the x axis scale is less problematic
# and make results clearer when thing are on different scales
lsr::standardCoefs(lm(formula = mpg ~ cyl * am, data = mtcars))
#>                b       beta
#> cyl    -1.975735 -0.5854553
#> am     10.175407  0.8424555
#> cyl:am -1.305116 -0.5871095
lsr::standardCoefs(lm(formula = mpg ~ wt * disp, data = mtcars))
#>                   b      beta
#> wt      -6.49567966 -1.054555
#> disp    -0.05635816 -1.158954
#> wt:disp  0.01170542  1.296692

^{Created on 2018-10-08 by the reprex package (v0.2.1)}

IndrajeetPatil · 2018-10-08T20:47:09Z

Yes, this is something I've would like to implement for sure. Maybe even for the 0.0.7 release.
(With this approach: http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf)

This is already implemented in this function from dotwhisker package, but I am wondering if I should write it myself and avoid another dependency or just import this function:
https://github.com/fsolt/dotwhisker/blob/master/R/by_2sd.R

I was kind of hoping that the users would do this themselves. For instance, for the example you provided:

ggstatsplot::ggcoefstats(x = lm(
  formula = scale(mpg) ~ scale(wt) * scale(disp),
  data = mtcars
))

No problem if you reorder, but please don't remove any of the existing models.
I kind of like the current "one vignette per function" approach. If you are worried about the increasing size of the vignette and the toll it's going to take on the speed of CRAN checks, we can just move this vignette to a website article and point to it in the additional vignette.

I've already started doing this:
https://indrajeetpatil.github.io/ggstatsplot/articles/additional.html

ibecav · 2018-10-08T21:29:40Z

I won't remove anything yet. Just reorder. Interested in more breaking down by functionality b hat is different in concept than omega or eta. By the way for your label boxes you're actually displaying b hat not $$\Beta (sorry haven't mastered greek letters in github yet I've never seen some of the methods Gelman mentioned employed in real life mainly seem `lm.beta` or Dani's `lsr::standardCoefs`which produce the same results

…

On Mon, Oct 8, 2018 at 4:47 PM Indrajeet Patil ***@***.***> wrote: 1. Yes, this is something I've would like to implement for sure. Maybe even for the 0.0.7 release. (With this approach: http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf ) This is already implemented in this function from dotwhisker package, but I am wondering if I should write it myself and avoid another dependency or just import this function: https://github.com/fsolt/dotwhisker/blob/master/R/by_2sd.R I was kind of hoping that the users would do this themselves. For instance, for the example you provided: ggstatsplot::ggcoefstats(x = lm( formula = scale(mpg) ~ scale(wt) * scale(disp), data = mtcars )) <https://camo.githubusercontent.com/73afa89107d6e76fdf9f0280a862ef93136e8934/68747470733a2f2f692e696d6775722e636f6d2f524c35424a766d2e706e67> 1. No problem if you reorder, but please don't remove any of the existing models. 2. I kind of like the current "one vignette per function" approach. If you are worried about the increasing size of the vignette and the toll it's going to take on the speed of CRAN checks, we can just move this vignette to a website article and point to it in the additional vignette. I've already started doing this: https://indrajeetpatil.github.io/ggstatsplot/articles/additional.html — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#62 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AZU62SoYpJhbzK8xtvStQAi40F2PH0XZks5ui7nOgaJpZM4XB-tU> .

IndrajeetPatil · 2018-10-08T21:51:21Z

That’s true, but I kind of like using Greek beta to represent both standardized or unstandardized coefficients. In the vignette examples, I keep changing using scaled or unscaled variables and change the x label to reflect what the beta stands for. So I hope that people appreciate this aspect of the plot. On Mon, Oct 8, 2018 at 5:29 PM Chuck Powell <notifications@github.com> wrote:

…

I won't remove anything yet. Just reorder. Interested in more breaking down by functionality b hat is different in concept than omega or eta. By the way for your label boxes you're actually displaying b hat not $$\Beta (sorry haven't mastered greek letters in github yet I've never seen some of the methods Gelman mentioned employed in real life mainly seem `lm.beta` or Dani's `lsr::standardCoefs`which produce the same results On Mon, Oct 8, 2018 at 4:47 PM Indrajeet Patil ***@***.***> wrote: > > 1. Yes, this is something I've would like to implement for sure. Maybe > even for the 0.0.7 release. > (With this approach: > http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf > ) > > This is already implemented in this function from dotwhisker package, but > I am wondering if I should write it myself and avoid another dependency or > just import this function: > https://github.com/fsolt/dotwhisker/blob/master/R/by_2sd.R > > I was kind of hoping that the users would do this themselves. For > instance, for the example you provided: > > ggstatsplot::ggcoefstats(x = lm( > formula = scale(mpg) ~ scale(wt) * scale(disp), > data = mtcars > )) > > > < https://camo.githubusercontent.com/73afa89107d6e76fdf9f0280a862ef93136e8934/68747470733a2f2f692e696d6775722e636f6d2f524c35424a766d2e706e67 > > > 1. > > No problem if you reorder, but please don't remove any of the existing > models. > 2. > > I kind of like the current "one vignette per function" approach. If > you are worried about the increasing size of the vignette and the toll it's > going to take on the speed of CRAN checks, we can just move this > vignette to a website article and point to it in the additional vignette. > > I've already started doing this: > https://indrajeetpatil.github.io/ggstatsplot/articles/additional.html > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > < #62 (comment) >, > or mute the thread > < https://github.com/notifications/unsubscribe-auth/AZU62SoYpJhbzK8xtvStQAi40F2PH0XZks5ui7nOgaJpZM4XB-tU > > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#62 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKzjlQhUzBaw_XsNzh3Yy7dqJ6iNW_dSks5ui8PFgaJpZM4XB-tU> .

ibecav · 2018-10-08T21:57:12Z

beta versus b hat has nothing to do with standardizing more about population versus sample

https://stats.stackexchange.com/questions/210543/what-is-the-difference-between-beta-1-and-hat-beta-1

IndrajeetPatil · 2018-10-09T00:32:00Z

Ah, I see. Sorry, got confused between two different issues you had raised (between “B (or b) versus beta” and “beta hat versus beta”). Do you think this is misleading? I highly doubt that somebody looking at this plot would infer that the beta represents a true value and not an estimate. The same holds true for Cohen's d and Hedge's g, which are currently shown without hats although they are estimates. Maybe this is something that can be clarified in the documentation for the function in question?

…

On Mon, Oct 8, 2018 at 5:57 PM Chuck Powell ***@***.***> wrote: beta versus b hat has nothing to do with standardizing more about population versus sample https://stats.stackexchange.com/questions/210543/what-is-the-difference-between-beta-1-and-hat-beta-1 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#62 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKzjlTZKf2V_x8WJJStpbaXJocEXFPVNks5ui8o5gaJpZM4XB-tU> .

ibecav · 2018-10-09T12:45:37Z

No I don't think any practitioner will be confused. I just teach and tend to be very careful until it sinks into their heads. Today will be slow (meetings) but I'll keep slogging along.

And to your earlier comment about having users simply use scale as part of the formula to lm that does not produce the classic standardized coefficients I'm used to in my discipline but as Gelman notes there are actually multiple ways you can standardize. I'm used to this one:

lm(formula = mpg ~ wt * disp, data = mtcars)
#> 
#> Call:
#> lm(formula = mpg ~ wt * disp, data = mtcars)
#> 
#> Coefficients:
#> (Intercept)           wt         disp      wt:disp  
#>    44.08200     -6.49568     -0.05636      0.01171
lm(formula = scale(mpg) ~ scale(wt) * scale(disp), data = mtcars)
#> 
#> Call:
#> lm(formula = scale(mpg) ~ scale(wt) * scale(disp), data = mtcars)
#> 
#> Coefficients:
#>           (Intercept)              scale(wt)            scale(disp)  
#>               -0.2026                -0.6161                -0.3845  
#> scale(wt):scale(disp)  
#>                0.2355
lm.beta::lm.beta(lm(formula = mpg ~ wt * disp, data = mtcars))
#> 
#> Call:
#> lm(formula = mpg ~ wt * disp, data = mtcars)
#> 
#> Standardized Coefficients::
#> (Intercept)          wt        disp     wt:disp 
#>    0.000000   -1.054555   -1.158954    1.296692
lsr::standardCoefs(lm(formula = mpg ~ wt * disp, data = mtcars))
#>                   b      beta
#> wt      -6.49567966 -1.054555
#> disp    -0.05635816 -1.158954
#> wt:disp  0.01170542  1.296692

^{Created on 2018-10-09 by the reprex package (v0.2.1)}

IndrajeetPatil · 2018-11-06T17:50:47Z

@ibecav Thinking of doing a minor release by the end of this month to make some important bug fixes available to the users. Will you have time to add a few more tests, especially for the subtitle making functions that are at the heart of everything (https://indrajeetpatil.github.io/ggstatsplot/reference/index.html#section-helper-functions-statistics-subtitles-)? Their code is currently not covered by any tests.

It will be pretty straightforward, just like with the tests you had written for the effect size functions and shouldn't take a lot of time.

I've been working on a new function to display post-hoc comparisons for ggbetweenstats and that will keep me occupied until the next release.

ibecav · 2018-11-07T18:41:16Z

I'll try and get at least some done before Thanksgiving.

ibecav · 2018-11-13T14:45:04Z

I did an initial pull request for one test. Take a look and then I can work the others.

IndrajeetPatil · 2018-11-14T01:48:21Z

Just had a closer look at all the testthat files you added and modified few things.

For future reference, can you please structure all tests so that they follow the points listed below-

Try not to have any lints in the code. So, for example, no line should go above 90 characters limit; better to have each argument on a separate line. The lintr bot will catch it, but better to do it on our side anyway.
Add a comment for each test and not for a full block of tests.
If you are copy-pasting tests from one file to another, make sure to change the comments (e.g., you were talking about checking bayes factor in a file containing tests for robust anovas and Pearson's chi-squared test).
Always have a test for sample size n. This is going to be crucial when NAs are present in the dataset.
All subtitle making functions are exported, so no need to have ::: call.

This will also make them easier to debug or change in case the functions being tested themselves change.

We are making a good headway towards increating the code coverage!
https://indrajeetpatil.github.io/ggstatsplot/articles/tests_and_coverage.html

ibecav · 2018-11-14T01:55:33Z

I’ll try and remember these. The export must be recent but I’ll try and oblige good news is it’s typically better safe than sorry. Sent from my mobile please forgive my brevity

…

On Nov 13, 2018, at 20:48, Indrajeet Patil ***@***.***> wrote: Just had a closer look at all the testthat files you added and modified few things. For future reference, can you please structure all tests so that they follow the points listed below- Try not to have any lints in the code. So, for example, no line should go above 90 characters limit; better to have each argument on a separate line. The lintr bot will catch it, but better to do it on our side anyway. Add a comment for each test and not for a full block of tests. If you are copy-pasting tests from one file to another, make sure to change the comments (e.g., you were talking about checking bayes factor in a file containing tests for robust anovas and Pearson's chi-squared test). Always have a test for sample size n. This is going to be crucial when NAs are present in the dataset. All subtitle making functions are exported, so no need to have ::: call. This will also make them easier to debug or change in case the functions being tested themselves change. We are making a good headway towards increating the code coverage! https://indrajeetpatil.github.io/ggstatsplot/articles/tests_and_coverage.html — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

IndrajeetPatil · 2018-11-20T16:15:49Z

@ibecav I have added some new tests in for ggbetweenstats helpers, which brought to my attention that all the new helpers I had added for pairwise comparisons have 0% code coverage! :(
https://github.com/IndrajeetPatil/ggstatsplot/blob/master/R/helpers_pairwise_comparison.R

If you get time, can you add tests for these as well? I am working on adding tests for ggcoefstats, which is going to take a lot of tests to have complete code coverage.

I am planning to submit 0.0.7 to CRAN after thanksgiving, most probably on the 28th of November. So, if you add more tests until then, they will be part of this release. If not, 0.0.8.

IndrajeetPatil added this to the 0.0.7 milestone Nov 10, 2018

IndrajeetPatil closed this as completed Dec 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

goals for 0.0.7 #62

goals for 0.0.7 #62

IndrajeetPatil commented Oct 1, 2018 •

edited

Loading

IndrajeetPatil commented Oct 1, 2018

ibecav commented Oct 1, 2018 via email

IndrajeetPatil commented Oct 1, 2018

ibecav commented Oct 1, 2018 via email

IndrajeetPatil commented Oct 5, 2018

ibecav commented Oct 5, 2018 via email

ibecav commented Oct 5, 2018

ibecav commented Oct 5, 2018 via email

IndrajeetPatil commented Oct 6, 2018

ibecav commented Oct 8, 2018

IndrajeetPatil commented Oct 8, 2018

ibecav commented Oct 8, 2018 via email •

edited

Loading

IndrajeetPatil commented Oct 8, 2018 via email

ibecav commented Oct 8, 2018

IndrajeetPatil commented Oct 9, 2018 via email

ibecav commented Oct 9, 2018

IndrajeetPatil commented Nov 6, 2018

ibecav commented Nov 7, 2018

ibecav commented Nov 13, 2018

IndrajeetPatil commented Nov 14, 2018

ibecav commented Nov 14, 2018 via email

IndrajeetPatil commented Nov 20, 2018

goals for 0.0.7 #62

goals for 0.0.7 #62

Comments

IndrajeetPatil commented Oct 1, 2018 • edited Loading

IndrajeetPatil commented Oct 1, 2018

ibecav commented Oct 1, 2018 via email

IndrajeetPatil commented Oct 1, 2018

ibecav commented Oct 1, 2018 via email

IndrajeetPatil commented Oct 5, 2018

ibecav commented Oct 5, 2018 via email

ibecav commented Oct 5, 2018

ibecav commented Oct 5, 2018 via email

IndrajeetPatil commented Oct 6, 2018

ibecav commented Oct 8, 2018

IndrajeetPatil commented Oct 8, 2018

ibecav commented Oct 8, 2018 via email • edited Loading

IndrajeetPatil commented Oct 8, 2018 via email

ibecav commented Oct 8, 2018

IndrajeetPatil commented Oct 9, 2018 via email

ibecav commented Oct 9, 2018

IndrajeetPatil commented Nov 6, 2018

ibecav commented Nov 7, 2018

ibecav commented Nov 13, 2018

IndrajeetPatil commented Nov 14, 2018

ibecav commented Nov 14, 2018 via email

IndrajeetPatil commented Nov 20, 2018

IndrajeetPatil commented Oct 1, 2018 •

edited

Loading

ibecav commented Oct 8, 2018 via email •

edited

Loading