-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
goals for 0.0.7 #62
Comments
@ibecav I think, for now, we can stop tweaking the vignettes because the package size is now I think the thing that really needs to be implemented asap is tests; only 14% of the code has code coverage, which is far from ideal. Do you mind working on adding more tests? I will work in parallel on adding a new function for dot plots. |
Let me finish purrr_examples and then I'll move to tests okay?
…On Mon, Oct 1, 2018 at 1:02 PM Indrajeet Patil ***@***.***> wrote:
@ibecav <https://github.com/ibecav> I think, for now, we can stop
tweaking the vignettes because the package size is now 3.3 MB, way below
the CRAN limit of 5 MB.
I think the thing that really needs to be implemented asap is tests; only
14% of the code has code coverage, which is far from ideal.
Do you mind working on adding more tests? I will work in parallel on
adding a new function for dot plots.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#62 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AZU62YHiQRQKAktMYCtFLinX6Mp42Wodks5ugkqagaJpZM4XB-tU>
.
|
Sounds good. Also, note that |
Got it other than the pull request I'm synched back up with you.
…On Mon, Oct 1, 2018 at 1:54 PM Indrajeet Patil ***@***.***> wrote:
Sounds good. Also, note that purrr_examples vignette has also changed
over the weekend.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#62 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AZU62fbrXBsOye4xBclgrwEK1nGS7CdDks5uglblgaJpZM4XB-tU>
.
|
@ibecav In light of the discussion about (Please don't remove any examples because, right now, this is the only way to check which regression models are supported by this function. We can downsize it only if the need arises.) |
Okay. Not sure when but I'll do that next.
…On Fri, Oct 5, 2018 at 11:32 AM Indrajeet Patil ***@***.***> wrote:
@ibecav <https://github.com/ibecav> In light of the discussion about
lm_effsize_ci bug, I was thinking that it'll be a good idea if you can
give the vignette for ggcoefstats a read. It can really benefit from your
statistical expertise as I am not really well-versed in all nuances
surrounding various kinds of regression models.
(Please don't remove any examples because, right now, this is the only way
to check which regression models are supported by this function. We can
downsize it only if the need arises.)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#62 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AZU62ct4cmtZ9SavYD78_xNKDySLw3sBks5uh3uXgaJpZM4XB-tU>
.
|
I took a look today and definitely see the need. Wow that's a lot of vignette! I'll see if I can add some value. |
And while far more simplistic I have been trying to make use of ggplot2
myself
https://cran.r-project.org/web/packages/CGPfunctions/vignettes/Using-Plot2WayANOVA.html
…On Fri, Oct 5, 2018 at 11:32 AM Indrajeet Patil ***@***.***> wrote:
@ibecav <https://github.com/ibecav> In light of the discussion about
lm_effsize_ci bug, I was thinking that it'll be a good idea if you can
give the vignette for ggcoefstats a read. It can really benefit from your
statistical expertise as I am not really well-versed in all nuances
surrounding various kinds of regression models.
(Please don't remove any examples because, right now, this is the only way
to check which regression models are supported by this function. We can
downsize it only if the need arises.)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#62 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AZU62ct4cmtZ9SavYD78_xNKDySLw3sBks5uh3uXgaJpZM4XB-tU>
.
|
Yeah, I agree that that vignette is pretty big, but that was deliberate on my part. I just want the user to have the impression that any regression model they can think of is supported by this function! Thanks for pointing me to I've been thinking about adding support for more complex factorial designs (not just 2-way, but any kind of design the user can think of) in |
Thanks for the tip on afex. A little disappointed that it doesn't accept many of the standard model objects. I have been going over the vignette for ggcoefstats. Some points for discussion before I start editing:
Thanks. set.seed(123)
# your current example
ggstatsplot::ggcoefstats(x = lm(formula = mpg ~ cyl * am, data = mtcars)) # a different exmaple some would say more classic regression
ggstatsplot::ggcoefstats(x = lm(formula = mpg ~ wt * disp, data = mtcars)) # Note R squared
summary(lm(formula = mpg ~ cyl * am, data = mtcars))
#>
#> Call:
#> lm(formula = mpg ~ cyl * am, data = mtcars)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -6.5255 -1.2820 -0.0191 1.6301 5.9745
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 30.8735 3.1882 9.684 1.95e-10 ***
#> cyl -1.9757 0.4485 -4.405 0.000141 ***
#> am 10.1754 4.3046 2.364 0.025258 *
#> cyl:am -1.3051 0.7070 -1.846 0.075507 .
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 2.939 on 28 degrees of freedom
#> Multiple R-squared: 0.7852, Adjusted R-squared: 0.7621
#> F-statistic: 34.11 on 3 and 28 DF, p-value: 1.73e-09
# Note R squared
summary(lm(formula = mpg ~ wt * disp, data = mtcars))
#>
#> Call:
#> lm(formula = mpg ~ wt * disp, data = mtcars)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -3.267 -1.677 -0.836 1.351 5.017
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 44.081998 3.123063 14.115 2.96e-14 ***
#> wt -6.495680 1.313383 -4.946 3.22e-05 ***
#> disp -0.056358 0.013239 -4.257 0.00021 ***
#> wt:disp 0.011705 0.003255 3.596 0.00123 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 2.455 on 28 degrees of freedom
#> Multiple R-squared: 0.8501, Adjusted R-squared: 0.8341
#> F-statistic: 52.95 on 3 and 28 DF, p-value: 1.158e-11
# stanardized coefficents mean the x axis scale is less problematic
# and make results clearer when thing are on different scales
lsr::standardCoefs(lm(formula = mpg ~ cyl * am, data = mtcars))
#> b beta
#> cyl -1.975735 -0.5854553
#> am 10.175407 0.8424555
#> cyl:am -1.305116 -0.5871095
lsr::standardCoefs(lm(formula = mpg ~ wt * disp, data = mtcars))
#> b beta
#> wt -6.49567966 -1.054555
#> disp -0.05635816 -1.158954
#> wt:disp 0.01170542 1.296692 Created on 2018-10-08 by the reprex package (v0.2.1) |
This is already implemented in this function from I was kind of hoping that the users would do this themselves. For instance, for the example you provided: ggstatsplot::ggcoefstats(x = lm(
formula = scale(mpg) ~ scale(wt) * scale(disp),
data = mtcars
))
I've already started doing this: |
I won't remove anything yet. Just reorder.
Interested in more breaking down by functionality b hat is different in
concept than omega or eta.
By the way for your label boxes you're actually displaying b hat not $$\Beta
(sorry haven't mastered greek letters in github yet
I've never seen some of the methods Gelman mentioned employed in real life
mainly seem `lm.beta` or Dani's `lsr::standardCoefs`which produce the same
results
…On Mon, Oct 8, 2018 at 4:47 PM Indrajeet Patil ***@***.***> wrote:
1. Yes, this is something I've would like to implement for sure. Maybe
even for the 0.0.7 release.
(With this approach:
http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf
)
This is already implemented in this function from dotwhisker package, but
I am wondering if I should write it myself and avoid another dependency or
just import this function:
https://github.com/fsolt/dotwhisker/blob/master/R/by_2sd.R
I was kind of hoping that the users would do this themselves. For
instance, for the example you provided:
ggstatsplot::ggcoefstats(x = lm(
formula = scale(mpg) ~ scale(wt) * scale(disp),
data = mtcars
))
<https://camo.githubusercontent.com/73afa89107d6e76fdf9f0280a862ef93136e8934/68747470733a2f2f692e696d6775722e636f6d2f524c35424a766d2e706e67>
1.
No problem if you reorder, but please don't remove any of the existing
models.
2.
I kind of like the current "one vignette per function" approach. If
you are worried about the increasing size of the vignette and the toll it's
going to take on the speed of CRAN checks, we can just move this
vignette to a website article and point to it in the additional vignette.
I've already started doing this:
https://indrajeetpatil.github.io/ggstatsplot/articles/additional.html
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#62 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AZU62SoYpJhbzK8xtvStQAi40F2PH0XZks5ui7nOgaJpZM4XB-tU>
.
|
That’s true, but I kind of like using Greek beta to represent both
standardized or unstandardized coefficients.
In the vignette examples, I keep changing using scaled or unscaled
variables and change the x label to reflect what the beta stands for. So I
hope that people appreciate this aspect of the plot.
On Mon, Oct 8, 2018 at 5:29 PM Chuck Powell <notifications@github.com>
wrote:
… I won't remove anything yet. Just reorder.
Interested in more breaking down by functionality b hat is different in
concept than omega or eta.
By the way for your label boxes you're actually displaying b hat not
$$\Beta
(sorry haven't mastered greek letters in github yet
I've never seen some of the methods Gelman mentioned employed in real life
mainly seem `lm.beta` or Dani's `lsr::standardCoefs`which produce the same
results
On Mon, Oct 8, 2018 at 4:47 PM Indrajeet Patil ***@***.***>
wrote:
>
> 1. Yes, this is something I've would like to implement for sure. Maybe
> even for the 0.0.7 release.
> (With this approach:
>
http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf
> )
>
> This is already implemented in this function from dotwhisker package, but
> I am wondering if I should write it myself and avoid another dependency
or
> just import this function:
> https://github.com/fsolt/dotwhisker/blob/master/R/by_2sd.R
>
> I was kind of hoping that the users would do this themselves. For
> instance, for the example you provided:
>
> ggstatsplot::ggcoefstats(x = lm(
> formula = scale(mpg) ~ scale(wt) * scale(disp),
> data = mtcars
> ))
>
>
> <
https://camo.githubusercontent.com/73afa89107d6e76fdf9f0280a862ef93136e8934/68747470733a2f2f692e696d6775722e636f6d2f524c35424a766d2e706e67
>
>
> 1.
>
> No problem if you reorder, but please don't remove any of the existing
> models.
> 2.
>
> I kind of like the current "one vignette per function" approach. If
> you are worried about the increasing size of the vignette and the toll
it's
> going to take on the speed of CRAN checks, we can just move this
> vignette to a website article and point to it in the additional vignette.
>
> I've already started doing this:
> https://indrajeetpatil.github.io/ggstatsplot/articles/additional.html
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <
#62 (comment)
>,
> or mute the thread
> <
https://github.com/notifications/unsubscribe-auth/AZU62SoYpJhbzK8xtvStQAi40F2PH0XZks5ui7nOgaJpZM4XB-tU
>
> .
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#62 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKzjlQhUzBaw_XsNzh3Yy7dqJ6iNW_dSks5ui8PFgaJpZM4XB-tU>
.
|
beta versus b hat has nothing to do with standardizing more about population versus sample |
Ah, I see.
Sorry, got confused between two different issues you had raised (between “B
(or b) versus beta” and “beta hat versus beta”).
Do you think this is misleading? I highly doubt that somebody looking at
this plot would infer that the beta represents a true value and not an
estimate. The same holds true for Cohen's d and Hedge's g, which are
currently shown without hats although they are estimates. Maybe this is
something that can be clarified in the documentation for the function in
question?
…On Mon, Oct 8, 2018 at 5:57 PM Chuck Powell ***@***.***> wrote:
beta versus b hat has nothing to do with standardizing more about
population versus sample
https://stats.stackexchange.com/questions/210543/what-is-the-difference-between-beta-1-and-hat-beta-1
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#62 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AKzjlTZKf2V_x8WJJStpbaXJocEXFPVNks5ui8o5gaJpZM4XB-tU>
.
|
No I don't think any practitioner will be confused. I just teach and tend to be very careful until it sinks into their heads. Today will be slow (meetings) but I'll keep slogging along. And to your earlier comment about having users simply use lm(formula = mpg ~ wt * disp, data = mtcars)
#>
#> Call:
#> lm(formula = mpg ~ wt * disp, data = mtcars)
#>
#> Coefficients:
#> (Intercept) wt disp wt:disp
#> 44.08200 -6.49568 -0.05636 0.01171
lm(formula = scale(mpg) ~ scale(wt) * scale(disp), data = mtcars)
#>
#> Call:
#> lm(formula = scale(mpg) ~ scale(wt) * scale(disp), data = mtcars)
#>
#> Coefficients:
#> (Intercept) scale(wt) scale(disp)
#> -0.2026 -0.6161 -0.3845
#> scale(wt):scale(disp)
#> 0.2355
lm.beta::lm.beta(lm(formula = mpg ~ wt * disp, data = mtcars))
#>
#> Call:
#> lm(formula = mpg ~ wt * disp, data = mtcars)
#>
#> Standardized Coefficients::
#> (Intercept) wt disp wt:disp
#> 0.000000 -1.054555 -1.158954 1.296692
lsr::standardCoefs(lm(formula = mpg ~ wt * disp, data = mtcars))
#> b beta
#> wt -6.49567966 -1.054555
#> disp -0.05635816 -1.158954
#> wt:disp 0.01170542 1.296692 Created on 2018-10-09 by the reprex package (v0.2.1) |
@ibecav Thinking of doing a minor release by the end of this month to make some important bug fixes available to the users. Will you have time to add a few more tests, especially for the subtitle making functions that are at the heart of everything (https://indrajeetpatil.github.io/ggstatsplot/reference/index.html#section-helper-functions-statistics-subtitles-)? Their code is currently not covered by any tests. It will be pretty straightforward, just like with the tests you had written for the effect size functions and shouldn't take a lot of time. I've been working on a new function to display post-hoc comparisons for |
I'll try and get at least some done before Thanksgiving. |
I did an initial pull request for one test. Take a look and then I can work the others. |
Just had a closer look at all the testthat files you added and modified few things. For future reference, can you please structure all tests so that they follow the points listed below-
This will also make them easier to debug or change in case the functions being tested themselves change. We are making a good headway towards increating the code coverage! |
I’ll try and remember these.
The export must be recent but I’ll try and oblige good news is it’s typically better safe than sorry.
Sent from my mobile please forgive my brevity
… On Nov 13, 2018, at 20:48, Indrajeet Patil ***@***.***> wrote:
Just had a closer look at all the testthat files you added and modified few things.
For future reference, can you please structure all tests so that they follow the points listed below-
Try not to have any lints in the code. So, for example, no line should go above 90 characters limit; better to have each argument on a separate line. The lintr bot will catch it, but better to do it on our side anyway.
Add a comment for each test and not for a full block of tests.
If you are copy-pasting tests from one file to another, make sure to change the comments (e.g., you were talking about checking bayes factor in a file containing tests for robust anovas and Pearson's chi-squared test).
Always have a test for sample size n. This is going to be crucial when NAs are present in the dataset.
All subtitle making functions are exported, so no need to have ::: call.
This will also make them easier to debug or change in case the functions being tested themselves change.
We are making a good headway towards increating the code coverage!
https://indrajeetpatil.github.io/ggstatsplot/articles/tests_and_coverage.html
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@ibecav I have added some new tests in for If you get time, can you add tests for these as well? I am working on adding tests for I am planning to submit |
(Goal for release date: last week of December)
To do:
groupedstats
as dependencies and import shared functions from thererlang
rather than using short-cutsstats::na.omit()
. Take a more fine-grained approach to removeNA
s only from columns of interest.results.subtitle
argument to all functionsggcoefstats
to work with dataframe argumentsggcoefstats
(like in Bayesian inference plots: e.g., https://twitter.com/tjmahr/status/1048226472710873089)50%
(currently at
14%
: https://github.com/IndrajeetPatil/ggstatsplot/tree/master/tests)theme_ggstatsplot
function; give user arguments option to change all aspects of the theme?gramr
packagek = 2
for all functions to follow APA guidelinesggscatterstats
,ggpiestats
, andggbetweenstats
(anova designs)ggpiestats
labels can overlap; give the option to have the labels to be either"internal"
(current default) or"external"
to the slicesgroup
option forggscatterstats
to support grouped marginals (https://github.com/daattali/ggExtra/blob/master/inst/vignette_files/ggExtra_files/figure-markdown_strict/ggmarginal-grouping-1.png)ggplot.function
argument togrouped_
variants to make modifications withggplot2
functions to customize the plotggbetweenstats
ggdotplotstats
for dot plots/charts95% CI
to have95%
as a subscriptThe text was updated successfully, but these errors were encountered: