Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update #88

Merged
merged 64 commits into from
May 9, 2019
Merged

Update #88

merged 64 commits into from
May 9, 2019

Conversation

DominiqueMakowski
Copy link
Member

Added a new version of p_direction based on AUC. My idea is that for posteriors with very few samples, estimating the density function and derivating the pd from that could be more sensitive (and accurate?) than taking the ratio of samples.

I will add these two versions to the comparison to see how it performs. What do you think?

@codecov-io
Copy link

Codecov Report

Merging #88 into master will decrease coverage by 0.83%.
The diff coverage is 39.28%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #88      +/-   ##
=========================================
- Coverage   62.34%   61.5%   -0.84%     
=========================================
  Files          29      30       +1     
  Lines        1041    1060      +19     
=========================================
+ Hits          649     652       +3     
- Misses        392     408      +16
Impacted Files Coverage Δ
R/utils_area_under_curve.R 0% <0%> (ø)
R/p_direction.R 78% <61.11%> (-9.81%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0cde6aa...8a192fd. Read the comment docs.

@codecov-io
Copy link

codecov-io commented May 4, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@952e464). Click here to learn what that means.
The diff coverage is 26.61%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master      #88   +/-   ##
=========================================
  Coverage          ?   62.94%           
=========================================
  Files             ?       28           
  Lines             ?     1101           
  Branches          ?        0           
=========================================
  Hits              ?      693           
  Misses            ?      408           
  Partials          ?        0
Impacted Files Coverage Δ
R/bayesfactor_inclusion.R 0% <0%> (ø)
R/print.bayesfactor_models.R 0% <0%> (ø)
R/utils_area_under_curve.R 0% <0%> (ø)
R/bayesfactor_savagedickey.R 51.21% <0%> (ø)
R/utils_distribution.R 21.05% <21.05%> (ø)
R/bayesfactor_models.R 31.73% <26.31%> (ø)
R/utils_density.R 29.16% <29.16%> (ø)
R/p_direction.R 78.43% <63.15%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 952e464...b35a48c. Read the comment docs.

@strengejacke
Copy link
Member

What about making both available? Does that make sense? I must take a closer look at the statistics

@DominiqueMakowski
Copy link
Member Author

Yes they are, currently, toggled by the raw=FALSE argument. However, I wonder if there is a better name for example auc=TRUE

@strengejacke
Copy link
Member

perhaps a method-argument? with optione auc and another one.

@DominiqueMakowski
Copy link
Member Author

true, method is more consistent with our other functions. method="auc" and method="raw" or "simple"?

@strengejacke
Copy link
Member

We did not have the cauchy, poisson and t-functions on CRAN, so no need to deprecate.

@DominiqueMakowski
Copy link
Member Author

I am checking the vignette. I've simplified some code a little, especially in plots: although a bit uglier, I think it's worth in such tutorials for beginners to keep the code as clear as possible for them so they can really get what is going on at each step ☺️

I've mostly commented out the chunks, but if you agree we can remove them ;)

@DominiqueMakowski
Copy link
Member Author

BTW there's an issue in the update method:

Error in update.bayesfactor_models(comparison, reference = 3) : 
  object 'x' not found

When subset is NULL, no x is generated.

@DominiqueMakowski
Copy link
Member Author

I also have issues with the BIC approx:

> bayesfactor_models(m1, m2, m3, m4, denominator = 1)
Error in mBIC - mBIC[denominator] : 
  non-numeric argument to binary operator

@DominiqueMakowski
Copy link
Member Author

There are some issues related to non-updated code, that makes some tests fail and prevent me to complete the vignette.

Nevertheless, it's great! A very good overview of the purpose and capabilities of these features.

@DominiqueMakowski
Copy link
Member Author

The fact that it doens't work on all R versions is quite problematic though, and it's unfortunately very hard to test and assess 😕

@mattansb
Copy link
Member

mattansb commented May 6, 2019

This whole thing is super weird, because my original BFEffects::inclusionBF used the same exact code, and that been running since June 2018 at least, so on R 3.5.0, if not 3.4.4...

@DominiqueMakowski
Copy link
Member Author

even worse is that I recall that I ran without any probs at the beginning...

@mattansb
Copy link
Member

mattansb commented May 6, 2019

Might have been something local to your setup? And any new installation of R would have worked?
Want to try to reinstall the previous version of R you had, and see if it works to test this hypothesis?

@DominiqueMakowski
Copy link
Member Author

I'll try tomorrow at my uni's pc which has a different version, we'll see :)

@strengejacke
Copy link
Member

Have you guys already discovered the debugging-feature in RStudio, or are you continuously using reprexes to debug your code? :-P

@DominiqueMakowski
Copy link
Member Author

@strengejacke Do you know that I've been trying to understand debugging for R and python for like 4 years... And yet I still print(1), print(2) and print(3) 😅

You mean using pointbreaks and stuff?

@strengejacke
Copy link
Member

pointbreaks

Probably, it's called breakpoints... Not suprising it already took 4 years! :-D

@DominiqueMakowski
Copy link
Member Author

sometimes like Yoda I speak 😁

@strengejacke
Copy link
Member

The problem is inside the sub-function make_terms():

    if (length(random_parts) == 0) {
      return(stats::setNames(fix_trms, fix_trms))
    }

We have:

as.data.frame(BFmodels)
#>                    Model    log.BF
#> 1                Species  67.30315
#> 2 Species + Petal.Length 128.40744
#> 3 Species * Petal.Length 125.12953
#> 4                      1   0.00000

Rows are reorder, so make_terms() is called with "1" in the first instance:

  for (m in seq_len(nrow(df.model))) {
    tmp_terms <- make_terms(df.model$Modelnames[m])
    df.model[m, tmp_terms] <- TRUE
  }

Since the "1" has no term labels when coerced to fomula., all.terms is empty, and so is fix_trms.

make_term <- function(formula) {
    formula.f <- stats::as.formula(paste0('~', formula))
    all.terms <- attr(stats::terms(formula.f), "term.labels")

    fix_trms <- all.terms[!grepl("\\|", all.terms)] # no random

    random_parts <- paste0(all.terms[grepl("\\|", all.terms)]) # only random
    if (length(random_parts) == 0) {
      return(stats::setNames(fix_trms, fix_trms))
    }
...

In the next step of the for-loop, you assign TRUE to a column that doesn't exist, because tmp_terms is an empty character.

df.model[m, tmp_terms] <- TRUE

This works on R3.6, but not on R < 3.6 (just tested here).

But: I don't understand the whole code. This loop

  for (m in seq_len(nrow(df.model))) {
    tmp_terms <- make_terms(df.model$Modelnames[m])
    df.model[m, tmp_terms] <- TRUE
  }

generates column names from terms, but these columns do not exist in the bayesfactor_models-object (the above data frame).

What is the purpose of that loop and the make_terms-function? I think we can easily replace that code in make_terms() with a lot shorter code by some insight-function. And then it needs to be checked what the for-loop actually should do, as those columns never exist, so .get_model_table() always ends up in returning an empty or invalid data frame.

Also, df.model[is.na(df.model)] <- FALSE is probably inaccutate.

@mattansb
Copy link
Member

mattansb commented May 6, 2019

Wow, good eye @strengejacke ! (Stupid intercept-only models...) I just pushed a fix for this (note that it works fine for me on R 3.5.3, and it did even before that.)


.get_model_table returns the following table, that has all the information needed to compute inclusion BFs.

library(bayestestR)
mo0 <- lm(Sepal.Length ~ 1, data = iris)
mo1 <- lm(Sepal.Length ~ Species, data = iris)
mo2 <- lm(Sepal.Length ~ Species + Petal.Length, data = iris)
mo3 <- lm(Sepal.Length ~ Species * Petal.Length, data = iris)

BFmodels <- bayesfactor_models(mo1, mo2, mo3, denominator = mo0)

bayestestR:::.get_model_table(BFmodels, priorOdds = NULL)
#>               Modelnames priorProbs    postProbs Species Petal.Length Species:Petal.Length
#> 1                      1       0.25 1.649232e-56   FALSE        FALSE                FALSE
#> 2                Species       0.25 2.796852e-27    TRUE        FALSE                FALSE
#> 3 Species + Petal.Length       0.25 9.636634e-01    TRUE         TRUE                FALSE
#> 4 Species * Petal.Length       0.25 3.633659e-02    TRUE         TRUE                 TRUE

Created on 2019-05-06 by the reprex package (v0.2.1)

What the internal make_term does is it breaks down a model formula (kind of) into its terms, including random effects:

make_terms("1")
#> character(0)
make_terms("0")
#> character(0)
make_terms("A")
#> "A" 
make_terms("A + (1|ID)")
#>    "A" "1:ID" 
make_terms("A + (A|ID)")
#>    "A" "1:ID" "A:ID" 
make_terms("A + (0 + A|ID)")
#>    "A" "A:ID"

This is how the df above is built.

@strengejacke
Copy link
Member

strengejacke commented May 6, 2019

What the internal make_term does is it breaks down a model formula (kind of) into its terms, including random effects:

Ah, I see! And R<3.6 fails with the creation of new variables when there was an empty character.

And indeed, the interactions like "A:ID" are not returned by any of the current insight-functions.

@DominiqueMakowski
Copy link
Member Author

The current version passes all checks and all at my place ☺️

Just need to please lord travis now:

  ══ testthat results  ═══════════════════════════════════════════════════════════
  OK: 65 SKIPPED: 4 WARNINGS: 36 FAILED: 3
  1. Error: (unknown) (@test-bayesfactor.R#40) 
  2. Error: brms (@test-brms.R#14) 
  3. Error: rstanarm (@test-rstanarm.R#31)

Either by skipping these tests on travis if we don't find any other way.

@mattansb
Copy link
Member

mattansb commented May 7, 2019

The only Travis I know is Travis Barker... I don't think he'll be much help...

@DominiqueMakowski
Copy link
Member Author

Unfortunately, this travis is the one that rhythms our lives now 😥 🥁

@DominiqueMakowski
Copy link
Member Author

merging 🎉 Thanks for this long but fruitful exchange

@DominiqueMakowski DominiqueMakowski merged commit bdd1a04 into master May 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants