Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use 'rtdists' for post-processing of wiener diffusion models #385

Closed
wants to merge 4 commits into from

Conversation

paul-buerkner
Copy link
Owner

@singmann would you mind reviewing the PR if I use rtdists correctly?

@singmann
Copy link
Contributor

On a first look everything looks fine.

But I will run some real tests at some point next week to make sure we are not overlooking something. However, I have two deadlines next week, Monday and Wednesday. So it might take a few days.

@paul-buerkner
Copy link
Owner Author

I will have to release the new brms version in the next few days as soon as loo 2.0 is on CRAN. Do you think you can take a more thorough look today or tomorrow? That would be really nice!

@singmann
Copy link
Contributor

I took a more in depth look into it in the last days and it seems that my assumption about speed were again wrong (I probably should simply stop making assumption about speed of code). I tested the new code against the model fitted in my blog post (http://singmann.org/wiener-model-analysis-with-brms-part-i/).

For this code, your implementation based on rtdists is in fact slower than the existing RWiener implementation (around 5300 seconds for rtdists versus around 3000 seconds for RWiener).

One problem with your code is that it does not use the vectorization that is already implemented in rtdists (i.e., Vectorize is unnecessary). I then changed the implementation to make use of the vectorization abilities in rtdists. With this code, both packages require approximately the same amount of time (rtdists might still be slightly slower, but the difference is less than 5%).

I believe remembering that there are certain parameter setting for which the RNG of RWiener fails and the one of rtdists does not. However, given that my believes apparently cannot be trusted I would need to run some more tests on this. Unfortunately, I am at a workshop in the UK at the moment until Sunday can cannot do this right now. If you want, I can nevertheless send you the updated code. However, given that there seems to be little actual benefit at the moment, this might perhaps not be the most important thing right now. I would suggest to go ahead with the new version without this addition. Maybe I can still find some test cases in which there is any real better outcome for rtdists in the future.

I want to apologize for having costed you time without any real benefit. Sorry.

One more thing, I also looked at the actual outcomes of all versions (i.e., the posterior predictive distributions) and they were all pretty much equivalent.

@paul-buerkner
Copy link
Owner Author

Thank Henrik for such a detailed analayis! And no worries, it didn't took me too long and it is good to know, that as of now, both implementations have roughly the same speed. I will close this PR, but leave the branch on github. If we later (whenever you have time; we are no longer in a hurry) find enough evidence that rtdists is to be favored overall, we can open this PR again.

@singmann
Copy link
Contributor

Hi Paul,

I would like to repoen this issue and discussion. I currently have a large fitted brms diffusion model (with 177K observations from 140 participants) for which I simply cannot get posterior predictives with RWiener. I have tried to let it run for weeks (actually months) and it did not provide any results yet for 500 posterior predictive samples.

I have since changed the RNG to the one from rtdists (see: https://github.com/singmann/brms) and it seems to work a lot better:

> set.seed(1)
> system.time(
+ tmptt <- predict(fit_congruency_old_diff, summary = FALSE, 
+                  negative_rt = TRUE, nsamples = 1)
+ )
   user  system elapsed 
320.189   2.391 322.523

I have run the exact same code for the CRAN version of brms and it has not yet produced a single posterior predictive sample after several hours (compared to the 5 minutes it takes for rtdists).

Thus, it seems that for some parameter sets, RWiener is just prohibitevly slow. However, I cannot guarantee that there are not sets for which the same would be true for rtdists. Therefore, ideally there would be a way to specify which package to use for the user. Potentially also with thepossibility to pass additional arguments to the RNG function (the current development version of rtdists provides two different methods for generating data).

Please note that the implementation on my github (https://github.com/singmann/brms) is pretty hacky, but it uses the rtdists built-in vectorization which seems like a good idea.

Let me know if there is anything I can provide or if you think there is a way to let the user choose which RNG to use or to specify additional arguments to the RNG.

Best,
Henrik

@paul-buerkner
Copy link
Owner Author

I think a good code structure could be to have an argument in the wiener distribution functions to specify which package to use for internal computation, so that both are available. What remains to be done then is to have an additional argument in predict to pass additional arguments to the distribution functions.

@paul-buerkner paul-buerkner reopened this Dec 23, 2019
@singmann
Copy link
Contributor

That sounds like an ideal solution. Let me know if you want my input on anything.

@paul-buerkner
Copy link
Owner Author

I will. Thank you!

paul-buerkner added a commit that referenced this pull request Jun 11, 2020
@paul-buerkner
Copy link
Owner Author

I have updated the brms master branch to support the rtdists backend as well. Would you mind double checking the code related to rtdists?

Here are some examples:

set.seed(1234)
n <- 10
x <- seq(0.1, 1, 0.1)
alpha <- rexp(n)
tau <- 0.05
beta <- 0.5
delta <- rnorm(n)
resp <- sample(c(0, 1), n, TRUE)

dwiener(x, alpha, tau, beta, delta, resp, backend = "Rwiener")
dwiener(x, alpha, tau, beta, delta, resp, backend = "rtdists")

rwiener(n, alpha, tau, beta, delta, backend = "Rwiener")
rwiener(n, alpha, tau, beta, delta, backend = "rtdists")

One can set the preferred backend globally for the current R session:

set.seed(0)
RT <- rbind(cbind(rwiener(100, 2, .3, .5, 0), grp = "A"),
            cbind(rwiener(100, 2, .3, .5, 1), grp = "B"))
names(RT)[1] <- "rt"

fit_wiener <- brm(rt | dec(resp) ~ grp, data = RT, 
                  family = wiener())

# default anyway
options(wiener_backend = "Rwiener")
pp_check(fit_wiener)
loo(fit_wiener)

# alternative backend
options(wiener_backend = "rtdists")
pp_check(fit_wiener)
loo(fit_wiener)

I will close this PR as I directly updated master with the new feature.

@paul-buerkner paul-buerkner deleted the use-rtdists branch July 15, 2020 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants