-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LOO Comparison Function #13
Comments
@goedman I believe you already had something similar implemented in StatsModelComparisons, care to make a pull request adding it? |
I'll have a look. |
The comments I posted in the closed PR is better located with this issue, so I reposted. The compare() method I wrote for StatisticalRethinking produces something like this:
My first question: Is this, of course based on the ParetoSmooth's loo() method, what you are looking for? I used Dataframes and would need to convert to AxisKeys.jl format. A good exercise for me as I only recently looked into AxisKeys.jl. And obtain the data from the Psisloo object. You also mention support for Stan.jl. We could indeed add StanSample.jl to the test Project.toml and install Stan's cmdstan in the CI script. Technically simple to do and robust. The above 3 models (m5.1s, m5.3s and m5.2s) investigate the association of the median age at marriage (A), the marriage rate (M) with the divorce rate (D) in the southern US states. Here:
E.g., for m5.3s the StanLanguage programs is:
and I extract the log_lik matrix as before, e.g. for above Stan Language model (stan5_3):
My 2nd question is how strongly you would like to demonstrate Stan.jl versus to use the equivalent Turing models and use Chris' additions and possible simulate the association. We can still, on a high level, document how to use this package with Stan. |
@goedman This would be perfect, thanks! The documentation here should mostly be about this package, rather than Stan or Turing; I suspect more people will be using this package with Turing, though, so I think a tutorial should use Turing models. Since it looks like you use |
Hi Chris (@itsdfish), Trying to test the initial version of a loo_compare function. With Stan I get:
I don't trust the ΔSE values yet. Not sure what I'm doing wrong with Turing though, would you mine taking a quick look?
returns:
and:
The final check in the Turing test script also fails. |
I think the first problem can be traced back to a warning that you overlooked:
Unfortunately, Turing does not save pointwise log likelihoods when broadcasting is used in the model likelihood. Instead, they are summed across observations. You can see this problem in the size of the first dimension of
The solution unfortunately is to use a for loop in the Turing model or to redefine the likelihood (e.g. The reason that |
If Turing is updated to the newest version, the error described in the warning should be fixed. The thread you linked is about a separate issue ( |
Thanks guys, I'm now getting:
vs. above Stan results:
Edit: This looks pretty reasonable. I'm a bit surprised by occasionally wild jumps in the Turing estimates if I don't use |
Yep, it should be -- at the moment, it looks like you're computing the difference of the standard errors, rather than the standard error of the differences. Worth noting that the most recent PR I just merged should include additional information, like the in-sample score (lpd), that should make it easier to provide all of the information in the Statistical Rethinking comparison method. (Although it might break some of your code a bit, unfortunately; sorry about that 😅 ) |
Standard errors should be fairly easy to compute -- it's just the regular formula for the standard error of a mean. (Take the pointwise difference in LOO scores, use |
In StatsModelComparisons.jl I used:
I think that matches your description and gives:
Do you have a preference as far as header sequence is concerned? I was thinking: |
@goedman I would follow the headers used in the For the comparison object, I think it's better to only have the comparisons, rather than the LOO-CV scores themselves, since otherwise it encourages people to try and interpret the raw scores (which aren't meaningful). |
You mean with LooCompare defined as:
something like this:
|
Just a few thoughts/observations:
There are definitely advantages in this setup with a Project.toml in the test directory (after adding ParetoSmooth to that Project.toml you can activate the test environment and run scripts and inspect results). But I don't think doing it this is fully production ready yet. The scripts do run after above message.
|
Implemented! |
We should provide a function that compares two models, similar to loo_compare. This should be pretty easy to build.
The text was updated successfully, but these errors were encountered: