New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discrepancy between expected and observed coalescence time in beta coalescent for large values of alpha #2256
Comments
Thanks for the report @Sendrowski - the figure seems to have gone missing, would you mind posting again please? |
No - maybe a problem with GitHub? |
Possibly — I managed to view it with an independent device. Here the link to the image if that is of any help. |
Sorry, still not showing up for me (I got a 404 on that link) |
And here is an iCloud link. Hope it works :) |
Any thoughts here @JereKoskela ? |
I ran a C++ script simulating Beta(2-a, a) random variables for a very close to 2, and relative errors of 5 or 6 per cent in an empirical mean from 10k samples are common because the true mean is so small. I think that explains what is going on below alpha = 1.995 or so. As for the spike very close to 2, the timescale of the Schweinsberg Beta-coalescent model is unpleasant. It goes to infinity at alpha = 1 and zero at alpha = 2, so I wouldn't be surprised if numerical issues creep in near those boundaries. At the alpha = 2 boundary in particular, you're essentially sampling 0/0. The practical thing to do is just use the Hudson model, since probability of seeing a multiple merger in the the Beta-coalescent is effectively zero in that regime. |
I agree that a relative error of this magnitude might perhaps not seem so surprising (100k replicates in my code example), but it is also important to note that the simulated values are consistently larger than the theoretical ones (I am not taking absolute values). For |
You're right @Sendrowski, I missed the lack of a modulus sign, and evidently also one zero on the number of replicates. I'll do some more digging. It looks like a very odd bug; the code just calculates the timescale and multiplies it by an Exp(1) random variable. And somehow the resulting mean does not equal the timescale. |
Yes, very odd indeed... The bias also appears to drop somewhat near |
Ok, this turned out to be an insufficiently sensitive polynomial approximation for evaluating (essentially) @jeromekelleher, any comments or are you happy to merge the linked PR? I've rerun the tests in verification.py for the Beta-coalescent and all still look good. |
Thanks for sorting this out so quickly @JereKoskela! Change LGTM. @Sendrowski, are you happy with the changes? Shall we push out a bugfix release? |
Also looks good to me! Thank you for the quick fix @JereKoskela, and I look forward to meeting you in Warwick in April. As for the release, it’s totally up to you how soon you would like to push it out. |
Sorry, still can't see the image @Sendrowski . I think maybe you're uploading a tiff instead of a png or something? |
Hm it should be a png, but the graph I enclosed actually looks identical to the one I originally posted. |
Yeah, the timescale will always break close to alpha = 1 or 2. There is no way to avoid that. I suppose we could disallow values very close to the boundaries as a result. I'd like to make sure that 1.01 and 1.99 work even if they come from floating point calculations in some other script, so maybe 1.009 and 1.991? |
Sounds good! Of note perhaps that I could not observe the same kind of numerical instability for values near 1 — at least for similar distances to the boundary. |
Thanks, that's useful to know. It's possible that we can get much closer to 1 than 2, but eventually everything will be infinite for alpha too close to 1. I'll see if I can find a reasonable lower bound. |
Shall we reopen this issue, or make a new one? |
I've reopened this one. I'm hoping to get to this within a few days. |
Hey!
There appears to be an upward bias in the simulated coalescence times of the beta coalescent. This is relative to my expectation based on the time scaling provided in the documentation. The bias starts to be noticeable from
alpha=1.8
and increases substantially for values closer to 2.Relative error between observed and expected coalescent times plotted against alpha:
Code to generate figure:
Am I perhaps missing something?
The text was updated successfully, but these errors were encountered: