Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problematic handling of "zero" #432

Closed
molpopgen opened this issue Mar 25, 2020 · 1 comment · Fixed by #433
Closed

Problematic handling of "zero" #432

molpopgen opened this issue Mar 25, 2020 · 1 comment · Fixed by #433

Comments

@molpopgen
Copy link
Owner

molpopgen commented Mar 25, 2020

Many of the distributions of effect sizes have a nonzero density at an effect size of zero. Although the occurrence of such mutations is rare, they cause a problem for simulations with tree sequences. The default behavior when constructing a mutation is to assign neutral = True when esize == 0. When this happens, the mutation is recorded into the mutation table but is not put into the offspring genome.

The symptom of this is to (very rarely) see an exception at the end of the simulation, "bad mutation key remapping". This is triggered if all of the following occur:

  • A mutation with zero effect size arises near the end of the simulation, or in a preserved sample.
  • Simplification (correctly) retains the mutation in the tables, but the mutation is not present in any genomes and has a count of zero in pop.mcounts.

However, such mutations are arising periodically. They simply never cause harm (exceptions) nor erroneous results because the variants don't affect fitness/trait values (so results are correct) and they don't trigger the exception during final cleanup because they have usually drifted out. Thus, the bug occurs rather rarely--you need large N to have a decent chance of seeing the final exception at all, and even then it is rather rare.

Possible fixes include:

  • Allowing such "neutral" mutations to be incorporated into the smutations field of a genome. This is perhaps the most obvious solution, as the simulation faithfully respects the distributions of effect sizes. However, unless I can come up with a clever workaround, we need to break backwards compatibility with binary formats.
  • Do not allow such mutations to happen, which means a while loop in the operator() of each Sregion type to not allow zeros out. I don't really like this...
  • Do not let such variants to be entered into the site/mutation tables. This is by far the path of least resistance/pain, and could either be handled via upstream changes to fwdpp or by adding a new function to fwdpy11. The down side of this approach is that some small input density of some Sregion types never gets tracked.
@molpopgen
Copy link
Owner Author

Closed by #432.

@molpopgen molpopgen added this to the 0.6.3 milestone Mar 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant