Problematic handling of "zero" #432

molpopgen · 2020-03-25T17:14:47Z

Many of the distributions of effect sizes have a nonzero density at an effect size of zero. Although the occurrence of such mutations is rare, they cause a problem for simulations with tree sequences. The default behavior when constructing a mutation is to assign neutral = True when esize == 0. When this happens, the mutation is recorded into the mutation table but is not put into the offspring genome.

The symptom of this is to (very rarely) see an exception at the end of the simulation, "bad mutation key remapping". This is triggered if all of the following occur:

A mutation with zero effect size arises near the end of the simulation, or in a preserved sample.
Simplification (correctly) retains the mutation in the tables, but the mutation is not present in any genomes and has a count of zero in pop.mcounts.

However, such mutations are arising periodically. They simply never cause harm (exceptions) nor erroneous results because the variants don't affect fitness/trait values (so results are correct) and they don't trigger the exception during final cleanup because they have usually drifted out. Thus, the bug occurs rather rarely--you need large N to have a decent chance of seeing the final exception at all, and even then it is rather rare.

Possible fixes include:

Allowing such "neutral" mutations to be incorporated into the smutations field of a genome. This is perhaps the most obvious solution, as the simulation faithfully respects the distributions of effect sizes. However, unless I can come up with a clever workaround, we need to break backwards compatibility with binary formats.
Do not allow such mutations to happen, which means a while loop in the operator() of each Sregion type to not allow zeros out. I don't really like this...
Do not let such variants to be entered into the site/mutation tables. This is by far the path of least resistance/pain, and could either be handled via upstream changes to fwdpp or by adding a new function to fwdpy11. The down side of this approach is that some small input density of some Sregion types never gets tracked.

The text was updated successfully, but these errors were encountered:

molpopgen · 2020-03-28T18:34:19Z

Closed by #432.

molpopgen added bug tree sequences labels Mar 25, 2020

molpopgen mentioned this issue Mar 26, 2020

Fix zero effect size bug #433

Merged

molpopgen linked a pull request Mar 26, 2020 that will close this issue

Fix zero effect size bug #433

Merged

molpopgen mentioned this issue Mar 28, 2020

Add test for #432 #435

Closed

molpopgen added this to the 0.6.3 milestone Mar 30, 2020

molpopgen closed this as completed Mar 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problematic handling of "zero" #432

Problematic handling of "zero" #432

molpopgen commented Mar 25, 2020 •

edited

molpopgen commented Mar 28, 2020

Problematic handling of "zero" #432

Problematic handling of "zero" #432

Comments

molpopgen commented Mar 25, 2020 • edited

molpopgen commented Mar 28, 2020

molpopgen commented Mar 25, 2020 •

edited