-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
boost math error during EM iteration: Evaluation of function at pole -nan #48
Comments
Hi @warrenmcg, Thanks for this bug report. This error is caused by the I can say that this behavior is never expected (i.e. it's not the case that due to the stochastic nature of the algorithm we sometimes expect to evaluate the digamma function on an argument of What I suspect is that there is some corner case here where the argument to the |
Well, I'm glad it was easy to troubleshoot. I'm crunched for a deadline, so I won't be able to devote much time to helping putting together a toy dataset for you; in a few weeks, I'll see what I can do. Thank you very much for your help! |
No problem. I'll post back here if I find anything related to this issue in the meantime. Otherwise, you should be just fine running without |
There is now more stringent checking of the input to the digamma function, and so these issues should be resolved in the current release. Please report back (and re-open this) if you still encounter this issue. |
Hi Rob,
I've been running another group's samples (single-end, second-strand protocol), and I have a script that iterates through each sample and runs salmon. I'm running the latest version (0.6.0) with the following arguments: salmon quant -i salmon_index --libType SF -r <(gzip -c -d $IN_FILE) -o $OUTPUT --numBootstraps 100 --useVBOpt --useFSPD --geneMap $GENES --biasCorrect -p 59
During the EM iteration step (soon after the 500th round, when salmon recalculates effective lengths), I get this error:
I can't tell if this is just a regular possible occurrence with the non-deterministic algorithm or if this is never supposed to happen. These particular samples are extremely high depth (about ~170-190M reads per sample), so that might be the cause, but I don't understand enough of how the algorithm works to know how to troubleshoot or to put together a toy dataset that reproduces the error. Rerunning the sample that causes the error often works.
If I can throw in a feature request here, it would be great to be able to set the seed to make the runs deterministic. Is that possible?
The text was updated successfully, but these errors were encountered: