Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching with zero errors #17

Merged
merged 6 commits into from
Jan 4, 2024
Merged

Searching with zero errors #17

merged 6 commits into from
Jan 4, 2024

Conversation

eaasna
Copy link
Collaborator

@eaasna eaasna commented Dec 7, 2023

The mismatch indel score in the scoring scheme is usually an inverse of the error rate i.e. the smaller the error rate the bigger the negative value of the mismatch score. Need to set a lower bound for the mismatch score.

Otherwise, for a very small error rate the seed extension does not stop in time. Not sure of the mechanism for the extension not stopping (likely has to do with this line that only forces an update of the scoring scheme for extremely low error rates).

Downstream the consequences of a longer seed extension are:

  1. larger diagonals (stellar::_bestExtension)
  2. larger trace matrix (stellar::_fillMatrixBestEndsRight and stellar::_fillMatrixBestEndsLeft)
  3. a lot of RAM and time (stellar::_align_banded_nw_best_ends)

TScore mismatchIndel = (TScore)_max((TScore) ceil(-1/eps) + 1, -(TScore)length(host(infH)));
TScore mismatchIndel = -(TScore)1000;
if (eps > 0.001) // avoid division by 0
mismatchIndel = (TScore)_max((TScore) ceil(-1/eps) + 1, -(TScore)length(host(infH)));
Copy link
Collaborator Author

@eaasna eaasna Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid division by zero Stellar did not allow 0% error rates, although they appear in the benchmark in the paper. The workaround to this was using a very small 0.0001 error rate, but this led to a very large mismatch score which led to the issues described above.

@eaasna eaasna changed the title Zero errors Searching with zero errors Dec 12, 2023
@eaasna eaasna marked this pull request as ready for review December 13, 2023 09:18
@eaasna eaasna requested a review from SGSSGene December 13, 2023 09:18
@eaasna eaasna merged commit ff4f53b into main Jan 4, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants