-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
align matches for correctness and readability #111
Comments
candidates for libraries to do this:
|
rather than attempting to "trim" low-similarity regions from the two sequences, it might be better to switch to Smith-Waterman or another local alignment, since ultimately we're concerned with all of the high-similarity regions within the two sequences wherever they are (not truly a global alignment). currently, using Needleman-Wunsch results in alignments like:
where both outlying 也 are undesirable, and we certainly want to avoid the second one since it's far outside the main high-scoring region. the optimal alignment is:
|
- Imports lingpy's Smith-Waterman implementation - Adds classes for auto-derived and custom scoring aligners See #111
- Make alignment non-mutating operation - Combine Smith-Waterman aligner variants into one class - Use alignment values to update match bounds See #111
- Imports lingpy's Smith-Waterman implementation - Adds classes for auto-derived and custom scoring aligners See #111
- Make alignment non-mutating operation - Combine Smith-Waterman aligner variants into one class - Use alignment values to update match bounds See #111
using a matcher based on edit distance ratios results in matches that may include irrelevant material at the end; alignment is necessary to remove this material from the match.
additionally, alignment allows for padding gaps with spaces for easier comparison of the results, and can aid in colorizing the results to highlight differences (#54). it could play into the structured output (#39).
The text was updated successfully, but these errors were encountered: