Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long sequences run into RecursionError #11

Open
bertsky opened this issue Oct 20, 2018 · 4 comments
Open

Long sequences run into RecursionError #11

bertsky opened this issue Oct 20, 2018 · 4 comments

Comments

@bertsky
Copy link

bertsky commented Oct 20, 2018

Since backtraceFrom is implemented by recursion (instead of iteration), calling the aligner on "long" sequences (more than 1000 items) results in a RecursionError with Python defaults. Extending stack depth limit may cause other serious issues.

@hadaev8
Copy link

hadaev8 commented Sep 25, 2019

Same here

@hadaev8
Copy link

hadaev8 commented Oct 13, 2019

@bertsky may you advise similar but better package?

@bertsky
Copy link
Author

bertsky commented Oct 14, 2019

@hadaev8 I'll try. Having tested several libraries available on PyPI (when searching with align or edit distance keywords) I finally reverted to the standard difflib.SequenceMatcher (with isjunk=None, autojunk=False) – although it is not optimising general global alignment (Needleman-Wunsch) but minimal visual difference (Ratcliff-Obershelp) – for the following reasons:

Generally, you want more than just correctness:

  • robustness:
    • possibility for bailout before entering extremely costly computations (memory or time)
    • heap and stack restrictions
  • efficiency: general complexity is O(n*m) (or even cubic when weighted), but:
    • there is a large difference in the linar factor (esp. between pure Python and good C implementations),
    • optimisation for benign cases is possible and makes a huge difference for average performance
    • you don't always need to enumerate all possible alignments, only one of minimal distance (but possibly with assumptions on ordering)
  • weighting etc

@hadaev8
Copy link

hadaev8 commented Oct 25, 2019

@bertsky
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants