Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--parse-by-seq bug fix + orderminhash bug fix #83

Merged
merged 4 commits into from
Aug 7, 2023
Merged

--parse-by-seq bug fix + orderminhash bug fix #83

merged 4 commits into from
Aug 7, 2023

Conversation

dnbaker
Copy link
Owner

@dnbaker dnbaker commented Aug 7, 2023

  1. Fix the --parse-by-seq code.
  2. OrderMinHash bug fix from updating to sketch v0.19.1
  3. Throw an error on empty sequences.
  4. Improved handling of ram or memory sequences.

Since --parse-by-seq only needs sequences for edit distance calculation, we can free memory if running in --seqs-in-ram mode. Saves the trouble of caching the parsed sequences to disk, but requires more memory.

Lifetime management needed a bit of extra work, but it seems to be stable for both cases.

@dnbaker dnbaker merged commit 34ae37a into main Aug 7, 2023
dnbaker added a commit that referenced this pull request Mar 7, 2024
* Debug --parse-by-seq lazily spilling to disk.

* Additionally update bonsai + hll versions for orderminhash bug fix.

* Free memory if possible for --seqs-in-ram mode.

* Manually manage versions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant