Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds a new feature, referred to as the sparse profile. Also includes neighbor metadata on each row of the k-mer database. This is expected to increase the fixed rate per line by a factor of about 10. That constant plays with the exponential in the final graph storage space needed. This also fixed a bug in
kmerdb.fileutil.KDBReader.slurp()
where the full profile was not loading properly. We added some code that could need refactoring.Anyways, the idea behind the sparse profile is that we don't need to store connectivity for a subgraph of the full k-mer space that does not exist in the dataset in any capacity. This will help restrict the size of files for very sparse settings in the k - sequence space.