feat: add serialization accessors to BigramFilter#330
Merged
dmtrKovalenko merged 5 commits intodmtrKovalenko:mainfrom Apr 3, 2026
Merged
feat: add serialization accessors to BigramFilter#330dmtrKovalenko merged 5 commits intodmtrKovalenko:mainfrom
dmtrKovalenko merged 5 commits intodmtrKovalenko:mainfrom
Conversation
Add read-only accessors and a from_raw_parts constructor to BigramFilter, enabling external tools to serialize/deserialize the bigram index to/from disk without reaching into private fields. New public methods: - lookup(), dense_data(), words(), dense_count(), populated() - skip_index() -> Option<&BigramFilter> - from_raw_parts(lookup, dense_data, ...) -> Self
Owner
dmtrKovalenko
left a comment
There was a problem hiding this comment.
Nice project, I have a few suggestions how to make it even better
dmtrKovalenko
approved these changes
Apr 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds read-only accessors and a
from_raw_partsconstructor toBigramFilter, enabling external tools to serialize and deserialize the bigram index to/from disk without reaching into private fields.New public methods:
lookup(),dense_data(),words(),dense_count(),populated()— field accessorsskip_index() -> Option<&BigramFilter>— sub-index referencefrom_raw_parts(lookup, dense_data, ...) -> Self— reconstruction from serialized dataThis is a purely additive, non-breaking change — no existing code is modified, no new dependencies. All 7 methods are simple one-line accessors/constructors.
Use case
fff-cli is a standalone CLI built on
fff-corethat persists the bigram index to disk (.fff/bigram.bin) for fast startup. It needs to read the filter's raw data for serialization and reconstruct it on load. Without these accessors, the only alternative is rebuilding the bigram index from scratch on every invocation (~200ms), negating the persistent index advantage.Test plan