Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support customized key compare function #212

Merged
merged 3 commits into from
Nov 6, 2023

Conversation

xiaoyawei
Copy link
Contributor

@xiaoyawei xiaoyawei commented Sep 15, 2023

Pull Request

Related issue

Fixes #86

What does this PR do?

This PR continues the efforts in #90 to allow users to customize key comparator. The implementation approaches stays mostly the same except the following parts

Prefix iterator

heed implements the prefix iterator while lmdb does not actually have features like prefix seek in rocksdb, which means all the logics that handles prefix are implemented in the rust wrapper layer.

After taking a closer look at the API of prefix iterator in heed, looks like currently heed has the following assumptions on the key order

  1. For any key that starts with prefix, we have prefix < key unless key and prefix are identical.
  2. If prefix1 < prefix2, then for any key1 that starts with prefix1 and key2 that starts with prefix2, we have key1 < key2

These 2 assumptions together actually make the key order a lexicographic order, which I have a proof and can provide more details if needed. So on top of the Comparator trait, I also introduce a LexicographicComparator trait, and only when key order is lexicographic can users use the prefix iterator API.

Database type

#86 mentions the attempt not to change the database type. In this PR, since I introduce the LexicographicComparator, I choose to add the comparator type as a generic type in Database's definition, this brings the benefits that compile fails if a user is trying to use prefix iterator with a non-lexicographic comparator.

Why do I need a custom key comparator?

Basically I am trying to implement a persistent key-ordered multimap on a bunch of storage engines. LMDB is faster than rocksdb with relatively small data volume so I'd love to add LMDB as an option for my multimap.

PR checklist

Please check if your PR fulfills the following requirements:

  • Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
  • Have you read the contributing guidelines?
  • Have you made sure that the title is accurate and descriptive of the changes?

@Kerollmops Since you are the author of the original PR and still an active committer in heed, I feel you must the be best person to take a look at this PR. I'd appreciate any feedback!

Thank you so much for contributing to Meilisearch!

@xiaoyawei xiaoyawei force-pushed the custom_comparator branch 2 times, most recently from 778882c to c18fc6a Compare September 16, 2023 02:04
Copy link
Member

@Kerollmops Kerollmops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your changes, it seems great and working well. I just have some small change proposals.

heed-traits/src/lib.rs Outdated Show resolved Hide resolved
heed/src/database.rs Outdated Show resolved Hide resolved
heed/src/iterator/range.rs Outdated Show resolved Hide resolved
heed/src/env.rs Outdated Show resolved Hide resolved
heed/src/env.rs Outdated Show resolved Hide resolved
heed/src/env.rs Outdated Show resolved Hide resolved
heed/src/database.rs Outdated Show resolved Hide resolved
heed/examples/custom_comparator.rs Outdated Show resolved Hide resolved
heed-traits/src/lib.rs Outdated Show resolved Hide resolved
@xiaoyawei
Copy link
Contributor Author

@Kerollmops, thank you for your insightful feedback!

I've pushed two additional commits to this PR:

  • The first one (52c0a60) implements bug fixes that emerged from further examination of potential edge cases within this PR. New test cases have been added to confirm these fixes, which interestingly, reveal failures in the current master branch. Therefore, this commit not only enhances the PR but also rectifies existing issues in the main codebase. Details can be found within the test cases themselves.
  • The second commit (711d7c4) incorporates your review comments and suggestions.

I welcome any additional feedback or recommendations you might have. Thank you!

@Kerollmops
Copy link
Member

Kerollmops commented Nov 6, 2023

Thank you very much for your PR and fixes on that one! I am very happy to merge this PR!
I will immediately release a v0.20.0-alpha.5 for you or others to try 🎉

@Kerollmops Kerollmops merged commit e28f4fa into meilisearch:main Nov 6, 2023
8 checks passed
@Kerollmops Kerollmops added the breaking A change that is breaking the semver label Nov 6, 2023
@Kerollmops Kerollmops added this to the v0.20.0 milestone Nov 6, 2023
@xiaoyawei xiaoyawei deleted the custom_comparator branch February 16, 2024 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking A change that is breaking the semver
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for custom key comparison function
2 participants