Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZeroTrie types simplification #4408

Open
2 tasks done
sffc opened this issue Dec 5, 2023 · 1 comment
Open
2 tasks done

ZeroTrie types simplification #4408

sffc opened this issue Dec 5, 2023 · 1 comment
Labels
C-zerovec Component: Yoke, ZeroVec, DataBake
Milestone

Comments

@sffc
Copy link
Member

sffc commented Dec 5, 2023

Currently we have 4 types:

  • ZeroTrieSimpleAscii
    • Supports ASCII only (so that it can be const constructed)
    • Does not use the perfect hash function (better for small data)
    • Does not support span nodes (because ASCII does not need them)
    • Maximum size is 2^32
  • ZeroTriePerfectHash
    • Supports arbitrary bytes, including UTF-8
    • Uses the perfect hash function for branch nodes with at least 16 children
    • Uses span nodes
    • Maximum size is 2^32 (slight performance win)
  • ZeroTrieExtendedCapacity
    • Same as ZeroTriePerfectHash but there is no size limit
  • ZeroTrie
    • An enum over the three above types

I think we should consolidate these down to two types:

  • ZeroAsciiTrie == ZeroTrieSimpleAscii
  • ZeroBytesTrie == ZeroTrieExtendedCapacity
  • Remove ZeroTriePerfectHash because the ~5% perf benefit isn't worth keeping around a separate type
  • Remove ZeroTrie because the type of trie should be determined by the programmer based on the nature of the data, not via a runtime switch

OK?

@sffc sffc added needs-approval One or more stakeholders need to approve proposal C-zerovec Component: Yoke, ZeroVec, DataBake labels Dec 5, 2023
@sffc sffc added this to the Utilities 1.0 milestone Dec 5, 2023
@sffc sffc mentioned this issue Dec 5, 2023
4 tasks
@sffc
Copy link
Member Author

sffc commented Dec 5, 2023

Worth noting that the byte pattern of a ZeroAsciiTrie can be consumed by a ZeroBytesTrie which means it is forwards-compatible to switch between them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-zerovec Component: Yoke, ZeroVec, DataBake
Projects
None yet
Development

No branches or pull requests

2 participants