Add OffHeapLongHashSet and utility methods for enhancements#255
Conversation
There was a problem hiding this comment.
Pull request overview
Adds new utility helpers and a new off-heap hash set implementation to support more efficient handling of primitive values and common formatting/hashing operations across the base module.
Changes:
- Introduces
OffHeapLongHashSetbacked by a directByteBufferfor off-heap storage of primitivelongvalues. - Adds small utility methods:
XHashing.hash(int,int)andXArrays.hasContent/hasNoContent(...)overloads for primitive arrays. - Enhances
VarStringwithisEmpty()override andpadLeft(int, ...)for integer formatting.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| base/src/main/java/org/eclipse/serializer/hashing/XHashing.java | Adds a helper to combine two hash components. |
| base/src/main/java/org/eclipse/serializer/collections/XArrays.java | Adds primitive-array “content present” convenience checks. |
| base/src/main/java/org/eclipse/serializer/collections/OffHeapLongHashSet.java | New off-heap primitive long hash set with linear probing and resizing. |
| base/src/main/java/org/eclipse/serializer/chars/VarString.java | Adds isEmpty() override and an int overload for padLeft. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Track 0L separately via containsZero; add/contains/iterate now handle the value correctly instead of treating it as a duplicate sentinel (previous behavior: add(0L) inflated size on every call, contains(0L) always returned false, iterate() skipped it). - Null-check the chain in contains() to avoid NPE on lookups whose hash slot has never been populated; bail early at the first 0L sentinel in the chain. - Reset hashRange (and containsZero) in truncate(); leaving the old hashRange caused AIOOBE on the next add/contains after a prior enlargement. - Replace low-bit masking hash with the bit-mixing function to reduce clustering for high-bit-heavy values. - Allow rebuild() to shrink; optimize() now passes pow2BoundCapped (size + 1) so an exact power-of-two size no longer triggers an immediate re-enlarge on the next add. - Widen size field from int to long to match the Sized.size() contract. - Seed filter() with the current hashSlots.length to avoid log(n) rebuilds while copying from a large source set. - Skip 0L sentinels during redistributeElements to stop propagating them as spurious elements on rehash. - Clamp chainGrowthFactor to >= 1.0f so an accidental <1 value doesn't make enlargeChain attempt to shrink.
…d` return status - Restart probing after resize to avoid unreachable slots and ensure correct insertion/containment behavior. - Modify `checkForRebuild` to return a boolean, signaling if resizing occurred. - Adjust collision handling and improve comments for better clarity.
…f-heap memory release.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…dling - Add `MAX_CAPACITY` to constrain the maximum number of slots. - Throw `IllegalArgumentException` or `CapacityExceededException` when exceeding capacity limits. - Introduce `ensureOpen` to validate the set's state before operations and throw `IllegalStateException` if closed. - Update collision handling logic in `checkForRebuild` for clearer and stricter thresholds. - Improve resizing safety checks and error messaging in `enlarge` and `resize`.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…invariants and prevent resizing issues.
…LongHashSet` - Add validation to prevent capacity exceeding `MAX_CAPACITY` during power-of-two rounding. - Revise probing loop to ensure termination based on table capacity rather than collision limits. - Improve comments for clarity on collision handling and probing behavior.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Add `ensureOpen` checks in `size`, `capacity`, and `currentLoad` methods to throw `IllegalStateException` if the set is closed.
Summary
This pull request introduces the following changes:
OffHeapLongHashSetfor efficient storage and retrieval of long values.isEmptyoverride andpadLeftutility method for better usability.Set_long