Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MaxHeap #5076

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
162 changes: 162 additions & 0 deletions contracts/utils/structs/MaxHeap.sol
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
// SPDX-License-Identifier: MIT

pragma solidity ^0.8.20;

/**
* TODO:
* - optimizations
* - changeset
* - add tests
* - add docs
* - add base impl link here
*
* @dev Library for managing a max heap data structure.
*
* A max heap is a complete binary tree where each node has a value greater than or equal to its children.
* The root node contains the maximum value in the heap.
*
* This library provides functions to insert, update, and remove elements from the max heap, as well as to
* retrieve the maximum element (peek) and check the validity of the heap.
*
* The max heap has the following properties:
*
* - Insertion: O(log n)
* - Deletion of maximum element: O(log n)
* - Retrieval of maximum element (peek): O(1)
* - Update of an element: O(log n)
*
* The max heap is implemented using two mappings:
* - `tree`: Maps the position in the heap to the item ID.
* - `items`: Maps the item ID to its corresponding `Node` struct, which contains the value and heap index.
*
* Example usage:
*
* ```solidity
* contract Example {
* using MaxHeap for MaxHeap.MaxHeap;
*
* MaxHeap.MaxHeap private heap;
*
* function addItem(uint256 itemId, uint256 value) public {
* heap.insert(itemId, value);
* }
*
* function removeMax() public returns (uint256, uint256) {
* return heap.pop();
* }
*
* function getMax() public view returns (uint256, uint256) {
* return heap.peek();
* }
* }
* ```
*/
library MaxHeap {
/**
* @dev The position doesn't have a _parent as it's the root.
*/
error InvalidPositionZero();

struct Node {
uint256 value;
uint256 heapIndex;
}
Comment on lines +60 to +63
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can do something like:

  • assuming the heapIndex fits into something smaller. I'm not expecting any onchain structure to have more than type(uint32).max elements. Maybe just type(uint16).max is enough (that is 65k elements).
  • limit the value to uint224

So that Node fits into a single slot.


struct MaxHeap {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random though: maybe we want a Heap, instead of a MaxHeap, with a function pointer for the comparator.

mapping(uint256 => uint256) tree;
mapping(uint256 => Node) items;
Comment on lines +66 to +67
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm expecting one of these mapping (maybe both) can be implemented using array. That would result in better data locality when we move to verkle.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, using array with a value smaller than uint256 (for tree) would compress the data (shared slots)

uint256 size;
}

function _parent(uint256 pos) private pure returns (uint256) {
if (pos == 0) revert InvalidPositionZero();
return (pos - 1) / 2;
}

function _swap(MaxHeap storage heap, uint256 fpos, uint256 spos) private {
(heap.tree[fpos], heap.tree[spos]) = (heap.tree[spos], heap.tree[fpos]);
(heap.items[heap.tree[fpos]].heapIndex, heap.items[heap.tree[spos]].heapIndex) = (fpos, spos);
Comment on lines +77 to +78
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(heap.tree[fpos], heap.tree[spos]) = (heap.tree[spos], heap.tree[fpos]);
(heap.items[heap.tree[fpos]].heapIndex, heap.items[heap.tree[spos]].heapIndex) = (fpos, spos);
(
heap.tree[fpos], heap.items[heap.tree[fpos]].heapIndex,
heap.tree[spos], heap.items[heap.tree[spos]].heapIndex
) = (
heap.tree[spos], spos,
heap.tree[fpos], fpos
);

}

function heapify(MaxHeap storage heap, uint256 pos) internal {
if (pos >= (heap.size / 2) && pos <= heap.size) return;
Copy link
Collaborator

@Amxx Amxx Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think pos <= heap.size should be pos < heap.size.
Also pos >= heap.size should be an error if that is a user call ... but that would require changes to the recursive calls


uint256 left = 2 * pos + 1;
uint256 right = left + 1;

uint256 leftValue = left < heap.size ? heap.items[heap.tree[left]].value : 0;
uint256 rightValue = right < heap.size ? heap.items[heap.tree[right]].value : 0;
uint256 posValue = heap.items[heap.tree[pos]].value;

if (posValue < leftValue || posValue < rightValue) {
if (leftValue > rightValue) {
_swap(heap, pos, left);
heapify(heap, left);
} else {
_swap(heap, pos, right);
heapify(heap, right);
}
}
}

function insert(MaxHeap storage heap, uint256 itemId, uint256 value) internal {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understand is that itemId must be unique, so that all items are clearly identified ... but is then useless to the user. Maybe it should be automatically generated ... so that the user doesn't have to worry about it, and can just push a value.

heap.tree[heap.size] = itemId;
heap.items[itemId] = Node({value: value, heapIndex: heap.size});

uint256 current = heap.size;
uint256 parentOfCurrent = _parent(current);

while (current != 0 && heap.items[heap.tree[current]].value > heap.items[heap.tree[parentOfCurrent]].value) {
uint256 parentOfCurrent = _parent(current);
_swap(heap, current, parentOfCurrent);
current = parentOfCurrent;
parentOfCurrent = _parent(current);
}
heap.size++;
}

function update(MaxHeap storage heap, uint256 itemId, uint256 newValue) internal {
// Check that itemId exists in heap
// TODO: update return with revert?
if (heap.items[itemId].heapIndex >= heap.size) return;

uint256 position = heap.items[itemId].heapIndex;
uint256 oldValue = heap.items[itemId].value;

heap.items[itemId].value = newValue;

if (newValue > oldValue) {
while (
position != 0 && heap.items[heap.tree[position]].value > heap.items[heap.tree[_parent(position)]].value
) {
uint256 parentOfPosition = _parent(position);
_swap(heap, position, parentOfPosition);
position = parentOfPosition;
}
} else heapify(heap, position);
}

function pop(MaxHeap storage heap) internal returns (uint256, uint256) {
// TODO: should it revert if empty?

uint256 popped = heap.tree[0];
uint256 returnValue = heap.items[popped].value;

delete heap.items[popped];

heap.tree[0] = heap.tree[--heap.size];

heap.items[heap.tree[0]].heapIndex = 0;

delete heap.tree[heap.size];

heapify(heap, 0);

return (popped, returnValue);
}

function peek(MaxHeap storage heap) internal view returns (uint256, uint256) {
// TODO: should it revert if empty?
return (heap.tree[0], heap.items[heap.tree[0]].value);
}
}
Loading