From 000b8db576df7c2592ddaf2bd754e70a3a8b0aad Mon Sep 17 00:00:00 2001 From: mauve Date: Tue, 22 Nov 2022 22:15:45 -0500 Subject: [PATCH 01/20] Added initial prolly tree ADL Spec --- .../advanced-data-layouts/prollytree/spec.md | 467 ++++++++++++++++++ 1 file changed, 467 insertions(+) create mode 100644 specs/advanced-data-layouts/prollytree/spec.md diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md new file mode 100644 index 00000000..156f980c --- /dev/null +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -0,0 +1,467 @@ +# IPLD Prolly Trees Specification + +## Introduction + +Many applications have been using IPLD to represent large datasets in use cases such as blockchains or decentralized databases. +One side effect is that querying large amounts of data has become a more common need which often gets solved with centralized database indexes. +IPLD Prolly Trees are an important step in this direction in that they provide an interface similar to a Database's B+ tree. +Some of the immediate benefits of using Prolly Trees over regular B+ Trees is that Prolly Trees are self-balancing and can be determenistically merged with other trees which is important to reduce the amount of restructuring necessary for collaborativ index creation. +In this document we will build off of prior art in [Probabilistic B-Trees](https://github.com/attic-labs/noms/blob/master/doc/intro.md#prolly-trees-probabilistic-b-trees) to define a speficication on how they can be represented in IPLD, how they can be constructed, how they can be queried, and how multiple trees can be merged together. + +## References + +[DoltDB Architecture](https://github.com/dolthub/docs/tree/gitbook-publish/content/architecture) +The best deep dive into how the Dolt storage engine works is a series of blog posts by Aaron Son. +[How Dolt Stores Table Data](https://www.dolthub.com/blog/2020-04-01-how-dolt-stores-table-data/) + +[The Dolt Commit Graph and Structural Sharing]() +https://www.dolthub.com/blog/2020-05-13-dolt-commit-graph-and-structural-sharing/) + +[Efficient Diff on Prolly Trees](https://www.dolthub.com/blog/2020-06-16-efficient-diff-on-prolly-trees/) + +[Cell-level Three-way Merge in Dolt](https://www.dolthub.com/blog/2020-07-15-three-way-merge/) + +[Dolt Implementation Notes — Push And Pull On a Merkle DAG](https://www.dolthub.com/blog/2020-09-09-push-pull-on-a-merkle-dag/) + +[mikeal/IPSQL: InterPlanetary SQL](https://github.com/mikeal/IPSQL) +[mikeal/prolly-trees: Hash consistent search trees.](https://github.com/mikeal/prolly-trees) +[mikeal/matrika: Next Generation Decentralized Database](https://github.com/mikeal/matrika) +[mikeal/ipfs-sqlite: SQL on IPFS](https://github.com/mikeal/ipfs-sqlite) + +[Merkle Search Trees: Efficient State-Based CRDTs in Open Networks (Scientific Artilcle)](https://hal.inria.fr/hal-02303490) + - [simulation](https://gitlab.inria.fr/aauvolat/mst_exp/) + +## Summary/Overview + +Prolly trees leverage content addresibility to create ordered search indexes similar to the common B+ tree structure used for databases, but with the added ability to determenistically merge trees together. + +At the highest level, Prolly Trees act as a key value store whith the ability to iterate over key ranges which are alphanumerically sorted. This property of sorted iteration is important for creating database indexes for a variety of use cases like full text search, sorting of large datasets, and arbitrary search queries. + +### Search Tree + +The basic structure is that of an ordered search tree: The contained keys are organised such that they can be found (inserted, updated, ...) efficiently by value. + +*TODO Insert sketch of search tree* + +To efficiently find keys, the tree is traversed top-to-bottom and the non-leaf nodes help navigating/comparing the values efficiently. An intermediate node contains several ordered key-address pairs, which link to further nodes (intermediate or leaf) on the next lower level. + +Levels go from `0` representing Leaf Nodes, and go up for each level in the tree. The root of the tree will have the highest level in the tree and can give an estimate of it's overall size. + +Leaf nodes contain the actual Key-Value pairs for the tree which can be iterated over as part of the overall tree iteration. + + +### Chunking + +Chunking is the strategy of determining chunk boundaries: Given a list of key-value pairs, it 'decides' which are still inside node A and which already go to the next node B on the same level. +It depends on the hash/address/node of the items and a 'chunking factor' for tuning the shape of the tree. +This chunking factor determines the average size of nodes/chunks that a node on a higher level contains. +This average size in turn controls the shape of the tree (broad/narrow). +The shape in turn defines the performance of operations on the tree. + +### Note on Multiple and Single Values and Sets + +The described tree can represent a data structure with multiple, single or no values per key. However, given that IPLD Maps (which a Prolly Tree loosely maps to) only allow one value for a key, implementations should merge duplicate keys into a single value. This is also important to have consistent ordering of key value pairs. + +### Re-Calculate + +In order to modify the tree (inserting, updating, or removing one or multiple values) a part of the tree needs to be re-calculated. This section describes an internal procedure that is not exposed to the caller. +The tree is (partially) re-calculated bottom-up. + +Starting at the inserted/modified leaf or node referencing a now deleted/removed leaf, successively walk up the tree.If a removed item was the only item in its node: Remove the node from its parent and Check if the hash/address/CID of a new leaf/node splits the parent chunk (split the parent node if yes). + Then, check if a removed leaf/node was a 'splitting node' and the nodes need to be merged (only the last node in a chunk can be a splitting node). If splitting criterion holds for the last item of the node and there is the current node is succeeded by another node on the same level: merge the currend node with the succeeding node. Continue walking up the path. If at root and it is being split: Create a new root that links to the freshly split nodes. Return the new root CID. + +![](https://i.imgur.com/xE0id0V.png) +![](https://i.imgur.com/aYqzXZI.png) +![](https://i.imgur.com/nuvInkH.png) +![](https://i.imgur.com/zACS8gS.png) + +Pay attention to the fact that the boundary algorithm is not related with the node cid, i.e node cid is got after the boundary is generated and the node is saved in blockstore. Maybe in the future we can combine the boundary algorithm and cid. + +### Put/Remove/Update + +First search if the tree already contains the value. +If the key is not contained, create a new leaf for the value, insert the key-value (key-leaf) pair in the correct node and re-calculate the tree. +If the key is already stored in the tree and the value is the same, the tree is left unchanged. +If the key exists and the values differ: Either leave the tree or insert the new value and update the tree. +If removing the key, remove the `key` and the `value` from the list. After making any modifications to the tree, run the algoritm (Re-Calculate)[#Re-Calculate]. + +## Structure + +IPLD Schema(https://github.com/kenlabs/ptree-bs/blob/main/pkg/prolly/tree/schema.ipldsch) + +``` +type ProllyNode struct { + config &ChunkConfig + level Int + keys [Bytes] + links nullable [&ProllyNode] + values nullable [Any] +} representation tuple + +type WeibullThresholdConfig struct { + K Float + L Float +} representation tuple + +type RollingHashConfig struct { + rollingHashWindow Int +} representation tuple + +type PrefixThresholdConfig struct { + chunkingFactor Int +} + +type ChunkStrategy enum { + | PrefixThreshold + | WeibullThreshold + | RollingHash +} representation string + +type ChunkConfig struct { + chunkStrategy ChunkStrategy + minChunkSize Int + maxChunkSize Int + prefix nullable PrefixChunkConfig + weilbull nullable WeibullThresholdConfig + rollingHash nullable RollingHashConfig +} +``` + +### `ProllyNode.keys` + +Raw keys(keys/values input from users) for leaf node. Key-value pairs are sorted by byte value with the "larger" keys being at the end. Values are comared at the first byte, and going down to the end. This means that keys that are just a prefix come before keys that are prefix + 1 byte. + +### `ProllyNode.values` + +raw values for leaf nodes. For branch nodes, it's null. Values can point to arbitrary IPLD nodes and it is up to applications to generate and process them. + +### `ProllyNode.links` + +null for leaf nodes. For branch nodes, it's the CID of the child node. The index in the array corresponds to the index in the `keys` array for the corresponding key. Keys are used for searching, and links are used for traversing down towards the leaves. + +### `ProllyNode.level` + +0 for leaf nodes, and add 1 for parent levels (and incrementing as more parents are added) + +### `ProllyNode.config` + +Link to the info about the chunking strategy for how the prolly tree is built. Trees must always be mutated with the given strategy. Storing the config in prolly nodes means that any parent inside a tree can be passed around as a new root, and it makes it easy to compare if two trees are compatible for being "merged" together or used for comparisons. + +### `ChunkStrategy` + +The enum for the types of chunking strategies that are part of the spec. This is set to be a string so that the number of strategies can grow over time. + +### `ChunkConfig` + +Chunk Config for prolly tree, it includes some global setting, the splitter method you choose and specific configs about the splitter + +### `ChunkConfig.minChunkSize` + +The minimum size a chunk should be before considering the boundry function. + +TODO: Should this be in number of keys or bytes of keys+values or block size? Maybe we can get away with number of keys? + +### `ChunkConfig.maxChunkSize` + +The maximum size a chunk could be before it needs to be split regardless of the chunk boundries + +### `ChunkConfig.chunkStrategy` + +The string representing the type of strategy to use. Either `RollingHash` or `KeySplitter`. + +### `PrefixThresholdConfig` + +Config for the `PrefixThreshold` chunking strategy. + +This is the original strategy that was described in the Merkle Search Tree paper and has a `chunkingFactor` which represents the number of bits in the key which need to be `0` in order to set another boundry. + +Lower values result in wider nodes. + +### `WeibullThresholdConfig` + +Config for the `WeiBullThreshold` chunking strategy. + +Makes chunk boundary decisions on the hash of the key of a []byte pair and tries to create chunks that have an average number of []byte pairs, rather than an average number of Bytes. However, because the target number of []byte pairs is computed directly from the chunk size and count, the practical difference in the distribution of chunk sizes is minimal. It uses a dynamic threshold modeled on a weibull distribution (https://en.wikipedia.org/wiki/Weibull_distribution). As the size of the current trunk increases, it becomes easier to pass the threshold, reducing the likelihood of forming very large or very small chunks. + +The `K` value represents the Shape Parameter, and `L` represents `λ` - the scaling factor. + +### `RollingHashConfig` + + rollingHashSplitter is a nodeSplitter that makes chunk boundary decisions using a rolling value hasher that processes Item pairs in a byte-wise fashion. + +rollingHashSplitter uses a dynamic hash pattern designed to constrain the chunk Size distribution by reducing the likelihood of forming very large or very small chunks. As the Size of the current chunk grows, rollingHashSplitter changes the target pattern to make it easier to match. The result is a chunk Size distribution that is closer to a binomial distribution, rather than geometric. + +## Algorithm in detail + +Left to the implementation is how to generate and load data from IPLD. +Whenever a Link is resolved to a ProllyNode or ChunkConfig, this is done via the IPLD LinkSystem which it an implementation detail outside of the scope of this document. + +When serializing data into a CID+Block, one should use the codec and multihash that's either used by root CID of the tree you are modifying, or specificed explicitly during tree creation. Blocks should also be saved to the IPLD LinkSystem and how this is done is also outside the scope of this document. We will assume that there is a `getNode(cid)` API which loads IPLD Nodes from a CID, and a `saveNode(node) => {cid, bytes}` which will save a node and get back a CID and the byte contents used for generating the CID. Note that `saveNode` should use the same encoding and hash algorithm as the root of the tree. + +### IsLeaf(ProllyNode) : Boolean + +1. get the `level` of the `ProllyNode` +2. if the `level` is `0` return `true` +3. else return `false` + +### CursorIsValid(Cursor) : Boolean + +1. get the `length` of `cursor.node.keys` +2. if `length` is `0`, return `false` +3. if `cursor.index` is less than 0, return `false` +4. if `cursor.index` is greater than or equal to `legnth`, return `false` + +### KeyIndex(ProllyNode, item) : Integer + +Use a [binary search](https://en.wikipedia.org/wiki/Binary_search_algorithm) or equivalent to find the index in the `keys` array which is "closest" to but not "larger than" the `item`. Return the index. + +### CursorAtItem(ProllyNode, item): Cursor + +Get the closest cursor to a byte prefix + +1. Define a `Cursor` struct with a `index Integer`, `node ProllyNode`, and`parent Cursor` +2. Set `cursor.node` to the `ProllyNode` +3. Set `cursor.index` to `KeyIndex(cursor.node, item)` +6. Start a loop + 1. if `IsLeaf(cursor.node)` is `true`, break the loop + 2. get the `link` from `CursorGetLink(cursor)` + 3. resolve the ProllyNode at the `link` to `newNode` + 4. Set `parent` to `cursor` + 5. set `cursor` to a new Cursor struct + 6. set `cursor.parent` to `parent` + 7. set `cursor.node` to `newNode` + 8. set `cursor.index` to `KeyIndex(cursor.node, item)` +7. return the `cursor` + +### AdvanceCursor(Cursor) : Cursor + +1. Get the `length` of the `cursor.node.keys` +2. if `cursor.index` is less than `length - 1`, increment `cursor.index` and return the `cursor` +3. If `cursor.parent` is `null` + 1. set `cursor.index` to `length` + 2. return the `cursor` +3. Invoke `AdvanceCursor(cursor.parent)` +4. If `CursorIsValid(cursor.parent)` is `false` + 5. set `cursor.index` to `length` + 6. return the `cursor` +6. Get the `link` from `CursorGetLink(cursor.parent)` +7. Checgreater or k that `link` is not `null`, throw an error if it is +8. Get the `node` ProllyNode from `link` +9. Set `cursor.node` to `node` +10. Set `cursor.index` to `0` + +### CursorGetKey(Cursor) : key? + +1. If `CursorIsValid(Cursor)` is `false`, return `null` (or throw an error) +2. Get the `key` from `cursor.node.keys` at `index` +3. return the `key` + +### CursorGetValue(Cursor) : value? + +1. If `IsLeaf(cursor.node)` is `false`, return `null` (or throw an error) +1. If `CursorIsValid(Cursor)` is `false`, return `null` (or throw an error) +2. Get the `value` from `cursor.node.values` at `cursor.index` +3. return the `value` + +### CursorGetLink(Cursor) : &ProllyNode + +1. If `IsLeaf(cursor.node)` is `true`, return `null` (or throw an error) +1. If `CursorIsValid(Cursor)` is `false`, return `null` (or throw an error) +2. Get the `link` from `cursor.node.links` at `cursor.index` +3. return the `link` + +### Get(ProllyNode, key): value + +1. Get a `cursor` from `CursorAtItem(ProllyNode, key)` +2. If `CursorIsValid(cursor)` is `false`, return an error (key not found) +3. Get `currentKey` from `CursorGetKey(Cursor)` +4. If `currentKey` is not bytewise equal to `key`, return an error (key not found) +5. Get `value` from `CursorGetValue(Cursor)` +6. Return `value` + +### weibullCDF(x, K Int, L Int) : int + +Note: CDF (Cumulative Distribution Function) of the ][Weibull probability distribution](https://en.wikipedia.org/wiki/Weibull_distribution) + - return `-exp(-pow(x/L),K)` + +### WeibullThreshold(ProllyNode, key, value) : Boolean + +TODO: cleanup + +WeibullThreshold returns true if we should split at |hash| for a given record inserted into a chunk of Size |Size|, where the record's Size is |thisSize|. |Size| is the Size of the chunk after the record is inserted, so includes |thisSize| in it. + +WeibullThreshold attempts to form chunks whose sizes match the weibull distribution. + +The logic is as follows: given that we haven't split on any of the records up to |Size - thisSize|, the probability that we should split on this record is (CDF(end) - CDF(start)) / (1 - CDF(start)), or, the precentage of the remaining portion of the CDF that this record actually covers. We split is |hash|, treated as a uniform random number between [0,1), is less than this percentage. + +0. Get the `{bytes, cid}` from `saveNode(value)` +1. `hash = hash(key + saveNode(value).cid)` *TODO: hash needs to be well-defined* +2. `itemSize = len(key) + len(value)` +3. Set `prevItemsSize` to the sum of the length of all the node's previous items (excluding the current one) +4. `start = weibullCDF(prevItemsSize, K, L)` +5. `end = weibullCDF(prevItemsSize + itemSize, K, L)` +6. `p = float(hash)/maxUInt32` +7. `d = 1 - start` +8. If `d <= 0`: return `true` (this should realistically never occour?) +9. `target = (end - start)/d` +10. return `p < target` + +### HashByteArray([]byte, offset) : Boolean + +*TODO: Description* +*TODO: current implementation applies the min/maxChunkSize to offset. Do we want this?* +- For each `byte`: + 1. Increment `offset` by 1 + 2. Feed `byte` to rolling hash *TODO Details for hashing, salting* + 3. Set `hash` to hashSum of rolling hash + 4. `pattern = (1<<(15 - (offset >> 10))) - 1` + 5. If `hash&pattern == pattern` return `true` +- Return `false` + +### RollingHash(ProllyNode, index) : Boolean + +*TODO: Rolling hash window* +1. Set `offset` to the number of bytes of the elements of previous keys and values +2. If `HashByteArray(key) == true`: return `true` +3. Add number of bytes of key to `offset` +4. If `HashByteArray() == true`: return `true` +5. return `false` + +### ShouldCreateBoundry(ProllyNode, key) : Boolean + +If `true`, a new ProllyNode should be created for all subsequent keys. + +1. Get the `ChunkConfig` by resolving the link in `node.ChunkConfig` +1. Get the `ChunkStrategy` from the `ChunkConfig` +2. Get the `length` from the `node.keys` +3. If `length` is less than `ChunkConfig.MinChunkSize` return `true` +3. If `length` is equal to `ChunkConfig.MaxChunkSize` return `true` +4. If the `ChunkStrategy` is `KeySplitter` + - return WeibullThreshold(key) +5. If the `ChunkStrategy` is `RollingHash` + - return RollingHash(ProllyNode, index) +6. Return an error (unsupported chunk strategy) + +### RebalanceTree(Cursor) : ProllyNode + +The `cursor` should be pointing to the node in a tree that rebalancing should start at. The root of the `cursor` will be upated to match the changes. + +The returned `ProllyNode` is for the new root of the tree. + +TODO: Account for `cursor.parent` being null +TODO: Advance a cursor instead of iterating through `cursor.node.keys` +TODO: Account for `cursor.node.keys` being empty (remove from parent) + +- Loop + - If `ShouldCreateBoundry(cursor.node.chunkconfig, cursor.node, CursorGetKey(cursor))` is `false` + - Call `AdvanceCursor(cursor)` + - continue to next loop cycle + - Create a new `ProllyNode` `node` by cloning the current `cursor.node` + - Remove all keys, values, links from `cursor.node` after `cursor.index` + - Remove all keys, values, links from `node` before and including `cursor.index` + - Generate a new `CID` for `cursor.node` + - Set the `cursor.parent.links` at `cursor.parent.index` to the new `CID` + - Generate a new `CID` for the new `node` + - Shift all items in `cursor.parent.keys` up by `1` + - Shift all items in `cursor.parent.links` up by `1` + - Increment `cursor.parent.index` by `1` + - Set the `cursor.parent.links` at `cursor.parent.index` to the new `CID` + - Set the `cursor.parent.keys` at `cursor.parent.index` to node.keys[0] + - Set cursor.node to `node` + - Set cursor.index to `0` +- Return `RebalanceTree(cursor.parent)` + +### Put(ProllyNode, key, value) : ProllyNode + +- Get `cursor` from `CursorAtItem(ProllyNode, key)` +- If `CursorIsValid(cursor)` is `true` + - check if the key is at `CursorGetKey` + - If it is + - Set the value in the `cursor.node.values` at `cursor.index` to `value` + - If it isn't + - add the `key` to `cursor.node.keys` after the `cursor.index`, shifting subsequent items down + - add the `values` to `cursor.node.values` after the `cursor.index`, shifting subsequent items down +- TODO: If false, how is this reached? It means that there were no keys even "close" to `key`? +- return `RebalanceTree(cursor)` + +### Delete(ProllyNode, key, value) : ProllyNode + +- Get `cursor` from `CursorAtItem(ProllyNode, key)` +- If `CursorIsValid(cursor)` is `false` + - Return the `ProllyNode` +- check if the key is at `CursorGetKey(cursor)` +- If it is + - Remove the key in `cursor.node.keys` at `cursor.index` + - Remove the value in `cursor.node.values` at `cursor.index` +- return `RebalanceTree(cursor)` + +### Search(ProllyNode, prefix) : Iterator + +- Get `cursor` from `CursorAtItem(ProllyNode, key)` +- Create an Iterator (language dependent) +- On each pull from the iterator + - If `CursorIsValid(cursor)` is `false` + - Close the iterator and return + - Get the `key` from `CursorGetKey(cursor)` + - If `key` does not start with `prefix` + - Close the iterator and return + - Get the `value` from `CursorGetValue(cursor)` + - Yield the `key` and `value` from the iterator + - AdvanceCursor(ProllyNode) + +### Diff(ProllyNode base, ProllyNode new) : Iterator< Diff > + +This function provides a means to retrieve all items in a tree `new` that are missing from another tree `base` and returns them as a list of key-value pairs. + +(Basically follows: [Efficient Diff on Prolly-Trees - Dolt](https://www.dolthub.com/blog/2020-06-16-efficient-diff-on-prolly-trees/)) + +0. If the CID at `base.config` isnt' the same as `new.config`: Raise Error/Abort (Invalid config) +0. (Check hashes/addresses/CIDs of both trees - done if identical) +1. Point cursers `cursor_new`and `cursor_base` to leftmost items of the respective trees `new` and `base`. +2. Loop: + 1. If `cursor_base.node.key` is smaller than `cursor_new.node.key`: + - Advance the base cursor. (We ignore keys in the base which are missing from the new tree.): `AdvanceCursor(cursor_base)` + 2. If `cursor_new.node.key` is smaller than `cursor_base.node.key`: + - Record the key as missing from the base tree. (Add key and value to the list that is to be returned.) + - Advance new cursor: `AdvanceCursor(cursor_new)` + 4. If the keys (`cursor_new.node.key`, `cursor_base.node.key`) are equal: + 1. If the values in the trees differ for the same key: + - Record as different. (Add key and differing values to the list that is to be returned and mark as differing.) + - Advance both cursors: `AdvanceCursor(cursor_new)`, `AdvanceCursor(cursor_base)` + 2. Skip common elements (and potential parents) as far as possible. (See below.) +3. If `cursor_base` reaches the end in its node and `cursor_new` is not a the end: + - Record all remaining items as missing. (Add all remaining key-value pairs to the list that is to be returned.) + +Skipping common elements: +1. Until both cursors point to different values repeat: + 1. Advance parents: + 1. If the parents are equal, set the cursers to the first element of these parents. + 2. Advance both cursors: `AdvanceCursor(cursor_new)`, `AdvanceCursor(cursor_base)` + +### Merge(ProllyNode, ProllyNode) : ProllyNode + +This procedure takes two trees and returns a tree that contains all elements from both. +It builds on Diff and (bulk) insert. +It uses the output of the diff and adds the key-value pairs, which are contained within the right tree but not in the left tree, to the left tree. + +(Note: As a heuristic the higher tree can chosen as the left tree. The expected number of added elements should be smaller that way.) + + 1. Initialize the tree to be returned with the left tree: `merged = left` + 1. Invoke the diff on the trees: `Diff(left, right)` + 2. Iterate over the resulting key-value pairs and insert them to the left tree: `merged = Put(merged, key, value)` + 1. If merge conflicts occur (if `Diff()` returns differing key-value pairs): See below + 3. Return `merged`. + +#### Note on Merge Conflicts + +When merging trees it is possible that the trees have different values for the same key. In this case, as the algorithm is ignorant of keys and values, it can not be known how to solve the conflict. +The following approaches are possible for the implementation: + - Ask the caller for a handler function that resolves the conflict + - Throw an error, indicating the conflicting keys and values + - Ignore both + - Assume the `right` tree contains the newer values and simply use the key-value pair of the right tree. + +## Configuration (and Defaults) + +TODO: *Discussion of suitable values and implication for configuration could go here, this is something we are actively running experiments for* + From b28edfedb00b1bb5abd6f83f24f4c2a7599325bb Mon Sep 17 00:00:00 2001 From: RangerMauve Date: Fri, 9 Dec 2022 11:05:21 -0500 Subject: [PATCH 02/20] Update specs/advanced-data-layouts/prollytree/spec.md Co-authored-by: Volker Mische --- specs/advanced-data-layouts/prollytree/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index 156f980c..2004ff8d 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -253,7 +253,7 @@ Get the closest cursor to a byte prefix ### CursorGetKey(Cursor) : key? 1. If `CursorIsValid(Cursor)` is `false`, return `null` (or throw an error) -2. Get the `key` from `cursor.node.keys` at `index` +2. Get the `key` from `cursor.node.keys` at `cursor.index` 3. return the `key` ### CursorGetValue(Cursor) : value? From 5dc4dcf53bdf2ee99981c427b9ecba7f8b140955 Mon Sep 17 00:00:00 2001 From: RangerMauve Date: Fri, 9 Dec 2022 11:05:35 -0500 Subject: [PATCH 03/20] Update specs/advanced-data-layouts/prollytree/spec.md Co-authored-by: Volker Mische --- specs/advanced-data-layouts/prollytree/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index 2004ff8d..714e2b62 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -35,7 +35,7 @@ https://www.dolthub.com/blog/2020-05-13-dolt-commit-graph-and-structural-sharing Prolly trees leverage content addresibility to create ordered search indexes similar to the common B+ tree structure used for databases, but with the added ability to determenistically merge trees together. -At the highest level, Prolly Trees act as a key value store whith the ability to iterate over key ranges which are alphanumerically sorted. This property of sorted iteration is important for creating database indexes for a variety of use cases like full text search, sorting of large datasets, and arbitrary search queries. +At the highest level, Prolly Trees act as a key value store whith the ability to iterate over key ranges which are lexicographically sorted. This property of sorted iteration is important for creating database indexes for a variety of use cases like full text search, sorting of large datasets, and arbitrary search queries. ### Search Tree From 7e2b74982b76ec3ec34538174cd210a34a73e280 Mon Sep 17 00:00:00 2001 From: RangerMauve Date: Fri, 9 Dec 2022 11:15:21 -0500 Subject: [PATCH 04/20] Update specs/advanced-data-layouts/prollytree/spec.md Co-authored-by: Volker Mische --- specs/advanced-data-layouts/prollytree/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index 714e2b62..970269d9 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -67,7 +67,7 @@ The described tree can represent a data structure with multiple, single or no va In order to modify the tree (inserting, updating, or removing one or multiple values) a part of the tree needs to be re-calculated. This section describes an internal procedure that is not exposed to the caller. The tree is (partially) re-calculated bottom-up. -Starting at the inserted/modified leaf or node referencing a now deleted/removed leaf, successively walk up the tree.If a removed item was the only item in its node: Remove the node from its parent and Check if the hash/address/CID of a new leaf/node splits the parent chunk (split the parent node if yes). +Starting at the inserted/modified leaf or node referencing a now deleted/removed leaf, successively walk up the tree. If a removed item was the only item in its node: Remove the node from its parent and Check if the hash/address/CID of a new leaf/node splits the parent chunk (split the parent node if yes). Then, check if a removed leaf/node was a 'splitting node' and the nodes need to be merged (only the last node in a chunk can be a splitting node). If splitting criterion holds for the last item of the node and there is the current node is succeeded by another node on the same level: merge the currend node with the succeeding node. Continue walking up the path. If at root and it is being split: Create a new root that links to the freshly split nodes. Return the new root CID. ![](https://i.imgur.com/xE0id0V.png) From d237a0493243a38a2bd6108d7ef17f8a3c8bf5a8 Mon Sep 17 00:00:00 2001 From: RangerMauve Date: Fri, 9 Dec 2022 11:15:34 -0500 Subject: [PATCH 05/20] Update specs/advanced-data-layouts/prollytree/spec.md Co-authored-by: Volker Mische --- specs/advanced-data-layouts/prollytree/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index 970269d9..b5c3e2ab 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -75,7 +75,7 @@ Starting at the inserted/modified leaf or node referencing a now deleted/removed ![](https://i.imgur.com/nuvInkH.png) ![](https://i.imgur.com/zACS8gS.png) -Pay attention to the fact that the boundary algorithm is not related with the node cid, i.e node cid is got after the boundary is generated and the node is saved in blockstore. Maybe in the future we can combine the boundary algorithm and cid. +Pay attention to the fact that the boundary algorithm is not related to the node CID, i.e the boundary is generated *before* the node CID is generated. Maybe in the future we can combine the boundary algorithm and CID. ### Put/Remove/Update From edb87069f3cb090f5a605014165b184d2d581f25 Mon Sep 17 00:00:00 2001 From: RangerMauve Date: Fri, 9 Dec 2022 11:34:46 -0500 Subject: [PATCH 06/20] Update specs/advanced-data-layouts/prollytree/spec.md Co-authored-by: Volker Mische --- specs/advanced-data-layouts/prollytree/spec.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index b5c3e2ab..dcbc7e0a 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -242,8 +242,8 @@ Get the closest cursor to a byte prefix 2. return the `cursor` 3. Invoke `AdvanceCursor(cursor.parent)` 4. If `CursorIsValid(cursor.parent)` is `false` - 5. set `cursor.index` to `length` - 6. return the `cursor` + 1. set `cursor.index` to `length` + 2. return the `cursor` 6. Get the `link` from `CursorGetLink(cursor.parent)` 7. Checgreater or k that `link` is not `null`, throw an error if it is 8. Get the `node` ProllyNode from `link` From 0d186ec6ba8e8c129dba25ca8b86246fd55d6837 Mon Sep 17 00:00:00 2001 From: mauve Date: Tue, 13 Dec 2022 04:15:28 -0500 Subject: [PATCH 07/20] Clean up prolly tree spec based on PR reviews --- .../advanced-data-layouts/prollytree/spec.md | 409 +++++++++--------- 1 file changed, 208 insertions(+), 201 deletions(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index dcbc7e0a..ad32c72d 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -41,8 +41,6 @@ At the highest level, Prolly Trees act as a key value store whith the ability to The basic structure is that of an ordered search tree: The contained keys are organised such that they can be found (inserted, updated, ...) efficiently by value. -*TODO Insert sketch of search tree* - To efficiently find keys, the tree is traversed top-to-bottom and the non-leaf nodes help navigating/comparing the values efficiently. An intermediate node contains several ordered key-address pairs, which link to further nodes (intermediate or leaf) on the next lower level. Levels go from `0` representing Leaf Nodes, and go up for each level in the tree. The root of the tree will have the highest level in the tree and can give an estimate of it's overall size. @@ -56,185 +54,200 @@ Chunking is the strategy of determining chunk boundaries: Given a list of key-va It depends on the hash/address/node of the items and a 'chunking factor' for tuning the shape of the tree. This chunking factor determines the average size of nodes/chunks that a node on a higher level contains. This average size in turn controls the shape of the tree (broad/narrow). -The shape in turn defines the performance of operations on the tree. +The shape in turn defines the performance of operations on the tree with the following tradeoffs: + - larger blocks with more cache invalidations + - or smaller blocks with less cache invalidations and more lookups ### Note on Multiple and Single Values and Sets The described tree can represent a data structure with multiple, single or no values per key. However, given that IPLD Maps (which a Prolly Tree loosely maps to) only allow one value for a key, implementations should merge duplicate keys into a single value. This is also important to have consistent ordering of key value pairs. -### Re-Calculate +### Rebalance -In order to modify the tree (inserting, updating, or removing one or multiple values) a part of the tree needs to be re-calculated. This section describes an internal procedure that is not exposed to the caller. -The tree is (partially) re-calculated bottom-up. +In order to modify the tree (inserting, updating, or removing one or multiple values) a part of the tree needs to be rebalanced. This section describes an internal procedure that is not exposed to the caller. +The tree is (partially) rebalance bottom-up. Starting at the inserted/modified leaf or node referencing a now deleted/removed leaf, successively walk up the tree. If a removed item was the only item in its node: Remove the node from its parent and Check if the hash/address/CID of a new leaf/node splits the parent chunk (split the parent node if yes). - Then, check if a removed leaf/node was a 'splitting node' and the nodes need to be merged (only the last node in a chunk can be a splitting node). If splitting criterion holds for the last item of the node and there is the current node is succeeded by another node on the same level: merge the currend node with the succeeding node. Continue walking up the path. If at root and it is being split: Create a new root that links to the freshly split nodes. Return the new root CID. - -![](https://i.imgur.com/xE0id0V.png) -![](https://i.imgur.com/aYqzXZI.png) -![](https://i.imgur.com/nuvInkH.png) -![](https://i.imgur.com/zACS8gS.png) +Then, check if a removed leaf/node was a 'splitting node' and the nodes need to be merged (only the last node in a chunk can be a splitting node). If splitting criterion holds for the last item of the node and there is the current node is succeeded by another node on the same level: merge the currend node with the succeeding node. Continue walking up the path. If at root and it is being split: Create a new root that links to the freshly split nodes. Return the new root CID. -Pay attention to the fact that the boundary algorithm is not related to the node CID, i.e the boundary is generated *before* the node CID is generated. Maybe in the future we can combine the boundary algorithm and CID. +Pay attention to the fact that the boundary algorithm is not related to the node CID, i.e the boundary is generated *before* the node CID is generated. ### Put/Remove/Update First search if the tree already contains the value. -If the key is not contained, create a new leaf for the value, insert the key-value (key-leaf) pair in the correct node and re-calculate the tree. +If the key is not contained, create a new leaf for the value, insert the key-value (key-leaf) pair in the correct node and rebalance the tree. If the key is already stored in the tree and the value is the same, the tree is left unchanged. If the key exists and the values differ: Either leave the tree or insert the new value and update the tree. -If removing the key, remove the `key` and the `value` from the list. After making any modifications to the tree, run the algoritm (Re-Calculate)[#Re-Calculate]. +If removing the key, remove the `key` and the `value` from the list. After making any modifications to the tree, run the (Rebalance)[#Rebalance] algorithm. ## Structure -IPLD Schema(https://github.com/kenlabs/ptree-bs/blob/main/pkg/prolly/tree/schema.ipldsch) - ``` -type ProllyNode struct { - config &ChunkConfig - level Int - keys [Bytes] - links nullable [&ProllyNode] - values nullable [Any] +type ProllyTree struct { + config &ProllyTreeConfig + root &TreeNode } representation tuple -type WeibullThresholdConfig struct { - K Float - L Float +type TreeeNode struct { + # Is leaf when level is 0 + level Int + keys [Bytes] + # If a leaf, contains entry valies + # If an intermediate node, contains Links to further TreeNodes + values [Any] } representation tuple -type RollingHashConfig struct { - rollingHashWindow Int -} representation tuple - -type PrefixThresholdConfig struct { +type HashThresholdConfig struct { chunkingFactor Int -} + hashFunction Int +} representation tuple -type ChunkStrategy enum { - | PrefixThreshold - | WeibullThreshold - | RollingHash -} representation string +type ChunkingStrategy union { + | HashThresholdConfig "hashThreshold" +} representation keyed -type ChunkConfig struct { - chunkStrategy ChunkStrategy +type ProllyTreeConfig struct { minChunkSize Int maxChunkSize Int - prefix nullable PrefixChunkConfig - weilbull nullable WeibullThresholdConfig - rollingHash nullable RollingHashConfig -} + strategy ChunkingStrategy +} representation tuple ``` -### `ProllyNode.keys` +### `ProllyTree` -Raw keys(keys/values input from users) for leaf node. Key-value pairs are sorted by byte value with the "larger" keys being at the end. Values are comared at the first byte, and going down to the end. This means that keys that are just a prefix come before keys that are prefix + 1 byte. - -### `ProllyNode.values` - -raw values for leaf nodes. For branch nodes, it's null. Values can point to arbitrary IPLD nodes and it is up to applications to generate and process them. +A prolly tree is identified by a node which links to the `root` `TreeNode` for the actual key-value pairs, +and a `config` link to the ProllyTreeConfig which has infomration about the chunking and encoding information in order to quickly compare if two trees are using the same configuration and may be merged together as well as making it easier to write. -### `ProllyNode.links` +### `TreeNode` -null for leaf nodes. For branch nodes, it's the CID of the child node. The index in the array corresponds to the index in the `keys` array for the corresponding key. Keys are used for searching, and links are used for traversing down towards the leaves. +This is the "Tree" part of Prolly Trees and is made to be general purpose. +We can potentially expec to use the TreeNode structure in subsequent specs with different types of trees. -### `ProllyNode.level` +### `TreeeNode.keys` -0 for leaf nodes, and add 1 for parent levels (and incrementing as more parents are added) +Raw keys(keys/values input from users) for leaf node. Key-value pairs are sorted by byte value with the "larger" keys being at the end. Values are comared at the first byte, and going down to the end. This means that keys that are just a prefix come before keys that are prefix + 1 byte. -### `ProllyNode.config` +### `TreeNode.values` -Link to the info about the chunking strategy for how the prolly tree is built. Trees must always be mutated with the given strategy. Storing the config in prolly nodes means that any parent inside a tree can be passed around as a new root, and it makes it easy to compare if two trees are compatible for being "merged" together or used for comparisons. +Values corresponding to keys. +For leaf nodes these will be Links pointing to additional +Values can point to arbitrary IPLD nodes and it is up to applications to generate and process them. -### `ChunkStrategy` +### `TreeNode.level` -The enum for the types of chunking strategies that are part of the spec. This is set to be a string so that the number of strategies can grow over time. +0 for leaf nodes, and add 1 for parent levels (and incrementing as more parents are added) -### `ChunkConfig` +### `ProllyTreeConfig` -Chunk Config for prolly tree, it includes some global setting, the splitter method you choose and specific configs about the splitter +The configuration for how the prolly tree should be assembled with encodings and chunking strategies. -### `ChunkConfig.minChunkSize` +### `ProllyTreeConfig.minChunkSize` The minimum size a chunk should be before considering the boundry function. +The size is calculated from the size of the `bytes` when encoding a prolly tree node. -TODO: Should this be in number of keys or bytes of keys+values or block size? Maybe we can get away with number of keys? - -### `ChunkConfig.maxChunkSize` +### `ProllyTreeConfig.maxChunkSize` The maximum size a chunk could be before it needs to be split regardless of the chunk boundries +The size is calculated from the size of the `bytes` when encoding a prolly tree node. -### `ChunkConfig.chunkStrategy` +### `ProllyTreeConfig.strategy` -The string representing the type of strategy to use. Either `RollingHash` or `KeySplitter`. - -### `PrefixThresholdConfig` +The `ChunkingStrategy` to use for forming the prolly tree. -Config for the `PrefixThreshold` chunking strategy. +### `ChunkingStrategy` -This is the original strategy that was described in the Merkle Search Tree paper and has a `chunkingFactor` which represents the number of bits in the key which need to be `0` in order to set another boundry. +The strategy to use for chunking the leaves of the tree (and intermediate nodes). -Lower values result in wider nodes. +This is a Map with different keys representing different strategies. -### `WeibullThresholdConfig` +This spec currently supports the `byteThreshold` strategy, however we are intentionally leaving room for further chunking strategies which use more advanced algorithms such as the Weibull Distribution strategy used in Dolt. -Config for the `WeiBullThreshold` chunking strategy. +### `ByteThresholdConfig` -Makes chunk boundary decisions on the hash of the key of a []byte pair and tries to create chunks that have an average number of []byte pairs, rather than an average number of Bytes. However, because the target number of []byte pairs is computed directly from the chunk size and count, the practical difference in the distribution of chunk sizes is minimal. It uses a dynamic threshold modeled on a weibull distribution (https://en.wikipedia.org/wiki/Weibull_distribution). As the size of the current trunk increases, it becomes easier to pass the threshold, reducing the likelihood of forming very large or very small chunks. +This is the strategy that was described in the [Peer to Peer Ordered Search Indexes](https://0fps.net/2020/12/19/peer-to-peer-ordered-search-indexes/) paper. -The `K` value represents the Shape Parameter, and `L` represents `λ` - the scaling factor. +It works by hashing a key+value pair, reading the last 4 bytes as a 32 bit unsigned integer, and checking how many bits are 0's relative to the chunking factor. -### `RollingHashConfig` +The `chunkingFactor` must be less than the maximum value of a 32 bit unsigned integer. +It is used to calculate a "chunking threshold" using the forumala `Math.floor(4294967295 / chunkingFactor)`. +It is reccommended to use powers of two to make it easier to relate to how many bits should be `0` in the chunking threshold. - rollingHashSplitter is a nodeSplitter that makes chunk boundary decisions using a rolling value hasher that processes Item pairs in a byte-wise fashion. +The larger the chunking factor, the less likely it is that a given keypair will result in a chunking boundry, and thus will lead to TreeNodes with more entries within them. -rollingHashSplitter uses a dynamic hash pattern designed to constrain the chunk Size distribution by reducing the likelihood of forming very large or very small chunks. As the Size of the current chunk grows, rollingHashSplitter changes the target pattern to make it easier to match. The result is a chunk Size distribution that is closer to a binomial distribution, rather than geometric. +It also contains a `hashFunction` field which points at a hash function in the [multicodec table](https://github.com/multiformats/multicodec/blob/master/table.csv) to use for hashing. +You should ensure that you are using a cryptographic hash function in order to make it more difficult to create collissions and to ensure you have an even distribution of values. ## Algorithm in detail Left to the implementation is how to generate and load data from IPLD. -Whenever a Link is resolved to a ProllyNode or ChunkConfig, this is done via the IPLD LinkSystem which it an implementation detail outside of the scope of this document. +Whenever a Link is resolved to a TreeNode or ProllyTreeConfig, this is done via the IPLD LinkSystem which it an implementation detail outside of the scope of this document. + +When serializing data into a CID+Block, one should use the codec and multihash that's either used by root CID of the tree you are modifying, or specificed explicitly during tree creation. Blocks should also be saved to the IPLD LinkSystem and how this is done is also outside the scope of this document. +We will assume that there is a `getNode(cid)` API which loads IPLD Nodes from a CID, and a `saveNode(node) => {cid, bytes}` which will save a node and get back a CID and the byte contents used for generating the CID. Note that `saveNode` should use the same encoding and hash algorithm as the root of the tree based on the root `ProllyTreeConfig` + +This section also relies on the existance of a `Cursor` structure to keep track of state when iterating through a prolly tree. + +Note that this is not mandatory for implementations and is more of a guide to help structure how the tree can be traversed. +Implementations may want to use other approaches to recursive traversal. -When serializing data into a CID+Block, one should use the codec and multihash that's either used by root CID of the tree you are modifying, or specificed explicitly during tree creation. Blocks should also be saved to the IPLD LinkSystem and how this is done is also outside the scope of this document. We will assume that there is a `getNode(cid)` API which loads IPLD Nodes from a CID, and a `saveNode(node) => {cid, bytes}` which will save a node and get back a CID and the byte contents used for generating the CID. Note that `saveNode` should use the same encoding and hash algorithm as the root of the tree. +Not mentioned here is a method for performing batch operations such as sequential writes. -### IsLeaf(ProllyNode) : Boolean +### CursorAtItem(TreeNode, prefix): Cursor + +Get a cursor pointing to the closest item in the tree to a given prefix. +This is useful for performing searches for keys. + +1. Define a `Cursor` struct with a `index Integer`, `node TreeNode`, and`parent Cursor` +2. Set `cursor.node` to the `TreeNode` +3. Set `cursor.index` to `KeyIndex(cursor.node, prefix)` +6. Start a loop + 1. if `IsLeaf(cursor.node)` is `true`, break the loop + 2. get the `link` from `CursorGetValue(cursor)` + 3. resolve the TreeNode at the `link` to `newNode` + 4. Set `parent` to `cursor` + 5. set `cursor` to a new `Cursor` struct + 6. set `cursor.parent` to `parent` + 7. set `cursor.node` to `newNode` + 8. set `cursor.index` to `KeyIndex(cursor.node, item)` +7. return the `cursor` -1. get the `level` of the `ProllyNode` +### IsLeaf(TreeNode) : Boolean + +Check to see if a given TreeNode is for a leaf, or if it is an intermediate node in the tree. + +1. get the `level` of the `TreeNode` 2. if the `level` is `0` return `true` 3. else return `false` ### CursorIsValid(Cursor) : Boolean +Check if a given cursor is set to a valid position. +It can sometimes be set to an invalid position if a search failed or if there are no more items to iterator over in a cursor. + 1. get the `length` of `cursor.node.keys` 2. if `length` is `0`, return `false` 3. if `cursor.index` is less than 0, return `false` 4. if `cursor.index` is greater than or equal to `legnth`, return `false` -### KeyIndex(ProllyNode, item) : Integer +### CursorIsAtEnd(Cursor) : Boolean -Use a [binary search](https://en.wikipedia.org/wiki/Binary_search_algorithm) or equivalent to find the index in the `keys` array which is "closest" to but not "larger than" the `item`. Return the index. +Check if the given cursor is at the end of it's TreeNode. +This is used to check if there are more items that can be traversed over. -### CursorAtItem(ProllyNode, item): Cursor +1. Get the `length` of `cursor.node.keys` +2. return `cursor.index === (length - 1)` -Get the closest cursor to a byte prefix +### KeyIndex(TreeNode, item) : Integer -1. Define a `Cursor` struct with a `index Integer`, `node ProllyNode`, and`parent Cursor` -2. Set `cursor.node` to the `ProllyNode` -3. Set `cursor.index` to `KeyIndex(cursor.node, item)` -6. Start a loop - 1. if `IsLeaf(cursor.node)` is `true`, break the loop - 2. get the `link` from `CursorGetLink(cursor)` - 3. resolve the ProllyNode at the `link` to `newNode` - 4. Set `parent` to `cursor` - 5. set `cursor` to a new Cursor struct - 6. set `cursor.parent` to `parent` - 7. set `cursor.node` to `newNode` - 8. set `cursor.index` to `KeyIndex(cursor.node, item)` -7. return the `cursor` +Use a [binary search](https://en.wikipedia.org/wiki/Binary_search_algorithm) or equivalent to find the index in the `keys` array which is "closest" to but not "larger than" the `item`. Return the index. ### AdvanceCursor(Cursor) : Cursor +Advances a cursor to the next key. +This function assumes the cursor is currently pointing at a leaf node. +When reaching the end of the current TreeNode, the parent cursor will be incremented to go to the next branch. + 1. Get the `length` of the `cursor.node.keys` 2. if `cursor.index` is less than `length - 1`, increment `cursor.index` and return the `cursor` 3. If `cursor.parent` is `null` @@ -244,136 +257,129 @@ Get the closest cursor to a byte prefix 4. If `CursorIsValid(cursor.parent)` is `false` 1. set `cursor.index` to `length` 2. return the `cursor` -6. Get the `link` from `CursorGetLink(cursor.parent)` -7. Checgreater or k that `link` is not `null`, throw an error if it is -8. Get the `node` ProllyNode from `link` +6. Get the `link` from `CursorGetValue(cursor.parent)` +7. Check that `link` is not `null`, throw an error if it is +8. Get the `node` TreeNode from using `load(link)` 9. Set `cursor.node` to `node` 10. Set `cursor.index` to `0` +11. return the `cursor`. ### CursorGetKey(Cursor) : key? +Get the current key pointed to by a cursor. + 1. If `CursorIsValid(Cursor)` is `false`, return `null` (or throw an error) 2. Get the `key` from `cursor.node.keys` at `cursor.index` 3. return the `key` ### CursorGetValue(Cursor) : value? +Get the current value pointed to by the cursor. + 1. If `IsLeaf(cursor.node)` is `false`, return `null` (or throw an error) 1. If `CursorIsValid(Cursor)` is `false`, return `null` (or throw an error) 2. Get the `value` from `cursor.node.values` at `cursor.index` 3. return the `value` -### CursorGetLink(Cursor) : &ProllyNode - -1. If `IsLeaf(cursor.node)` is `true`, return `null` (or throw an error) -1. If `CursorIsValid(Cursor)` is `false`, return `null` (or throw an error) -2. Get the `link` from `cursor.node.links` at `cursor.index` -3. return the `link` +### Get(TreeNode, key): value -### Get(ProllyNode, key): value +Get the value associated with a key from the tree. +This is a public method meant for consumers of ProllyTrees. +If the TreeNode isn't a leaf, the function will traverse into it to find the leaf which contains the key. -1. Get a `cursor` from `CursorAtItem(ProllyNode, key)` +1. Get a `cursor` from `CursorAtItem(TreeNode, key)` 2. If `CursorIsValid(cursor)` is `false`, return an error (key not found) 3. Get `currentKey` from `CursorGetKey(Cursor)` 4. If `currentKey` is not bytewise equal to `key`, return an error (key not found) 5. Get `value` from `CursorGetValue(Cursor)` 6. Return `value` -### weibullCDF(x, K Int, L Int) : int - -Note: CDF (Cumulative Distribution Function) of the ][Weibull probability distribution](https://en.wikipedia.org/wiki/Weibull_distribution) - - return `-exp(-pow(x/L),K)` - -### WeibullThreshold(ProllyNode, key, value) : Boolean +### CursorAtChunkingBoundry(ProllyTreeConfig config, Cursor) : Boolean -TODO: cleanup +Checks if the cursor is currently pointing to a chunking boundry. +If `true`, a new TreeNode should be created for all subsequent keys. +This should be called after adding a key-value pair to a leaf TreeNode to determine if more items should be added. -WeibullThreshold returns true if we should split at |hash| for a given record inserted into a chunk of Size |Size|, where the record's Size is |thisSize|. |Size| is the Size of the chunk after the record is inserted, so includes |thisSize| in it. +1. Get the `ChunkingStrategy` from the `config` +2. Get the `length` of the `cursor.node` from `save(cursor.node).bytes.length` +3. If `length` is less than `config.minChunkSize` return `false` +5. If the `ChunkingStrategy` is not a `ByteThresholdConfig`, return an error (unsupported chunking strategy) +6. Set `threshold` to `Math.floor(MAX_UNIT32 / config.chunkingFactor)` +7. Get the `hash` function associated with the multicodec in `config.hashFunction` +8. Calculate the `entryHash` from the `hash(CursorGetKey(cursor) + save(CursorGetValue(cursor)).bytes)` +9. Read the last 4 bytes of the `entryHash` as a UInt32 `identity` +10. Return `identity <= threshold` -WeibullThreshold attempts to form chunks whose sizes match the weibull distribution. +### SplitNode(TreeNode, index) : left, right -The logic is as follows: given that we haven't split on any of the records up to |Size - thisSize|, the probability that we should split on this record is (CDF(end) - CDF(start)) / (1 - CDF(start)), or, the precentage of the remaining portion of the CDF that this record actually covers. We split is |hash|, treated as a uniform random number between [0,1), is less than this percentage. +Split a TreeNode at a given boundry. +All of the entries after `index` will be in a new `right` node. +All of the entries at and before `index` will be in a new `left` node. -0. Get the `{bytes, cid}` from `saveNode(value)` -1. `hash = hash(key + saveNode(value).cid)` *TODO: hash needs to be well-defined* -2. `itemSize = len(key) + len(value)` -3. Set `prevItemsSize` to the sum of the length of all the node's previous items (excluding the current one) -4. `start = weibullCDF(prevItemsSize, K, L)` -5. `end = weibullCDF(prevItemsSize + itemSize, K, L)` -6. `p = float(hash)/maxUInt32` -7. `d = 1 - start` -8. If `d <= 0`: return `true` (this should realistically never occour?) -9. `target = (end - start)/d` -10. return `p < target` +### SplitCursor(Cursor) : Cursor -### HashByteArray([]byte, offset) : Boolean +Split the current node on a cursor into two nodes at the given index. +This will add a new child to the parent node, and create a parent node+cursor if one doesn't exist. -*TODO: Description* -*TODO: current implementation applies the min/maxChunkSize to offset. Do we want this?* -- For each `byte`: - 1. Increment `offset` by 1 - 2. Feed `byte` to rolling hash *TODO Details for hashing, salting* - 3. Set `hash` to hashSum of rolling hash - 4. `pattern = (1<<(15 - (offset >> 10))) - 1` - 5. If `hash&pattern == pattern` return `true` -- Return `false` +- Get `left` and `right` from `SplitNode(cursor.node, cursor.index)` +- If `cursor.parent` is null + - Create a new `TreeNode` `parentNode` + - Set `parentNode.level` to `cursor.node.level + 1` + - Set `parentNode.keys[0]` to `left.keys[0]` + - Create a new Cursor `parentCursor` + - Set `parentCursor.index` to `0` + - Set `parentCursor.node` to `parentNode` + - Set `cursor.parent` to `parentCursor` +- Set the entry at `cursor.parent.node` at `cursor.parent.index` to `left.keys[0]`,`save(left).cid` +- Insert a new entry into `cursor.parent.node` at `cursor.parent.index+1` to `right.keys[0]`,`save(right).cid` +- Increment `cursor.parent.index` by `1` +- Set `cursor.node` to `right` +- Set `cursor.index` to `0` -### RollingHash(ProllyNode, index) : Boolean +### MergeNodes(TreeNode left, TreeNode right) : TreeNode -*TODO: Rolling hash window* -1. Set `offset` to the number of bytes of the elements of previous keys and values -2. If `HashByteArray(key) == true`: return `true` -3. Add number of bytes of key to `offset` -4. If `HashByteArray() == true`: return `true` -5. return `false` +Merge two tree nodes together. -### ShouldCreateBoundry(ProllyNode, key) : Boolean +- If `left.level` != `right.level` + - Return an error (incompatible tree node levels) +- Create a new `TreeNode` `node` +- Set `node.keys` to `left.keys`, and concat it with `right.keys` +- Set `node.values` to `left.values`, and concat it with `right.values` -If `true`, a new ProllyNode should be created for all subsequent keys. - -1. Get the `ChunkConfig` by resolving the link in `node.ChunkConfig` -1. Get the `ChunkStrategy` from the `ChunkConfig` -2. Get the `length` from the `node.keys` -3. If `length` is less than `ChunkConfig.MinChunkSize` return `true` -3. If `length` is equal to `ChunkConfig.MaxChunkSize` return `true` -4. If the `ChunkStrategy` is `KeySplitter` - - return WeibullThreshold(key) -5. If the `ChunkStrategy` is `RollingHash` - - return RollingHash(ProllyNode, index) -6. Return an error (unsupported chunk strategy) - -### RebalanceTree(Cursor) : ProllyNode +### RebalanceTree(Cursor, ProllyTreeConfig) : TreeNode The `cursor` should be pointing to the node in a tree that rebalancing should start at. The root of the `cursor` will be upated to match the changes. +This will attempt to merge sequential TreeNodes that don't end on a boundry, and split TreeNodes with boundries within them. -The returned `ProllyNode` is for the new root of the tree. - -TODO: Account for `cursor.parent` being null -TODO: Advance a cursor instead of iterating through `cursor.node.keys` -TODO: Account for `cursor.node.keys` being empty (remove from parent) +The returned `TreeNode` is for the new root of the tree. +- If `cursor.node.keys` is empty + - if `cursor.parent` is null + - return `cursor.node` + - Remove the entry at `cursor.parent.index` in `cursor.parent.node` + - Return `RebalanceTree(cursor.parent)` - Loop - - If `ShouldCreateBoundry(cursor.node.chunkconfig, cursor.node, CursorGetKey(cursor))` is `false` - - Call `AdvanceCursor(cursor)` - - continue to next loop cycle - - Create a new `ProllyNode` `node` by cloning the current `cursor.node` - - Remove all keys, values, links from `cursor.node` after `cursor.index` - - Remove all keys, values, links from `node` before and including `cursor.index` - - Generate a new `CID` for `cursor.node` - - Set the `cursor.parent.links` at `cursor.parent.index` to the new `CID` - - Generate a new `CID` for the new `node` - - Shift all items in `cursor.parent.keys` up by `1` - - Shift all items in `cursor.parent.links` up by `1` - - Increment `cursor.parent.index` by `1` - - Set the `cursor.parent.links` at `cursor.parent.index` to the new `CID` - - Set the `cursor.parent.keys` at `cursor.parent.index` to node.keys[0] - - Set cursor.node to `node` - - Set cursor.index to `0` + - If `CursorIsAtEnd(cursor)` + - If `cursor.parent` is `null` or `ShouldCreateBoundry(ProllyTreeConfig, cursor)` is `true` + - break the loop + - Else, if `cursor.parent` has another prolly node + - merge it with `cursor.node` + - break the loop + - If `ShouldCreateBoundry(ProllyTreeConfig, cursor)` is `false` + - Call `AdvanceCursor(cursor)` + - continue to next loop cycle + - Call `SplitCursor(cursor)` - Return `RebalanceTree(cursor.parent)` -### Put(ProllyNode, key, value) : ProllyNode +### Put(ProllyTree tree, key, value) : ProllyTree -- Get `cursor` from `CursorAtItem(ProllyNode, key)` +This is a public facing API for setting keys in the prolly tree. + +TODO: If `length` is equal to or greater than `config.maxChunkSize`, remove the current key to another node + +- Get the `config` from the `tree` using `load(tree.config)` +- Get root `node` using `load(tree.root)` +- Get `cursor` from `CursorAtItem(node, key)` - If `CursorIsValid(cursor)` is `true` - check if the key is at `CursorGetKey` - If it is @@ -381,23 +387,27 @@ TODO: Account for `cursor.node.keys` being empty (remove from parent) - If it isn't - add the `key` to `cursor.node.keys` after the `cursor.index`, shifting subsequent items down - add the `values` to `cursor.node.values` after the `cursor.index`, shifting subsequent items down + - increment `cursor.index` by `1` + - Get the `length` from `save(cursor.node).bytes` + - If `length` > `config.maxChunkSize` + - call `SplitCursor(cursor)` - TODO: If false, how is this reached? It means that there were no keys even "close" to `key`? - return `RebalanceTree(cursor)` -### Delete(ProllyNode, key, value) : ProllyNode +### Delete(TreeNode, key, value) : TreeNode -- Get `cursor` from `CursorAtItem(ProllyNode, key)` +- Get `cursor` from `CursorAtItem(TreeNode, key)` - If `CursorIsValid(cursor)` is `false` - - Return the `ProllyNode` + - Return the `TreeNode` - check if the key is at `CursorGetKey(cursor)` - If it is - Remove the key in `cursor.node.keys` at `cursor.index` - Remove the value in `cursor.node.values` at `cursor.index` - return `RebalanceTree(cursor)` -### Search(ProllyNode, prefix) : Iterator +### Search(TreeNode, prefix) : Iterator -- Get `cursor` from `CursorAtItem(ProllyNode, key)` +- Get `cursor` from `CursorAtItem(TreeNode, key)` - Create an Iterator (language dependent) - On each pull from the iterator - If `CursorIsValid(cursor)` is `false` @@ -407,9 +417,9 @@ TODO: Account for `cursor.node.keys` being empty (remove from parent) - Close the iterator and return - Get the `value` from `CursorGetValue(cursor)` - Yield the `key` and `value` from the iterator - - AdvanceCursor(ProllyNode) + - AdvanceCursor(TreeNode) -### Diff(ProllyNode base, ProllyNode new) : Iterator< Diff > +### Diff(TreeNode base, TreeNode new) : Iterator< Diff > This function provides a means to retrieve all items in a tree `new` that are missing from another tree `base` and returns them as a list of key-value pairs. @@ -438,7 +448,7 @@ Skipping common elements: 1. If the parents are equal, set the cursers to the first element of these parents. 2. Advance both cursors: `AdvanceCursor(cursor_new)`, `AdvanceCursor(cursor_base)` -### Merge(ProllyNode, ProllyNode) : ProllyNode +### Merge(TreeNode, TreeNode) : TreeNode This procedure takes two trees and returns a tree that contains all elements from both. It builds on Diff and (bulk) insert. @@ -446,11 +456,11 @@ It uses the output of the diff and adds the key-value pairs, which are contained (Note: As a heuristic the higher tree can chosen as the left tree. The expected number of added elements should be smaller that way.) - 1. Initialize the tree to be returned with the left tree: `merged = left` - 1. Invoke the diff on the trees: `Diff(left, right)` - 2. Iterate over the resulting key-value pairs and insert them to the left tree: `merged = Put(merged, key, value)` - 1. If merge conflicts occur (if `Diff()` returns differing key-value pairs): See below - 3. Return `merged`. +1. Initialize the tree to be returned with the left tree: `merged = left` +1. Invoke the diff on the trees: `Diff(left, right)` +2. Iterate over the resulting key-value pairs and insert them to the left tree: `merged = Put(merged, key, value)` + 1. If merge conflicts occur (if `Diff()` returns differing key-value pairs): See below +3. Return `merged`. #### Note on Merge Conflicts @@ -458,10 +468,7 @@ When merging trees it is possible that the trees have different values for the s The following approaches are possible for the implementation: - Ask the caller for a handler function that resolves the conflict - Throw an error, indicating the conflicting keys and values - - Ignore both - - Assume the `right` tree contains the newer values and simply use the key-value pair of the right tree. ## Configuration (and Defaults) TODO: *Discussion of suitable values and implication for configuration could go here, this is something we are actively running experiments for* - From dcf262da68676cff997c517a40cfa85ac80984c5 Mon Sep 17 00:00:00 2001 From: mauve Date: Tue, 13 Dec 2022 04:26:35 -0500 Subject: [PATCH 08/20] Clean up prolly tree spec further --- .../advanced-data-layouts/prollytree/spec.md | 35 ++++++++++++------- 1 file changed, 22 insertions(+), 13 deletions(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index ad32c72d..1d0f7771 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -40,14 +40,10 @@ At the highest level, Prolly Trees act as a key value store whith the ability to ### Search Tree The basic structure is that of an ordered search tree: The contained keys are organised such that they can be found (inserted, updated, ...) efficiently by value. - To efficiently find keys, the tree is traversed top-to-bottom and the non-leaf nodes help navigating/comparing the values efficiently. An intermediate node contains several ordered key-address pairs, which link to further nodes (intermediate or leaf) on the next lower level. - Levels go from `0` representing Leaf Nodes, and go up for each level in the tree. The root of the tree will have the highest level in the tree and can give an estimate of it's overall size. - Leaf nodes contain the actual Key-Value pairs for the tree which can be iterated over as part of the overall tree iteration. - ### Chunking Chunking is the strategy of determining chunk boundaries: Given a list of key-value pairs, it 'decides' which are still inside node A and which already go to the next node B on the same level. @@ -345,6 +341,7 @@ Merge two tree nodes together. - Create a new `TreeNode` `node` - Set `node.keys` to `left.keys`, and concat it with `right.keys` - Set `node.values` to `left.values`, and concat it with `right.values` +- Return `node` ### RebalanceTree(Cursor, ProllyTreeConfig) : TreeNode @@ -357,26 +354,24 @@ The returned `TreeNode` is for the new root of the tree. - if `cursor.parent` is null - return `cursor.node` - Remove the entry at `cursor.parent.index` in `cursor.parent.node` - - Return `RebalanceTree(cursor.parent)` + - Return `RebalanceTree(cursor.parent, ProllyTreeConfig)` - Loop - If `CursorIsAtEnd(cursor)` - If `cursor.parent` is `null` or `ShouldCreateBoundry(ProllyTreeConfig, cursor)` is `true` - break the loop - Else, if `cursor.parent` has another prolly node - - merge it with `cursor.node` + - merge it with `cursor.node` - break the loop - If `ShouldCreateBoundry(ProllyTreeConfig, cursor)` is `false` - Call `AdvanceCursor(cursor)` - continue to next loop cycle - Call `SplitCursor(cursor)` -- Return `RebalanceTree(cursor.parent)` +- Return `RebalanceTree(cursor.parent, ProllyTreeConfig)` ### Put(ProllyTree tree, key, value) : ProllyTree This is a public facing API for setting keys in the prolly tree. -TODO: If `length` is equal to or greater than `config.maxChunkSize`, remove the current key to another node - - Get the `config` from the `tree` using `load(tree.config)` - Get root `node` using `load(tree.root)` - Get `cursor` from `CursorAtItem(node, key)` @@ -392,10 +387,17 @@ TODO: If `length` is equal to or greater than `config.maxChunkSize`, remove the - If `length` > `config.maxChunkSize` - call `SplitCursor(cursor)` - TODO: If false, how is this reached? It means that there were no keys even "close" to `key`? -- return `RebalanceTree(cursor)` +- get a new `root` from `RebalanceTree(cursor, config)` +- Create a new `updatedTree` by duplicating `tree` +- Set `updatedTree.root` to `root` +- Return `updatedTree` -### Delete(TreeNode, key, value) : TreeNode +### Delete(ProllyTree tree, key) : ProllyTree +Removes a key from a ProllyTree if it exists + +- Get the `config` from the `tree` using `load(tree.config)` +- Get root `node` using `load(tree.root)` - Get `cursor` from `CursorAtItem(TreeNode, key)` - If `CursorIsValid(cursor)` is `false` - Return the `TreeNode` @@ -403,10 +405,17 @@ TODO: If `length` is equal to or greater than `config.maxChunkSize`, remove the - If it is - Remove the key in `cursor.node.keys` at `cursor.index` - Remove the value in `cursor.node.values` at `cursor.index` -- return `RebalanceTree(cursor)` +- get a new `root` from `RebalanceTree(cursor, config)` +- Create a new `updatedTree` by duplicating `tree` +- Set `updatedTree.root` to `save(root).cid` +- Return `updatedTree` ### Search(TreeNode, prefix) : Iterator +This is the basis for how one can search through a tree. +This may be exposed as a public method by implementors, though they may want to add additional features like an "end" instead of a prefix. +Applications should otherwise manually detect when to stop iterating based on the last item's key that was yielded. + - Get `cursor` from `CursorAtItem(TreeNode, key)` - Create an Iterator (language dependent) - On each pull from the iterator @@ -417,7 +426,7 @@ TODO: If `length` is equal to or greater than `config.maxChunkSize`, remove the - Close the iterator and return - Get the `value` from `CursorGetValue(cursor)` - Yield the `key` and `value` from the iterator - - AdvanceCursor(TreeNode) + - `AdvanceCursor(TreeNode)` ### Diff(TreeNode base, TreeNode new) : Iterator< Diff > From 106d0184a41a004584846c611a86940edf096f1a Mon Sep 17 00:00:00 2001 From: mauve Date: Tue, 13 Dec 2022 04:33:18 -0500 Subject: [PATCH 09/20] Add note about codec and hash function in schema --- .../advanced-data-layouts/prollytree/spec.md | 21 +++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index 1d0f7771..7a02eac4 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -105,6 +105,8 @@ type ChunkingStrategy union { type ProllyTreeConfig struct { minChunkSize Int maxChunkSize Int + codec Int + hashFunction Int strategy ChunkingStrategy } representation tuple ``` @@ -144,9 +146,21 @@ The size is calculated from the size of the `bytes` when encoding a prolly tree ### `ProllyTreeConfig.maxChunkSize` -The maximum size a chunk could be before it needs to be split regardless of the chunk boundries +The maximum size a chunk could be before it needs to be split regardless of the chunk boundries. +If a node reaches this size (or larger) after an insertion, it will trigger a chunking boundry regardless of the chunking strategy used. +This is in order to avoid attacks that make chunks larger than necessary. The size is calculated from the size of the `bytes` when encoding a prolly tree node. +### `ProllyTreeConfig.codec` + +This is the multicodec ID for the codec to use when encoding the tree. +Generally it is reccommended to use DAG-CBOR unless you really know what you're doing. + +### `ProllyTreeConfig.hashFunction` + +This is the multicodec ID for the hash function to use for generating CIDs. +You should use whatever the default is for CIDv1 unless you really know what you're doing. + ### `ProllyTreeConfig.strategy` The `ChunkingStrategy` to use for forming the prolly tree. @@ -180,12 +194,11 @@ Left to the implementation is how to generate and load data from IPLD. Whenever a Link is resolved to a TreeNode or ProllyTreeConfig, this is done via the IPLD LinkSystem which it an implementation detail outside of the scope of this document. When serializing data into a CID+Block, one should use the codec and multihash that's either used by root CID of the tree you are modifying, or specificed explicitly during tree creation. Blocks should also be saved to the IPLD LinkSystem and how this is done is also outside the scope of this document. -We will assume that there is a `getNode(cid)` API which loads IPLD Nodes from a CID, and a `saveNode(node) => {cid, bytes}` which will save a node and get back a CID and the byte contents used for generating the CID. Note that `saveNode` should use the same encoding and hash algorithm as the root of the tree based on the root `ProllyTreeConfig` +We will assume that there is a `getNode(cid)` API which loads IPLD Nodes from a CID, and a `saveNode(node) => {cid, bytes}` which will save a node and get back a CID and the byte contents used for generating the CID. Note that `saveNode` should use the same encoding and hash algorithm as the root of the tree based on the root `ProllyTreeConfig.codec` and `ProllyTreeConfig.hashFunction`. This section also relies on the existance of a `Cursor` structure to keep track of state when iterating through a prolly tree. - Note that this is not mandatory for implementations and is more of a guide to help structure how the tree can be traversed. -Implementations may want to use other approaches to recursive traversal. +Implementations may want to use other approaches to recursive traversal and updating. Not mentioned here is a method for performing batch operations such as sequential writes. From 2e82bee47302c3db4759fe2aa6def25b6cf11ce1 Mon Sep 17 00:00:00 2001 From: mauve Date: Tue, 13 Dec 2022 14:05:45 -0500 Subject: [PATCH 10/20] Fix up prolly tree formating per vmx's comments --- .../advanced-data-layouts/prollytree/spec.md | 25 +++++++++---------- 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index 7a02eac4..d934d78d 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -153,13 +153,13 @@ The size is calculated from the size of the `bytes` when encoding a prolly tree ### `ProllyTreeConfig.codec` -This is the multicodec ID for the codec to use when encoding the tree. +This is the multicodec code for the codec to use when encoding the tree. Generally it is reccommended to use DAG-CBOR unless you really know what you're doing. ### `ProllyTreeConfig.hashFunction` -This is the multicodec ID for the hash function to use for generating CIDs. -You should use whatever the default is for CIDv1 unless you really know what you're doing. +This is the multicodec code for the hash function to use for generating CIDs. +It is reccommended to use SHA2-256 for your hash function unless you know what you're doing. ### `ProllyTreeConfig.strategy` @@ -173,14 +173,14 @@ This is a Map with different keys representing different strategies. This spec currently supports the `byteThreshold` strategy, however we are intentionally leaving room for further chunking strategies which use more advanced algorithms such as the Weibull Distribution strategy used in Dolt. -### `ByteThresholdConfig` +### `HashThresholdConfig` This is the strategy that was described in the [Peer to Peer Ordered Search Indexes](https://0fps.net/2020/12/19/peer-to-peer-ordered-search-indexes/) paper. It works by hashing a key+value pair, reading the last 4 bytes as a 32 bit unsigned integer, and checking how many bits are 0's relative to the chunking factor. The `chunkingFactor` must be less than the maximum value of a 32 bit unsigned integer. -It is used to calculate a "chunking threshold" using the forumala `Math.floor(4294967295 / chunkingFactor)`. +It is used to calculate a "chunking threshold" using the formula `Math.floor(4294967295 / chunkingFactor)`. It is reccommended to use powers of two to make it easier to relate to how many bits should be `0` in the chunking threshold. The larger the chunking factor, the less likely it is that a given keypair will result in a chunking boundry, and thus will lead to TreeNodes with more entries within them. @@ -197,6 +197,7 @@ When serializing data into a CID+Block, one should use the codec and multihash t We will assume that there is a `getNode(cid)` API which loads IPLD Nodes from a CID, and a `saveNode(node) => {cid, bytes}` which will save a node and get back a CID and the byte contents used for generating the CID. Note that `saveNode` should use the same encoding and hash algorithm as the root of the tree based on the root `ProllyTreeConfig.codec` and `ProllyTreeConfig.hashFunction`. This section also relies on the existance of a `Cursor` structure to keep track of state when iterating through a prolly tree. +This structure should keep track of a `TreeNode` that it is currently focused on, an `index` for the entry in the node which is being focused on, and optionally a `parent` Cursor for the parent `TreeNode` being focused on. Note that this is not mandatory for implementations and is more of a guide to help structure how the tree can be traversed. Implementations may want to use other approaches to recursive traversal and updating. @@ -210,7 +211,7 @@ This is useful for performing searches for keys. 1. Define a `Cursor` struct with a `index Integer`, `node TreeNode`, and`parent Cursor` 2. Set `cursor.node` to the `TreeNode` 3. Set `cursor.index` to `KeyIndex(cursor.node, prefix)` -6. Start a loop +4. Start a loop 1. if `IsLeaf(cursor.node)` is `true`, break the loop 2. get the `link` from `CursorGetValue(cursor)` 3. resolve the TreeNode at the `link` to `newNode` @@ -219,7 +220,7 @@ This is useful for performing searches for keys. 6. set `cursor.parent` to `parent` 7. set `cursor.node` to `newNode` 8. set `cursor.index` to `KeyIndex(cursor.node, item)` -7. return the `cursor` +5. return the `cursor` ### IsLeaf(TreeNode) : Boolean @@ -237,11 +238,11 @@ It can sometimes be set to an invalid position if a search failed or if there ar 1. get the `length` of `cursor.node.keys` 2. if `length` is `0`, return `false` 3. if `cursor.index` is less than 0, return `false` -4. if `cursor.index` is greater than or equal to `legnth`, return `false` +4. if `cursor.index` is greater than or equal to `length`, return `false` ### CursorIsAtEnd(Cursor) : Boolean -Check if the given cursor is at the end of it's TreeNode. +Check if the given cursor is at the end of its TreeNode. This is used to check if there are more items that can be traversed over. 1. Get the `length` of `cursor.node.keys` @@ -312,7 +313,7 @@ This should be called after adding a key-value pair to a leaf TreeNode to determ 1. Get the `ChunkingStrategy` from the `config` 2. Get the `length` of the `cursor.node` from `save(cursor.node).bytes.length` 3. If `length` is less than `config.minChunkSize` return `false` -5. If the `ChunkingStrategy` is not a `ByteThresholdConfig`, return an error (unsupported chunking strategy) +5. If the `ChunkingStrategy` is not a `HashThresholdConfig`, return an error (unsupported chunking strategy) 6. Set `threshold` to `Math.floor(MAX_UNIT32 / config.chunkingFactor)` 7. Get the `hash` function associated with the multicodec in `config.hashFunction` 8. Calculate the `entryHash` from the `hash(CursorGetKey(cursor) + save(CursorGetValue(cursor)).bytes)` @@ -423,7 +424,7 @@ Removes a key from a ProllyTree if it exists - Set `updatedTree.root` to `save(root).cid` - Return `updatedTree` -### Search(TreeNode, prefix) : Iterator +### Search(TreeNode, start) : Iterator This is the basis for how one can search through a tree. This may be exposed as a public method by implementors, though they may want to add additional features like an "end" instead of a prefix. @@ -435,8 +436,6 @@ Applications should otherwise manually detect when to stop iterating based on th - If `CursorIsValid(cursor)` is `false` - Close the iterator and return - Get the `key` from `CursorGetKey(cursor)` - - If `key` does not start with `prefix` - - Close the iterator and return - Get the `value` from `CursorGetValue(cursor)` - Yield the `key` and `value` from the iterator - `AdvanceCursor(TreeNode)` From d6de6640feb90394febbbb179a45da4a119fd938 Mon Sep 17 00:00:00 2001 From: mauve Date: Tue, 13 Dec 2022 14:08:53 -0500 Subject: [PATCH 11/20] Fix spelling errors in prolly trees spec --- specs/advanced-data-layouts/prollytree/spec.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index d934d78d..8d619bf5 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -5,8 +5,8 @@ Many applications have been using IPLD to represent large datasets in use cases such as blockchains or decentralized databases. One side effect is that querying large amounts of data has become a more common need which often gets solved with centralized database indexes. IPLD Prolly Trees are an important step in this direction in that they provide an interface similar to a Database's B+ tree. -Some of the immediate benefits of using Prolly Trees over regular B+ Trees is that Prolly Trees are self-balancing and can be determenistically merged with other trees which is important to reduce the amount of restructuring necessary for collaborativ index creation. -In this document we will build off of prior art in [Probabilistic B-Trees](https://github.com/attic-labs/noms/blob/master/doc/intro.md#prolly-trees-probabilistic-b-trees) to define a speficication on how they can be represented in IPLD, how they can be constructed, how they can be queried, and how multiple trees can be merged together. +Some of the immediate benefits of using Prolly Trees over regular B+ Trees is that Prolly Trees are self-balancing and can be deterministically merged with other trees which is important to reduce the amount of restructuring necessary for collaborativ index creation. +In this document we will build off of prior art in [Probabilistic B-Trees](https://github.com/attic-labs/noms/blob/master/doc/intro.md#prolly-trees-probabilistic-b-trees) to define a specification on how they can be represented in IPLD, how they can be constructed, how they can be queried, and how multiple trees can be merged together. ## References @@ -33,9 +33,9 @@ https://www.dolthub.com/blog/2020-05-13-dolt-commit-graph-and-structural-sharing ## Summary/Overview -Prolly trees leverage content addresibility to create ordered search indexes similar to the common B+ tree structure used for databases, but with the added ability to determenistically merge trees together. +Prolly trees leverage content addressability to create ordered search indexes similar to the common B+ tree structure used for databases, but with the added ability to determenistically merge trees together. -At the highest level, Prolly Trees act as a key value store whith the ability to iterate over key ranges which are lexicographically sorted. This property of sorted iteration is important for creating database indexes for a variety of use cases like full text search, sorting of large datasets, and arbitrary search queries. +At the highest level, Prolly Trees act as a key value store with the ability to iterate over key ranges which are lexicographically sorted. This property of sorted iteration is important for creating database indexes for a variety of use cases like full text search, sorting of large datasets, and arbitrary search queries. ### Search Tree @@ -64,7 +64,7 @@ In order to modify the tree (inserting, updating, or removing one or multiple va The tree is (partially) rebalance bottom-up. Starting at the inserted/modified leaf or node referencing a now deleted/removed leaf, successively walk up the tree. If a removed item was the only item in its node: Remove the node from its parent and Check if the hash/address/CID of a new leaf/node splits the parent chunk (split the parent node if yes). -Then, check if a removed leaf/node was a 'splitting node' and the nodes need to be merged (only the last node in a chunk can be a splitting node). If splitting criterion holds for the last item of the node and there is the current node is succeeded by another node on the same level: merge the currend node with the succeeding node. Continue walking up the path. If at root and it is being split: Create a new root that links to the freshly split nodes. Return the new root CID. +Then, check if a removed leaf/node was a 'splitting node' and the nodes need to be merged (only the last node in a chunk can be a splitting node). If splitting criterion holds for the last item of the node and there is the current node is succeeded by another node on the same level: merge the current node with the succeeding node. Continue walking up the path. If at root and it is being split: Create a new root that links to the freshly split nodes. Return the new root CID. Pay attention to the fact that the boundary algorithm is not related to the node CID, i.e the boundary is generated *before* the node CID is generated. @@ -119,11 +119,11 @@ and a `config` link to the ProllyTreeConfig which has infomration about the chun ### `TreeNode` This is the "Tree" part of Prolly Trees and is made to be general purpose. -We can potentially expec to use the TreeNode structure in subsequent specs with different types of trees. +We can potentially expect to use the TreeNode structure in subsequent specs with different types of trees. ### `TreeeNode.keys` -Raw keys(keys/values input from users) for leaf node. Key-value pairs are sorted by byte value with the "larger" keys being at the end. Values are comared at the first byte, and going down to the end. This means that keys that are just a prefix come before keys that are prefix + 1 byte. +Raw keys(keys/values input from users) for leaf node. Key-value pairs are sorted by byte value with the "larger" keys being at the end. Values are compared at the first byte, and going down to the end. This means that keys that are just a prefix come before keys that are prefix + 1 byte. ### `TreeNode.values` @@ -186,7 +186,7 @@ It is reccommended to use powers of two to make it easier to relate to how many The larger the chunking factor, the less likely it is that a given keypair will result in a chunking boundry, and thus will lead to TreeNodes with more entries within them. It also contains a `hashFunction` field which points at a hash function in the [multicodec table](https://github.com/multiformats/multicodec/blob/master/table.csv) to use for hashing. -You should ensure that you are using a cryptographic hash function in order to make it more difficult to create collissions and to ensure you have an even distribution of values. +You should ensure that you are using a cryptographic hash function in order to make it more difficult to create collisions and to ensure you have an even distribution of values. ## Algorithm in detail From a06bfae366314d73c685345f9f656eb3e95e7519 Mon Sep 17 00:00:00 2001 From: mauve Date: Tue, 13 Dec 2022 14:30:43 -0500 Subject: [PATCH 12/20] Acccount for max chunk size when merging treenodes on deletion --- specs/advanced-data-layouts/prollytree/spec.md | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index 8d619bf5..392816ab 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -371,12 +371,18 @@ The returned `TreeNode` is for the new root of the tree. - Return `RebalanceTree(cursor.parent, ProllyTreeConfig)` - Loop - If `CursorIsAtEnd(cursor)` - - If `cursor.parent` is `null` or `ShouldCreateBoundry(ProllyTreeConfig, cursor)` is `true` + - If `cursor.parent` is `null` or `CursorAtChunkingBoundry(ProllyTreeConfig, cursor)` is `true` - break the loop - - Else, if `cursor.parent` has another prolly node - - merge it with `cursor.node` + - if `CursorIsAtEnd(cursor.parent)` is `true` - break the loop - - If `ShouldCreateBoundry(ProllyTreeConfig, cursor)` is `false` + - Set `right` to `load(cursor.parent.values[cursor.parent.index + 1])` + - Create a new `merged` `TreeNode` from `MergeNodes(cursor.node, right)` + - Get the `length` from `save(merged).bytes`, as well as the `cid` + - If `length` is less than or equal to `config.maxChunkSize` + - set `cursor.parent.node.values[cursor.parent.index]` to `cid` + - remove the entry in `cursor.parent.node` at `cursor.parent.index + 1` + - break the loop + - If `CursorAtChunkingBoundry(ProllyTreeConfig, cursor)` is `false` - Call `AdvanceCursor(cursor)` - continue to next loop cycle - Call `SplitCursor(cursor)` From e5042daf099398ea340ddd3348e0c6b66be82ecf Mon Sep 17 00:00:00 2001 From: mauve Date: Tue, 13 Dec 2022 14:41:51 -0500 Subject: [PATCH 13/20] Add index to prolltree ADL --- specs/advanced-data-layouts/prollytree/index.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) create mode 100644 specs/advanced-data-layouts/prollytree/index.md diff --git a/specs/advanced-data-layouts/prollytree/index.md b/specs/advanced-data-layouts/prollytree/index.md new file mode 100644 index 00000000..c132c808 --- /dev/null +++ b/specs/advanced-data-layouts/prollytree/index.md @@ -0,0 +1,15 @@ +--- +title: "Specs: Prolly Tree ADL" +navTitle: "Prolly Tree ADL" +--- + +Prolly Tree ADL +=============== + +The Prolly Tree ADL provides a [map](/docs/data-model/kinds/#map-kind) interface, while sharding data internally. + +Prolly trees can support large volumes of ordered key-value pairs with configurable chunking strategies for how wide tree nodes should be. +They can also be merged determenistically with each other if they have compatible chunking configurations while skipping over similar key ranges. +In general they are a useful building block for database indexes and large append only logs which wish to make use of sparse querying. + +- [Prolly Tree ADL Specification](./spec/) From 183a45d9eac7ca3c28e97a5bab1c29b53820c166 Mon Sep 17 00:00:00 2001 From: RangerMauve Date: Wed, 14 Dec 2022 12:13:44 -0500 Subject: [PATCH 14/20] Update specs/advanced-data-layouts/prollytree/spec.md Co-authored-by: ch3 <72873632+che-ch3@users.noreply.github.com> --- specs/advanced-data-layouts/prollytree/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index 392816ab..1d64d54d 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -84,7 +84,7 @@ type ProllyTree struct { root &TreeNode } representation tuple -type TreeeNode struct { +type TreeNode struct { # Is leaf when level is 0 level Int keys [Bytes] From a9757ffd7d748fa9a74295204532a7d55d661314 Mon Sep 17 00:00:00 2001 From: RangerMauve Date: Wed, 14 Dec 2022 12:14:00 -0500 Subject: [PATCH 15/20] Update specs/advanced-data-layouts/prollytree/spec.md Co-authored-by: ch3 <72873632+che-ch3@users.noreply.github.com> --- specs/advanced-data-layouts/prollytree/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index 1d64d54d..b32e1743 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -121,7 +121,7 @@ and a `config` link to the ProllyTreeConfig which has infomration about the chun This is the "Tree" part of Prolly Trees and is made to be general purpose. We can potentially expect to use the TreeNode structure in subsequent specs with different types of trees. -### `TreeeNode.keys` +### `TreeNode.keys` Raw keys(keys/values input from users) for leaf node. Key-value pairs are sorted by byte value with the "larger" keys being at the end. Values are compared at the first byte, and going down to the end. This means that keys that are just a prefix come before keys that are prefix + 1 byte. From f536669aa10583178a111166e8f197a37ba9571f Mon Sep 17 00:00:00 2001 From: RangerMauve Date: Wed, 14 Dec 2022 12:14:13 -0500 Subject: [PATCH 16/20] Update specs/advanced-data-layouts/prollytree/spec.md Co-authored-by: ch3 <72873632+che-ch3@users.noreply.github.com> --- specs/advanced-data-layouts/prollytree/spec.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index b32e1743..8caf3bc4 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -304,7 +304,7 @@ If the TreeNode isn't a leaf, the function will traverse into it to find the lea 5. Get `value` from `CursorGetValue(Cursor)` 6. Return `value` -### CursorAtChunkingBoundry(ProllyTreeConfig config, Cursor) : Boolean +### CursorAtChunkingBoundary(ProllyTreeConfig config, Cursor) : Boolean Checks if the cursor is currently pointing to a chunking boundry. If `true`, a new TreeNode should be created for all subsequent keys. From 6ad39b47fc386b82d0679d67b67fb9b5fecdb8ad Mon Sep 17 00:00:00 2001 From: mauve Date: Wed, 14 Dec 2022 12:20:14 -0500 Subject: [PATCH 17/20] Replace tabs with spaces in spec --- .../advanced-data-layouts/prollytree/spec.md | 102 +++++++++--------- 1 file changed, 51 insertions(+), 51 deletions(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index 8caf3bc4..a0728da5 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -103,11 +103,11 @@ type ChunkingStrategy union { } representation keyed type ProllyTreeConfig struct { - minChunkSize Int - maxChunkSize Int - codec Int - hashFunction Int - strategy ChunkingStrategy + minChunkSize Int + maxChunkSize Int + codec Int + hashFunction Int + strategy ChunkingStrategy } representation tuple ``` @@ -212,14 +212,14 @@ This is useful for performing searches for keys. 2. Set `cursor.node` to the `TreeNode` 3. Set `cursor.index` to `KeyIndex(cursor.node, prefix)` 4. Start a loop - 1. if `IsLeaf(cursor.node)` is `true`, break the loop - 2. get the `link` from `CursorGetValue(cursor)` - 3. resolve the TreeNode at the `link` to `newNode` - 4. Set `parent` to `cursor` - 5. set `cursor` to a new `Cursor` struct - 6. set `cursor.parent` to `parent` - 7. set `cursor.node` to `newNode` - 8. set `cursor.index` to `KeyIndex(cursor.node, item)` + 1. if `IsLeaf(cursor.node)` is `true`, break the loop + 2. get the `link` from `CursorGetValue(cursor)` + 3. resolve the TreeNode at the `link` to `newNode` + 4. Set `parent` to `cursor` + 5. set `cursor` to a new `Cursor` struct + 6. set `cursor.parent` to `parent` + 7. set `cursor.node` to `newNode` + 8. set `cursor.index` to `KeyIndex(cursor.node, item)` 5. return the `cursor` ### IsLeaf(TreeNode) : Boolean @@ -261,12 +261,12 @@ When reaching the end of the current TreeNode, the parent cursor will be increme 1. Get the `length` of the `cursor.node.keys` 2. if `cursor.index` is less than `length - 1`, increment `cursor.index` and return the `cursor` 3. If `cursor.parent` is `null` - 1. set `cursor.index` to `length` - 2. return the `cursor` + 1. set `cursor.index` to `length` + 2. return the `cursor` 3. Invoke `AdvanceCursor(cursor.parent)` 4. If `CursorIsValid(cursor.parent)` is `false` - 1. set `cursor.index` to `length` - 2. return the `cursor` + 1. set `cursor.index` to `length` + 2. return the `cursor` 6. Get the `link` from `CursorGetValue(cursor.parent)` 7. Check that `link` is not `null`, throw an error if it is 8. Get the `node` TreeNode from using `load(link)` @@ -333,13 +333,13 @@ This will add a new child to the parent node, and create a parent node+cursor if - Get `left` and `right` from `SplitNode(cursor.node, cursor.index)` - If `cursor.parent` is null - - Create a new `TreeNode` `parentNode` - - Set `parentNode.level` to `cursor.node.level + 1` - - Set `parentNode.keys[0]` to `left.keys[0]` - - Create a new Cursor `parentCursor` - - Set `parentCursor.index` to `0` - - Set `parentCursor.node` to `parentNode` - - Set `cursor.parent` to `parentCursor` + - Create a new `TreeNode` `parentNode` + - Set `parentNode.level` to `cursor.node.level + 1` + - Set `parentNode.keys[0]` to `left.keys[0]` + - Create a new Cursor `parentCursor` + - Set `parentCursor.index` to `0` + - Set `parentCursor.node` to `parentNode` + - Set `cursor.parent` to `parentCursor` - Set the entry at `cursor.parent.node` at `cursor.parent.index` to `left.keys[0]`,`save(left).cid` - Insert a new entry into `cursor.parent.node` at `cursor.parent.index+1` to `right.keys[0]`,`save(right).cid` - Increment `cursor.parent.index` by `1` @@ -351,7 +351,7 @@ This will add a new child to the parent node, and create a parent node+cursor if Merge two tree nodes together. - If `left.level` != `right.level` - - Return an error (incompatible tree node levels) + - Return an error (incompatible tree node levels) - Create a new `TreeNode` `node` - Set `node.keys` to `left.keys`, and concat it with `right.keys` - Set `node.values` to `left.values`, and concat it with `right.values` @@ -374,18 +374,18 @@ The returned `TreeNode` is for the new root of the tree. - If `cursor.parent` is `null` or `CursorAtChunkingBoundry(ProllyTreeConfig, cursor)` is `true` - break the loop - if `CursorIsAtEnd(cursor.parent)` is `true` - - break the loop + - break the loop - Set `right` to `load(cursor.parent.values[cursor.parent.index + 1])` - Create a new `merged` `TreeNode` from `MergeNodes(cursor.node, right)` - Get the `length` from `save(merged).bytes`, as well as the `cid` - If `length` is less than or equal to `config.maxChunkSize` - - set `cursor.parent.node.values[cursor.parent.index]` to `cid` - - remove the entry in `cursor.parent.node` at `cursor.parent.index + 1` - - break the loop + - set `cursor.parent.node.values[cursor.parent.index]` to `cid` + - remove the entry in `cursor.parent.node` at `cursor.parent.index + 1` + - break the loop - If `CursorAtChunkingBoundry(ProllyTreeConfig, cursor)` is `false` - - Call `AdvanceCursor(cursor)` - - continue to next loop cycle - - Call `SplitCursor(cursor)` + - Call `AdvanceCursor(cursor)` + - continue to next loop cycle + - Call `SplitCursor(cursor)` - Return `RebalanceTree(cursor.parent, ProllyTreeConfig)` ### Put(ProllyTree tree, key, value) : ProllyTree @@ -396,16 +396,16 @@ This is a public facing API for setting keys in the prolly tree. - Get root `node` using `load(tree.root)` - Get `cursor` from `CursorAtItem(node, key)` - If `CursorIsValid(cursor)` is `true` - - check if the key is at `CursorGetKey` - - If it is - - Set the value in the `cursor.node.values` at `cursor.index` to `value` - - If it isn't - - add the `key` to `cursor.node.keys` after the `cursor.index`, shifting subsequent items down - - add the `values` to `cursor.node.values` after the `cursor.index`, shifting subsequent items down - - increment `cursor.index` by `1` - - Get the `length` from `save(cursor.node).bytes` - - If `length` > `config.maxChunkSize` - - call `SplitCursor(cursor)` + - check if the key is at `CursorGetKey` + - If it is + - Set the value in the `cursor.node.values` at `cursor.index` to `value` + - If it isn't + - add the `key` to `cursor.node.keys` after the `cursor.index`, shifting subsequent items down + - add the `values` to `cursor.node.values` after the `cursor.index`, shifting subsequent items down + - increment `cursor.index` by `1` + - Get the `length` from `save(cursor.node).bytes` + - If `length` > `config.maxChunkSize` + - call `SplitCursor(cursor)` - TODO: If false, how is this reached? It means that there were no keys even "close" to `key`? - get a new `root` from `RebalanceTree(cursor, config)` - Create a new `updatedTree` by duplicating `tree` @@ -420,11 +420,11 @@ Removes a key from a ProllyTree if it exists - Get root `node` using `load(tree.root)` - Get `cursor` from `CursorAtItem(TreeNode, key)` - If `CursorIsValid(cursor)` is `false` - - Return the `TreeNode` + - Return the `TreeNode` - check if the key is at `CursorGetKey(cursor)` - If it is - - Remove the key in `cursor.node.keys` at `cursor.index` - - Remove the value in `cursor.node.values` at `cursor.index` + - Remove the key in `cursor.node.keys` at `cursor.index` + - Remove the value in `cursor.node.values` at `cursor.index` - get a new `root` from `RebalanceTree(cursor, config)` - Create a new `updatedTree` by duplicating `tree` - Set `updatedTree.root` to `save(root).cid` @@ -439,12 +439,12 @@ Applications should otherwise manually detect when to stop iterating based on th - Get `cursor` from `CursorAtItem(TreeNode, key)` - Create an Iterator (language dependent) - On each pull from the iterator - - If `CursorIsValid(cursor)` is `false` - - Close the iterator and return - - Get the `key` from `CursorGetKey(cursor)` - - Get the `value` from `CursorGetValue(cursor)` - - Yield the `key` and `value` from the iterator - - `AdvanceCursor(TreeNode)` + - If `CursorIsValid(cursor)` is `false` + - Close the iterator and return + - Get the `key` from `CursorGetKey(cursor)` + - Get the `value` from `CursorGetValue(cursor)` + - Yield the `key` and `value` from the iterator + - `AdvanceCursor(TreeNode)` ### Diff(TreeNode base, TreeNode new) : Iterator< Diff > From 93d9baffc2e9736eaef4faea7498b4f2962658c7 Mon Sep 17 00:00:00 2001 From: mauve Date: Thu, 15 Dec 2022 11:40:27 -0500 Subject: [PATCH 18/20] Replace level with isLeaf --- .../advanced-data-layouts/prollytree/spec.md | 32 +++++++------------ 1 file changed, 12 insertions(+), 20 deletions(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index a0728da5..b45e2b64 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -40,9 +40,9 @@ At the highest level, Prolly Trees act as a key value store with the ability to ### Search Tree The basic structure is that of an ordered search tree: The contained keys are organised such that they can be found (inserted, updated, ...) efficiently by value. -To efficiently find keys, the tree is traversed top-to-bottom and the non-leaf nodes help navigating/comparing the values efficiently. An intermediate node contains several ordered key-address pairs, which link to further nodes (intermediate or leaf) on the next lower level. -Levels go from `0` representing Leaf Nodes, and go up for each level in the tree. The root of the tree will have the highest level in the tree and can give an estimate of it's overall size. -Leaf nodes contain the actual Key-Value pairs for the tree which can be iterated over as part of the overall tree iteration. +To efficiently find keys, the tree is traversed top-to-bottom and the non-leaf nodes help navigating/comparing the values efficiently. +An intermediate node contains several ordered key-address pairs, which link to further nodes (intermediate or leaf) on the next lower level. +Leaf nodes are identified using an `isLeaf` flag which tells the reader to stop trying to traverse deeper and to treat the values as actual values. ### Chunking @@ -85,8 +85,7 @@ type ProllyTree struct { } representation tuple type TreeNode struct { - # Is leaf when level is 0 - level Int + isLeaf Boolean keys [Bytes] # If a leaf, contains entry valies # If an intermediate node, contains Links to further TreeNodes @@ -131,9 +130,10 @@ Values corresponding to keys. For leaf nodes these will be Links pointing to additional Values can point to arbitrary IPLD nodes and it is up to applications to generate and process them. -### `TreeNode.level` +### `TreeNode.isLeaf` -0 for leaf nodes, and add 1 for parent levels (and incrementing as more parents are added) +A flag that gets set when a treenode is a leaf or an intermediate node. +If `isLeaf` is false, all the `values` should be Links pointing to other TreeNodes ### `ProllyTreeConfig` @@ -212,7 +212,7 @@ This is useful for performing searches for keys. 2. Set `cursor.node` to the `TreeNode` 3. Set `cursor.index` to `KeyIndex(cursor.node, prefix)` 4. Start a loop - 1. if `IsLeaf(cursor.node)` is `true`, break the loop + 1. if `cursor.node.isLeaf` is `true`, break the loop 2. get the `link` from `CursorGetValue(cursor)` 3. resolve the TreeNode at the `link` to `newNode` 4. Set `parent` to `cursor` @@ -222,14 +222,6 @@ This is useful for performing searches for keys. 8. set `cursor.index` to `KeyIndex(cursor.node, item)` 5. return the `cursor` -### IsLeaf(TreeNode) : Boolean - -Check to see if a given TreeNode is for a leaf, or if it is an intermediate node in the tree. - -1. get the `level` of the `TreeNode` -2. if the `level` is `0` return `true` -3. else return `false` - ### CursorIsValid(Cursor) : Boolean Check if a given cursor is set to a valid position. @@ -286,7 +278,7 @@ Get the current key pointed to by a cursor. Get the current value pointed to by the cursor. -1. If `IsLeaf(cursor.node)` is `false`, return `null` (or throw an error) +1. If `cursor.node.isLeaf` is `false`, return `null` (or throw an error) 1. If `CursorIsValid(Cursor)` is `false`, return `null` (or throw an error) 2. Get the `value` from `cursor.node.values` at `cursor.index` 3. return the `value` @@ -334,7 +326,7 @@ This will add a new child to the parent node, and create a parent node+cursor if - Get `left` and `right` from `SplitNode(cursor.node, cursor.index)` - If `cursor.parent` is null - Create a new `TreeNode` `parentNode` - - Set `parentNode.level` to `cursor.node.level + 1` + - Set `parentNode.isLeaf` to `false` - Set `parentNode.keys[0]` to `left.keys[0]` - Create a new Cursor `parentCursor` - Set `parentCursor.index` to `0` @@ -350,8 +342,8 @@ This will add a new child to the parent node, and create a parent node+cursor if Merge two tree nodes together. -- If `left.level` != `right.level` - - Return an error (incompatible tree node levels) +- If `left.isLeaf` != `right.isLeaf` + - Return an error, cannot merge leaves with intermediate nodes. - Create a new `TreeNode` `node` - Set `node.keys` to `left.keys`, and concat it with `right.keys` - Set `node.values` to `left.values`, and concat it with `right.values` From 6e6a60ee1668a66eea4c7fb754c436f5ae14ec5f Mon Sep 17 00:00:00 2001 From: mauve Date: Thu, 15 Dec 2022 13:38:19 -0500 Subject: [PATCH 19/20] Add cidVersion and hashLength to ProllyTreeConfig --- specs/advanced-data-layouts/prollytree/spec.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index b45e2b64..27b69763 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -104,8 +104,10 @@ type ChunkingStrategy union { type ProllyTreeConfig struct { minChunkSize Int maxChunkSize Int + cidVersion Int codec Int hashFunction Int + nullable hashLength Int strategy ChunkingStrategy } representation tuple ``` @@ -151,6 +153,11 @@ If a node reaches this size (or larger) after an insertion, it will trigger a ch This is in order to avoid attacks that make chunks larger than necessary. The size is calculated from the size of the `bytes` when encoding a prolly tree node. +### `ProllyTreeConfig.cidVersion` + +This is the version number for the CID version that should be used for encoding links in the tree. +It is reccommended to use version `1`. + ### `ProllyTreeConfig.codec` This is the multicodec code for the codec to use when encoding the tree. @@ -161,6 +168,13 @@ Generally it is reccommended to use DAG-CBOR unless you really know what you're This is the multicodec code for the hash function to use for generating CIDs. It is reccommended to use SHA2-256 for your hash function unless you know what you're doing. +### `ProllyTreeConfig.hashLength` + +This is the multihash length parameter which should be used for generating CIDs. +It can be set to `null` or `-1` to use the default hash length from the hash function output. +You should generally use the default unless you have particular needs for shortening CIDs. +Setting this to lower values increases the chances of collisions when encoding data. + ### `ProllyTreeConfig.strategy` The `ChunkingStrategy` to use for forming the prolly tree. From 776b537e0d16dc0341f1bec13ee79bd05a0dfb9e Mon Sep 17 00:00:00 2001 From: mauve Date: Thu, 15 Dec 2022 13:54:36 -0500 Subject: [PATCH 20/20] Update wording for hashLength config --- specs/advanced-data-layouts/prollytree/spec.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/specs/advanced-data-layouts/prollytree/spec.md b/specs/advanced-data-layouts/prollytree/spec.md index 27b69763..57067afe 100644 --- a/specs/advanced-data-layouts/prollytree/spec.md +++ b/specs/advanced-data-layouts/prollytree/spec.md @@ -171,7 +171,8 @@ It is reccommended to use SHA2-256 for your hash function unless you know what y ### `ProllyTreeConfig.hashLength` This is the multihash length parameter which should be used for generating CIDs. -It can be set to `null` or `-1` to use the default hash length from the hash function output. +It can be set to `null` to use the default hash length from the hash function output. +Otherwise it should be greater than `0`. You should generally use the default unless you have particular needs for shortening CIDs. Setting this to lower values increases the chances of collisions when encoding data.