AMT implementation #197

austinabell · 2020-01-29T21:06:12Z

Summary of changes
Changes introduced in this pull request:

Implements AMT IPLD structure Implement AMT structure #188
- Allows it to be used in Added Fetch and Load Methods #196 and other PR/ changes using an AMT
- Will try to make the code more readible in another PR, wanted to get this in ASAP to be able to be used (Code isn't janky it just isn't very readible)
- Will open an issue to benchmark this
- Cids of nodes will not match Lotus, as they are using Blake2b256 which isn't available in rust multihash (but I will fork and PR in other changes)

In summary, this is a sharded array, which underlying nodes get cbor serialized and stored to the underlying database. The BlockStore trait here is a wrapper around our DB traits to abstract the need to serialize and generate a cid specifically every root and node.

A node always has 8 items, but can either be a leaf node where all items are values or a link node where the values can be empty, a cid to a node (which can be pulled from the blockstore when needed), or a cached node (heap allocated). To not require every change in the AMT to be persisted to the DB, the AMT should only be flushed (remove cached nodes) when the Cid or the AMT needs to be persisted.

Reference issue to close (if applicable)

Closes #188

Other information and links

dutterbutter

See initial comments, will have to re-review as there is a lot in this PR that I need more time understanding.

ipld/amt/src/amt.rs

ipld/amt/tests/amt_tests.rs

…ot done yet)

austinabell · 2020-01-30T22:19:50Z

By the way, I hacked together a generic bound AMT variant I just pushed here: 5b2fbe4

The benefit is type safety because the AMT would be bound on this type across all reads and writes. The disadvantage is that it would have to be explicitly defined when used and would require that type to be an owned value. In this case, the root nodes values would be of the type and only serialized when stored in the database. Benefit is that the values would not require to be heap allocated (very small benefit) but there would be a big difference in the enum variant between a link node and leaf node depending on the type which would definitely be a negative.

Don't look too in depth with that commit, just bringing up if you guys think it is a good idea. There is another option which is that I could just bind the type of AMT but only constrain the getters and setters to be of that type, which keeps everything else the same (this is the option I'm thinking of changing to since I don't like switching the implementation until benchmarks exist).

austinabell · 2020-01-30T23:47:03Z

So to update, I did just add a commit to constrain the AMT values to one type: ca43a60

The only function signature that could change in the future if implemented is the set function, where the value would be moved in instead of passing a reference and cloning. I kept as is because I didn't want to make the bulk set function have to clone everything (only usage as of now)

…yped

…est vector

austinabell · 2020-02-02T22:53:57Z

Ended up going with the generic and typesafe version because the other way was actually incompatible, so sorry for making changes again but the way it was is going to be janky to serialize and deserialize correctly. Also this fixes #203 because I'm using a fork of multihash where I added this functionality and also checked the cid for AMTs against ones generated through Lotus

dutterbutter · 2020-02-04T13:46:48Z

blockchain/blocks/Cargo.toml

@@ -10,7 +10,7 @@ crypto = { path = "../../crypto" }
 message = { package = "forest_message", path = "../../vm/message" }
 clock = { path = "../../node/clock" }
 cid = { package = "forest_cid", path = "../../ipld/cid" }
-multihash = "0.9.3"
+multihash = { git = "https://github.com/austinabell/rust-multihash", rev = "56a8304b1b47697660dba7252d214be9829e137d" }


Assuming we can update these back given latest merge

ec2 · 2020-02-04T16:23:47Z

ipld/amt/src/bitmap.rs

+        let bz: Vec<u8> = serde_bytes::Deserialize::deserialize(deserializer)?;
+
+        // Get bitmap byte from serialized bytes
+        let bmap: BitMap = bz


Do you need to bind to bmap here? Can't just return?

Rust will actually compile each identically, this is just for readability

ec2 · 2020-02-04T16:31:22Z

ipld/amt/src/root.rs

+/// Root of an AMT vector, can be serialized and keeps track of height and count
+#[derive(PartialEq, Debug)]
+pub(super) struct Root<V> {
+    pub(super) height: u32,


Just a thought. But these fields can be pub instead of pub(super) since your struct is pub(super), right? Doesn't matter though, just a though :~)

This is more explicit for if the Root visibility changes in future. All of this is internal so no need for that <|:)

austinabell added 22 commits January 22, 2020 10:19

Initialize amt crate and structs

d476dd3

Set up blockstore framework and test framework

d19fb3d

Merge branch 'master' of github.com:ChainSafe/ferret into austin/amt

44e3c4b

Implement basic setting and updated structures

2219d23

refactor and implement most of getter

fe51f44

refactor usage to not require cbor

f760fab

Refactor to rewrite with

971ce68

Set up caching framework

b685125

Implement bitmap and clean

2c76411

Implements leafs get and sets

c9cdf22

test more and lint

38fb7c2

Refactor out bitmap for future use

6d6ffe5

finish setters and expansion

eff318e

fix edge cases

defb9fa

Link traversing Cids and bitmap cleaned

2b4d534

clean and refactor

dc985d1

Implement loading amt from blockstore

346be6a

bulk inserting test

ee2c1d8

Functional leaf delete

4b56f73

Handle height changes with deletes and height adjustments

b8f019c

Clean up interface for PR

81f3d30

documentation

1a00e98

austinabell requested review from ansermino, dutterbutter and ec2 as code owners January 29, 2020 21:06

austinabell requested review from RajarupanSampanthan and GregTheGreek January 29, 2020 21:06

austinabell added IPLD Priority: 2 - High Status: Needs Review labels Jan 29, 2020

austinabell added 3 commits January 30, 2020 12:11

Refactor to associate bmap with variant for cleaner use

3c6ca70

Move bitmap serialization to type

24782fa

remove unneeded trait impl

4e2c98a

dutterbutter reviewed Jan 30, 2020

View reviewed changes

ipld/amt/src/amt.rs Outdated Show resolved Hide resolved

ipld/amt/src/amt.rs Show resolved Hide resolved

ipld/amt/src/amt.rs Outdated Show resolved Hide resolved

ipld/amt/tests/amt_tests.rs Outdated Show resolved Hide resolved

ipld/amt/tests/amt_tests.rs Outdated Show resolved Hide resolved

austinabell added 4 commits January 30, 2020 14:09

Update documentation based on review

57ca9eb

One last readability change (sorry)

6bd9a5d

remove mut ref for get (was mut before to possibly update cache but n…

386ae94

…ot done yet)

Generic type bound alternative

5b2fbe4

dutterbutter mentioned this pull request Jan 30, 2020

Added Fetch and Load Methods #196

Merged

austinabell mentioned this pull request Jan 31, 2020

Add benchmarks to AMT/HAMT #202

Closed

austinabell added 3 commits February 2, 2020 16:37

Merge branch 'master' of github.com:ChainSafe/ferret into austin/amtt…

7fdf3a2

…yped

switch multihash version and change functions slightly

de17893

Clean up generic bounds from refactor, update cid hashing algo, add t…

850e923

…est vector

austinabell force-pushed the austin/amt branch from 07352ae to 850e923 Compare February 2, 2020 22:47

Fix headers

6b31403

austinabell added 3 commits February 3, 2020 09:27

Add cid checks for tests against Lotus

638a17a

Small naming change

d9f643d

Clean up test utils

80fa48a

dutterbutter reviewed Feb 4, 2020

View reviewed changes

dutterbutter approved these changes Feb 4, 2020

View reviewed changes

austinabell added 2 commits February 4, 2020 10:14

Update multihash from fork to released

d75e88e

Merge branch 'master' into austin/amt

7b23964

ec2 reviewed Feb 4, 2020

View reviewed changes

ec2 approved these changes Feb 4, 2020

View reviewed changes

austinabell merged commit 8b1b61b into master Feb 4, 2020

austinabell deleted the austin/amt branch February 4, 2020 16:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMT implementation #197

AMT implementation #197

austinabell commented Jan 29, 2020 •

edited

dutterbutter left a comment

austinabell commented Jan 30, 2020

austinabell commented Jan 30, 2020

austinabell commented Feb 2, 2020

dutterbutter Feb 4, 2020

ec2 Feb 4, 2020

austinabell Feb 4, 2020

ec2 Feb 4, 2020

austinabell Feb 4, 2020

AMT implementation #197

AMT implementation #197

Conversation

austinabell commented Jan 29, 2020 • edited

dutterbutter left a comment

Choose a reason for hiding this comment

austinabell commented Jan 30, 2020

austinabell commented Jan 30, 2020

austinabell commented Feb 2, 2020

dutterbutter Feb 4, 2020

Choose a reason for hiding this comment

ec2 Feb 4, 2020

Choose a reason for hiding this comment

austinabell Feb 4, 2020

Choose a reason for hiding this comment

ec2 Feb 4, 2020

Choose a reason for hiding this comment

austinabell Feb 4, 2020

Choose a reason for hiding this comment

austinabell commented Jan 29, 2020 •

edited