Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement TreeBins #72

Merged
merged 11 commits into from
Apr 7, 2020
Merged

Conversation

domenicquirl
Copy link
Collaborator

@domenicquirl domenicquirl commented Mar 16, 2020

This PR ports an optimization from the Java code to flurry which replaces linear bins with bins containing ordered trees for bins containing many elements (in case of many hash collisions).

TreeBins and TreeNodes are introduced as structs with all associated methods, including the red-black tree implementation and methods to read from or insert into a tree bin. Handling for tree bins is added to all methods of HashMap that are sensitive to bin content/the type of bins, including iterators. Documentation is provided for the implementation, in particular the crate-level implementation notes are updated.

Documentation changes may also include minor modifications to where I noticed things to be out-of-date. These are small across the board, notably I included some information on HashSets in the crate-level documentation, since they were added with #19 (which means we should also probably re-purpose #17 to address missing methods or HashSetRef only, or close it and move these to a new issue).

Since I did not write most of the methods of HashMap that I had to touch, please review these particularly vigilantly. In particular the iterator code, since this was implemented in my absence.

Closes #13.


This change is Reviewable

@domenicquirl domenicquirl marked this pull request as ready for review March 16, 2020 11:28
@domenicquirl
Copy link
Collaborator Author

Hm, is something up with CI? I wanted to start this as a draft to check ASAN, but CI seemed unhappy to start so I marked this as ready. Still doesn't seem to get going :/

@jonhoo
Copy link
Owner

jonhoo commented Mar 16, 2020

Hmm, could you try rebasing on master and force-pushing?

@domenicquirl
Copy link
Collaborator Author

Aha! There seems to be some remaining issues, but at least I can address them now!

@domenicquirl

This comment has been minimized.

@domenicquirl domenicquirl force-pushed the treebin-optimization branch 2 times, most recently from 6056a07 to 6f3f51c Compare March 17, 2020 08:23
@domenicquirl
Copy link
Collaborator Author

Managed to figure it out in my remaining time this morning before I have to leave. It was the re-used bins that leaked memory, just had to account for the fact that additionally, values are always re-used in the new table and cannot be dropped both with the old and the new bin.

miri seems to be unavailable again, but the other checks all pass now, so this should now actually be ready for review 😊

Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me! I've left a number of inline comments that should be quick to resolve. Only one is critical (a missed case in the red-black tree implementation). I have no idea whether it'll be easier to follow this review on GitHub (I think reviewable just posts a giant comment) or on Reviewable directly. Hope it's not too painful!

Reviewed 3 of 6 files at r1, 6 of 6 files at r2.
Reviewable status: all files reviewed, 85 unresolved discussions (waiting on @domenicquirl)


src/lib.rs, line 59 at r2 (raw file):

//! exactly the same [`Hash`](std::hash::Hash) value is a sure way to slow down performance of any
//! hash table. To ameliorate impact, keys are required to be [`Ord`](std::cmp::Ord), which may
//! be used by the map to help break ties.

This seems somewhat misleading to me. We don't really use it to break ties, we use it to more efficiently handle large number of keys that hash to the same bucket. I know this is a relatively straightforward paraphrasing of the Java docs, but I think we can do better here.


src/lib.rs, line 64 at r2 (raw file):

//! TODO: dynamic load factor
//! */
//! # Set projections

HashSet is not quite the same as what the Java code refers to as "set projection". What they are referring to is a set view of the map (which essentially gives you just the keys) as demonstrated here:

public KeySetView<K,V> keySet() {
.


src/lib.rs, line 145 at r2 (raw file):
Nit: I prefer the Java comment's present-tense form here:

So we use a secondary strategy that applies when the number of nodes in a bin exceeds a threshold.


src/lib.rs, line 151 at r2 (raw file):

//! given that N cannot exceed `(1<<64)` (before running out of adresses) this bounds search steps,
//! lock hold times, etc, to reasonable constants (roughly 100 nodes inspected per operation worst
//! case) as keys are required to be comparable ([`Ord`](std::cmp::Ord)). `BinEntry::Tree` nodes

Nit: I think the note about keys being "comparable" here is unnecessary. It is only there in the Java case because they cannot guarantee a bounded number of search steps if the key is not comparable.


src/lib.rs, line 210 at r2 (raw file):

 * Java's `Comparable` interface. */
//! `BinEntry::Tree` bins use a special form of comparison for search and related operations (which
//! is the main reason they do not use existing collections such as tree maps). The contained tree

Nit: Here too I prefer the more casual tone used in the Java docs of "we cannot use".


src/lib.rs, line 218 at r2 (raw file):

//! always possible by readers even during updates, tree traversal is not, mainly because of
//! tree-rotations that may change the root node and/or its linkages. Tree bins include a simple
//! reaad-write lock mechanism parasitic on the main bin-synchronization strategy: Structural

Nit: "reaad" -> "read"


src/map.rs, line 935 at r2 (raw file):

                            tree_node.node.key.clone(),
                            tree_node.node.value.clone(),
                            hash,

Nit: very minor preference, but can we keep the constructor argument order the same as the Java code (hash comes first)?


src/map.rs, line 995 at r2 (raw file):

                        // bin here, since it will be swapped for a Moved entry
                        // while we are still behind the bin lock.
                        if high_count != 0 {

Nit: clippy will probably complain about this not using else if.


src/map.rs, line 1003 at r2 (raw file):

                            // safety: we have just created `low` and its `next`
                            // nodes and have never shared them
                            unsafe { TreeBin::drop_tree_nodes(low, false, guard) };

I wish there was a way for us to not even allocate nodes if we determine we can re-use the whole tree. Not sure if that'd work out in practice though. Especially because it would also save us from cloning all the keys and values.


src/map.rs, line 1034 at r2 (raw file):

                    // if we did not re-use the old bin, it is now garbage,
                    // since all of its nodes have been reallocated. However,
                    // we always re-use the stored values, so we can't drop those.

I think it should be fine to just use defer_destroy here like in the linear case. Since the values are already behind Atomic, they won't be automatically dropped just because the containing node is dropped.


src/map.rs, line 1227 at r2 (raw file):

    where
        K: Borrow<Q>,
        Q: ?Sized + Hash + Ord,

I think you'll want to change this whole impl block to be K: Ord rather than K: Eq. Note also that we now require that the borrowed form of K implements Ord the same way that Q does.


src/map.rs, line 1373 at r2 (raw file):

}

impl<K, V, S> HashMap<K, V, S>

See the note above that should make splitting this impl block unnecessary.


src/map.rs, line 1414 at r2 (raw file):

            }

            // slow path -- bin is non-empty

Not sure why this comment went away?


src/map.rs, line 1493 at r2 (raw file):

                    // unlink it from the map to prevent others from entering it
                    tab.store_bin(idx, Shared::null());
                    // next, walk the nodes of the bin and free the nodes and their values as we go

I'm not sure I see why we have to walk the nodes of the tree while still holding the lock? Once we store NULL, we must be the owners.


src/map.rs, line 1494 at r2 (raw file):

                    tab.store_bin(idx, Shared::null());
                    // next, walk the nodes of the bin and free the nodes and their values as we go
                    // note that we do not free the head node yet, since we're holding the lock it contains

This comment is now a little misleading, since we're not freeing anything in this loop, only counting!


src/map.rs, line 1616 at r2 (raw file):

                // fast path -- bin is empty so stick us at the front
                let node = Owned::new(BinEntry::Node(Node {
                    key: key.clone(),

I wonder if we can be a little smarter here. Move the key, and then move the key back out of changed.new if the cas_bin fails.


src/map.rs, line 1780 at r2 (raw file):

                    bin_count = 2;
                    let p = tree_bin.find_or_put_tree_val(hash, key, value, guard);
                    if !p.is_null() {

Can we perhaps invert this condition and break to avoid indenting the whole body?


src/map.rs, line 1837 at r2 (raw file):

            // reach this point if the bin changed while obtaining the lock.
            // However, our code doesn't (it uses continue) and `bin_count`
            // _cannot_ be 0 at this point.

Nit: should we assert that?


src/map.rs, line 1841 at r2 (raw file):

                self.treeify_bin(t, bini, guard);
            }
            guard.flush();

In the old code, the flush here is placed after the call to add_count. Not sure if that makes a meaningful difference?


src/map.rs, line 1847 at r2 (raw file):

            break;
        }
        // increment count, since we only get here if we did not return an old (updated) value

Nit: should we assert that old_val.is_none()?


src/map.rs, line 2084 at r2 (raw file):

                    if !root.is_null() {
                        new_val = {
                            let p = TreeNode::find_tree_node(root, hash, key, guard);

Any particular reason why this isn't a method on TreeBin?


src/map.rs, line 2135 at r2 (raw file):

                                    // the beginning of the search, the value cannot be dropped until the
                                    // next epoch, which won't arrive until after we drop our guard.
                                    Some(unsafe { n.value.load(Ordering::SeqCst, guard).deref() })

Nit: We could probably avoid the load here if we just used Owned::into_shared above. Alternatively, this can at least be downgraded to a Relaxed load.


src/map.rs, line 2139 at r2 (raw file):

                                    removed_node = true;
                                    // remove the BinEntry::TreeNode containing the removed key value pair from the bucket
                                    // safety: `p` is marked for garbage collection below if the bin gets untreeified

Not sure I follow this safety argument.


src/map.rs, line 2161 at r2 (raw file):

                                            guard.defer_destroy(bin);
                                            guard.defer_destroy(p);
                                        }

BTW I wonder if we should start doing all defer_destroys after dropping the head lock (this doesn't just apply to tree bins). May not make much of a difference, I'm not sure, but something worth thinking about. In particular, it would shorten the critical sections for inserts, which should in turn allow for more concurrency.


src/map.rs, line 2178 at r2 (raw file):

            // However, our code doesn't (it uses continue) and making this
            // break conditional confuses the compiler about how many times we
            // use the `remapping_function`.

Nit: should we assert that bin_count != 0?


src/map.rs, line 2251 at r2 (raw file):

    }

    /// Replaces node value with `new_value`, conditional upon match of `observed_value`.

"conditional upon match" is really hard to read compared to "if it is equal to". We should fix the variable names, but retain the sentence structure.


src/map.rs, line 2304 at r2 (raw file):

            let is_remove = new_value.is_none();
            let mut old_val: Option<(&'g K, Shared<'g, V>)> = None;

Nit: these can both go outside the loop I believe.


src/map.rs, line 2402 at r2 (raw file):

                    let root = tree_bin.root.load(Ordering::SeqCst, guard);
                    if !root.is_null() {

Can we perhaps invert this condition and break to avoid indenting the whole body?


src/map.rs, line 2404 at r2 (raw file):

                    if !root.is_null() {
                        let p = TreeNode::find_tree_node(root, hash, key, guard);
                        if !p.is_null() {

Same here.


src/map.rs, line 2415 at r2 (raw file):

                            let pv = n.value.load(Ordering::SeqCst, guard);

                            // just remove the node if the value is the one we expected at method call

This comment is a little misleading since we only remove if new_node.is_none().


src/map.rs, line 2571 at r2 (raw file):

        } else {
            let bin = tab.bin(index, guard);
            if !bin.is_null() {

Can we invert this and return early to avoid indenting everything?


src/map.rs, line 2578 at r2 (raw file):

                    let lock = node.lock.lock();
                    // check if `bin` is still the head
                    if tab.bin(index, guard) == bin {

Same here — let's invert and early return (that also matches the pattern we've used in replace_node and put of using continue if current_head != bin).


src/map.rs, line 2581 at r2 (raw file):

                        let mut e = bin;
                        let mut head = Shared::null();
                        let mut last_tree_node = Shared::null();

Nit: any particular reason not to use tail here to match the Java code and for duality with head?


src/map.rs, line 2598 at r2 (raw file):

                            let new_tree_node = TreeNode::new(
                                e_deref.key.clone(),
                                e_deref.value.clone(), // this uses a load with Ordering::Relaxed, but write access is synchronized through the bin lock

Nit: this seems important enough that I'd place it on its own line above the call to TreeNode::new with a // NOTE:


src/map.rs, line 2624 at r2 (raw file):

                        }
                        tab.store_bin(index, Owned::new(BinEntry::Tree(TreeBin::new(head, guard))));
                        // make sure the old bin entries get dropped

We can do this after dropping the head lock.


src/map.rs, line 2653 at r2 (raw file):

                    }
                    drop(lock);
                }

I'd like an else here with unreachable! to ensure we never try to treeify a tree for example.


src/map.rs, line 2661 at r2 (raw file):

    /// _not_ clean up old TreeNodes, as they may still be reachable.
    fn untreeify<'g>(
        node: Shared<'g, BinEntry<K, V>>,

Should this argument be called bin instead of node?


src/map.rs, line 2665 at r2 (raw file):

    ) -> Shared<'g, BinEntry<K, V>> {
        let mut head = Shared::null();
        let mut last_node: Shared<'_, BinEntry<K, V>> = Shared::null();

Nit: any particular reason not to use tail like the Java code? Also, why is it necessary to give the type explicitly?


src/map.rs, line 2670 at r2 (raw file):

            // safety: we only untreeify sequences of TreeNodes which either
            //  - were just created (e.g. in transfer) and are thus valid, or
            //

Nit: extraneous whitespace


src/map.rs, line 2677 at r2 (raw file):

            //    remains a valid pointer at least until our guard is dropped.
            let q_deref = unsafe { q.deref() }.as_tree_node().unwrap();
            let new_node = Owned::new(BinEntry::Node(Node {

BTW This constructor is used a lot throughout the code. May be time we introduce a Node::new.


src/map.rs, line 2677 at r2 (raw file):

            //    remains a valid pointer at least until our guard is dropped.
            let q_deref = unsafe { q.deref() }.as_tree_node().unwrap();
            let new_node = Owned::new(BinEntry::Node(Node {

Doesn't TreeNode already contain a Node? Can't we just clone that directly?


src/node.rs, line 6 at r2 (raw file):

use parking_lot::Mutex;
use std::borrow::Borrow;
use std::thread::{current, park, Thread};

We may want to use parking_lot's thread parking infrastructure here rather than std::thread.


src/node.rs, line 141 at r2 (raw file):

            let p_deref = unsafe { Self::get_tree_node(p) };
            let p_left = p_deref.left.load(Ordering::SeqCst, guard);
            let p_right = p_deref.right.load(Ordering::SeqCst, guard);

We only need either p_left or p_right if p_hash != hash. I think we should just read them in the specific branch arm where they're needed below to speed up the common case.


src/node.rs, line 149 at r2 (raw file):

                continue;
            }
            if p_hash < hash {

Nit: should probably be an else if.


src/node.rs, line 167 at r2 (raw file):

                continue;
            }
            if p_right.is_null() {

Nit: should probably be an else if. Alternatively could write this as a match on (p_left.is_null(), p_right.is_null()).


src/node.rs, line 173 at r2 (raw file):

            // Otherwise, we compare keys to find the next child to look at.
            let dir = match p_key.borrow().cmp(&key) {

I don't think there is any reason to cast the result of cmp to an integer and then if on the integer below. Just use the Ordering directly to decide p_left vs p_right.


src/node.rs, line 186 at r2 (raw file):

            // _compare_ equal (k.compareTo(p_key) == 0). In this case, both
            // children are searched. Since `Eq` and `Ord` must match, these
            // cases cannot occur here.

BTW It is really weird to me that they also have an else case here. What's that about?


src/node.rs, line 215 at r2 (raw file):

    K: Ord,
{
    pub(crate) fn new<'g>(bin: Shared<'g, BinEntry<K, V>>, guard: &'g Guard) -> Self {

It would be good to document somewhere that it is the TreeBin constructor that actually constructs the red-black tree. Elsewhere in the code when we construct TreeNodes (like in transfer and treeify), we just set next, and rely on the actually treeification to happen when the TreeBin wrapper is added.


src/node.rs, line 222 at r2 (raw file):

        // safety: when creating a new TreeBin, the nodes used are created just
        // for this bin and not shared with another thread, so they cannot get
        // invalidated.

Can we have this reflect in the method signature somehow? Like take an Owned?


src/node.rs, line 249 at r2 (raw file):

                let p_key = &p_deref.node.key;
                let p_hash = p_deref.node.hash;
                let dir = match p_hash.cmp(&hash) {

Here too I think we shouldn't go via numbers. For the Equal case, use Ordering::then.


src/node.rs, line 263 at r2 (raw file):

                let xp = p;
                p = if dir <= 0 {
                    p_deref.left.load(Ordering::SeqCst, guard)

Since we are constructing the tree, I think we are the ones who set all the left and right fields, so a Relaxed load should be sufficient.


src/node.rs, line 287 at r2 (raw file):

        }

        if cfg!(debug) {

Nit: this should probably be cfg!(debug_assertions)


src/node.rs, line 350 at r2 (raw file):

            } else if waiting {
                park();
            }

This should probably have a call to spin_loop_hint.


src/node.rs, line 419 at r2 (raw file):

                };
                if bin_deref.lock_state.fetch_add(-READER, Ordering::SeqCst) == (READER | WRITER) {
                    // check if another thread is waiting and, if so, unpark it

Nit: I'd add "we were the last reader holding up a waiting writer".


src/node.rs, line 422 at r2 (raw file):

                    let waiter = &bin_deref.waiter.load(Ordering::SeqCst, guard);
                    if !waiter.is_null() {
                        // safety: we do not destroy waiting threads

Which is probably a problem. We should make sure to destroy the old Thread handle whenever there is one when we update waiter.


src/node.rs, line 433 at r2 (raw file):

    }

    /// Unlinks the given node, which must be present before this call. This is

Nit: include a double newline before "This is" so we get nice docs.


src/node.rs, line 464 at r2 (raw file):

        let next = p_deref.node.next.load(Ordering::SeqCst, guard);
        let prev = p_deref.prev.load(Ordering::SeqCst, guard);
        // unlink traversal pointers

Nit: let's add an empty line above this comment for clarity.


src/node.rs, line 491 at r2 (raw file):

        // if we are now too small to be a `TreeBin`, we don't need to worry
        // about restructuring the tree since the bin will be untreeified
        // anyway, so we check that

A comment on what the heuristic here does would be good


src/node.rs, line 493 at r2 (raw file):

        // anyway, so we check that
        let mut root = self.root.load(Ordering::SeqCst, guard);
        if root.is_null()

I think we can assert that !root.is_null(), no?


src/node.rs, line 508 at r2 (raw file):

                    .left
                    .load(Ordering::SeqCst, guard)
                    .is_null()

What a bizarre heuristic, but I guess it is what the Java code does...


src/node.rs, line 535 at r2 (raw file):

            successor_deref
                .red
                .store(p_deref.red.load(Ordering::Relaxed), Ordering::Relaxed);

Is the reason why this store is okay to be Relaxed in the context of lock-free reads the compare_and_swap that find attempts? Same with the various other Relaxed stores below.


src/node.rs, line 540 at r2 (raw file):

            let successor_right = successor_deref.right.load(Ordering::Relaxed, guard);
            let p_parent = p_deref.parent.load(Ordering::Relaxed, guard);
            if successor == p_right {

Nit: a comment here saying that we're going to swap p and successor would be good


src/node.rs, line 569 at r2 (raw file):

                }
            }
            p_deref.left.store(Shared::null(), Ordering::Relaxed);

Nit: maybe assert successor_left.is_null()?


src/node.rs, line 610 at r2 (raw file):

            replacement = p_right;
        } else {
            replacement = p;

I think this case is unreachable, since it would imply that p was the only node, which is already handled in the code before we take the root lock.


src/node.rs, line 613 at r2 (raw file):

        }

        if replacement != p {

Even though it's basic binary tree stuff, I think it'd be good to remind the reader what's going on here. We have swapped p with successor, which immediately follows it in (hash, key) order. We now want to get rid of p completely. If p now has no children (i.e., if successor had no right child (it can never have a left child -- it wouldn't be the successor of p)), then the job is trivial since p is a leaf: we just unlink it from its parent (further below). If it has one child (which must be a right child), then we can just replace p with that right child directly.


src/node.rs, line 648 at r2 (raw file):

        );

        if p == replacement {

I wonder why this happens after the rebalancing rather than next to the if replacement == p.


src/node.rs, line 651 at r2 (raw file):

            let p_parent = p_deref.parent.load(Ordering::Relaxed, guard);
            if !p_parent.is_null() {
                if p == TreeNode::get_tree_node(p_parent)

Nit: probably worth storing TreeNode::get_tree_node(p_parent) into a temporary variable here for clarity.


src/node.rs, line 668 at r2 (raw file):

                }

                p_deref.parent.store(Shared::null(), Ordering::Relaxed);

Nit: should we assert here that left and right are both is_null?


src/node.rs, line 678 at r2 (raw file):

        // that pins an epoch <= our epoch, and thus have to be released before
        // `p` is actually dropped.
        #[allow(unused_unsafe)]

Why is this needed?


src/node.rs, line 684 at r2 (raw file):

        }

        self.unlock_root();

We should hoist this unlock to above the defer_destroy.


src/node.rs, line 685 at r2 (raw file):

        self.unlock_root();
        if cfg!(debug) {

Should be cfg!(debug_assertions)


src/node.rs, line 697 at r2 (raw file):

impl<K, V> TreeBin<K, V>
where
    K: Ord,

This probably requires at least Send + Sync.


src/node.rs, line 733 at r2 (raw file):

            let p_deref = unsafe { TreeNode::get_tree_node(p) };
            let p_hash = p_deref.node.hash;
            let dir = match p_hash.cmp(&hash) {

Same here as earlier: let's avoid the use of integers for comparisons.


src/node.rs, line 818 at r2 (raw file):

        }

        if cfg!(debug) {

cfg!(debug_assertions)


src/node.rs, line 899 at r2 (raw file):

    /// TreeNode.
    #[inline]
    pub(crate) unsafe fn get_tree_node<'g>(bin: Shared<'g, BinEntry<K, V>>) -> &'g TreeNode<K, V> {

Nit: this particular method probably doesn't belong in this impl block given its comment.


src/node.rs, line 903 at r2 (raw file):

    }

    fn rotate_left<'g>(

I think this (and all the code below) is executed while holding the root lock, which means the loads can at least be Relaxed?


src/node.rs, line 918 at r2 (raw file):

        let p_deref = unsafe { Self::get_tree_node(p) };
        let right = p_deref.right.load(Ordering::SeqCst, guard);
        if !right.is_null() {

Nit: I'd invert this condition.


src/node.rs, line 964 at r2 (raw file):

        let p_deref = unsafe { Self::get_tree_node(p) };
        let left = p_deref.left.load(Ordering::SeqCst, guard);
        if !left.is_null() {

Nit: I'd invert this condition.


src/node.rs, line 1012 at r2 (raw file):

        let mut x_parent_parent: Shared<'_, BinEntry<K, V>>;
        let mut x_parent_parent_left: Shared<'_, BinEntry<K, V>>;
        let mut x_parent_parent_right: Shared<'_, BinEntry<K, V>>;

All of the unsafe { Self::get_tree_node(...) } noise makes the code below really hard to read :'( Maybe it'd be worthwhile to have a local single-letter macro for it?


src/node.rs, line 1157 at r2 (raw file):

                    .store(false, Ordering::Relaxed);
                return x;
            }

I believe this is missing this code segment from the Java code.


src/node.rs, line 1176 at r2 (raw file):

                        .red
                        .store(true, Ordering::Relaxed);
                    root = Self::rotate_left(root, x_parent, guard);

Minor: this pattern of rotating, updating x_parent, and then setting some other var to some property of x_parent if !x_parent.is_null() comes up a bunch. May be worth providing a simple macro for it.


src/node.rs, line 1189 at r2 (raw file):

                }
                if x_parent_right.is_null() {
                    x = x_parent;

continue would probably be clearer here


src/node.rs, line 1210 at r2 (raw file):

                            .red
                            .store(true, Ordering::Relaxed);
                        x = x_parent;

continue


src/node.rs, line 1292 at r2 (raw file):

                }
                if x_parent_left.is_null() {
                    x = x_parent;

continue


src/node.rs, line 1301 at r2 (raw file):

                        .load(Ordering::SeqCst, guard);

                    if (s_right.is_null()

Nit: the Java code technically compares s_left first and then s_right here, though the outcome should be the same.


src/node.rs, line 1313 at r2 (raw file):

                            .red
                            .store(true, Ordering::Relaxed);
                        x = x_parent;

continue


src/node.rs, line 1373 at r2 (raw file):

    }
    /// Checks invariants recursively for the tree of Nodes rootet at t.
    fn check_invariants<'g>(t: Shared<'g, BinEntry<K, V>>, guard: &'g Guard) -> bool {

Might be better to just stick the asserts directly in here rather than return a bool. That way we can actually include information about what specific assertion failed.


src/raw/mod.rs, line 164 at r2 (raw file):

                        }
                        BinEntry::TreeNode(_) => break table.find(bin, hash, key, guard),
                        BinEntry::Tree(_) => break table.find(bin, hash, key, guard),

Let's instead just make the first case BinEntry::Node(_) | BinEntry::TreeNode(_) | BinEntry::Tree(_) =>.


src/raw/mod.rs, line 168 at r2 (raw file):

                }
            }
            BinEntry::TreeNode(_) => {

Hmm, I don't think this case will ever be hit directly, as TreeBin::find doesn't recurse to this method (at least I don't think).


src/raw/mod.rs, line 227 at r2 (raw file):

                        unreachable!();
                    };
                    drop(bin);

Nit: maybe add a comment here saying that the Drop impl for Tree takes care of freeing the tree nodes and values.

@jonhoo
Copy link
Owner

jonhoo commented Mar 17, 2020

#73 should fix miri not running

Copy link
Collaborator Author

@domenicquirl domenicquirl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working my way through this, I think I'll stick to Reviewable to be able to see where your comments are in the code. Will be a few batches, while most of the changes are not big, it's still a bit of time for each and I can't do all in one. Also, maybe some aren't as trivial as they seem, but see for yourself ^^

Reviewable status: all files reviewed, 79 unresolved discussions (waiting on @domenicquirl and @jonhoo)


src/lib.rs, line 64 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

HashSet is not quite the same as what the Java code refers to as "set projection". What they are referring to is a set view of the map (which essentially gives you just the keys) as demonstrated here:

public KeySetView<K,V> keySet() {
.

The Java code has the KeySetView wrapper, which is backed by a hash map. This is used for two things: You can statically create such a view, which will then be backed by a ConcurrentHashMap<K, Boolean> that always contains true for keys that are present. Or you can turn a concrete instance of a hash map into such a view if you provide an object of correct type to use a the "dummy" value for newly added keys.

If you use one of the static methods, what you get is essentially equivalent to a hash set. While you can get the backing map via CollectionView, what do you do with a key -> true mapping? If you just use it as-is, it is functionally the same as HashMap<K, ()>, only Java doesn't have ().
There is of course a difference the non-static one where you actually obtain a view for an existing map. We don't have a direct equivalent for this, the closest thing in my mind was iterating over keys. Though I concede that keys() is an iterator and the result returned by keySet() is such a KeySetView.

Do you have a concrete suggestion for what you would change here? Are you only against the nomenclature of a set projection or is your comment arguing in favor of removing this paragraph again/moving it somewhere else? I think we should definitely mention the possibility of using HashSet somewhere in the global doc.


src/map.rs, line 995 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Nit: clippy will probably complain about this not using else if.

Interestingly, it only complained about the high_bin case, which is why they look different now. Maybe the comment is holding it off? Will change though!


src/map.rs, line 1003 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

I wish there was a way for us to not even allocate nodes if we determine we can re-use the whole tree. Not sure if that'd work out in practice though. Especially because it would also save us from cloning all the keys and values.

This would apply if the nodes in the tree either all hash to the low or all to the high bin, right? 'Cause if the tree is re-used because it's empty we have no nodes to create.

I was also annoyed that I had to create nodes and stick them behind shared pointers only to then clean them up again. But even if we could avoid allocations by performing some check first, I'm not sure if that would be worth it. I think in most non-pathological cases there will be both elements that hash to the low and such that hash to the high bin in the new table. It would be faster for the worst case of identical hashes though...


src/map.rs, line 1034 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

I think it should be fine to just use defer_destroy here like in the linear case. Since the values are already behind Atomic, they won't be automatically dropped just because the containing node is dropped.

Dropping a TreeBin will by default drop all of its nodes and their values. This is because if we drop the map, then we also drop the values. However, here we want to re-use the values, so we cannot rely on this Drop impl.

Now that I'm looking at this again, I think it's the other way round. Other places in the map that remove tree bins should use logic like this one where values are not dropped, such as in replace_node if a node gets removed. I wonder why this did not produce an error in the tests, maybe this case is not hit. Having this logic everywhere would of course raise the question whether the default for dropping a TreeBin should be not to drop the node values, and have a separate, "manual" implementation in Table::drop_bins for the case where we need to drop everything. However, clear for example also need the value drop. What do you think?


src/map.rs, line 1227 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

I think you'll want to change this whole impl block to be K: Ord rather than K: Eq. Note also that we now require that the borrowed form of K implements Ord the same way that Q does.

Fair enough. Regarding the matching impls of Ord, would you want to mention this somewhere? If so, what would you consider to be a good place to do so?


src/map.rs, line 1373 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

See the note above that should make splitting this impl block unnecessary.

Done.


src/map.rs, line 1414 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Not sure why this comment went away?

Done.


src/map.rs, line 1493 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

I'm not sure I see why we have to walk the nodes of the tree while still holding the lock? Once we store NULL, we must be the owners.

I think you're right. Other threads that wait on the lock would re-check and see a different bin when the lock is released after that point. The tree bin case mirrors the regular case, which I would guess just has the lock matching the Java code's synchronized. For some reason, they store the null bin at the end of the work section instead of at the beginning. Want me to change that for both types of nodes?


src/map.rs, line 1494 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

This comment is now a little misleading, since we're not freeing anything in this loop, only counting!

Done.

Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, no rush! And thanks for looking into this :D

Reviewable status: all files reviewed, 79 unresolved discussions (waiting on @domenicquirl and @jonhoo)


src/lib.rs, line 64 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

The Java code has the KeySetView wrapper, which is backed by a hash map. This is used for two things: You can statically create such a view, which will then be backed by a ConcurrentHashMap<K, Boolean> that always contains true for keys that are present. Or you can turn a concrete instance of a hash map into such a view if you provide an object of correct type to use a the "dummy" value for newly added keys.

If you use one of the static methods, what you get is essentially equivalent to a hash set. While you can get the backing map via CollectionView, what do you do with a key -> true mapping? If you just use it as-is, it is functionally the same as HashMap<K, ()>, only Java doesn't have ().
There is of course a difference the non-static one where you actually obtain a view for an existing map. We don't have a direct equivalent for this, the closest thing in my mind was iterating over keys. Though I concede that keys() is an iterator and the result returned by keySet() is such a KeySetView.

Do you have a concrete suggestion for what you would change here? Are you only against the nomenclature of a set projection or is your comment arguing in favor of removing this paragraph again/moving it somewhere else? I think we should definitely mention the possibility of using HashSet somewhere in the global doc.

I was thinking mostly in terms of nomenclature. I agree that mentioning that we also have a HashSet is a good idea, but I think "projection" implies that you can take a map and "get" a set, which isn't something we currently support, neither by conversion or as a view.


src/map.rs, line 995 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

Interestingly, it only complained about the high_bin case, which is why they look different now. Maybe the comment is holding it off? Will change though!

Yeah, it's probably the comment!


src/map.rs, line 1003 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

This would apply if the nodes in the tree either all hash to the low or all to the high bin, right? 'Cause if the tree is re-used because it's empty we have no nodes to create.

I was also annoyed that I had to create nodes and stick them behind shared pointers only to then clean them up again. But even if we could avoid allocations by performing some check first, I'm not sure if that would be worth it. I think in most non-pathological cases there will be both elements that hash to the low and such that hash to the high bin in the new table. It would be faster for the worst case of identical hashes though...

I was thinking specifically of the case where they all go to the high or low bin, yes. I'm having a hard time guessing whether that will come up in practice or not. By pure chance, you'd think that when you have 6+ elements in a bin, at least one will go high and at least one will go low. But in cases where hashes are non-uniform (as you observe), that might not be the case, and this is one of the primary things the tree optimization should help against! The other case I think is during resizes, where you may continue inserting into a bin for a while after a resize has already been started, but that is probably a rarer case.

Ultimately, I'll leave this one up to you — I think we could go either way. Maybe it's worth erring on the side of less complexity.


src/map.rs, line 1034 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

Dropping a TreeBin will by default drop all of its nodes and their values. This is because if we drop the map, then we also drop the values. However, here we want to re-use the values, so we cannot rely on this Drop impl.

Now that I'm looking at this again, I think it's the other way round. Other places in the map that remove tree bins should use logic like this one where values are not dropped, such as in replace_node if a node gets removed. I wonder why this did not produce an error in the tests, maybe this case is not hit. Having this logic everywhere would of course raise the question whether the default for dropping a TreeBin should be not to drop the node values, and have a separate, "manual" implementation in Table::drop_bins for the case where we need to drop everything. However, clear for example also need the value drop. What do you think?

Ah, yes, I hadn't seen the impl Drop yet when I wrote that. I think we could go either way with this, so I trust your judgement as to which approach is more common (and thus better to be the "default"). If both types of dropping appear in multiple places, let's factor them both into functions, and then just call the appropriate function as needed.

As for the existing tests not catching this, let's try to fix that by introducing a test where it should trigger?


src/map.rs, line 1227 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

Fair enough. Regarding the matching impls of Ord, would you want to mention this somewhere? If so, what would you consider to be a good place to do so?

The existing requirement about matching Eq implementations already appears in the docs for get, and other methods that take a K: Borrow<Q> (like get_key_value), so I think this is just a matter of updating the text there.


src/map.rs, line 1493 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

I think you're right. Other threads that wait on the lock would re-check and see a different bin when the lock is released after that point. The tree bin case mirrors the regular case, which I would guess just has the lock matching the Java code's synchronized. For some reason, they store the null bin at the end of the work section instead of at the beginning. Want me to change that for both types of nodes?

Yeah, I don't understand why they do that. My instinct is to trust that they knew what they were doing, though this seems like it really should be fine for exactly the reason you outline. My inclination is to change it for both types of notes, and then add a comment with NOTE: that we differ from the Java impl here.

Copy link
Collaborator Author

@domenicquirl domenicquirl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 78 unresolved discussions (waiting on @domenicquirl and @jonhoo)


src/lib.rs, line 64 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

I was thinking mostly in terms of nomenclature. I agree that mentioning that we also have a HashSet is a good idea, but I think "projection" implies that you can take a map and "get" a set, which isn't something we currently support, neither by conversion or as a view.

Done.


src/map.rs, line 1003 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

I was thinking specifically of the case where they all go to the high or low bin, yes. I'm having a hard time guessing whether that will come up in practice or not. By pure chance, you'd think that when you have 6+ elements in a bin, at least one will go high and at least one will go low. But in cases where hashes are non-uniform (as you observe), that might not be the case, and this is one of the primary things the tree optimization should help against! The other case I think is during resizes, where you may continue inserting into a bin for a while after a resize has already been started, but that is probably a rarer case.

Ultimately, I'll leave this one up to you — I think we could go either way. Maybe it's worth erring on the side of less complexity.

The main thing I would consider ist exactly that tree bins are partly a protective measure in the case of highly skewed hash distributions. Ultimately though, we do not guarantee any protection and I this doesn't feel like it warrants hitting the regular case with appropriate hash functions and the increased complexity you already mentioned. Then again, I'm by no means an expert on this, but I'll leave it as-is for now.


src/map.rs, line 1034 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Ah, yes, I hadn't seen the impl Drop yet when I wrote that. I think we could go either way with this, so I trust your judgement as to which approach is more common (and thus better to be the "default"). If both types of dropping appear in multiple places, let's factor them both into functions, and then just call the appropriate function as needed.

As for the existing tests not catching this, let's try to fix that by introducing a test where it should trigger?

Adding this test took me... ages. I involuntarily triggered a different bug where if nodes are removed from a tree which does not get untreeified their values are also deleted, but replace_node deletes old values across cases, so the value got dropped twice. Missed that when moving code around between inside and outside the match cases.

As for the different drop cases, I don't know which case will be more common across all map methods (we are still missing a lot compared to Java), for now it's fairly even. But having Drop clean up everything associated with the bin makes more sense to me intuitively than the other way around, so I'd do it that way.


src/map.rs, line 1227 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

The existing requirement about matching Eq implementations already appears in the docs for get, and other methods that take a K: Borrow<Q> (like get_key_value), so I think this is just a matter of updating the text there.

Done.


src/map.rs, line 1493 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Yeah, I don't understand why they do that. My instinct is to trust that they knew what they were doing, though this seems like it really should be fine for exactly the reason you outline. My inclination is to change it for both types of notes, and then add a comment with NOTE: that we differ from the Java impl here.

Done.

Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, I left a lot of comments. Sorry about that 😅

Reviewable status: all files reviewed, 77 unresolved discussions (waiting on @domenicquirl and @jonhoo)


src/lib.rs, line 64 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

Done.

Maybe this is something weird about reviewable, but I can't see any of your changes? Have you pushed them?


src/map.rs, line 1034 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

Adding this test took me... ages. I involuntarily triggered a different bug where if nodes are removed from a tree which does not get untreeified their values are also deleted, but replace_node deletes old values across cases, so the value got dropped twice. Missed that when moving code around between inside and outside the match cases.

As for the different drop cases, I don't know which case will be more common across all map methods (we are still missing a lot compared to Java), for now it's fairly even. But having Drop clean up everything associated with the bin makes more sense to me intuitively than the other way around, so I'd do it that way.

Haha, it sounds like it was worth writing the test then!

Given that both appear in multiple places, I think we should then factor out this code (and the "real drop" code) into their own functions, and then call them from the appropriate places. This would also make it relatively simple to change the "default" behavior in Drop later.

@domenicquirl
Copy link
Collaborator Author

Yeah, that's why I said I'll do a few batches ^^
No worries though, that's what this is for after all. And I'd rather read some more comments and get this right than miss something.

I didn't push any changes yet, I still need to rebase in the fix for miri. Will push after the next set of updates so you can see what has changed. But that test thing took so long that I didn't want to get drawn into fixing more stuff in case ASAN complains or something 😅

The code for the different kind of drop is already methods, this should be good now 👍

@jonhoo
Copy link
Owner

jonhoo commented Mar 19, 2020

Let's hope that reviewable deals better with rebases than GitHub does. That's the primary reason why I'm trying it out 😅

Copy link
Collaborator Author

@domenicquirl domenicquirl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got through all of map.rs now. Will push an update after posting this and doing the rebase.

Reviewable status: all files reviewed, 70 unresolved discussions (waiting on @domenicquirl and @jonhoo)


src/map.rs, line 1616 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

I wonder if we can be a little smarter here. Move the key, and then move the key back out of changed.new if the cas_bin fails.

Done.


src/map.rs, line 1780 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Can we perhaps invert this condition and break to avoid indenting the whole body?

Done.


src/map.rs, line 1841 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

In the old code, the flush here is placed after the call to add_count. Not sure if that makes a meaningful difference?

I placed it at this point to make sure it gets hit even if we updated an existing value, in which case the old value is returned immediately below. To be honest I don't know whether this matters, but if you prefer we could flush here only if we return, and then flush after add_count if we don't?


src/map.rs, line 1847 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Nit: should we assert that old_val.is_none()?

This would require moving old_val out a scope. Does that matter?


src/map.rs, line 2084 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Any particular reason why this isn't a method on TreeBin?

There are a few cases where methods that are instance methods in the Java code ended up as static methods, mostly due to handling shared references and lifetimes. In this case, findTreeNode is originally an instance method of TreeNode. But since the method has an equivalent on the Java side, I think it's good to keep it where you'd expect (or support the correct expectation of where to find it in the Java code).


src/map.rs, line 2135 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Nit: We could probably avoid the load here if we just used Owned::into_shared above. Alternatively, this can at least be downgraded to a Relaxed load.

Good idea! Again a case where this behaviour is shared with the regular case, so I'll update both.


src/map.rs, line 2139 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Not sure I follow this safety argument.

The problem with remove_tree_node is the following: The method first unlinks the given node's linear traversal pointers (next and previous). Then, it checks if the bin is now too small to be a tree bin. If so, no other operations are performed and the method returns early, indicating that untreeify is necessary. Only if the bin remains a tree is the given node unlinked from the tree.

Now, the method only takes the additional tree bin lock if it actually needs to modify the tree. This means that, until the bin is untreeified and swapped out for a linear bin, other threads can freely read (not write, because we hold the bin lock) using the fast path along the tree ordering. In particular, they can still obtain references to the "removed" node. So remove_tree_node only marks p for garbage collection if it is also removed from the tree, since then no new references can be obtained. Otherwise, the caller needs to clean up p after the bin was replaced with a linear one.

So using the method is not technically "unsafe" if you don't follow this, it "only" leaks memory. I wanted to ensure that people are aware when they use the method in the future, and the best thing I could come up with was unsafe. If you have a better alternative, I'm very open to changing this.


src/map.rs, line 2161 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

BTW I wonder if we should start doing all defer_destroys after dropping the head lock (this doesn't just apply to tree bins). May not make much of a difference, I'm not sure, but something worth thinking about. In particular, it would shorten the critical sections for inserts, which should in turn allow for more concurrency.

I agree with the sentiment, but what do we do if garbage collection is different depending on branching inside the critical section (such as here)? You'd have to reconstruct what happened when you've dropped the lock and then do the appropriate gc, which will probably need references that have gone out of scope and so on. I'll keep an eye out for places where this seems doable, but I wouldn't start refactoring everything.


src/map.rs, line 2251 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

"conditional upon match" is really hard to read compared to "if it is equal to". We should fix the variable names, but retain the sentence structure.

The previous sentence was ambiguous, the extra value is checked against the previous value for that key, not the given value (which would of course be a useless feature, but still). I wrote this out in the updated code.


src/map.rs, line 2402 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Can we perhaps invert this condition and break to avoid indenting the whole body?

Done.


src/map.rs, line 2404 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Same here.

Done.


src/map.rs, line 2415 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

This comment is a little misleading since we only remove if new_node.is_none().

I didn't even read this before, lol. Changed both occurrences to "only replace".


src/map.rs, line 2571 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Can we invert this and return early to avoid indenting everything?

Done.


src/map.rs, line 2578 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Same here — let's invert and early return (that also matches the pattern we've used in replace_node and put of using continue if current_head != bin).

Done. I feel like this would be a good thing for clippy to report, at least if there is no other branches...


src/map.rs, line 2624 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

We can do this after dropping the head lock.

Done.


src/map.rs, line 2653 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

I'd like an else here with unreachable! to ensure we never try to treeify a tree for example.

Done.


src/map.rs, line 2661 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Should this argument be called bin instead of node?

Done.


src/map.rs, line 2665 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Nit: any particular reason not to use tail like the Java code? Also, why is it necessary to give the type explicitly?

Done. Regarding the type, I really don't know. I would have thought that the type can be inferred at least from storing new_node (which is all that is done with head, but conditionally), but the compiler complains if they are not there :/


src/map.rs, line 2677 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

BTW This constructor is used a lot throughout the code. May be time we introduce a Node::new.

A quick search gives me 5 instances in map.rs and node.rs. 3 pass an existing Atomic<V> and 2 an Atomic::from. 1 has a next pointer, 3 don't and TreeNode::new passes next as an argument, which is currently also always null, but can't be removed. So I added

Node::new(hash: u64, key: K, value: Atomic<V>) -> Self

and

Node::with_next(hash: u64, key: K, value: Atomic<V>, next: Atomic<BinEntry<K, V>> ) -> Self

with null and Mutex::new(()) as defaults.


src/map.rs, line 2677 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Doesn't TreeNode already contain a Node? Can't we just clone that directly?

Right now Node is not Clone. I assume this is primarily due to the contained lock (parking_lot::Mutex). We could impl Clone to return a node with a new lock, though the question is whether the new node would share next pointer. In this instance, the node receives its own next node below.


src/node.rs, line 6 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

We may want to use parking_lot's thread parking infrastructure here rather than std::thread.

Can you clarify what you mean by "italic may"? In #13 (comment), you argued against changes only for no_std. Is there another reason you think parking_lot would be better?

Domenic Quirl added 4 commits March 20, 2020 14:02
adds:
  * TreeNode and TreeBin structs
  * TreeBin::new()
  * red-black tree methods
add insertion, removal and find for TreeBins
adds the additional write lock for TreeBins that prevents concurrent tree restructuring
add conversions to and from TreeBins
have tables drop TreeBins when dropping bins
adds TreeBins to transfer, clear, put, iterators, compute_if_present, replace_node
extends crate-level documentation with TreeBins
Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r3.
Reviewable status: all files reviewed, 58 unresolved discussions (waiting on @domenicquirl)


src/map.rs, line 1841 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

I placed it at this point to make sure it gets hit even if we updated an existing value, in which case the old value is returned immediately below. To be honest I don't know whether this matters, but if you prefer we could flush here only if we return, and then flush after add_count if we don't?

I forget exactly why we do a flush in the first place. I think the thinking was that if you end up doing a resize, you want to make sure any garbage we produced as a result of the resize gets collected promptly. But if so, the flush should arguably just be in transfer somewhere, and not here anyway. Maybe just remove it?


src/map.rs, line 1847 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

This would require moving old_val out a scope. Does that matter?

I think that's worth it for this assert.


src/map.rs, line 2139 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

The problem with remove_tree_node is the following: The method first unlinks the given node's linear traversal pointers (next and previous). Then, it checks if the bin is now too small to be a tree bin. If so, no other operations are performed and the method returns early, indicating that untreeify is necessary. Only if the bin remains a tree is the given node unlinked from the tree.

Now, the method only takes the additional tree bin lock if it actually needs to modify the tree. This means that, until the bin is untreeified and swapped out for a linear bin, other threads can freely read (not write, because we hold the bin lock) using the fast path along the tree ordering. In particular, they can still obtain references to the "removed" node. So remove_tree_node only marks p for garbage collection if it is also removed from the tree, since then no new references can be obtained. Otherwise, the caller needs to clean up p after the bin was replaced with a linear one.

So using the method is not technically "unsafe" if you don't follow this, it "only" leaks memory. I wanted to ensure that people are aware when they use the method in the future, and the best thing I could come up with was unsafe. If you have a better alternative, I'm very open to changing this.

Ah, sorry, my comment wasn't very clear. I understand why the code needs to do what it does to avoid leaks, I more mean that I don't follow the safety comment. As in, "p is marked for garbage collection below if the bin gets untreeified" does not tell me why the unsafe call below is safe.

I think it's perfectly reasonable for that method to be marked unsafe :)


src/map.rs, line 2161 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

I agree with the sentiment, but what do we do if garbage collection is different depending on branching inside the critical section (such as here)? You'd have to reconstruct what happened when you've dropped the lock and then do the appropriate gc, which will probably need references that have gone out of scope and so on. I'll keep an eye out for places where this seems doable, but I wouldn't start refactoring everything.

I think the idea would be to essentially keep a vec or something of garbage declared at the top of the function, push to that when we have garbage, and then call defer_destroy on all of them at the end. Or something like that.


src/map.rs, line 2578 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

Done. I feel like this would be a good thing for clippy to report, at least if there is no other branches...

Hehe, I agree with you! Maybe that's another PR idea :p It's probably a little tricky to detect whether it should be a continue, break, or return though. But perhaps it should just point out that you could transform without having to say how you should transform.


src/map.rs, line 2665 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

Done. Regarding the type, I really don't know. I would have thought that the type can be inferred at least from storing new_node (which is all that is done with head, but conditionally), but the compiler complains if they are not there :/

🤷‍♂️ The compiler works in mysterious ways.


src/map.rs, line 2677 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

A quick search gives me 5 instances in map.rs and node.rs. 3 pass an existing Atomic<V> and 2 an Atomic::from. 1 has a next pointer, 3 don't and TreeNode::new passes next as an argument, which is currently also always null, but can't be removed. So I added

Node::new(hash: u64, key: K, value: Atomic<V>) -> Self

and

Node::with_next(hash: u64, key: K, value: Atomic<V>, next: Atomic<BinEntry<K, V>> ) -> Self

with null and Mutex::new(()) as defaults.

Seems good! You could make it generic over AV: Into<Atomic<V>>, which would let the caller use either Atomic<V> or V (I think). But up to you -- it's not important.


src/map.rs, line 2677 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

Right now Node is not Clone. I assume this is primarily due to the contained lock (parking_lot::Mutex). We could impl Clone to return a node with a new lock, though the question is whether the new node would share next pointer. In this instance, the node receives its own next node below.

Good point. I'm happy to keep it as-is.


src/map.rs, line 1434 at r3 (raw file):

                    // NOTE: The Java code stores the null bin _after_ the loop, and thus also has
                    // to hold the lock until that point. However, after the store happens new
                    // threads and threads waiting on the lock will read the new bin.

I think this comment is missing the implication: "so we can drop the lock early and do the dropping without the lock.


src/map.rs, line 1484 at r3 (raw file):

                    // NOTE: The Java code stores the null bin _after_ the loop, and thus also has
                    // to hold the lock until that point. However, after the store happens new
                    // threads and threads waiting on the lock will read the new bin.

Same note here as above.


src/map.rs, line 1771 at r3 (raw file):

                    let p = tree_bin.find_or_put_tree_val(hash, key, value, guard);
                    if p.is_null() {
                        break;

Nit: might be good to include a brief comment here along the lines of "The key did not exist so the put immediately succeeded, and we're done".


src/map.rs, line 2253 at r3 (raw file):

    /// Replaces node value with `new_value`.
    /// If an `observed_value` is provided, the replacement only happens if `observed_value` equals
    /// the value that was previously associated with the given key.

I think this should say "is currently associated". Since the replacement has not happened yet.

Also, can we place an empty line above this (as in, after the first line), just to make the docs come out a little nicer.


src/map.rs, line 2408 at r3 (raw file):

                    let root = tree_bin.root.load(Ordering::SeqCst, guard);
                    if root.is_null() {
                        break;

A brief comment for why we break here would be good.
In fact, is it even possible for us to hit this? Shouldn't the node have been untreeified before this can happen?


src/map.rs, line 2412 at r3 (raw file):

                    let p = TreeNode::find_tree_node(root, hash, key, guard);
                    if p.is_null() {
                        break;

Nit: A brief comment for why we break here would be good.


src/map.rs, line 2669 at r3 (raw file):

                }
            } else {
                unreachable!("treeifying something that is not a linear bin");

Nit: from experience, it's super useful to here print what it was, so that we have a bit more to go on if someone later reports seeing this panic.


src/map.rs, line 3233 at r3 (raw file):

                map.insert(i, i, guard);
            }
            // Ensure the bin was correctly treeified

It's probably sufficient to have one test that tests treefication using insert. Or at least we should factor it out into its own method if many tests will have this code.


src/node.rs, line 6 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

Can you clarify what you mean by "italic may"? In #13 (comment), you argued against changes only for no_std. Is there another reason you think parking_lot would be better?

The parking lot thread parking/wakeup mechanism is in general more efficient than the one in std. I'm not talking about parking_lot_core here (which would be no_std), but the optimized implementation in parking_lot (which is not no_std).


src/node.rs, line 871 at r3 (raw file):

        guard: &'g Guard,
    ) {
        guard.defer_unchecked(move || {

I don't have a good sense for the efficiency of defer_unchecked and defer_destroy. I think the latter might be more efficient though, since it doesn't have to pass along a full closure. Could we perhaps hoist the if let out, pass a Guard to drop_fields and then have it call defer_destroy internally?

Copy link
Collaborator Author

@domenicquirl domenicquirl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 10 files reviewed, 54 unresolved discussions (waiting on @domenicquirl and @jonhoo)


src/map.rs, line 1841 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

I forget exactly why we do a flush in the first place. I think the thinking was that if you end up doing a resize, you want to make sure any garbage we produced as a result of the resize gets collected promptly. But if so, the flush should arguably just be in transfer somewhere, and not here anyway. Maybe just remove it?

It is not in transfer at the moment, and the reasoning makes sense to me. I'll move it behind add_count for now so we get that behaviour, and then maybe a separate change can consolidate flush into transfer. This PR is already doing a lot "on the side" I feel ^^


src/map.rs, line 2139 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Ah, sorry, my comment wasn't very clear. I understand why the code needs to do what it does to avoid leaks, I more mean that I don't follow the safety comment. As in, "p is marked for garbage collection below if the bin gets untreeified" does not tell me why the unsafe call below is safe.

I think it's perfectly reasonable for that method to be marked unsafe :)

I've expanded a bit, it just answers the safety condition of the method - the provided node and, if desired, its value, have to be manually garbage collected when they are no longer reachable in case of the early return.


src/map.rs, line 2161 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

I think the idea would be to essentially keep a vec or something of garbage declared at the top of the function, push to that when we have garbage, and then call defer_destroy on all of them at the end. Or something like that.

Okay, this also seems to me like it could/should be its own change if it moves around a good amount of code. Also, defer_destroy essentially defers a closure that turns the given Shared into an Owned and then drops it. A reference to the closure is stored in a slice internally, and the closure is only allocated on the heap if it exceeds a certain size (which I would guess doesn't happen that often for two calls?). Do you think having our own "defer" would even be that much faster if we still do that in the critical section?


src/map.rs, line 2677 at r2 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Seems good! You could make it generic over AV: Into<Atomic<V>>, which would let the caller use either Atomic<V> or V (I think). But up to you -- it's not important.

OK, is good idea, did do!


src/map.rs, line 1434 at r3 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

I think this comment is missing the implication: "so we can drop the lock early and do the dropping without the lock.

Done.


src/map.rs, line 1484 at r3 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Same note here as above.

Done.


src/map.rs, line 2253 at r3 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

I think this should say "is currently associated". Since the replacement has not happened yet.

Also, can we place an empty line above this (as in, after the first line), just to make the docs come out a little nicer.

Done.


src/map.rs, line 2408 at r3 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

A brief comment for why we break here would be good.
In fact, is it even possible for us to hit this? Shouldn't the node have been untreeified before this can happen?

It should, again because of the bin lock and the following re-check. But the check is in multiple places in the Java code and also remove_tree_node checks a lot of cases that have too few nodes to actually occur. I would guess that the intention is to have the code be robust against changing the UNTREEIFY_THRESHOLD. There still probably shouldn't be empty bins (cause when they get empty they get untreeified the latest), but I don't exactly feel confident to remove all checks.


src/map.rs, line 3233 at r3 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

It's probably sufficient to have one test that tests treefication using insert. Or at least we should factor it out into its own method if many tests will have this code.

It's not mainly about triggering treeification, that's just necessary to test tree bin removal. I'd like to keep both tests, since the behaviour between replace_node and compute_if_present differs with regards to garbage collection (see the lost usize I had to chase around to make ASAN happy again 😅 ). But I refactored the test to take a closure.

@domenicquirl
Copy link
Collaborator Author

The comment about the efficiency of defer_destroy may also apply to your concern regarding the use of defer_unchecked, given that the former also creates a closure which is then handled by the latter. I'll stop here for today though, we can come back to this later.

Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think one day we'll have to make a pass where we try to optimize the garbage collection, given that that's where our current bottleneck appears to be.

Reviewed 2 of 2 files at r4.
Reviewable status: all files reviewed, 50 unresolved discussions (waiting on @domenicquirl)


src/map.rs, line 1841 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

It is not in transfer at the moment, and the reasoning makes sense to me. I'll move it behind add_count for now so we get that behaviour, and then maybe a separate change can consolidate flush into transfer. This PR is already doing a lot "on the side" I feel ^^

It is, that is true. Once we finish it all up, it might be worth at least splitting out the most obvious stand-alone changes into different commits, but that's a bunch of extra work on your plate that I understand if you want to avoid :) The biggest reason for doing it would be that it makes git blame/bisect work much better should any of this produce an issue in the future. I'll leave that as your call. We can probably still land it all as one PR — I'll just do a merge instead of a squash.


src/map.rs, line 2161 at r2 (raw file):

Previously, domenicquirl (DQ) wrote…

Okay, this also seems to me like it could/should be its own change if it moves around a good amount of code. Also, defer_destroy essentially defers a closure that turns the given Shared into an Owned and then drops it. A reference to the closure is stored in a slice internally, and the closure is only allocated on the heap if it exceeds a certain size (which I would guess doesn't happen that often for two calls?). Do you think having our own "defer" would even be that much faster if we still do that in the critical section?

Ah, if that's the case, then I agree having the closure this way is fine. I feel like it should be possible to optimize defer_destroy more (e.g., by passing around dyn Drop), but let's leave that for later.

I also think it's fine to leave the "reclaim after critical section" for a different change. Add it to the pile of "eventually TODO" :p Maybe open an issue?


src/map.rs, line 2408 at r3 (raw file):

Previously, domenicquirl (DQ) wrote…

It should, again because of the bin lock and the following re-check. But the check is in multiple places in the Java code and also remove_tree_node checks a lot of cases that have too few nodes to actually occur. I would guess that the intention is to have the code be robust against changing the UNTREEIFY_THRESHOLD. There still probably shouldn't be empty bins (cause when they get empty they get untreeified the latest), but I don't exactly feel confident to remove all checks.

Makes sense. How about we add a:

// TODO: Is this even reachable?
// The Java code has `NULL` checks for this, but in theory it should not be possible to
// encounter a tree that has no root when we have its lock. It should always have at
// least `UNTREEIFY_THRESHOLD` nodes. If it is indeed impossible we should replace
// this with an assertion instead.

Both here and to the one or two other places where we have if root.is_null().


src/map.rs, line 3233 at r3 (raw file):

Previously, domenicquirl (DQ) wrote…

It's not mainly about triggering treeification, that's just necessary to test tree bin removal. I'd like to keep both tests, since the behaviour between replace_node and compute_if_present differs with regards to garbage collection (see the lost usize I had to chase around to make ASAN happy again 😅 ). But I refactored the test to take a closure.

Oh, yeah, that makes sense. It was also a little subtle to me that compute_if_present actually ends up doing removals. I almost wonder if the name should change to reflect the fact that it might actually remove things too. Not sure how though.


src/node.rs, line 77 at r4 (raw file):

        AV: Into<Atomic<V>>,
    {
        Node {

Nit: this could just be Node::with_next(hash, key, value, Atomic::null())

Copy link
Collaborator Author

@domenicquirl domenicquirl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 9 of 10 files reviewed, 1 unresolved discussion (waiting on @jonhoo)


src/node.rs, line 379 at r7 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

I think what was confusing me was that we had both the bin lock, lock_state (which is also a lock), and the write_lock. Now that we're back to just the bin lock and lock_state (I think?), it makes more sense to me.

Well the additional locking mechanism is still there in parking the waiting thread, its mutex is just not exposed anymore as part of TreeBin. It's still there, but hidden inside Thread (which, in fact, again uses something like lock_state to handle tokens, in addition to convar/mutex). It's just you need to have the additional lock, and it has a synchronization/messaging part (lock_state) and a waiting part if you need the lock (waiter and park()/ Condvar + Mutex). But with notify we would have to keep notifying until the waiting actually calls wait(), which we probably don't want to do anyway.


src/node.rs, line 387 at r8 (raw file):

Previously, jonhoo (Jon Gjengset) wrote…

Should this assert that waiter is null? If it's not null, it'd need to be safely dropped.

Sure, we can do that! This should be guaranteed by the bin lock, but I've stuck an assertion in to be sure.

Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r9.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! I guess now all that remains is merging, and then we're good to go?

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@codecov
Copy link

codecov bot commented Apr 7, 2020

Codecov Report

Merging #72 into master will decrease coverage by 7.68%.
The diff coverage is 72.33%.

Impacted Files Coverage Δ
src/serde_impls.rs 0.00% <0.00%> (ø)
src/set_ref.rs 0.00% <0.00%> (ø)
src/set.rs 58.73% <9.52%> (-41.27%) ⬇️
src/iter/traverser.rs 92.00% <42.85%> (-1.45%) ⬇️
src/map_ref.rs 77.35% <66.66%> (-11.54%) ⬇️
src/map.rs 85.00% <77.72%> (-0.78%) ⬇️
src/node.rs 80.07% <77.90%> (-18.66%) ⬇️
src/raw/mod.rs 87.37% <87.50%> (+0.64%) ⬆️
src/lib.rs 100.00% <100.00%> (ø)
... and 6 more

@domenicquirl
Copy link
Collaborator Author

It is damn confusing to merge in an explicit drop of a node whose creation I removed ^^
Missed some errors because apparently both my IDE plugin and cargo test don't use all features? Gimme a sec.

@domenicquirl
Copy link
Collaborator Author

Checks passing now! 🎉
It's only coverage that's failing. Admittedly, there aren't that many tests for the TreeBins, but the red-black tree stuff is checked by check_invariants in testing (which I don't think counts as an actual test?). The node.rs drop is probably the fault of my changes, set.rs has to be the merged-in changes though?

Would be nice to have some more tests for the more delicate parts of TreeBins, but these are also kind of hart to trigger. For example, it would be nice to have a test which always triggers a writing thread waiting for the additional lock. How do you want to proceed? I can't really tell how much of this is on me and what was merged in.

Copy link
Owner

@jonhoo jonhoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 I think really what we'd want here is to run tests through loom, which could reliably trigger the various interleavings here. Without that, I think the interrim test is going to be using the OneBucketHasher in a test with lots of reader and writer threads just spinning on random keys in a somewhat limited range. Could you add something like that and see if that helps coverage?

Reviewed 5 of 5 files at r10, 1 of 1 files at r11.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@domenicquirl
Copy link
Collaborator Author

Looks like it slightly helps node.rs, but doesn't really help in general :/

src/map.rs Outdated Show resolved Hide resolved
src/map.rs Show resolved Hide resolved
src/map.rs Outdated Show resolved Hide resolved
@jonhoo
Copy link
Owner

jonhoo commented Apr 7, 2020

Looking through the coverage diff, it doesn't actually look too bad. I think one thing that's not being tested is deletes, which may cause untreefication and such. Since the set of keys is small, I think this also maybe mostly ends up testing value overwrites rather than tree restructuring (like rotations), so maybe the keyspace actually needs to be a decent amount larger (and have deletes)?

@domenicquirl
Copy link
Collaborator Author

But do I read the individual file view correctly in that contended_lock was not hit by any test? That would be what we want to trigger...
I (locally) added delete threads, as well as an increased key range, which should help with that. We may run into an issue with test execution times here however:

const NUM_THREADS: usize = 20;
const NUM_REPEATS: usize = 1000;
const NUM_KEYS: usize = 1000;

is the last configuration that runs in a somewhat reasonable time. Both running 1000 repeats with 10000 different keys and 10000 repeats with 1000 different keys takes a minute plus. Both on 10000 took a whopping 10 minutes to run.

@jonhoo
Copy link
Owner

jonhoo commented Apr 7, 2020

Yeah, I feel like we really want loom here 😅
Until we do, all we can hope for is a rough approximation for the contention cases I think sadly.

less threads, but more repetitions and bigger key range
@domenicquirl
Copy link
Collaborator Author

That seems to have done quite a lot still. It's only about half the impact for node.rs now (still -40% for set.rs, but I don't know which changes these are). Overall impact is down to -8.5 from -14.

I think contended_lock is still not being hit? I'm not quite familiar with the interface. If so, that would be a bit sad, but maybe we really need loom for this.

Any more comments or ideas? Otherwise I think we can finally land this!

@jonhoo
Copy link
Owner

jonhoo commented Apr 7, 2020

The stuff in set.rs I think the coverage tool is just confused about, and I would ignore it.

It also isn't hitting things like a bin being removed just as someone tries to insert into it, but I think that's okay, and will require loom. Some of the coverage is also just incorrectly marked as untested, probably due to limitations in the coverage sampler (like unwrap being marked as uncovered because we don't explore the panic branch). We also run into some limitations around all the buckets hashing to 0, which means that they all get moved together in the case of a resize, but that too I think is okay. Similarly, our comparison by hash always returns ==, but that is the most interesting case, so that's fine. It looks like the drop code for BinEntry::Tree [isn't hit], which is odd? contended_lock doesn't seem to be hit, you're right, which is really sad. Not sure how to force that beyond what we're already doing. Maybe fewer writing threads would help, since it lets the reader threads be more aggressive. Might also be we just don't hit it on CI, since it has fewer cores, and likely the many threads this test has can't all run at once. The if p.is_null() clause in find_or_put_tree_val also appears to never be hit? Same with the drop closure in defer_drop_without_values for some reason?

Overall, I'm pretty happy though. We will want loom for this regardless. Maybe try to get coverage for the drop, and try fewer writers (both insert and remove) but with the same number of readers, and then we call it good?

@jonhoo
Copy link
Owner

jonhoo commented Apr 7, 2020

Looks like contended_lock still isn't hit :'(

@domenicquirl
Copy link
Collaborator Author

The find_or_put_tree_val case makes sense to me, since the tree bin should be created from an existing linear bin, in which case TREEIFY_THRESHOLD elements are present initially, including a root. It's one of these cases where it's unclear whether root can even ever be null.

Similarly, I would assume that the drop method is not hit by transfer because it can always re-use the existing bin (because the high bin is empty). It should be hit by untreeify which is covered by the other tests, so is weird that the closure wouldn't be hit. Latest CI run on the new commit does hit the closure.

contended_lock still isn't, yeah, but I don't know how else to force that. With another slight improvement I am in favour of calling it good and waiting for loom to cover this...

@jonhoo jonhoo merged commit dcb5a1c into jonhoo:master Apr 7, 2020
@jonhoo
Copy link
Owner

jonhoo commented Apr 7, 2020

Awesome! Thank you for the amazing work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement the tree-bin optimization
2 participants