Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] How to get previous node of bucket sentinel? #957

Closed
aquacat1220 opened this issue Jun 8, 2024 · 8 comments
Closed

[Question] How to get previous node of bucket sentinel? #957

aquacat1220 opened this issue Jun 8, 2024 · 8 comments
Assignees
Labels
bug Something isn't working question Further information is requested

Comments

@aquacat1220
Copy link

In SplitOrderedList::lookup_buckets(), we should be able to skip "finding" if we have a reference to the sentinel in self.buckets.

AFAIK, self.buckets gives us &Atomic<Node>, which are references to the growable array's internal pointer to the sentinel node. At first, I thought of using these to create the Cursor object to return, but this doesn't make sense, since these references are not the next pointers of list nodes. "The Art of Multiprocessor Programming" doesn't uses cursors when looking up buckets, so doesn't seem to be experiencing this problem.

As long as we are returning a Cursor for lookup_buckets(), it seems like we should have a way to find the previous node of the sentinel. How should that be done? Am I missing something?

@aquacat1220 aquacat1220 added the question Further information is requested label Jun 8, 2024
@Lee-Janggun
Copy link
Member

Lee-Janggun commented Jun 8, 2024

You probably want to take a look at
#938 (comment)
and
#939

@aquacat1220
Copy link
Author

let atomic = self.buckets.get(index,guard);
let shared = atomic.load(SeqCst,guard);
let cursor = Cursor::new(atomic,shared);

The V in Cursor::new() is a generic parameter. For the split ordered list we instantiate it with MaybeUninit<V>.

Copied from #938.

In the code above, atomic is a reference to a pointer in self.buckets, not a reference to a next pointer of the node before shared. The Cursor API seems like it assumes that cursor.prev is a reference to a next pointer of a node in the list, and violating this assumption might lead to unexpected behaviors.

For example, if this cursor is later used to insert a new node, cursor.prev might be CASed to a pointer to a new node, changing the pointer in the growable array, which clearly isn't what we want.

@Lee-Janggun
Copy link
Member

In the code above, atomic is a reference to a pointer in self.buckets, not a reference to a next pointer of the node before shared.

atomic is a reference to a node of self.list, not to a pointer in the growable array; this is blocked at the type level, which returns a Atomic<T>, not Atomic<Segment<T>>. If that is what self.buckets.get() returned, one has an incorrect growable array implementation.

The intention of the Cursor::new() API is to literally use it like the example code I gave you. For example, see:
https://github.com/kaist-cp/cs431/blob/main/src/lockfree/list.rs#L288-L290

It simple loads the head pointer which has type Atomic<Node<K,V>>, essentially same as atomic, and creates a cursor saving those two values. There is nothing special with the head node here, and using a &'g Atomic<Node<K,V>> that references any node in the list will do, including atomic.

@aquacat1220
Copy link
Author

aquacat1220 commented Jun 8, 2024

In the code above, atomic is a reference to a pointer in self.buckets, not a reference to a next pointer of the node before shared.

atomic is a reference to a node of self.list, not to a pointer in the growable array; this is blocked at the type level, which returns a Atomic<T>, not Atomic<Segment<T>>. If that is what self.buckets.get() returned, one has an incorrect growable array implementation.

I thought atomic is not "a reference to a node of self.list", but "a reference to a Atomic<Node> that points to a node of self.list". If atomic was "a reference to a node of self.list", its type must've been &Node. GrowableArray<T> internally stores Atomic<T>s, and give access to them through get().

In the Cursor insert API, we create a new node, make the node point to cursor's curr, and CAS the cursor's prev to point to our new node. This makes sense when we have a cursor created from List::head(), since the cursor's prev is part of the list. Using atomic, returned from GrowableArray<Node>::get(), as a cursor's prev is thus going to cause a problem, since a CAS to atomic will not change self.list, but corrupt self.buckets.

@aquacat1220
Copy link
Author

aquacat1220 commented Jun 8, 2024

BTW, at least on my implementation, this doesn't make the hash table buggy, since even if lookup_buckets() returns a Cursor with an invalid prev, we never insert() a new node into the cursor without a find_XX() before it, and find_XX()s will fix the prev if they encounter a "curr node with a valid tag, but has a smaller key than the target". Our curr is pointing to the bucket sentinel, which is never removed (thus have a valid tag) and is guaranteed to have a smaller key than the target. (We calculate the bucket index to be smaller than the target node anyway.)

TL;DR: This problem won't cause the code to fail.

@Lee-Janggun
Copy link
Member

Lee-Janggun commented Jun 9, 2024

The return type for 'self.buckets.get()' for the split ordered list is '&Atomic<Node<K,MaybeUninit>', which means the returned value is a reference to the next field of some node in 'self.list'. I'm really not sure why you keep on saying using it will cause corruption of the array, which can only happen if it's an internal (children) node of the growable array, which it isn't.

@aquacat1220
Copy link
Author

This is my current understanding on GrowableArray<T>:

A GrowableArray<T> doesn't manage the value of type T, but only stores a pointer to it. Leaf Segment<T>s store Atomic<T> pointers that point to the value of type T. On get() calls, the API returns a reference to the Atomic<T> stored in leaf node segments. This is why GrowableArray<T> doesn't have 'insert()' calls: the 'get()' call already returns a reference to an element (of type Atomic<T> of its leaf segment, and users are expected to modify the pointer to point to a data of type T. (If it returned Atomic<T>, users wouldn't be able to modify the array at all.)

In my understanding, GrowableArray<Node> can't possibly return references to the next field of a node in self.list, because it is not designed to do so. GrowableArray<Node> is designed to point to Nodes, not Atomic<Node>s. If someone wanted to have self.buckets point to the next field of some node in self.list, they would've used self.buckets of type GrowableArray<Atomic<Node>>.

@Lee-Janggun
Copy link
Member

Ah ok I see your point. I agree on your points and think that the current Cursor API is a bit fragile. I think the reason it's ok for this is due to the usage of sentinel nodes that won't be tagged and stuff, but in general it should be fixed. I'll try to fix it ASAP.

@Lee-Janggun Lee-Janggun added the bug Something isn't working label Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants