Reclaim inactive participant data #70

gereeter · 2016-04-29T23:21:27Z

The code that was removed in #42 was almost correct - it removed inactive participants from the list and tried to guard against multiple treads removing the same node. However, it guarded for that in an exactly backwards way: unlinked needs to be called by the first thread to remove the node, and so should run if the compare_and_swap succeeded. However, since compare_and_swap returns the previous value, it succeeds iff it returns false, not true. Thus, nodes were only unlinked by threads that weren't alone in removing the same participant. This also explains why the bug was so hard to exhibit - to get a double free, three threads had to simultaneously remove the same node.

I tested this with the slammer script posted in #42 - while the buggy code would segfault quickly (within a minute), I left the slammer running with this patched applied for over half an hour and it never failed.

This is implemented using a variation on Tim Harris's linked list removal algorithm.

Fixes #37.

msullivan · 2016-04-30T01:02:53Z

Lurr, the compare_and_swap inversion is pretty silly. That said, I'm pretty sure I convinced myself that the algorithm was wrong even without that inversion. I'll think about it some.

msullivan · 2016-04-30T01:14:55Z

Yeah, I think that this can leave unlinked nodes in the list:
Imagine you have a list head -> A -> B -> <whatever>.
T1 is iterating through it and has next = &A->next. Then, thread A exits, and T2 comes along, sees that A has exited, unlinks A from the list, and then iterates down the rest and wraps up. The list is now head -> B -> <whatever> but T1 still has a pointer to A and so is looking at a list like: A -> B -> <whatever>.

Then thread B exits. So T1 unlinks it from A, sets the flag, and unlinks it, then exits its loop because it has reached the end of the list.

But B only got unlinked from A, which isn't on the actual list anyways. So the thread list remains head -> B -> <whatever> even though B has been unlinked. Run through a couple collections and B can get reused as something else and everything will be super sad.

…ng the low order bits of the pointer

…inked list removal algorithm. Fixes crossbeam-rs#37.

gereeter · 2016-05-01T18:44:24Z

Right - just because I found one bug doesn't mean there aren't others.

I completely scrapped the old code and used essentially Tim Harris's linked list removal algorithm instead. This involved introducing a new MarkedAtomic type.

gereeter · 2016-05-01T18:47:11Z

Oddly, though, the slammer script, even after running for at least 16 hours on my original attempt (which, as you pointed out, was still buggy), never reported a crash.

msullivan · 2016-05-01T22:19:42Z

This seems plausible to me. Am I right in thinking that the Scan for an active node part is an optimization and would work if it always just picked the next node instead? (Since the active node that gets found could of course become inactive immediately after the check.) Perhaps it would be worth the code simplification to ditch that?

Maybe we need a test script that more heavily stresses thread creation/destruction to make these bugs more visible.

arthurprs · 2017-01-16T15:24:36Z

Is there anything specific blocking this? Otherwise it's a great fix.

Vtec234 · 2017-01-16T21:51:57Z

Hey! It appears that this already adds the functionality I proposed in #102. It also has support for more things, notably arbitrarily sized markers and or_mark. OTOH, my branch has IMO better docs and a function I need - from_ptr. Ideally we could keep the best parts. If you are willing to merge it, I can write up a patch adding my functionality to this PR and then I could close mine.

gereeter · 2017-01-17T01:27:24Z

I... completely forgot about this PR. As far as I know, there isn't anything really blocking this.

@msullivan I believe that it is just an optimization, but I think I'd prefer to just stick as close to Tim Harris's described algorithm as possible.

@Vtec234 That sounds reasonable to me.

…ypes not aligned to powers of 2

Additions to markable atomic type

gereeter · 2017-01-20T19:45:18Z

Thanks to @Vtec234, I think the TaggedAtomic type introduced here is good for general consumption, beyond just fixing the leak of participant data.

gereeter · 2017-03-07T05:23:17Z

Ping. It'd be nice to have #37 fixed.

aturon · 2017-03-09T17:22:17Z

FYI, https://internals.rust-lang.org/t/crossbeam-request-for-help/4933 -- I'm actively looking to hand over maintenance, as I simply do not have the time to give this library attention.

aturon · 2017-03-09T17:38:51Z

@arthurprs @gereeter @msullivan in particular I'd love to bring any of you on as maintainers, if you're game.

jeehoonkang · 2017-04-07T18:06:54Z

Hi! I just wonder if we could merge atomic.rs and tagged_atomic.rs. Theoretically TaggedAtomic provides strictly larger set of features than Atomic, and it would be burdensome to maintain two copies of the common functionalities.

We can simply name it Atomic, as it will not be a breaking change.

Vtec234 · 2017-04-07T20:17:54Z

~~Can do.~~

EDIT: On second thought, I am not sure this is a good idea. Notice that e.g. load on Atomic returns Shared, while on TaggedAtomic it returns (Shared, usize). There are more instances of this. Changing load would be a breaking change. An option is to add load_with_tag and make load just ignore that tag. There is (probably negligible) overhead related to decoding the pointer, but more importantly grafting tagging functionality onto the Atomic type might be confusing. There are also problems like 'how do we deal with store'? Overwriting just the pointer while preserving the tag requires a CAS, which is costly, and ignoring the tag (maybe defaulting it to 0 when just the pointer is set) might not be what we want and would have to be explained in the docs, making it all the more confusing when you don't care about tagging. So I am in favour of slightly more maintenance burden, but cleaner API.

arthurprs · 2017-04-07T20:43:43Z

Is it a good idea? I'm not sure all methods interact well across tag/normal use.

gereeter added 2 commits May 1, 2016 13:39

Add MarkedAtomic, a more general version of Atomic that support marki…

7b2501b

…ng the low order bits of the pointer

Reclaim inactive participant data using a variation on Tim Harris's l…

7f45314

…inked list removal algorithm. Fixes crossbeam-rs#37.

gereeter force-pushed the reclaim-participants branch from 561888b to 7f45314 Compare May 1, 2016 18:41

Relax the memory ordering of cleaning up inactive participant data

e9ada8c

arthurprs mentioned this pull request Jan 16, 2017

Add MarkableAtomic #102

Closed

Vtec234 and others added 6 commits January 17, 2017 12:36

Additions to markable atomic type

df5752f

markable atomic: add and_mark and cosmetics changes

7ac52b5

Change MarkableAtomic to TaggedAtomic

2c64fe7

tagged_atomic: rename bitwise fns, fix bitwise and, add support for t…

4277203

…ypes not aligned to powers of 2

tagged_atomic: docs, add xor

a97d4c0

Merge pull request #1 from Vtec234/reclaim-participants

6cf0075

Additions to markable atomic type

jeehoonkang mentioned this pull request Jan 20, 2018

Use subcrates #169

Merged

jeehoonkang closed this in #169 Feb 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reclaim inactive participant data #70

Reclaim inactive participant data #70

gereeter commented Apr 29, 2016 •

edited

Loading

msullivan commented Apr 30, 2016

msullivan commented Apr 30, 2016

gereeter commented May 1, 2016

gereeter commented May 1, 2016

msullivan commented May 1, 2016

arthurprs commented Jan 16, 2017

Vtec234 commented Jan 16, 2017

gereeter commented Jan 17, 2017

gereeter commented Jan 20, 2017

gereeter commented Mar 7, 2017

aturon commented Mar 9, 2017

aturon commented Mar 9, 2017

jeehoonkang commented Apr 7, 2017

Vtec234 commented Apr 7, 2017 •

edited

Loading

arthurprs commented Apr 7, 2017

Reclaim inactive participant data #70

Reclaim inactive participant data #70

Conversation

gereeter commented Apr 29, 2016 • edited Loading

msullivan commented Apr 30, 2016

msullivan commented Apr 30, 2016

gereeter commented May 1, 2016

gereeter commented May 1, 2016

msullivan commented May 1, 2016

arthurprs commented Jan 16, 2017

Vtec234 commented Jan 16, 2017

gereeter commented Jan 17, 2017

gereeter commented Jan 20, 2017

gereeter commented Mar 7, 2017

aturon commented Mar 9, 2017

aturon commented Mar 9, 2017

jeehoonkang commented Apr 7, 2017

Vtec234 commented Apr 7, 2017 • edited Loading

arthurprs commented Apr 7, 2017

gereeter commented Apr 29, 2016 •

edited

Loading

Vtec234 commented Apr 7, 2017 •

edited

Loading