spinlock for GC #1153

MartinNowak · 2015-02-03T01:16:30Z

less overhead than pthread_mutex
uses test and test-and-set algorithm with configurable backoff

- less overhead than pthread_mutex - uses test and test-and-set algorithm with configurable backoff

MartinNowak · 2015-02-03T01:19:39Z

master

R tree1            0.843 s,    22 MB,   68 GC  416 ms, Pauses  298 ms <    7 ms
R conmsg           0.866 s,     5 MB,  191 GC   47 ms, Pauses    7 ms <    0 ms
R huge_single      0.028 s,  1501 MB,    3 GC    1 ms, Pauses    0 ms <    0 ms
R tree2            1.223 s,     1 MB,  216 GC   90 ms, Pauses    3 ms <    0 ms
R concpu           0.112 s,     5 MB,   13 GC    6 ms, Pauses    6 ms <    4 ms
R testgc3          2.243 s,   210 MB,   15 GC  564 ms, Pauses  419 ms <   99 ms
R conalloc         0.113 s,     5 MB,   14 GC    2 ms, Pauses    2 ms <    0 ms
R conappend        0.031 s,     5 MB,    4 GC    0 ms, Pauses    0 ms <    0 ms
R words            1.249 s,   341 MB,    9 GC   31 ms, Pauses   31 ms <   16 ms
R rand_large       0.649 s,    92 MB, 3820 GC  270 ms, Pauses  112 ms <    0 ms
R dlist            2.109 s,    22 MB,   53 GC  303 ms, Pauses  186 ms <    9 ms
R rand_small       0.741 s,    12 MB, 2032 GC  394 ms, Pauses  209 ms <    0 ms
R slist            2.043 s,    22 MB,   53 GC  259 ms, Pauses  144 ms <    4 m

spinLock

R tree1            0.786 s,    22 MB,   68 GC  407 ms, Pauses  296 ms <    7 ms
R conmsg           0.749 s,    12 MB,   33 GC   82 ms, Pauses   32 ms <    1 ms
R huge_single      0.028 s,  1501 MB,    3 GC    1 ms, Pauses    0 ms <    0 ms
R tree2            1.144 s,     1 MB,  216 GC   82 ms, Pauses    3 ms <    0 ms
R concpu           0.112 s,     5 MB,   14 GC    4 ms, Pauses    4 ms <    1 ms
R testgc3          2.219 s,   210 MB,   15 GC  572 ms, Pauses  422 ms <   99 ms
R conalloc         0.111 s,     5 MB,   12 GC    2 ms, Pauses    1 ms <    1 ms
R conappend        0.034 s,     5 MB,    5 GC    0 ms, Pauses    0 ms <    0 ms
R words            1.271 s,   341 MB,    9 GC   48 ms, Pauses   47 ms <   25 ms
R rand_large       0.676 s,    92 MB, 3820 GC  301 ms, Pauses  112 ms <    0 ms
R dlist            2.040 s,    22 MB,   53 GC  304 ms, Pauses  181 ms <    9 ms
R rand_small       0.722 s,    12 MB, 2032 GC  388 ms, Pauses  210 ms <    0 ms
R slist            1.985 s,    22 MB,   53 GC  263 ms, Pauses  142 ms <    4 ms

rainers · 2015-02-03T08:11:46Z

I was tempted to do something similar when seeing the API profiling (#1147). It seems a lot of people tell you not to try to roll your own locking methods, though.

rainers · 2015-02-03T08:16:54Z

src/core/internal/spinlock.d

+    static if (X86)
+    {
+        enum pauseThresh = 16;
+        void pause() { asm @trusted nothrow { rep; nop; } }


Most prefer _mm_pause() here, i.e. pause as instruction.

That's the same and DMD doesn't know pause.
http://stackoverflow.com/questions/7086220/what-does-rep-nop-mean-in-x86-assembly

I (mis?)read somewhere that pause would require less power than rep nop, but if they have the same machine code, that's obviously nonsense.

I didn't knew either. First tried to use pause (Issue 14120) then found a REP_NOP macro somewhere, which the disassembler showed me as pause.

It's a clever encoding, in that it preserves the semantics on older hardware.
Same as the HLE prefixes.
http://www.felixcloutier.com/x86/XACQUIRE:XRELEASE.html

rep; nop is supported by older x86 chips. Which is a bonus.

MartinNowak · 2015-02-03T08:33:30Z

It seems a lot of people tell you not to try to roll your own locking methods, though.

Because it has a lot of gotchas, the dead-lock of the testers being one of them.
Will close this temporarily.

MartinNowak · 2015-02-03T09:37:59Z

I thought we wouldn't need a recursive lock, but maybe we do.

MartinNowak · 2015-02-03T10:52:23Z

It's the newly added runFinalizer tests in rt.lifetime. They try to produce a FinalizeError without an InvalidMemoryError. That revealed, that I need to set gcx.running during runFinalizers. Now the irony is, that even throwing a preallocated exception allocates a TraceInfo class which then fails and triggers an InvalidMemoryError. So it's not possible to see a FinalizeError unless you override the traceHandler.

MartinNowak · 2015-02-03T21:12:50Z

Still fighting with the FinalizeError issues.
MartinNowak@f4b989a

I need to explicitly catch Error and rethrow them whenever finalizers are run so that I can perform at least some basic cleanup. That still might leave the GC in an invalid state.
That whole stuff just used to work accidentally.

MartinNowak · 2015-02-03T21:16:14Z

How about this instead, we just kill the process on FinalizeError and print the stack trace?
I don't really want to make the GC Exception/Error safe as that costs performance and adds complexity.

MartinNowak · 2015-02-03T21:19:59Z

We hit a similar problem with Threads and nothrow functions before.
Issue 7018 – thrown Error from different thread should lead to program abort

rainers · 2015-02-03T21:32:15Z

even throwing a preallocated exception allocates a TraceInfo class

Ouch. How did we miss that so far? So much for the @nogc attributes of onOutOfMemoryError and onInvalidMemoryOperationError. Could the default trace handler use C-malloced memory instead?
BTW: printing a stack trace needs the GC again, e.g. in toString. I'm not sure how this can work if the Error was raised from within the GC itself. There is no proper cleanup due to nothrow attributes anyway.

MartinNowak · 2015-02-03T21:37:34Z

Exactly, it's a mess.

Could the default trace handler use C-malloced memory instead?

Yes, but only because finalization is done after thread_resumeAll.

rainers · 2015-02-03T21:38:22Z

How about this instead, we just kill the process on FinalizeError and print the stack trace?

Sounds reasonable. It might need a lot of changes to the stack tracing though, as it relies on the GC quite a bit.

MartinNowak · 2015-02-03T21:41:27Z

We should probably delay the whole story for 2.068, because it seems pretty risky. It's only a tiny gain and I'm still working on thread local caches which heavily reduces the lock contention, making the gain even smaller.

rainers · 2015-02-03T21:48:15Z

Yeah, delaying this is probably better. Wouldn't most problems disappear if we'd worked towards moving finalization out of the GC lock and allowing allocation in destructors?

MartinNowak · 2015-02-03T22:02:25Z

Wouldn't most problems disappear if we'd worked towards moving finalization out of the GC lock and allowing allocation in destructors?

That's definitely something I want to do. Not sure if we can move it out of the lock though.
I also thought about parallelizing finalization, but it's probably not worth the trouble.

MartinNowak · 2015-12-01T01:29:59Z

Wouldn't most problems disappear if we'd worked towards moving finalization out of the GC lock and allowing allocation in destructors?

Yes, but it required quite some tricks to allow concurrent access to the freebits metadata.
Maybe it could be done by splitting the sweep phase into a finalize and free phase and do the latter with the lock hold during recover. In any case quite some work.

MartinNowak · 2015-12-01T01:39:59Z

Reopened #1447.

spinlock for GC

514b8b1

- less overhead than pthread_mutex - uses test and test-and-set algorithm with configurable backoff

rainers reviewed Feb 3, 2015
View reviewed changes

MartinNowak closed this Feb 3, 2015

MartinNowak modified the milestones: 2.068, 2.067 Feb 3, 2015

MartinNowak added the GC garbage collector label Mar 7, 2015

MartinNowak modified the milestones: 2.069, 2.068 Sep 9, 2015

MartinNowak mentioned this pull request Dec 1, 2015

spinlock for GC #1447

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spinlock for GC #1153

spinlock for GC #1153

MartinNowak commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

rainers commented Feb 3, 2015

rainers Feb 3, 2015

MartinNowak Feb 3, 2015

rainers Feb 3, 2015

MartinNowak Feb 3, 2015

ibuclaw Feb 13, 2016

MartinNowak commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

rainers commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

rainers commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

rainers commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

MartinNowak commented Dec 1, 2015

MartinNowak commented Dec 1, 2015

spinlock for GC #1153

spinlock for GC #1153

Conversation

MartinNowak commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

rainers commented Feb 3, 2015

rainers Feb 3, 2015

Choose a reason for hiding this comment

MartinNowak Feb 3, 2015

Choose a reason for hiding this comment

rainers Feb 3, 2015

Choose a reason for hiding this comment

MartinNowak Feb 3, 2015

Choose a reason for hiding this comment

ibuclaw Feb 13, 2016

Choose a reason for hiding this comment

MartinNowak commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

rainers commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

rainers commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

rainers commented Feb 3, 2015

MartinNowak commented Feb 3, 2015

MartinNowak commented Dec 1, 2015

MartinNowak commented Dec 1, 2015