Conversation
Thanks for your pull request and interest in making D better, @somzzz! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla referencesYour PR doesn't reference any Bugzilla issue. If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog. |
159895e
to
9f42a62
Compare
src/gc/proxy.d
Outdated
} | ||
|
||
alias GCInitNoThrow = void function() nothrow @nogc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this appears to be unused
src/gc/proxy.d
Outdated
{ | ||
instanceLock.lock(); | ||
scope(exit) instanceLock.unlock(); | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary extra scope?
src/gc/proxy.d
Outdated
|
||
fprintf(stderr, "No GC was initialized, please recheck the name of the selected GC ('%.*s').\n", cast(int)config.gc.length, config.gc.ptr); | ||
exit(1); | ||
if (atomicLoad(*pinstance) is null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whenever I do double-checked locking, I like to put a comment :)
// using double-checked locking
src/gc/proxy.d
Outdated
exit(1); | ||
} | ||
|
||
atomicStore(*pinstance, cast(shared GC)auxInstance); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Race condition? setting instance
to "not null" should be the last thing done correct? Otherwise another thread could come in, see that instance
is not null and then continue before thread_init
has finished. However, my comment is assuming that thread_init
should be finished before another thread can start using the GC, is this assumption correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on the startup, there could be multiple external "C" threads calling thread_attachThis
concurrently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MartinNowak so what's the right course of action here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mutual dependency between gc_init and thread_init isn't properly resolved.
if (atomicLoad(*pinstance) is null))
{
instanceLock.lock();
scope (exit) instanceLock.unlock();
if (atomicLoad(*pinstance) is null))
{
// init
// ...
atomicStore(*pinstance, cast(shared GC)auxInstance);
// thread_init <- should be moved before storing pInstance or some other thread could start to use the GC before thread_init finishes
}
}
The problem is that thread_init depends on the GC, but any GC usage depends on thread_init (for suspending).
Let's figure out whether we can avoid the GC for Thread.getThis, if not then pass auxInstance
directly to thread_init.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And BTW, use weaker (and much cheaper) orderings.
if (atomicLoad!(MemoryOrder.acq)(*pInstance) is null)
{
// lock...
if (atomicLoad!(MemoryOrder.acq)(*pInstance) is null)
{
// init...
atomicStore!(MemoryOrder.rel)(*pInstance, auxInstance);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MartinNowak let's make it most conservative and optimize in a subsequent step thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nastier than it looks.
thread_init
calls thread_attachThis
which calls GC.disable
and GC.enable
.
https://github.com/dlang/druntime/blob/master/src/core/thread.d#L2138
While gc_init
has not yet finished, the spinlock is still locked. Calling disable
will redirect to gc_init
(with the instance
still null
because store
hasn't been done yet) -> deadlock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, just directly pass the newly created GC instance to thread_init
, so that thread_attachThis
can directly access the GC interface instead of calling the global gc_disable
/gc_enable
functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works for gc_disable
/gc_enable
.
However, the code then creates an object new Thread
which will become a call to _d_newclass
in lifetime.d
.
https://github.com/dlang/druntime/blob/master/src/rt/lifetime.d#L93
This function calls GC.malloc
. I can't pass the instance there and the code hangs on the spinlock (as GC.malloc
tries to call gc_init
).
A slightly ugly workaround is to create a static tlsInstance
for the thread doing the initialization and use this one, if available, in gc_enable
, gc_disable
and gc_malloc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_d_newclass
should forward to a function that accepts a gc instance, and then instead of calling new Thread
, call the function directly (with the auxInstance
of course).
src/gc/proxy.d
Outdated
import core.stdc.stdlib : exit; | ||
|
||
fprintf(stderr, "No GC was initialized, please recheck the name of the selected GC ('%.*s').\n", cast(int)config.gc.length, config.gc.ptr); | ||
instanceLock.unlock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've unlocked twice! (scope(exit) instanceLock.unlock();
above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes please delete
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason, if unlock is not called before exit(1)
, this test hangs: druntime/test/exceptions/src/unknown_gc.d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uh oh, that may be a bug with D's scope(exit) implementation (anyone correct me if I'm wrong). Given this, I would suggest not using scope(exit)
here and instead insert instancelock.unlock()
at every "exit point". There should just be 2, just before calling exit
and at the end of the block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW we have a nice language feature for exactly this:
synchronized (instanceLock)
{
// code here
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If scope(exit)
doesn't work with the exit
function, my guess is that this might suffer from the same bug. @somzzz can you try this out and confirm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, exit()
doesn't return, so it can't exit the scope. This seems like the code is correct to me...
scope(exit)
only works if the scope is exited, via stack unwinding or normal execution. Calling exit
function is neither.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep scope(exit) and also unlock explicitly before calling exit thx
src/gc/proxy.d
Outdated
instanceLock.lock(); | ||
scope(exit) instanceLock.unlock(); | ||
{ | ||
if (atomicLoad(*pinstance) is null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think atomicLoad
is necessary once inside the lock, since the value is guaranteed not to be modified while a thread is inside the lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether you have tried the following trick of using a variable to cache and thus providing as fast, hot-path for the default case:
__gshared gcHasBeenInitialized;
if (!gcHasBeenInitialized)
{
// do your normal locking here
gcHasBeenInitialized = true;
}
TLS variables could be faster, but you could benchmark this for dynamic modules too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what that is, I believe, only fixed to be correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(A TLS cache can be made thread-safe without atomics, of course.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would love to see a performance comparison on TLS vs atomic double-checked locking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect that to be hard to do as a microbenchmark on x86 in a sensible fashion, since it depends a lot on non-local considerations. If there is no contention on the cache line, I'd expect double checked locking to be at least as fast (acq loads are just simple movs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double-checked locking should be faster, TLS reads go through an extra indirection and can be very slow if TLS is emulated. Just reading a shared variable by multiple threads doesn't cost much (except maybe for the non-inlined atomicRead!acq
overhead with dmd).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@somzzz let's conservatively leave atomicLoad here and review it in a future pass thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, precisely. On x86_64 TLS can pretty much only be faster if the indirection is cheap (i.e. static TLS model) and the other variant is burdened down by concurrent writes to other data on the same cache line (hence my above comment). I wouldn't worry about it.
src/gc/proxy.d
Outdated
thread_init(); | ||
void gc_init_nothrow() nothrow | ||
{ | ||
try |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gc_init_throw
is going to be called often. Rather than incurring the overhead of a try/catch
for every gc_init_nothrow
, this should probably go inside the double checked lock. That way you only have to pay the price of a try/catch once, when you initialize the GC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, use sth. similar to initOnce
.
Instead of a separate flag
variable, you could directly load/store the shared instance
variable. Also seems like the spinLock would be fine to be used as mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the purpose here was just to trick nothrow
.
A short idiom for that is the following.
void foo() nothrow
{
scope (failure) assert(0, "unexpected exception");
stmt; // not yet nothrow annotated
}
Calling abort is cleaner as it prints a better error message, so you might just want to stick with the current code. In that case please add a comment that gc_init should be annotated nothrow instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand, try/catch
and scope(failure)
incur some amount of overhead. By moving this inside the lock we only have to incur it once. I still recommend doing so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on @MartinNowak's suggestions, initOnce
looks like it could work here so long as you use a mutex and wrap the init
argument in a tryCatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initOnce is in Phobos, which is the reason I didn't suggest it earlier. We cut put something like it in druntime, of course, for internal use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dmd is fairly bad at optimizing initOnce, so doing it manually seems fine here.
src/gc/proxy.d
Outdated
@@ -68,132 +103,159 @@ extern (C) | |||
// NOTE: Due to popular demand, this has been re-enabled. It still has | |||
// the problems mentioned above though, so I guess we'll see. | |||
|
|||
instance.collectNoStack(); // not really a 'collect all' -- still scans | |||
// static data area, roots, and ranges. | |||
if (instance !is null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with who calls gc_term
. Would it be better to make this synchronized like you have done in gc_init
? Or is this only going to be called by a single thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gc_term
is called in dmain2.d->rt_term
after thread_joinAll
.
https://github.com/dlang/druntime/blob/master/src/rt/dmain2.d#L223
Thanks for making this change. I like the idea. |
src/gc/proxy.d
Outdated
instance.enable(); | ||
} | ||
|
||
void gc_disable() | ||
{ | ||
assert(instance !is null); | ||
instance.disable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this break code that calls GC.disable
at the beginning of the program? Sometimes it's useful to do this and schedule your own calls to GC.collect
; I've done this before to optimize GC performance.
Would it be better to also initialize the GC in gc_enable
/ gc_disable
, in addition to doing it upon the first memory allocation? IOW, initialize the GC the first time it's used in some way, whether it's a memory allocation or enable/disable, etc..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is a use case for it, then yes - it should be initialized.
Which of these methods do you think should initialize the GC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely GC.disable as it's fairly common. Don't think anyone has explicitly called GC.enable
yet. Even better if we could memorize GC.disable and lazily initialize and disable the GC if it's used later on. It would be weird to initialize a GC just to disable it.
Don't make things too complicated though, when in doubt just initialize the GC on any access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... just had a weird idea.
What if the instance is initialized with a GC that when first used, figures out what GC to allocate, sets the 'instance' to the new one, and calls the appropriate function on it. Then we can handle things like a bool to track the disabling without having to initialize until someone tries to call malloc, etc.
Then once the real instance is in place, there is no longer any checking for initialization.
You'd still need to atomically load the instance every time you use it, unfortunately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the methods requiring that behavior disable
, addRoot, removeRoot, rootIter
, addRange, removeRange, rangeIter
? Sounds like a reasonable approach for lazy initialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, that just requires an atomicLoad!(MemoryOrder.acq)
, and that's free on strongly-ordered CPUs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would initialize a full GC on anything that requires storing an array of things to do, because that involves allocation. Just having a global bool is OK to cache until you need the whole GC.
that's free on strongly-ordered CPUs
Right, x86 doesn't have issues with loads of a word, so those will be fine. It's the other architectures I don't know enough about how the performance might be. I would leave that decision/discussion to those who know more :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would initialize a full GC on anything that requires storing an array of things to do
Hm... I guess if you need to allocate using the C heap you can, as long as the behavior matches what the GC would actually do. You might even pass those into the GC initializer, and it would just take over ownership of the arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, that just requires an atomicLoad!(MemoryOrder.acq), and that's free on strongly-ordered CPUs.
Please KISS (Keep It Simple and conServative) for now. Then we'll take a separate PR with loads and stores tightened. If we ever have trouble with those we'll be able to track that easily. Thanks!
src/gc/proxy.d
Outdated
import core.stdc.stdlib : exit; | ||
import core.atomic : atomicLoad, atomicStore; | ||
|
||
auto pinstance = cast(shared(GC)*) &instance; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that if you remove the second atomicLoad
(see comment below). I believe you could just do
if (atomicLoad(instance))
and you don't need the pinstance
variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marler8997 I think that won't work because atomicLoad expects a ref to a shared object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, the ugly startup order has long been lingering around. I guess we want to add a test to druntime/test/
that checks that a @nogc
main can be run without initializing the GC.
There is already unknown_gc which should now require a GC allocation to pass, but you can adopt it to check that no GC is ever initialized.
src/gc/proxy.d
Outdated
|
||
// NOTE: The GC must initialize the thread library | ||
// before its first collection. | ||
thread_init(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you indent to do about Thread.getThis
? It is currently a GC allocated instance that is eagerly created in thread_init
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That may be why *pinstance had to be set just above! Seems fragile. Perhaps we can not use the GC for getThis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can currently retain a Thread object longer than the threads lives, e.g. there is even a Thread.join methods, so there isn't a clear owner of that instance. Not too easy to avoid the GC here.
src/gc/proxy.d
Outdated
thread_init(); | ||
void gc_init_nothrow() nothrow | ||
{ | ||
try |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, use sth. similar to initOnce
.
Instead of a separate flag
variable, you could directly load/store the shared instance
variable. Also seems like the spinLock would be fine to be used as mutex.
src/gc/proxy.d
Outdated
exit(1); | ||
} | ||
|
||
atomicStore(*pinstance, cast(shared GC)auxInstance); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on the startup, there could be multiple external "C" threads calling thread_attachThis
concurrently.
|
src/gc/proxy.d
Outdated
} | ||
|
||
alias GCInitNoThrow = void function() nothrow @nogc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make it private
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(if used)
src/gc/proxy.d
Outdated
import core.stdc.stdlib : exit; | ||
import core.atomic : atomicLoad, atomicStore; | ||
|
||
auto pinstance = cast(shared(GC)*) &instance; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marler8997 I think that won't work because atomicLoad expects a ref to a shared object
src/gc/proxy.d
Outdated
import core.stdc.stdlib : exit; | ||
|
||
fprintf(stderr, "No GC was initialized, please recheck the name of the selected GC ('%.*s').\n", cast(int)config.gc.length, config.gc.ptr); | ||
instanceLock.unlock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes please delete
src/gc/proxy.d
Outdated
exit(1); | ||
} | ||
|
||
atomicStore(*pinstance, cast(shared GC)auxInstance); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MartinNowak so what's the right course of action here?
src/gc/proxy.d
Outdated
|
||
// NOTE: The GC must initialize the thread library | ||
// before its first collection. | ||
thread_init(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That may be why *pinstance had to be set just above! Seems fragile. Perhaps we can not use the GC for getThis?
That's the main blocker to make this work https://github.com/dlang/druntime/pull/2057/files#r163907886. |
This may have been lost in the noise, so I'll link to it here: Might be a viable option that eliminates some of the possible performance drawbacks. |
@schveiguy I'm actually working an an alternative that explores your idea. Will make a PR soon if it turns out well. |
src/gc/proxy.d
Outdated
|
||
fprintf(stderr, "No GC was initialized, please recheck the name of the selected GC ('%.*s').\n", cast(int)config.gc.length, config.gc.ptr); | ||
instanceLock.unlock(); | ||
exit(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should add an assert(0);
here to note that exit(1)
shouldn't return.
src/gc/proxy.d
Outdated
instanceLock.lock(); | ||
scope(exit) instanceLock.unlock(); | ||
{ | ||
if (atomicLoad(*pinstance) is null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@somzzz let's conservatively leave atomicLoad here and review it in a future pass thx
src/gc/proxy.d
Outdated
import core.stdc.stdlib : exit; | ||
|
||
fprintf(stderr, "No GC was initialized, please recheck the name of the selected GC ('%.*s').\n", cast(int)config.gc.length, config.gc.ptr); | ||
instanceLock.unlock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep scope(exit) and also unlock explicitly before calling exit thx
src/gc/proxy.d
Outdated
exit(1); | ||
} | ||
|
||
atomicStore(*pinstance, cast(shared GC)auxInstance); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MartinNowak let's make it most conservative and optimize in a subsequent step thx
src/gc/proxy.d
Outdated
instance.enable(); | ||
} | ||
|
||
void gc_disable() | ||
{ | ||
assert(instance !is null); | ||
instance.disable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't make things too complicated though, when in doubt just initialize the GC on any access.
let's do that now, refine later
Thanks for noticing ;-) I think with the switching proxy object no atomics are needed to access the instance pointer in the general case if pointer writes cannot appear as partial operations in other threads (which should cover ost architectures). The synchronization is only needed in the switching proxy, i.e. it is a one time operation and doesn't need to be very fast. The disable state of the GC can be saved to gc.config.disable before atually initializing the GC. BTW: how about not allocating any pool during GC initialization. That would have almost the same effect as not initializing it, but a few tiny allocations. |
I realized I omitted a discussion I've had with @somzzz. Next on her list to move the code that adds roots in So please let's get this in without more features. Accumulating |
src/gc/impl/proto/gc.d
Outdated
class ProtoGC : GC | ||
{ | ||
__gshared Array!Root roots; | ||
__gshared Array!Range ranges; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do those really need to be __gshared
?
As said earlier, we have |
@MartinNowak in wake of the upcoming removal of |
That sounds nice in theory, but you're likely underestimating the complexity in that area, because GC roots are dynamically added/removed when loading/unloading shared libraries. So instead of conflating a complex with another complex, let's just go for the straightforward solution and cache roots/ranges until lazily initializing the GC. |
@MartinNowak meet on slack? |
After discussing, the shared lib example is quite compelling. @somzzz will implement the changes with this PR. |
e451159
to
b99629e
Compare
b99629e
to
6d7fefd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks great.
Another nice thing that comes with this PR, once #1924 is implemented, applications wouldn't even need to link against any GC. |
Congrats @somzzz and many thanks to all reviewers! |
Yes, thanks for this important step towards a no-GC language, and thanks @schveiguy for the nice ProtoGC idea. |
Thanks for the credit, I also have to mention that @rainers had the same idea (before I did), but just expressed it in a different way here: #2057 (comment). |
size_t reserve(size_t size) nothrow | ||
{ | ||
gc_init_nothrow(); | ||
return reserve(size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops! It should have been gc_reserve
|
||
void addRange(void* p, size_t sz, const TypeInfo ti = null) nothrow @nogc | ||
{ | ||
ranges.insertBack(Range(p, p + sz, cast() ti)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem thread-safe – are we sure there can never be a situation where ProtoGC sticks around long enough to be racy (e.g. after improving druntime thread startup to be nogc and using statically allocated Thread
instances, or when attaching to multiple C threads)?
Has this been thoroughly thought through for race conditions in general, especially considering architectures where loading the global instance
doesn't automatically have acquire semantics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good point. Likely you need to take the lock. I'm wondering if adding/removing ranges or roots should simply initialize the GC at that point.
I thought the instance should be atomically loaded, see Martin's suggestion here. But that doesn't help if you need atomic array appending.
@klickverbot thanks for the reminder. @somzzz is off to a scholarship in Singapore so she's off the project. The concern about addRange seems legit. Can you please create a bugzilla issue? Thanks! |
I've created an "umbrella" issue – no time to delve into the code right now to figure out what exactly is broken right now. |
This change causes a regression in phobos Array container. |
No, the point of The problem causing the issue is that |
Fixing PR: #2220 |
This PR removes the requirement of having the GC initialized at startup. Instead,
gc_init
is called when the GC is first invoked.