Skip to content

Conversation

denis-sh
Copy link
Contributor

Issue URL: http://d.puremagic.com/issues/show_bug.cgi?id=9606

A think-less fix just to prevent nasty random failures.

@alexrp
Copy link
Contributor

alexrp commented Apr 13, 2013

I'm not sure I understand what is going on here? Can you explain more detail?

Also, GC.disable/enable would probably be sufficient if we really need to mess with the GC at all.

@denis-sh
Copy link
Contributor Author

std.signals tries to implement weak references (which we hasn't in druntime, Issue 4151) and fails just like std.stdio.File try to implement RefCounted (fixed now).
We can't disable GC in emit as callbacks may rely on it. Se we just ensure called objects will not be finalized in emit using GC.addRange. This pull doesn't fix the issue as connect and other functions are still have the same bug but it highly decreases probability of failure.

@ghost
Copy link

ghost commented Oct 24, 2013

Since a new signals is going to be put up for review sometime soon perhaps we should avoid adding these tricky changes?

@DmitryOlshansky
Copy link
Member

Going to close in ~ a week on the grounds of new std.signals being ready to go (pending results of voting). No point in half-fixing the broken module that has replacement.

@JakobOvrum
Copy link
Contributor

Going to close in ~ a week on the grounds of new std.signals being ready to go (pending results of voting).

The new std.signals was rejected a few days ago.

@DmitryOlshansky
Copy link
Member

@JakobOvrum
Seems like I've missed the final. I see that it's going to get another round sometime later. I still see little incentive in merging half-fix.

@JakobOvrum
Copy link
Contributor

I see that it's going to get another round sometime later. I still see little incentive in merging half-fix.

Aye, I don't know/care enough about std.signals to evaluate the fix, just wanted you to know about the results :)

@ghost
Copy link

ghost commented Feb 15, 2014

@MartinNowak @Safety0ff @yebblies any thoughts on this?

@MartinNowak
Copy link
Member

What exactly is the problem with finalizing objects during emit?
The object you're currently calling is already pinned by the call itself and the loop is skips slots that are null.

@MartinNowak
Copy link
Member

If you're bug report is about the race condition of emit vs. finalize/unhook, then pinning all objects through GC.addRange is an inappropriate fix.
You should rather synchronize access to slots using a mutex.

@denis-sh
Copy link
Contributor Author

The problem is not working weak-ref implementation. My proposal is to add a slow but working weak-ref in this pull. Synchronize access to stuff is another issue I'm not going to deal with.

Pinning all objects through GC.addRange is an inappropriate fix.

Why? We need objects being pinned and a GC fence anyay I don't see any better way to do it.

@MartinNowak
Copy link
Member

Well, how about naming the issue exact issue first then? All I can read in the bug report and this pull request is very vague.
Here is what I think you're trying to solve.
During iterating the slots in emit one of the connected objects is finalized. It will call unhook from it's dispose event. This is causing troubles, because access to slots isn't synchronized.

@denis-sh
Copy link
Contributor Author

As written in description, this is:

A think-less fix just to prevent nasty random failures.

The entire code is so incorrect that trying to discover "exact issues" is harder than to rewrite it completely. slots_idx and slots elements may be modified from another thread in any time. connect and disconnect functions are fast enough to not cause failures on my machine in general use case, but emit may be very slow and thus triggers race condition. So I propose to disallow objects finalization during emit execution.

The better solution is to add big reg it doesn't work! message in docs and deprecate the module as it can't really be used anyway.

@MartinNowak
Copy link
Member

The entire code is so incorrect that trying to discover "exact issues" is harder than to rewrite it completely.

This is how cargo cult code is born. Please, if there is a race condition and you want simple fix, wrap a mutex around it.

@denis-sh
Copy link
Contributor Author

The entire code is so incorrect that trying to discover "exact issues" is harder than to rewrite it completely.

This is how cargo cult code is born. Please, if there is a race condition and you want simple fix, wrap a mutex around it.

I'm not quite sure I understand what did you mean. I'll answer in case you meant my position is silly:
Yes, let's just add mutextes and pause GC thread sine die. I mean do not think I didn't investigate in this direction. What we need here is a weak reference array (if there were no runtime fast weakref support) or just runtime weakref.

At first sight addition of a mutext in connect/disconnect/unhook/~this with my original changes in emit will do the work with (correct for now) assumption C memory allocation is fast and can't trigger GC collection. But I really don't want to be responsible for such by-hand weakref implementation so I'll just close the pull. Sorry for wasting your time.

@denis-sh denis-sh closed this Feb 19, 2014
@MartinNowak
Copy link
Member

Even if you had weak ref support, if the slots are meant to be accessed by multiple threads they need to be protected by a mutex. But a mutex should be sufficient, no need for the GC addRange stuff. Let's just fix this.

@MartinNowak MartinNowak reopened this Feb 20, 2014
@denis-sh
Copy link
Contributor Author

Even if you had weak ref support, if the slots are meant to be accessed by multiple threads they need to be protected by a mutex.

I don't see any "accessed by multiple threads" in this module except what weak ref covers.

But a mutex should be sufficient, no need for the GC addRange stuff. Let's just fix this.

So good luck with that. I really have no idea how to implement what you propose. As one can see even my carefully written weak reference array implementation uses GC.addRange/GC.removeRange stuff. It will be really interesting to see how you propose to do it as I have spent a lot of time on it and didn't discovered any better way without runtime changes.

@denis-sh denis-sh closed this Feb 21, 2014
@MartinNowak
Copy link
Member

I see what your saying. I forgot, that finalizer are called when all threads are running again (after GC mark suspension).
That makes the original code in this pull request incorrect, because GC.addRange could be called after an object referenced in the slots is already marked as dead.

On adding weak reference support to druntime, I'm all for it, but we need to find good primitives that are portable to different GC implementations. All of this has been discussed in detail before.

@denis-sh
Copy link
Contributor Author

I forgot, that finalizer are called when all threads are running again (after GC mark suspension).
That makes the original code in this pull request incorrect, because GC.addRange could be called after an object referenced in the slots is already marked as dead.

No. Threads are running but GC mutex is still locked until all finalizers are called, so GC.addRange works as GC fence.

@MartinNowak
Copy link
Member

Ah right, sorry for that. I should try to spend more time for reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants