Skip to content
This repository has been archived by the owner on Oct 12, 2022. It is now read-only.

Add a GC.callLocked() function to execute a delegate with the GC lock held. #213

Closed
wants to merge 1 commit into from
Closed

Conversation

alexrp
Copy link
Member

@alexrp alexrp commented May 12, 2012

No description provided.

@alexrp
Copy link
Member Author

alexrp commented May 16, 2012

BTW: I know there's a GSoC project working towards getting rid of the GC lock. Even then, I think having this in place adds value because whether the GC actually has a lock is an implementation detail. A GC without a lock can simply call the delegate straight away.

@alexrp
Copy link
Member Author

alexrp commented May 23, 2012

Ping?

@alexrp
Copy link
Member Author

alexrp commented May 29, 2012

Can we at least merge this for 2.060? It's a ridiculously simple change...

@MartinNowak
Copy link
Member

It's not ridiculously simple.
It adds a long-term maintenance responsibility,
even worse it breaks encapsulation and finally

No description given.

there is not even a rationale for this change.

@alexrp
Copy link
Member Author

alexrp commented May 29, 2012

It adds a long-term maintenance responsibility,

Even libgc has this functionality. It's well-established that it's necessary to implement weak references in conservative, locking GCs. The maintenance responsibility is hardly problematic with the amount of code this adds. Like I said, if the GC ever removes its lock entirely (quite unlikely, really), the delegate can just be called directly. No maintenance burden to be seen here.

even worse it breaks encapsulation and finally

It's a GC. It's low-level. Not everything can (or should) be encapsulated. But I'd actually argue that this does not break encapsulation; in fact, it does the exact opposite: It makes you able to do an operation safely regardless of whether the GC uses a lock or not, i.e. you don't rely on the implementation detail that the GC has a lock. All you do is say "OK, if the GC does have a lock, I've got myself covered".

there is not even a rationale for this change.

It is essential to implementing weak references without using dirty and oftentimes error-prone GC.addrOf() hacks. You have to be able to tell the GC "OK, I'm going to fetch this hidden reference from memory and make it live, so don't do anything while I do this operation".

@MartinNowak
Copy link
Member

Can you elaborate a little more about how your weak ref mechanism works.

Would it be possible to use gc_enable/gc_disable?

@alexrp
Copy link
Member Author

alexrp commented May 29, 2012

Can you elaborate a little more about how your weak ref mechanism works.

First, you allocate a block of memory wide enough to hold a pointer. You then mark this block as BlkAttr.NO_SCAN. Then, you take some reference, cast it to size_t, obscure it (one's complement or whatever works), and finally store it into the memory block you allocated earlier. The object the reference points to is now weakly referenced, and as soon as all strong references to it disappear, it can be collected if the GC runs out of memory.

Now, when you want to fetch the pointer again, you have to tell the GC "I don't want you to collect while I do this tiny operation". So what you do is, you acquire the GC lock, then fetch the reference from memory, restore it to the correct reference (again, just one's complement and you have the original reference), and finally release the GC lock. You need to lock this sequence because you don''t want the object to be collected in the middle of this sequence (e.g. just before you do the restore operation). By the time you've restored the pointer, the object is live again, and releasing the lock is safe.

Would it be possible to use gc_enable/gc_disable?

Consider what would happen if you disable the GC, and start to fetch the weak reference, and in the meantime, some other thread starts filling up the heap: What you end up with is quite likely an irrecoverable out of memory condition because, being disabled, the GC can't do anything. By using a lock, it is ensured that other threads that attempt to allocate while weak references are being fetched can't cause out of memory conditions.

@alexrp
Copy link
Member Author

alexrp commented May 29, 2012

Oh, also, fun fact: For weak references in particular, critical regions (#204) would be sufficient. ;) They tell the GC not to interrupt what the pointer fetch + pointer restore operation is doing, while ensuring out of memory conditions can't be caused as a side-effect, and even better, they don't involve locks!

But that said, critical regions are error-prone, much less flexible, and so on (as mentioned in #204's description). Also, it seems like that one's not being merged for 2.060, so this is a reasonable (and more general) solution to the weak reference problem in particular.

@MartinNowak
Copy link
Member

After descrambling the pointer you'd need to query the GC
if that object is still alive, wouldn't you?

even better, they don't involve locks

gc_enable/gc_disable could be made lock free too.

@alexrp
Copy link
Member Author

alexrp commented May 29, 2012

After descrambling the pointer you'd need to query the GC
if that object is still alive, wouldn't you?

Nope, that won't work. The reason is tricky: By the time you've restored the pointer, it may not necessarily point to the same memory it originally did, even if GC.addrOf(ptr) != null or even GC.addrOf(ptr) == ptr. The chances of this happening are low, but it is possible. Now, even if you restore the pointer atomically, you still have the issue where it could be pointing to memory now used for something else. To work around this issue, rt_attachDisposeEvent() has to be used on the stored object, so that when it is destroyed by the GC, a callback is fired that zeroes out the pointer stored in the weak reference object. That way, you can disambiguate.

It's a tricky sequence to get right...

gc_enable/gc_disable could be made lock free too.

Aren't they already? Anyway, they still have the problem I mentioned about OOM conditions.

@MartinNowak
Copy link
Member

I thought quite a lot about this and concluded that it would be a bad idea to add this functionality.

A GC without a lock can simply call the delegate straight away.

For GC.callLocked to be useful we'd need to specify what operations are locked out.
Your implementation synchronizes with the whole mark/sweep/finalize cycle and your weak ref implementation would already depend on that particular behavior, i.e. future loosening would inevitably break code thus those semantics are a long-term commitment.
Having to synchronize all threads over the whole GC cycle is directly opposed to what we strive for. Sharing that lock with additional computation is even worse. It could also impose high costs onto certain GC improvements, e.g. a separate finalizer thread.

I've sketched out an alternative WeakRef (https://gist.github.com/2852438) implementation. It uses addrOf to synchronize with the GC. That's not perfect either but at least it doesn't acquire new technical debt.

Besides I'm not sure what the state of rt_attachDisposeEvent is, see the recent discussion on the object monitor.
It would be cheaper and more future proof to store a pointer to a shared weak ref in the object itself and clean it
in the finalizer. That would avoid to allocate memory for every weak ptr.
http://channel9.msdn.com/Shows/Going+Deep/C9-Lectures-Stephan-T-Lavavej-Advanced-STL-1-of-n

@alexrp alexrp closed this Jun 10, 2012
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants