Don't register variables on the stack as global #7

Merged
merged 1 commit into from Nov 17, 2012

2 participants

@dbussink

By definition, VALUE pointers on the stack can never be a global variable.
The location goes out of scope when the function is executed and
can point at random data.

On the other hand, the error class is used as global so mark it as such.

@dbussink dbussink Don't register variables on the stack as global
By definition, VALUE pointers on the stack can never be a global variable.
The location goes out of scope when the function is executed and
can point at random data.

On the other hand, the error class is used as global so mark it as such.
159b6e0
@dbussink dbussink referenced this pull request in rubinius/rubinius Nov 11, 2012
Closed

Syntax error: h = {}; h[:a] = b = sleep 1 #2004

@dbussink

With this change I could run the sequel specs successfully with sequel_pg installed.

@jeremyevans
Owner

Thanks for looking into this and fixing the issue. I'll merge and test this later this week. I'm assuming this isn't an issue on MRI because it doesn't actually use the pointer (just checks in when GCing), while rbx actually uses the pointer for something?

@dbussink

Rubinius has a precise GC. So it needs to know about every object. Therefore it also needs to keep track of any handles to objects people use in the C api. That means that you should always point at valid memory, because Rubinius expect you to point at valid variables. During GC we use addresses marked with rb_global_variable(), because we need to make sure they are scanned and kept alive. It also makes sure that if we move objects, we update the data structure that the VALUE* points to so next time you see a valid object.

Since MRI doesn't have a precise GC, it probably uses some heuristic to determine whether a pointer actually points at gc'able memory. I guess that is what allows you to get away with it here, since that heuristic probably ends up saying that this is probably not a valid pointer anymore later on in the process.

Even then though, it's still a time bomb, since at some point perhaps the heuristic could be valid and it could gc some random other object.

@jeremyevans
Owner

That wasn't my understanding of rb_global_variable. My understanding is that rb_global_variable ensures that the VALUE pointed to will not get garbage collected (it is an alias of rb_gc_register_address). So the worst case scenario with the present code in MRI is that it will refuse to GC some GC-able object.

@dbussink

Well, what MRI does (and also Rubinius), is store the address in a global list. What then happens is during a GC cycle is that the objects stored in these addresses get marked. This makes sure they aren't garbage collected.

The problem here is when you add a memory address on the stack to this list and don't remove it when the function exits. This means that when the GC runs, it will dereference this address located on the stack. But this address isn't in scope anymore or perhaps used in a totally different stack frame, so dereferencing it can cause all kinds of problems.

Just checked the exact code in MRI and this is exactly how it behaves. What saves you in MRI is probably that rb_gc_mark_maybe checks whether the pointer looks like something allocated on the heap. In this case that check would most likely fail since it's a stack allocated address.

@jeremyevans
Owner

That makes sense. Thanks for explaining the situation to me. :)

@jeremyevans jeremyevans merged commit 159b6e0 into jeremyevans:master Nov 17, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment