thread-safety in MRI too? #15

jrochkind · 2013-07-25T03:18:54Z

It is a misconception that:

    # Because MRI never runs code in parallel, the existing
    # non-thread-safe structures should usually work fine.
    Array = ::Array
    Hash  = ::Hash

(https://github.com/headius/thread_safe/blob/master/lib/thread_safe.rb#L32)

It looks like that code is saying if it's MRI, thread_safe gem provides nothing, just set ThreadSafe::Array to ordinary Array and ThreadSafe::Hash to ordinary hash.

But this is a misconception. While it's true that MRI can't run code on more than one core simultaneously, MRI can still context switch in the middle of any operation, switching one thread out for another. This means you still need to use synchronization, exactly the same way you do in a ruby without the GIL. (the ruby stdlib doesn't provide mutex and monitor and other synchronization primitives for no reason!) See: http://www.rubyinside.com/does-the-gil-make-your-ruby-code-thread-safe-6051.html

I could really use a simple threadsafe Array and Hash in MRI too. But I think I'm not getting it here?

I wonder if the rbx variation would just work in MRI too, it looks like maybe it only uses stdlib available on mri too?

The text was updated successfully, but these errors were encountered:

thedarkone · 2013-07-25T16:50:54Z

While it's true that MRI can't run code on more than one core simultaneously, MRI can still context switch in the middle of any operation, switching one thread out for another.

MRI doesn't release GIL while in C code, all of the Array and Hash methods are pure C code on MRI, therefore what you describe can't actually happen.

Also, if you are looking for a thread safe Hash, I'm certain you will be better served by ThreadSafe::Cache.

sferik · 2013-10-01T11:40:44Z

Also, if you are looking for a thread safe Hash, I'm certain you will be better served by ThreadSafe::Cache.

@thedarkone Can you say more about this? What are the benefits of using ThreadSafe::Cache over ThreadSafe::Hash?

thedarkone · 2013-10-01T13:24:48Z

@sferik TS::Hash is a fully synchronized Hash wrapper on concurrent VMs (JRuby and Rbx) see here, this means every method call involves locking (this doesn't scale terribly well under concurrent load).

TS::Cache is fully concurrent (meaning a large number of threads can modify it at the same time), all read operations are completely lock-free and non-blocking, while update/delete/insert are also lock-free or "fine-grain locked" in the worst case-scenario. It also attempts to learn from a decade of java.util.concurrent.ConcurrentHashMap experience and provides atomic "check-then-act" methods from the get go (compute_if_absent, compute_if_present, merge_pair, replace_if_exists etc).

jrochkind · 2013-10-01T13:38:00Z

Huh, what, if any, are the situations where you'd want to use a TS::Hash instead of a TS::Cache? It sounds from that description like TS::Cache is just plain better.

This would be good stuff for the README; from the README people might assume TS::Hash is the way to go, and not even know about TS::Cache.

sferik · 2013-10-01T15:01:50Z

@jrochkind 👍 I was just about to suggest the same thing. I naïvely used ThreadSafe::Hash in my project. I’m about to switch to using ThreadSafe::Cache.

@thedarkone One question: the following change is raising ThreadError: deadlock; recursive locking

 def define_memoize_method(method)
   method_name = method.name.to_sym
   undef_method(method_name)
   define_method(method_name) do
-    memory.fetch(method_name) do
-      value = method.bind(self).call
-      store_memory(method_name, value)
+    memory.compute_if_absent(method_name) do
+      method.bind(self).call
     end
   end
 end

-def store_memory(name, value)
-  memory[name] = value
-end

 def memory
   @memory ||= ThreadSafe::Cache.new
 end

Is that a bug or am I doing something wrong?

headius · 2013-10-01T17:08:13Z

Here's my question... if you would never really want ThreadSafe::Hash instead of ThreadSafe::Cache, why don't they use the same impl?

Also note some Ruby 2.1 features I'm pushing for that would make a dumb synchronized Hash (and Array) a bit less necessary:

http://bugs.ruby-lang.org/issues/8556

https://bugs.ruby-lang.org/issues/8961

sferik · 2013-10-01T17:24:30Z

@headius That’s a good question.

Is there anything we can do to help these features make it into Ruby 2.1? In the long term, that’s more desirable than maintaining this library forever.

headius · 2013-10-01T18:22:17Z

Something this big in 2.1 might be tough, even if matz wasn't already opposed to adding more collection classes. I've tried to find something that matz will like, but have been unsuccessful.

My issue #8556 attempts to add a standard mechanism for creating synchronization-wrapped objects, which would make ThreadSafe::Hash mostly unnecessary (though probably faster until we do a fast SynchronizedDelegate in JRuby), so that's a step forward. And 8961 will provide a way for users to start building data structures that have synchronization built in from the start in Java style.

thedarkone · 2013-10-01T19:29:56Z

@sferik unfortunately, check-then-act methods cannot be recursive (ie: they cannot modify the same TS::Cache instance inside the "atomic" block), in your case a memoized method cannot be dependent on other memoizable methods. I might fix this in the future (a thread will be able to do recursive "atomic" calls), but you will still be risking a deadlock, unless you are careful about something called "lock ordering" of your calls (google lock ordering if you want to know more). As a side: note TS::Cache#fetch is not atomic and can do whatever it wants.

In your use case you might not care about computing some values multiple times, so you can just use memory[method_name] ||= method.bind(self).call. Otherwise you'll have to resort to double checked locking idiom which would look something like this or this.

@headius @jrochkind Ruby hashes are now insertion ordered, this has some serious performance implications for a concurrent data structure (I even dislike this in plain old Ruby hashes). That is the main reason for TS::Cache.kind_of?(Hash) # => false

@headius as for the Ruby concurrency enhancements something like this: https://bugs.ruby-lang.org/issues/8259 beats pretty much everything 😋. BTW: I'm still waiting for JRuby to get fast enough for @foo ||= Foo.new to become dangerous (due to JVM compiler reordering tricks mainly, since x86 memory model has our backs otherwise).

headius · 2013-10-01T21:16:11Z

@thedarkone Thank you for explaining the insertion ordering issue. That's a very good point.

As for https://bugs.ruby-lang.org/issues/8259, I've commented there to wake it up and optimistically marked it for 2.1. I doubt it will get in, though, since it's pretty late in the game. If there were a patch however (hint, hint), it would have a MUCH better chance.

sferik · 2013-10-02T15:41:46Z

I might fix this in the future (a thread will be able to do recursive "atomic" calls)…

@thedarkone: Is this something I should open an issue about? I think it would be a very nice feature.

thedarkone · 2013-10-14T08:52:45Z

Is this something I should open an issue about? I think it would be a very nice feature.

Opened #30 (won't be able to work on this right now though..).

bf4 · 2013-11-20T20:41:41Z

Sorta related question, why does this library use autoload which is known not to be thread-safe? (Or so I understand)

In 2008 you (@headius) wrote in http://bugs.ruby-lang.org/issues/show/921 and https://www.ruby-forum.com/topic/172385

Currently autoload is not safe to use in a multi-threaded application. To put it more bluntly, it's broken.

The current logic for autoload is as follows:

A special object is inserted into the target constant table, used as a marker for autoloading

When that constant is looked up, the marker is found and triggers autoloading

The marker is first removed, so the constant now appears to be undefined if retrieved concurrently

The associated autoload resource is required, and presumably redefines the constant in question

The constant lookup, upon completion of autoload, looks up the constant again and either returns its new value or proceeds with normal constant resolution

The problem arises when two or more threads try to access the constant. Because autoload is stateful and unsynchronized, the second thread may encounter the constant table in any number of states:

It may see the autoload has not yet fired, if the first thread has encountered the marker but not yet removed it. It would then proceed along the same autoload path, requiring the same file a second time.

It may not find an autoload marker, and assume the constant does not exist.

It may see the eventual constant the autoload was intended to define.

Of these combinations, (3) is obviously the desired behavior. (1) can only happen on native-threaded implementations that do not have a global interpreter lock, since it requires concurrency during autoload's internal logic. (2) can happen on any implementation, since while the required file is processing the original autoload constant appears to be undefined.

And in 2011, @matz said https://www.ruby-forum.com/topic/3036681

'autoload will be dead, I strongly discourage the use of autoload in any standard libraries'

H/T for the quote from my blog post and the ruby rogues discussion around it :)

thedarkone · 2013-11-20T20:54:33Z

@bf4 all the autoload code in thread_safe was introduced by me. I thought it was thread safe in 2012 Ruby VMs, turns out it wasn't in Rubinius, but that has been fixed by @dbussink.

bf4 · 2013-11-20T22:04:57Z

@thedarkone So, the internet is wrong inasmuch as autoload is thread-safe in

cruby >= 2.0, diff
rbx some patch level
jruby some patch level

And you'd advise using it in all libraries that support 1.9.3 or higher? Do we know specifics? Is there a source for this? I think a lot of people don't know.

Update: I did find a few sources to support, but specifics still not clear, and it's kind of important, no?

aws confirms autoload is now safe is Ruby 2.0 by @lsiegel
jruby issue where autoload made thread-safe
Discussion 1, 2, 3

See headius/thread_safe#15 (comment)

* This is based on the comments in headius/thread_safe#15 (comment) on how to handle this exact issue with memoization.

headius · 2014-02-26T17:47:21Z

I think this issue has been discussed enough. If there are things we need to fix please file separate issues.

Summary:

We treat MRI Hash and Array as thread-safe since they are implemented in C and do not allow concurrent execution.
autoload is safe if it's the only auto-loading mechanism used (e.g. don't check defined?(SomeConstant) because it will be defined before it is fully booted). However, we need to examine our use of autoload to see if it's actually saving us anything and to see if it is actually safe usage.

bf4 · 2014-02-26T17:57:56Z

@headius Thanks for following up. I think there's still a great need to better understand when/how to use autoload when building libraries.

fukayatsu pushed a commit to fukayatsu/twitter that referenced this issue Dec 8, 2013

Replace ThreadSafe::Hash with ThreadSafe::Cache

6e0a09c

See headius/thread_safe#15 (comment)

dkubb mentioned this issue Dec 15, 2013

Change memoized method to not accept a block dkubb/memoizable#8

Merged

dkubb added a commit to dkubb/memoizable that referenced this issue Dec 15, 2013

Add double checked locking to Memoizable::Memory#fetch

7dd129b

* This is based on the comments in headius/thread_safe#15 (comment) on how to handle this exact issue with memoization.

dkubb mentioned this issue Dec 15, 2013

Add double checked locking to Memoizable::Memory#fetch dkubb/memoizable#9

Merged

dkubb added a commit to dkubb/memoizable that referenced this issue Dec 15, 2013

Add double checked locking to Memoizable::Memory#fetch

2cd167d

* This is based on the comments in headius/thread_safe#15 (comment) on how to handle this exact issue with memoization.

headius closed this as completed Feb 26, 2014

headius added this to the 0.1.3 and earlier milestone Feb 26, 2014

bf4 mentioned this issue Oct 21, 2014

Memory Use exploded in 2.6 mikel/mail#812

Closed

bf4 mentioned this issue Jun 16, 2015

Encapsulate serialization in ActiveModel::SerializableResource rails-api/active_model_serializers#954

Merged

thedarkone mentioned this issue Aug 12, 2015

Import thread_safe gem ruby-concurrency/concurrent-ruby#386

Merged

pitr-ch mentioned this issue Aug 12, 2015

Do we embrace Kernel#autoload ? ruby-concurrency/concurrent-ruby#395

Closed

bf4 mentioned this issue Sep 17, 2015

add require statements to top of file rails-api/active_model_serializers#1171

Merged

pmorton mentioned this issue Apr 8, 2016

Autoloading and Rails recurly/recurly-client-ruby#235

Closed

pmorton mentioned this issue Apr 26, 2016

Autoloading and Rails jmespath/jmespath.rb#26

Closed

bf4 mentioned this issue Apr 30, 2017

WIP: Capybara Integration with Rails (AKA System Tests) rails/rails#26703

Merged

13 tasks

bf4 mentioned this issue Mar 28, 2018

Possible race condition in the Maildown::MarkdownEngine.default_html_block method zombocom/maildown#40

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

thread-safety in MRI too? #15

thread-safety in MRI too? #15

jrochkind commented Jul 25, 2013

thedarkone commented Jul 25, 2013

sferik commented Oct 1, 2013

thedarkone commented Oct 1, 2013

jrochkind commented Oct 1, 2013

sferik commented Oct 1, 2013

headius commented Oct 1, 2013

sferik commented Oct 1, 2013

headius commented Oct 1, 2013

thedarkone commented Oct 1, 2013

headius commented Oct 1, 2013

sferik commented Oct 2, 2013

thedarkone commented Oct 14, 2013

bf4 commented Nov 20, 2013

thedarkone commented Nov 20, 2013

bf4 commented Nov 20, 2013

headius commented Feb 26, 2014

bf4 commented Feb 26, 2014

thread-safety in MRI too? #15

thread-safety in MRI too? #15

Comments

jrochkind commented Jul 25, 2013

thedarkone commented Jul 25, 2013

sferik commented Oct 1, 2013

thedarkone commented Oct 1, 2013

jrochkind commented Oct 1, 2013

sferik commented Oct 1, 2013

headius commented Oct 1, 2013

sferik commented Oct 1, 2013

headius commented Oct 1, 2013

thedarkone commented Oct 1, 2013

headius commented Oct 1, 2013

sferik commented Oct 2, 2013

thedarkone commented Oct 14, 2013

bf4 commented Nov 20, 2013

thedarkone commented Nov 20, 2013

bf4 commented Nov 20, 2013

headius commented Feb 26, 2014

bf4 commented Feb 26, 2014