Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread-unsafe "autoload" code can cause constants to be missing #121

Open
headius opened this issue Apr 27, 2015 · 6 comments
Open

Thread-unsafe "autoload" code can cause constants to be missing #121

headius opened this issue Apr 27, 2015 · 6 comments
Assignees

Comments

@headius
Copy link

headius commented Apr 27, 2015

This may or may not be serious, depending on how thread-safe you consider RSpec itself to be, but it did lead to some red herring bugs in JRuby and Rubinius (jruby/jruby#2874 and rubinius/rubinius#2934).

This code to "autoload" a few namespaces under RSpec is not thread-safe:

  MODULES_TO_AUTOLOAD = {
    :Matchers     => "rspec/expectations",
    :Expectations => "rspec/expectations",
    :Mocks        => "rspec/mocks"
  }

  def self.const_missing(name)
    # Load rspec-expectations when RSpec::Matchers is referenced. This allows
    # people to define custom matchers (using `RSpec::Matchers.define`) before
    # rspec-core has loaded rspec-expectations (since it delays the loading of
    # it to allow users to configure a different assertion/expectation
    # framework). `autoload` can't be used since it works with ruby's built-in
    # require (e.g. for files that are available relative to a load path dir),
    # but not with rubygems' extended require.
    #
    # As of rspec 2.14.1, we no longer require `rspec/mocks` and
    # `rspec/expectations` when `rspec` is required, so we want
    # to make them available as an autoload. For more info, see:
    require MODULES_TO_AUTOLOAD.fetch(name) { return super }
    ::RSpec.const_get(name)
  end

This logic is essentially the same problem as using defined?(SomeConst) to determine whether a library has been loaded. Here's a possible sequence leading to the error above.

  • Thread A begins accessing RSpec::Matchers
  • A can't find the constant, so the const_missing triggers
  • A starts loading matchers.rb, and right away Matchers is defined (but still empty)
  • Thread B begins accessing RSpec::Matchers
  • B sees that the constant is there and proceeds to look up BuiltIn, skipping the const_missing altogether
  • B has an incomplete view of Matchers since it is still being loaded, and as a result BuiltIn (or any number of other constants and methods) is missing.

Synchronizing const_missing does not help, because once the Matchers module is defined we won't even call it. Autoload is not at fault here obviously, but neither is require logic, since no amount of locking can prevent this code from possibly seeing a partially-initialized Matchers module. RSpec needs to change how it manages these constants.

And finally, a detail I should have checked immediately...this happens in MRI too:

[] ~/projects/jruby-1.7 $ rvm ruby-2.2 do ruby blah.rb
blah.rb:11:in `block (2 levels) in <main>': uninitialized constant RSpec::Matchers::BuiltIn (NameError)
@myronmarston
Copy link
Member

Thanks, @headius. Your description of a possible sequence that triggers this is very helpful.

RSpec needs to change how it manages these constants.

I'm all ears if you've got a suggestion that satisfies the user experience constraints we're operating under. We're specifically dealing with two constraints that are practically polar opposites:

  • RSpec is highly configurable and modular. Users should be able to configure RSpec to use a different mocking library (e.g. mocha, rr, flexmock or bogus) or a different assertion/expectation library (e.g. Minitest::Assertions, Test::Unit::Assertions, Wrong, etc). When they configure an alternate, we want to avoid loading the RSpec library that they are choosing not to use.
  • The default stack (rspec-core/expectations/mocks) should "just work" with no configuration necessary. That means, for example, that a user should be able to define a custom matcher using RSpec::Matchers.define in spec/spec_helper.rb (or some other file loaded early) without forcing them to require rspec/expectations themselves.

Up to now we've satisfied these two constraints with a two-pronged approach:

  • We wait to require 'rspec/expectations' and require 'rspec/mocks' until the user is defining their first example group. (We used to wait even longer until just before running the examples, but a bug in ruby 1.9 forced us to include the mixins from those libraries before the user has a chance to including any modules). Waiting to load expectations and mocks is essential so the user has a chance to configure RSpec to use a different library if they wish.
  • We use the const_missing/const_get piece you're reporting here so that users who are using the default stack can do RSpec::Matchers.define w/o worrying about the fact that we haven't yet loaded rspec-expectations.

@headius, do you have a suggestion for a different way we could manage these constants while still satisfying these constraints?

I will say that I don't believe this is the source of the Rubinius::ToolSet::Runtime::CompileError I originally reported. That error is happening part way through our test suite run long after the constants defined by this bit of code have been loaded. Also, I think it's extremely unlikely for RSpec users to hit this bug, since it's a pretty limited situation in which users would hit this--it would only happen if they are spinning up multiple threads that are accessing RSpec constants during the loading/configuration phase of the RSpec run. Once the first example group has been defined by the user, we require the libs and it's no longer a problem -- they can access RSpec constants in multi-threaded code all they want at that point.

Given that no RSpec user has ever reported a problem with this, if we can't come up with an alternate solution, my instinct is to leave it in place in spite of the fact that it's not threadsafe. Although, we may want to document it as unsupported (e.g. "Accessing RSpec's constants from multiple threads during the loading/setup phase is not supported.")

@headius
Copy link
Author

headius commented Apr 27, 2015

Also, I think it's extremely unlikely for RSpec users to hit this bug, since it's a pretty limited situation in which users would hit this

Yes, I agree. And this is even less likely given that this is a test framework, and people probably don't access RSpec::Matchers in production code.

If you wanted to make this safe, you would want to use a pattern that lazily defines the Matchers constant so that once it is visible, it is also ready to be used. That pattern would look something like this:

module RSpec
  matchers = Module.new do
    ...body of Matchers
  end
  Matchers = matchers
end

This is obviously not typical Ruby code but your const_missing autoload is pretty atypical too.

Unfortunately this is a fundamental problem with how Ruby assigns the constant for an in-progress class definition. I have proposed that the class's name constant not be assigned until the class has closed, but people generally oppose the idea.

I also agree this may not be something that needs to be fixed, but it came up as a possible bug reproduction, crossed my desk, and I found this.

@myronmarston
Copy link
Member

If you wanted to make this safe, you would want to use a pattern that lazily defines the Matchers constant so that once it is visible, it is also ready to be used. That pattern would look something like this:

Clever :). With that in case, would we also need a mutex to guard the logic in our const_missing definition?

This is obviously not typical Ruby code but your const_missing autoload is pretty atypical too.

Yeah, I'm realizing that it could get messy since we re-open the Matchers module in some files and define some additional nested classes and modules. We'd have to somehow give those files access to the anonymous module, which could get really messy really quickly.

For now I think it's unlikely we'll do anything but I'll keep thinking about it.

@cupakromer
Copy link
Member

We'd have to somehow give those files access to the anonymous module, which could get really messy really quickly.

What about using module_exec (and friends)? I use this pattern a bit with Rails projects b/c of Rails' autoloading issues.

@myronmarston
Copy link
Member

What about using module_exec (and friends)? I use this pattern a bit with Rails projects b/c of Rails' autoloading issues.

module_exec could be used here but doesn't solve the issue I was thinking of (and perhaps didn't explain well...) and has an issue of its own.

The issue I was thinking of was this:

# lib/rspec/matchers.rb

module RSpec
  matchers = Module.new do
    # the existing matchers code would go here
  end

  require 'rspec/matchers/dsl'

  Matchers = matchers
end
# in lib/rspec/matchers/dsl.rb

# we want to do this...but we don't have a reference to the matchers module!
anonymous_matchers_module.module_exec do
  module DSL
  end
end

The problem here is that we want to wait to assign the matchers module to a constant until the definition is complete. However, in other files we define additional things in the matchers module, but we don't have a reference to the matchers module to re-open it since the matchers module is only available via a local variable accessible in the other file at that point. We can, of course do something messy like store the anonymous matchers module in a global variable or thread local or something else...but that's really messy.

The other issue is that constant definition works differently in module_exec than it does in a normal module definition. Notice what happens in this IRB session:

  rspec-core git:(example-creation-refactoring) irb
irb(main):001:0> Foo = Module.new
=> Foo
irb(main):002:0> Foo.module_exec do
irb(main):003:1*   Bar = 1
irb(main):004:1> end
=> 1
irb(main):005:0> Foo::Bar
NameError: uninitialized constant Foo::Bar
        from (irb):5
        from :0
irb(main):006:0> Bar
=> 1

We reopen the Foo module via module_exec and define a Bar constant in there, but that's defined as a top-level constant, not as Foo::Bar.

Neither of these issues is insurmountable, of course, but they would probably take fairly significant restructurings, and I'm not convinced the effort would be worth it given the unlikeliness of users hitting this issue in a test suite (as discussed above).

@headius
Copy link
Author

headius commented Jul 8, 2020

Auditing old issues that mention me...

This code still exists but could now be replaced with autoload on Ruby versions 2.6 and higher, since that now redispatches to require and will see gems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants