Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel#require conflict with Zeitwerk #383

Closed
fxn opened this issue Jan 1, 2022 · 4 comments · Fixed by #385
Closed

Kernel#require conflict with Zeitwerk #383

fxn opened this issue Jan 1, 2022 · 4 comments · Fixed by #385

Comments

@fxn
Copy link

fxn commented Jan 1, 2022

Opening this issue to have a conversation about the conflict and its solution.

As you know, Zeitwerk decorates Kernel#require to detect autoloaded files it manages (here).

There's an edge case where both libraries are in conflict: In order to load implicit namespaces on demand and have them behave like everything else, Zeitwerk issues an autoload call in one of the directories that implicitly define the namespace.

For example, if a Rails application has a directory

app/controllers/admin

and there is no file admin.rb defining the namespace, Zeitwerk issues

Object.autoload(:Admin, "/absolute/path/to/app/controllers/admin")

When a require call for that directory is detected, Zeitwerk hijacks it, does not invoke the original require, and does not modify $LOADED_FEATURES. It just detects the implicit namespace being autoloaded, autovivifies the Admin module, sets autoloads for that namespace, invokes callbacks, logs if enabled, etc.

Now, here's the conflict: if Bootsnap's require is the one calling Zeitwerk's (depends on load order), Bootsnap's assumes the directory is pushed to $LOADED_FEATURES. Bootsnap's loaded features cache gets out of sync, and reloads do not work because the cache does not reflect what is caching (it has an entry for the directory that never was in $LOADED_FEATURES and therefore was not removed from it).

I tested pushing the directory to $LOADED_FEATURES. That's like 4x slower in earlier Ruby versions, and in Ruby 3.1 the performance regression skyrockets to about 180x in some tests I did.

@jeremyevans commented in ruby#18452 that Bootsnap's caching logic might need to be revised.

Ideas:

  • Let the cache reflect the true contents of $LOADED_FEATURES somehow.
  • Let Bootsnap's require ignore absolute paths altogether. Zeitwerk exclusively uses absolute paths for everything.

What do you think?

@byroot
Copy link
Contributor

byroot commented Jan 1, 2022

Ack. I'm still in vacation for another week, but I'll dig into this first thing when I get back to work.

We can likely ignore absolute path entirely yes, as a matter of fact Shopify has been using config.add_autoload_paths_to_load_path = false for a couple years now, because there's no point having load path caching for Zeitwerk managed files.

I'll just have to carefully consider it, because bootsnap's LoadedFeaturesIndex is particularly tricky, but on paper it sounds logical.

@fxn
Copy link
Author

fxn commented Jan 2, 2022

Awesome @byroot!

I've written a self-contained script that shows the conflict:

require 'fileutils'
require 'tmpdir'

require 'bundler/inline'
gemfile do
  source 'https://rubygems.org'
  gem 'bootsnap', '= 1.9.3', require: false
  gem 'zeitwerk', '= 2.5.3', require: false
end

Dir.mktmpdir do |dir|
  Dir.chdir(dir)

  FileUtils.mkdir_p('lib/admin')
  $LOAD_PATH << File.expand_path('lib')

  require 'zeitwerk'
  loader = Zeitwerk::Loader.new
  loader.log!
  loader.push_dir('lib')
  loader.enable_reloading
  loader.setup

  require 'bootsnap'
  Bootsnap.setup(
    cache_dir:          dir,
    development_mode:   true,
    load_path_cache:    true,
    compile_cache_iseq: false,
    compile_cache_yaml: false
  )

  Admin          # works
  loader.reload
  Admin          # raises NameError
end

The first Admin is autoloaded, but the second one is not because there's a cache hit for /abspath/to/lib/admin here and Zeitwerk's require is not called.

If we swap the Zeitwerk and Bootsnap blocks in the script above, reloading works fine. That's the order we have when using Rails commands due to how things are defined in Rails, which explains why we did not find this incompatibility earlier.

@fxn
Copy link
Author

fxn commented Jan 2, 2022

BTW, posted that followup because I had time this morning to write that script, but please ignore everything until you get back to work. Enjoy your break man!

@fxn
Copy link
Author

fxn commented Jan 6, 2022

I have modified Rails 7 so that Zeitwerk is loaded a bit later (when the application is instantiated). This will ship with 7.0.1.

With that, require "rails" by itself does not load Zeitwerk, and bundle exec sidekiq won't load Zeitwerk before Bootsnap does its setup. That mitigates a bit the issue in practice. Even if some gem dependency loaded itself with Zeitwerk, Bootsnap's setup runs before Bundler's require, we should be good.

That mitigates the issue a bit in practice, this is going to work more like Rails 6 did, and Rails 6 did not have issues of this kind reported. However, we uncovered a legit incompatibility that is still there.

For example, if tomorrow Sidekiq itself used Zeitwerk, bundle exec sidekiq would still load Zeitwerk before Bootsnap's setup. Or, if some non-Rails project or their dependencies used both gems, we still might have to be compatible.

Your call :).

casperisfine pushed a commit that referenced this issue Jan 10, 2022
Fix: #383
Ref: fxn/zeitwerk#198

The loaded feature index goal is to prevent requiring the same "feature"
twice if the `$LOAD_PATH` was mutated. e.g:

```ruby
require "bundler" # ~/gems/bundler/lib/bundler.rb
$LOAD_PATH.unshift("some/path")
require "bundler" # some/path/bundler.rb
```

In such scenario Ruby remember that you require "bundler" already
and even though it now maps to a new path, it won't load it again.

This was causing issues with Zeitwerk if it is loaded before bootsnap (see refs),
because the cache wouldn't be cleared.

But ultimately Zeitwerk always require absolute paths, and the concern
described above simply doesn't apply to absolute paths. So we can simply
bail out early in these cases. That fixes the bug, and also means less work
and a smaller index, so win-win.
casperisfine pushed a commit that referenced this issue Jan 10, 2022
Fix: #383
Ref: fxn/zeitwerk#198

The loaded feature index goal is to prevent requiring the same "feature"
twice if the `$LOAD_PATH` was mutated. e.g:

```ruby
require "bundler" # ~/gems/bundler/lib/bundler.rb
$LOAD_PATH.unshift("some/path")
require "bundler" # some/path/bundler.rb
```

In such scenario Ruby remember that you require "bundler" already
and even though it now maps to a new path, it won't load it again.

This was causing issues with Zeitwerk if it is loaded before bootsnap (see refs),
because the cache wouldn't be cleared.

But ultimately Zeitwerk always require absolute paths, and the concern
described above simply doesn't apply to absolute paths. So we can simply
bail out early in these cases. That fixes the bug, and also means less work
and a smaller index, so win-win.
casperisfine pushed a commit that referenced this issue Jan 10, 2022
Fix: #383
Ref: fxn/zeitwerk#198

The loaded feature index goal is to prevent requiring the same "feature"
twice if the `$LOAD_PATH` was mutated. e.g:

```ruby
require "bundler" # ~/gems/bundler/lib/bundler.rb
$LOAD_PATH.unshift("some/path")
require "bundler" # some/path/bundler.rb
```

In such scenario Ruby remember that you require "bundler" already
and even though it now maps to a new path, it won't load it again.

This was causing issues with Zeitwerk if it is loaded before bootsnap (see refs),
because the cache wouldn't be cleared.

But ultimately Zeitwerk always require absolute paths, and the concern
described above simply doesn't apply to absolute paths. So we can simply
bail out early in these cases. That fixes the bug, and also means less work
and a smaller index, so win-win.
casperisfine pushed a commit that referenced this issue Jan 10, 2022
Fix: #383
Ref: fxn/zeitwerk#198

The loaded feature index goal is to prevent requiring the same "feature"
twice if the `$LOAD_PATH` was mutated. e.g:

```ruby
require "bundler" # ~/gems/bundler/lib/bundler.rb
$LOAD_PATH.unshift("some/path")
require "bundler" # some/path/bundler.rb
```

In such scenario Ruby remember that you require "bundler" already
and even though it now maps to a new path, it won't load it again.

This was causing issues with Zeitwerk if it is loaded before bootsnap (see refs),
because the cache wouldn't be cleared.

But ultimately Zeitwerk always require absolute paths, and the concern
described above simply doesn't apply to absolute paths. So we can simply
bail out early in these cases. That fixes the bug, and also means less work
and a smaller index, so win-win.
casperisfine pushed a commit that referenced this issue Jan 10, 2022
Fix: #383
Ref: fxn/zeitwerk#198

The loaded feature index goal is to prevent requiring the same "feature"
twice if the `$LOAD_PATH` was mutated. e.g:

```ruby
require "bundler" # ~/gems/bundler/lib/bundler.rb
$LOAD_PATH.unshift("some/path")
require "bundler" # some/path/bundler.rb
```

In such scenario Ruby remember that you require "bundler" already
and even though it now maps to a new path, it won't load it again.

This was causing issues with Zeitwerk if it is loaded before bootsnap (see refs),
because the cache wouldn't be cleared.

But ultimately Zeitwerk always require absolute paths, and the concern
described above simply doesn't apply to absolute paths. So we can simply
bail out early in these cases. That fixes the bug, and also means less work
and a smaller index, so win-win.
casperisfine pushed a commit that referenced this issue Jan 10, 2022
Fix: #383
Ref: fxn/zeitwerk#198

The loaded feature index goal is to prevent requiring the same "feature"
twice if the `$LOAD_PATH` was mutated. e.g:

```ruby
require "bundler" # ~/gems/bundler/lib/bundler.rb
$LOAD_PATH.unshift("some/path")
require "bundler" # some/path/bundler.rb
```

In such scenario Ruby remember that you require "bundler" already
and even though it now maps to a new path, it won't load it again.

This was causing issues with Zeitwerk if it is loaded before bootsnap (see refs),
because the cache wouldn't be cleared.

But ultimately Zeitwerk always require absolute paths, and the concern
described above simply doesn't apply to absolute paths. So we can simply
bail out early in these cases. That fixes the bug, and also means less work
and a smaller index, so win-win.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants