Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry "Do not load file with same realpath twice when requiring" #4887

Merged
merged 2 commits into from Oct 2, 2021

Conversation

jeremyevans
Copy link
Contributor

This was previously committed and then reverted because it caused an unexpected warning in a race condition during simultaneous loading of encodings in separate threads. Use a mutex to synchronize the loading of encodings to prevent that warning.

@jeremyevans jeremyevans force-pushed the require-realpath-17885-v2 branch 3 times, most recently from 211bfc1 to a334392 Compare September 23, 2021 21:18
Instead of relying on setting an unsetting ruby_verbose, which is
not thread-safe, restructure require_internal and load_lock to
accept a warn argument for whether to warn, and add
rb_require_internal_silent to require without warnings.  Use
rb_require_internal_silent when loading encoding.

Note this does not modify ruby_debug and errinfo handling, those
remain thread-unsafe.

Also silent requires when loading transcoders.
This fixes issues with paths being loaded twice in certain cases
when symlinks are used.

It took me multiple attempts to get this working.  My original
attempt tried to convert paths to realpaths before adding them
to $LOADED_FEATURES.  Unfortunately, this doesn't work well
with the loaded feature index, which is based off load paths
and not realpaths. While I was able to get require working, I'm
fairly sure the loaded feature index was not being used as
expected, which would have significant performance implications.
Additionally, I was never able to get that approach working with
autoload when autoloading a non-realpath file. It also broke
some specs.

This takes a more conservative approach. Directly before loading the
file, if the file with the same realpath has been required, the
loading of the file is skipped. The realpaths are stored as
fstrings in a hidden hash.

When rebuilding the loaded feature index, the hash of realpaths
is also rebuilt.  I'm guessing this makes rebuilding process
slower, but I don think that is a hot path. In general, modifying
loaded features is only done when reloading, and that tends to be
in non-production environments.

Change test_require_with_loaded_features_pop test to use 30 threads
and 300 iterations, instead of 4 threads and 1000 iterations.
I saw only sporadic failures with 4/1000, but consistent failures
30/300 threads. These failures were due to the fact that the
concurrent deletions from $LOADED_FEATURES in other threads can
result in rb_ary_entry returning nil when rebuilding the loaded
features index.

To avoid concurrency issues when rebuilding the loaded features
index, the building of the index itself is left alone, and
afterwards, a separate loop is done on a copy of the loaded feature
snapshot in order to rebuild the realpaths hash.

Fixes [Bug #17885]
@jeremyevans jeremyevans changed the title Retry "Do not load file with same realpath twice when requiring" with mutex around encoding loading Retry "Do not load file with same realpath twice when requiring" Sep 23, 2021
@jeremyevans
Copy link
Contributor Author

I've updated the implementation of this to avoid the underlying race condition between encoder loading (which implicitly sets ruby_versbose to false), and transcoder loading (which doesn't). So a mutex isn't needed and the current implementation of the patch does not use a mutex.

@jeremyevans jeremyevans requested a review from nobu October 2, 2021 00:34
@@ -328,6 +337,15 @@ get_loaded_features_index(void)
features_index_add(as_str, INT2FIX(i));
}
reset_loaded_features_snapshot();

features = rb_ary_dup(vm->loaded_features_snapshot);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I just noticed today that this line is responsible for over 6GiB of allocation during boot in our app.

On 3.0.3:

Total allocated: 3.12 GB (23346856 objects)

on 3.1.0:

Total allocated: 9.55 GB (23451652 objects)


allocated memory by location
-----------------------------------
6.37 GB  bootsnap-1.10.3/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30 (Kernel.require)
allocated memory by class
-----------------------------------
   6.72 GB  Array

I'll see if I can find a way to get rid of it, but figure I'd let you know first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so after digging a bit more, this path should only be taken when $LOADED_FEATURES has been mutated from the ruby side, so it might be caused by Bootsnap's loaded_features_index.

I'll try to confirm this and to see if there's a way to avoid that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, actually, I don't think bootsnap mutates $LOADED_FEATURES, so I'm not sure what cause the index to be rebuilt so much. I'll have to dig more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm at a loss here, I used the following monkey patch to try to figure out what might be mutating $LOADED_FEATURES:

module LoadedObserver
  ($LOADED_FEATURES.methods - [:any?, :size, :last, :each, :each_with_object, :reverse_each, :[]]).each do |m|
    module_eval(<<~RUBY, __FILE__, __LINE__ + 1)
      def #{m}(...)
        puts "-" * 40
        p [:call, "#{m}"]
        puts caller
        super
      end
    RUBY
  end
end

$LOADED_FEATURES.singleton_class.prepend(LoadedObserver)

But I got absolutely no hits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I went as far as modifying Ruby, and I was wrong, this callsite isn't where the arrays are created.

However looking at the heap dumps, I can confirm that these arrays are indeed some $LOAD_PATH snapshots, I just don't get how they are created.

I suspect it might have something to do with reset_loaded_features_snapshot(), but I must confess I don't quite grasp how shared arrays work exactly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants