New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry "Do not load file with same realpath twice when requiring" #4887
Retry "Do not load file with same realpath twice when requiring" #4887
Conversation
211bfc1
to
a334392
Compare
Instead of relying on setting an unsetting ruby_verbose, which is not thread-safe, restructure require_internal and load_lock to accept a warn argument for whether to warn, and add rb_require_internal_silent to require without warnings. Use rb_require_internal_silent when loading encoding. Note this does not modify ruby_debug and errinfo handling, those remain thread-unsafe. Also silent requires when loading transcoders.
This fixes issues with paths being loaded twice in certain cases when symlinks are used. It took me multiple attempts to get this working. My original attempt tried to convert paths to realpaths before adding them to $LOADED_FEATURES. Unfortunately, this doesn't work well with the loaded feature index, which is based off load paths and not realpaths. While I was able to get require working, I'm fairly sure the loaded feature index was not being used as expected, which would have significant performance implications. Additionally, I was never able to get that approach working with autoload when autoloading a non-realpath file. It also broke some specs. This takes a more conservative approach. Directly before loading the file, if the file with the same realpath has been required, the loading of the file is skipped. The realpaths are stored as fstrings in a hidden hash. When rebuilding the loaded feature index, the hash of realpaths is also rebuilt. I'm guessing this makes rebuilding process slower, but I don think that is a hot path. In general, modifying loaded features is only done when reloading, and that tends to be in non-production environments. Change test_require_with_loaded_features_pop test to use 30 threads and 300 iterations, instead of 4 threads and 1000 iterations. I saw only sporadic failures with 4/1000, but consistent failures 30/300 threads. These failures were due to the fact that the concurrent deletions from $LOADED_FEATURES in other threads can result in rb_ary_entry returning nil when rebuilding the loaded features index. To avoid concurrency issues when rebuilding the loaded features index, the building of the index itself is left alone, and afterwards, a separate loop is done on a copy of the loaded feature snapshot in order to rebuild the realpaths hash. Fixes [Bug #17885]
a334392
to
972f274
Compare
I've updated the implementation of this to avoid the underlying race condition between encoder loading (which implicitly sets ruby_versbose to false), and transcoder loading (which doesn't). So a mutex isn't needed and the current implementation of the patch does not use a mutex. |
@@ -328,6 +337,15 @@ get_loaded_features_index(void) | |||
features_index_add(as_str, INT2FIX(i)); | |||
} | |||
reset_loaded_features_snapshot(); | |||
|
|||
features = rb_ary_dup(vm->loaded_features_snapshot); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I just noticed today that this line is responsible for over 6GiB of allocation during boot in our app.
On 3.0.3:
Total allocated: 3.12 GB (23346856 objects)
on 3.1.0:
Total allocated: 9.55 GB (23451652 objects)
allocated memory by location
-----------------------------------
6.37 GB bootsnap-1.10.3/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30 (Kernel.require)
allocated memory by class
-----------------------------------
6.72 GB Array
I'll see if I can find a way to get rid of it, but figure I'd let you know first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so after digging a bit more, this path should only be taken when $LOADED_FEATURES
has been mutated from the ruby side, so it might be caused by Bootsnap's loaded_features_index
.
I'll try to confirm this and to see if there's a way to avoid that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, actually, I don't think bootsnap mutates $LOADED_FEATURES
, so I'm not sure what cause the index to be rebuilt so much. I'll have to dig more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm at a loss here, I used the following monkey patch to try to figure out what might be mutating $LOADED_FEATURES
:
module LoadedObserver
($LOADED_FEATURES.methods - [:any?, :size, :last, :each, :each_with_object, :reverse_each, :[]]).each do |m|
module_eval(<<~RUBY, __FILE__, __LINE__ + 1)
def #{m}(...)
puts "-" * 40
p [:call, "#{m}"]
puts caller
super
end
RUBY
end
end
$LOADED_FEATURES.singleton_class.prepend(LoadedObserver)
But I got absolutely no hits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so I went as far as modifying Ruby, and I was wrong, this callsite isn't where the arrays are created.
However looking at the heap dumps, I can confirm that these arrays are indeed some $LOAD_PATH
snapshots, I just don't get how they are created.
I suspect it might have something to do with reset_loaded_features_snapshot()
, but I must confess I don't quite grasp how shared arrays work exactly.
This was previously committed and then reverted because it caused an unexpected warning in a race condition during simultaneous loading of encodings in separate threads. Use a mutex to synchronize the loading of encodings to prevent that warning.