Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance] Add early checks to loaded_feature_path to make it faster #3010

Merged

Conversation

itarato
Copy link
Collaborator

@itarato itarato commented Apr 19, 2023

Benchmark shows that string interpolations in the loop is very expensive. We can improve with .start_with? / .end_with? to shape the string early.

Base version:

def lookup(s1, s2, list)
	list.find do |elem|
		s1 == "#{elem}/#{s2}" || s1 == "#{elem}/#{s2}/foo"
	end
end

Improved version:

def lookup(s1, s2, list)
	prefix = if s1.end_with?("/#{s2}")
		s1.chomp(name)
	elsif s1.end_with?("/#{s2}/foo")
		s1.chomp("/#{s2}/foo")
	else
		return false
	end

	list.include?(prefix) ? prefix : nil
end

Benchmark:

benchmark do
	lookup(
		"/home/john/folder/subfolder/subfolder/rails/active_record/foo",
		"active_record",
		[
			"/home/john/folder/subfolder/subfolder/rails09",
                        # ... lot more
			"/home/john/folder/subfolder/subfolder/rails", # <- correct one
		]
	)
end

# With `jt benchmark script`:
# Base:     1060301.2
# Improved: 5018656.3

And upon looking at the use of this function, the return type is only as useful as a boolean to indicate a find - replacing the string return to a boolean improves another bit from 5018656.3 to 5133049.9 (4.8x compare to current).

Testing this on a rails app single test execution:

jt --use jvm-ce ruby --cpusampler -S bin/rails test test/...

On master's HEAD:

Truffle::FeatureLoader.loaded_feature_path || 1133ms 100.0% || 1133ms 100.0% || ../../../../truffle/truffleruby/src/main/ruby/truffleruby/core/truffle/feature_loader.rb~201-206:6970-7220

On this branch's commit:

Truffle::FeatureLoader.feature_path_loaded? || 186ms 100.0% || 186ms 100.0% || ../../../../truffle/truffleruby/src/main/ruby/truffleruby/core/truffle/feature_loader.rb~201-215:7030-7567

Same with the parent function find_feature_or_file:

Truffle::FeatureLoader.find_feature_or_file || 4697ms   2.2% || 117ms   0.1% || ../../../../truffle/truffleruby/src/main/ruby/truffleruby/core/truffle/feature_loader.rb~79-141:2566-4768

on master and:

Truffle::FeatureLoader.find_feature_or_file || 3628ms   1.8% || 103ms   0.1% || ../../../../truffle/truffleruby/src/main/ruby/truffleruby/core/truffle/feature_loader.rb~79-141:2566-4768

from this PR's commit.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Apr 19, 2023
@itarato itarato self-assigned this Apr 19, 2023
@itarato itarato marked this pull request as ready for review April 19, 2023 19:31
@itarato itarato force-pushed the feature/PA-faster-loaded_feature_path branch 2 times, most recently from 4da9ca3 to 21211a3 Compare April 20, 2023 02:51
return false
end

load_path.include?(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could store the $LOAD_PATH index in the FeatureEntry, and use that as a faster check to avoid the loop (i.e. load_path[index] == path or load_path.include?(path)). OTOH we would need to care about $LOAD_PATH changing (tracked via version), in which case we should recompute the index for the FeatureEntry.

Actually, I think something simpler and easier would be to keep a Hash of the $LOAD_PATH (recomputed by version). And then we can do load_path_hash.include?(path) which seems really nice and simple.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe something along that line can be accomplished. I'd rather ship this separately as that cached result change would still need some thinking how to position it properly. Would that work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's definitely something that can be done separately.

@eregon
Copy link
Member

eregon commented Apr 20, 2023

This looks great. I left some comments to improve it further.

I think it would be good to measure it in a real usage scenario too, not just a microbenchmark.
For instance you could measure how long does require 'bundler/setup' takes (e.g. with p Benchmark.realtime { ... }) before and after this change.
Also could you show the difference with --cpusampler=flamegraph? Just sharing the total time for find_feature_or_file, feature_provided?, and loaded_feature_path/feature_path_loaded? before & after would be helpful, no need to share the full flamegraph.


BTW there is also Require Profiling.
I thought this could be quite useful here, but unfortunately I think it does not measure Truffle::FeatureLoader.find_feature_or_file, which is what we would want here.
metrics searching only measures Primitive.find_file and metrics require only measures actually loading a file but not the search before.
Maybe we should improve that. OTOH regular CPUSampler should already be able to measure how much Truffle::FeatureLoader.find_feature_or_file takes without anything special.

@itarato itarato force-pushed the feature/PA-faster-loaded_feature_path branch from 21211a3 to d7807c1 Compare April 20, 2023 14:10
@itarato itarato force-pushed the feature/PA-faster-loaded_feature_path branch from d7807c1 to 75acd60 Compare April 20, 2023 15:17
Copy link
Member

@eregon eregon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you write a changelog entry for this under performance? Then I think it's good to go.

@itarato
Copy link
Collaborator Author

itarato commented Apr 20, 2023

Could you write a changelog entry for this under performance? Then I think it's good to go.

Certainly. Also adding some more results to the PR description.

@itarato itarato force-pushed the feature/PA-faster-loaded_feature_path branch from 75acd60 to 67fcd2d Compare April 20, 2023 18:23
@itarato itarato requested a review from eregon April 20, 2023 18:23
@eregon eregon added the in-ci The PR is being tested in CI. Do not push new commits. label Apr 21, 2023
@graalvmbot graalvmbot merged commit 6c93b4c into oracle:master Apr 24, 2023
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in-ci The PR is being tested in CI. Do not push new commits. OCA Verified All contributors have signed the Oracle Contributor Agreement. shopify
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants