New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rubocop -a is slow #6492

Open
gurgeous opened this Issue Nov 16, 2018 · 11 comments

Comments

Projects
None yet
5 participants
@gurgeous
Copy link

gurgeous commented Nov 16, 2018

See ruby-formatter/rufo#2 and Microsoft/vscode#41194. It is not really possible to use Rubocop autocorrect in vscode unless the user manually increases the editor.formatOnSaveTimeout. I have mine set to 3s, which seems to be sufficient for my app on my beefy machine. I bet if we sped up Rubocop autocorrect, many more people would adopt it for vscode Format on Save.

Here's a simple benchmark:

$ wget https://raw.githubusercontent.com/rails/rails/master/actionpack/lib/action_controller/metal/strong_parameters.rb
$ time rubocop -a strong_parameters.rb
Inspecting 1 file
...
1 file inspected, 118 offenses detected, 114 offenses corrected

real	0m3.087s
user	0m2.853s
sys	0m0.194s

I have 43 cops disabled in my .rubocop.yml, so the vanilla config might be slower. Thanks so much for all your hard work on Rubocop. I love it and use it daily with all my projects!

@Drenmi

This comment has been minimized.

Copy link
Collaborator

Drenmi commented Nov 16, 2018

I'll try to provide some insight into why this is slow. Part of it is startup time. Parsing configuration, etc. Part is RuboCop currently works by using what we call the "correction loop". As soon as we rewrite part of a file, we have all cops re-inspect that file, since the correction may have caused violations that can be picked up by other cops.

So when your file has 114 correctable offenses, we will have all cops inspect the file 115 times. I assume the biggest problem will be on the first run. I haven't given much thought to how this could be done differently, but there might be better ways.

@bbatsov

This comment has been minimized.

Copy link
Collaborator

bbatsov commented Nov 16, 2018

Yeah, that's a great overview of the problem and we should really investigate where exactly we spend so much time and how can we optimize the performance. Past experience has taught me that when we put our focus in this direction we can achieve some epic performance boosts. :-)

@gurgeous

This comment has been minimized.

Copy link
Author

gurgeous commented Nov 16, 2018

I am completely unfamiliar with the rubocop source, but I noticed that RuboCop::Runner#do_inspection_loop runs the cops 12 times on the strong_parameters.rb test file referenced above. Is that amount of looping common?

@jonas054

This comment has been minimized.

Copy link
Collaborator

jonas054 commented Nov 17, 2018

@gurgeous I'm not sure how common it is, but I think it's expected. Some cops rewrite the code so that new offenses are generated. I've looked at what happens during auto-correction of strong_parameters.rb, and it looks like the Layout/CommentIndentation cop is especially to blame for the long running time of rubocop -a. The cop corrects one line at a time and writes that change to file. If there's a preceding comment line that originally had the same indentation as the corrected line, that line now has an offense, which will be corrected in the next run. And so on.

If the indentation cops could look at the surrounding lines and fix them too when they see that there will be new offenses in the next iteration, I think we could take big steps towards shorter auto-correction times.

@gurgeous

This comment has been minimized.

Copy link
Author

gurgeous commented Nov 17, 2018

I see, thank you for clarifying. I wonder if we should also target the second rubocop run, since that mirrors real world usage when plugged into an editor. For example:

$ wget https://raw.githubusercontent.com/rails/rails/master/actionpack/lib/action_controller/metal/strong_parameters.rb

$ time rubocop -a strong_parameters.rb
...
real	0m4.669s
$ time rubocop -a strong_parameters.rb
...
real	0m1.261s

This is on a slower machine so the numbers are a bit worse than the earlier benchmark. Why is the second pass slow? Is that mostly startup costs?

@gurgeous

This comment has been minimized.

Copy link
Author

gurgeous commented Nov 18, 2018

On the second run, half the execution time is requiring files. I wonder if we should only load cops that support autocorrect. Another optimization would be to avoid loading cops that are explicitly disabled. Also, loading Rack seems to take a large chunk of time from the HttpStatus cop.

jonas054 added a commit to jonas054/rubocop that referenced this issue Nov 18, 2018

[rubocop-hq#6492] Auto-correct chunks of comments
One common reason for `rubocop -a` to be slow is that individual corrections
create new offenses for the next iteration of the inspect/autocorrect
loop. Perhaps the worst problem is posed by `Layout/CommentIndentation` when it
fixes only the last comment line in a large multi-line inline comment. Then we
must go through the loop as many times as there are lines in that multi-line
comment.

This change improves the situation a lot, but I think it's too early to close
the issue. More improvements can probably be made before we're satisfied.
@jonas054

This comment has been minimized.

Copy link
Collaborator

jonas054 commented Nov 18, 2018

since that mirrors real world usage when plugged into an editor

Do you by that mean that the most common case is that rubocop -a doesn't make any changes to the file? That's true. But it's more difficult to solve that case, I believe.

half the execution time is requiring files

Interesting! How did you measure that?

I wonder if we should only load cops that support autocorrect.

That's a big change. We've always let all configured cops report their offenses, running with or without --auto-correct. But we could make it configurable I guess. If it turns out to be a big performance boost.

Also... please take a look at my PR. 😄 It cut the execution time in half when I ran on strong_parameters.rb.

bbatsov added a commit that referenced this issue Nov 18, 2018

[#6492] Auto-correct chunks of comments
One common reason for `rubocop -a` to be slow is that individual corrections
create new offenses for the next iteration of the inspect/autocorrect
loop. Perhaps the worst problem is posed by `Layout/CommentIndentation` when it
fixes only the last comment line in a large multi-line inline comment. Then we
must go through the loop as many times as there are lines in that multi-line
comment.

This change improves the situation a lot, but I think it's too early to close
the issue. More improvements can probably be made before we're satisfied.
@gurgeous

This comment has been minimized.

Copy link
Author

gurgeous commented Nov 18, 2018

Do you by that mean that the most common case is that rubocop -a doesn't make any changes to the file? That's true. But it's more difficult to solve that case, I believe.

Yes. When using rubocop via vscode, I might run rubocop -a dozens of times as I'm editing a particular file.

Interesting! How did you measure that?

I modified exe/rubocop:

tm = Time.now
require 'rubocop'
printf("%0.3f require\n", Time.now - tm)

require 'benchmark'

cli = RuboCop::CLI.new
result = 0

time = Benchmark.realtime do
  result = cli.run
end
printf("%0.3f cli\n", Time.now - tm)

which produces:

0.664 require
1.379 cli

I'm sure we can find ways to improve startup time if we poke around a bit...

Great work on the PR! Incredible improvement and much appreciated.

@jonas054

This comment has been minimized.

Copy link
Collaborator

jonas054 commented Dec 8, 2018

I've experimented with caching the compiled node patterns that we get from def_node_matcher. This gave a decrease in start-up time by around 15% on my system. I'm not at all sure it's worth the effort, because a) it adds complexity, and b) results depend on hardware configuration.

@ndbroadbent

This comment has been minimized.

Copy link

ndbroadbent commented Dec 15, 2018

UPDATE: 0.3.0 was released with these changes.

Hi everyone, I've just finished working on a PR for this rubocop-daemon library. I wanted to see if RuboCop could be daemonized, so I searched "rubocop daemon", and found this library! I needed to do a ton of work to make it more reliable and fix a lot of issues, but now it seems to be working really well.

The idea is that this starts RuboCop as a daemon process. A client will connect to the server and send commands, and the daemon will return the results. The default client is a Ruby script, but this also adds some latency. So I've written a bash script that connects to the TCP server via netcat, and it's way faster. This bash script is actually a drop-in replacement for rubocop, and it makes it around 20x faster (~150ms to lint/format a file, instead of ~3 seconds.)

It might be a good idea to rewrite this script in Go or Rust, but bash seems fast enough for now.

It was a bit tricky to get it working with VS Code. The vscode-ruby extension doesn't really allow you to customize the rubocop path or binary. So in the meantime I'm just overriding the binary with a symlink:

/Users/ndbroadbent/.rvm/gems/ruby-2.5.3/bin/rubocop -> /usr/local/bin/rubocop-daemon-wrapper

I've added the installation instructions to the README. (You have to manually download this script and put it in your PATH.)

I've tested this on a few different apps, and I think it should be pretty stable now. But I've only tested on my Mac, and on Ruby 2.5.3. I could use some help testing on Linux, and with different Ruby/RuboCop versions, etc. Please try it out and let me know if you run into any issues!

Benchmarks

git clone https://github.com/fohte/rubocop-daemon /tmp/rubocop-daemon
cd /tmp/rubocop-daemon
bundle install

time rubocop -a      # no daemon
...
real	0m2.321s


gem build rubocop-daemon.gemspec
gem install rubocop-daemon-0.3.0.gem

time ./bin/rubocop-daemon-wrapper -a          # first run starts the daemon
...
real	0m2.290s

time ./bin/rubocop-daemon-wrapper -a         # a bit faster on the second run
...
real	0m0.742s

time ./bin/rubocop-daemon-wrapper -a         # much faster on third run (not too sure why)
...
real	0m0.131s

From #6492 (comment)

cd /tmp
wget https://raw.githubusercontent.com/rails/rails/master/actionpack/lib/action_controller/metal/strong_parameters.rb

time rubocop -a strong_parameters.rb
...
real	0m2.468s

time rubocop -a strong_parameters.rb
...
real	0m1.963s

time ./rubocop-daemon/bin/rubocop-daemon-wrapper -a strong_parameters.rb
...
real	0m2.327s

time ./rubocop-daemon/bin/rubocop-daemon-wrapper -a strong_parameters.rb
...
real	0m0.539s
@gurgeous

This comment has been minimized.

Copy link
Author

gurgeous commented Dec 16, 2018

Oooh, this looks neat. I'm headed out of town for a few days but I'll try it when I return. This could make a big difference in my workflow.

We should also get a PR going for vscode-ruby to allow customizing the complete path to rubocop. Right now vscode-ruby let's you set a bunch of flags to tweak how rubocop is executed, instead of one string to just specify the entire command line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment