Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow cops to invalidate results cache #7496

Merged
merged 1 commit into from
Nov 23, 2019

Conversation

maxh
Copy link
Contributor

@maxh maxh commented Nov 11, 2019

Internally at Flexport, we've written a few cops to enforce Rails Engine isolation. They've helped us incrementally modularize our system, and we think they might be valuable to others too. I'm working to open source them now.

For example, GlobalModelAccessFromEngine (flexport/rubocop-flexport#5) forbids code within Rails Engines from directly accessing global models in the main app/ directory. Another example is EngineApiBoundary (flexport/rubocop-flexport#6).

(Note: we considered upstream these to rubocop-rails, and @koic suggested we create our gem instead: rubocop/rubocop-rails#152 (comment))

These cops read from the filesystem, taking a holistic view of our codebase while they inspect. Unfortunately, this approach does not play nice with the RuboCop results cache. As a workaround, we created a custom lint.rb that wraps rubocop and busts the cache when needed using the --cache param. It works ok.

But it's not ideal, especially for broader use. This PR aims to fix the issue at its source by allowing cops themselves to invalidate the cache.

Example

Consider the following example:

(1) A filesystem contains these files:

app/models/my_model.rb
engines/my_engine/app/services/my_engine/my_service.rb

(2) We run rubocop engines/my_engine/app/services/my_engine/my_service.rb and find a violation -- service.rb contains MyModel.find(123).

(2b) During that run, a cached result is stored, keyed by: the inspected source code, the RuboCop config, command-line options, and executable version.

(3) We run the same command again. Cache is hit, same violation shown. All good.

(4) We move the model file into the engine. So now app/models/my_model.rb no longer exists and we have engines/my_engine/app/models/my_engine/my_model.rb.

(5) We run the same RuboCop command again. Because none of the cache key inputs changed, the same violation is shown again. This is incorrect.

(6) We run the same command again with --cache false and see that the violation no longer exists. This is correct.

Implementation overview

This PR allows cops to define an external_dependency_checksum method that busts the cache when their external dependencies change. Cops are responsible for computing their own checksum however they deem appropriate.

Among existing cops, I believe there are no use cases for this method. For the cops we've written, there are two types of external dependencies: (1) the presence or absence of certain files and/or (2) the contents of certain files.

Discussion

I recognize that this is perhaps an unusual use case for RuboCop. As an alternative, we could create our own repo with these cache-unfriendly cops and package them with a custom runner script that does cache busting inside the script. But that seems suboptimal. And I suspect other cops may make use of this feature in the future as well.

Feedback welcome! Thanks!


Before submitting the PR make sure the following are checked:

  • Wrote good commit messages.
  • Commit message starts with [Fix #issue-number] (if the related issue exists).
  • Feature branch is up-to-date with master (if not - rebase it).
  • Squashed related commits together.
  • Added tests.
  • Added an entry to the Changelog if the new code introduces user-observable changes. See changelog entry format.
  • The PR relates to only one subject with a clear title and description in grammatically correct, complete sentences.
  • Run bundle exec rake default. It executes all tests and RuboCop for itself, and generates the documentation.

@bbatsov
Copy link
Collaborator

bbatsov commented Nov 14, 2019

I'm fine with the proposed solution, but I'll defer to @jonas054 to evaluate the implementation.

lib/rubocop/cop/cop.rb Outdated Show resolved Hide resolved
Copy link
Collaborator

@jonas054 jonas054 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks very good to me. I just found a couple of small things to complain about. :)

lib/rubocop/runner.rb Outdated Show resolved Hide resolved
Copy link
Collaborator

@jonas054 jonas054 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@maxh
Copy link
Contributor Author

maxh commented Nov 22, 2019

Here's a blog post that goes into more detail about the Rails Engine cops mentioned in the PR description:

https://flexport.engineering/isolating-rails-engines-with-rubocop-210feaba3164

When this is merged, I will be able to full upstream those cops. Thanks!

@bbatsov bbatsov merged commit 5654998 into rubocop:master Nov 23, 2019
@bbatsov
Copy link
Collaborator

bbatsov commented Nov 23, 2019

Thanks!

# ResultCache system when those external dependencies change,
# ie when the ResultCache should be invalidated.
def external_dependency_checksum
nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you run the cop on n files, this method will be called every time, recomputing the checksum n times, correct?

Maybe it would be useful for rubocop to have a "run" context that keeps state across files and is accessible from a cop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question -- please see #7543. I believe these results should be cached per team/config, so don't need to be recomputed per inspected file.

koic added a commit that referenced this pull request Nov 28, 2019
Follow up of #7496.
@maxh maxh deleted the maxh/enable-cache-key branch December 5, 2019 22:53
koic added a commit to koic/rubocop-rails that referenced this pull request Apr 9, 2020
…updating schema.rb

Fixes rubocop#227.

This PR makes `Rails/UniqueValidationWithoutIndex` aware of updating
db/schema.rb

`Rails/UniqueValidationWithoutIndex` cop needs to know both model
and db/schema.rb changes to register an offense. However, with
default RuboCop, only changes to the model affect cache behavior.

This PR ensures that changes to db/schema.rb affect the cache by
overriding the following method:

```ruby
# This method should be overridden when a cop's behavior depends
# on state that lives outside of these locations:
#
#   (1) the file under inspection
#   (2) the cop's source code
#   (3) the config (eg a .rubocop.yml file)
#
# For example, some cops may want to look at other parts of
# the codebase being inspected to find violations. A cop may
# use the presence or absence of file `foo.rb` to determine
# whether a certain violation exists in `bar.rb`.
#
# Overriding this method allows the cop to indicate to RuboCop's
# ResultCache system when those external dependencies change,
# ie when the ResultCache should be invalidated.
def external_dependency_checksum
  nil
end
```

https://github.com/rubocop-hq/rubocop/blob/v0.81.0/lib/rubocop/cop/cop.rb#L222-L239

See for more details: rubocop/rubocop#7496
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants