Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract static array into constant in RedundantSelf cop #8154

Merged

Conversation

fatkodima
Copy link
Contributor

No need to allocate an array for every check.
It is also faster when converted to Set, but I preserved the logic. Will update with set, if requested.

@bbatsov bbatsov merged commit 41b5b5e into rubocop:master Jun 14, 2020
@bbatsov
Copy link
Collaborator

bbatsov commented Jun 14, 2020

Thanks!

It is also faster when converted to Set, but I preserved the logic.

I'm assuming the performance difference is small, so that's not a big deal.

@marcandre
Copy link
Contributor

marcandre commented Jun 14, 2020

I'm assuming the performance difference is small, so that's not a big deal.

I would like to understand what makes you assume this. It's not the same O. With these particular values it's a 5x slower. How much faster than 5x does it have to be?

I really wish I could convey how inappropriate Arrays are to call include? on them while Sets are designed for this. Let me know how I can help without sounding like a broken record. I hope we can settle quickly on a name and merge #8133 and rubocop/rubocop-ast#29

$ ruby test.rb
Running each test 32768 times. Test will take about 2 seconds.
using_set is faster than using_array by 5x ± 1.0
# test.rb
# frozen_string_literal: true

require 'fruity'
require 'set'

KEYWORDS = %i[alias and begin break case class def defined? do
              else elsif end ensure false for if in module
              next nil not or redo rescue retry return self
              super then true undef unless until when while
              yield __FILE__ __LINE__ __ENCODING__].freeze

KEYWORDS_SET = KEYWORDS.to_set.freeze

Fruity.compare do
  using_array { KEYWORDS.include? :x }
  using_set   { KEYWORDS_SET.include? :x }
end

Edit: Changed the value to check from a string to a symbol, factor changed from 10x to 5x. Point remains valid.

@fatkodima
Copy link
Contributor Author

fatkodima commented Jun 14, 2020

I also have made local benchmark after Bozhidar's comment, and when checking for .include? the last element, I have seenSet is 4x faster than Array. And when checked for first element, Array was faster by 10%.

After programming on C, where due to L-whatever caches and spatial locality, small arrays are much faster than any linked data structures, I still can not accept that in ruby is an opposite situation. But it is ruby 😄

@marcandre
Copy link
Contributor

marcandre commented Jun 14, 2020

Sets are not "linked data structures". They are indexed structures, a Hash table.

@fatkodima
Copy link
Contributor Author

fatkodima commented Jun 14, 2020

Collision lists are implemented as linked lists, so definitely a "bit linked". 😄
Or Set can be implemented as a tree, so definitely a linked data structure.

I mean, in C, processing small arrays sequentially is most of the time faster than calculating hash value, jumping to specific location and optionally traversing collision lists.

So it a bit surprises me at first with the opposite situation in ruby, but after a little thinking and due to the fact that ruby is an interpreted language, everything falls into place.

@marcandre
Copy link
Contributor

marcandre commented Jun 14, 2020

Collision lists are implemented as linked lists, so definitely a "bit linked". 😄

Good try, but you won't get collisions on a small Hash map, and even on big maps, collisions account for a negligible cost as you traverse them only on hash collision...

Or Set can be implemented as a tree, so definitely a linked data structure.

Another good try, but they officially aren't "Set uses Hash as storage", etc... 😄

@fatkodima
Copy link
Contributor Author

Another good try, but they officially aren't "Set uses Hash as storage", etc... 😄

Yes, I know, I was talking about some possible implementation somewhere else 😄

Btw, thanks for interesting thread :)

@MaxLap
Copy link
Contributor

MaxLap commented Jun 15, 2020

Not to beat on a dead horse, but we are talking about check for method calls to methods such as .def. Those basically never happen. It's safe to say that way over 99.9% of cases will not be found.

There is no balance to strike between the performance of finding the first element and the last element. It will "never" be the first, and never be the last. You only have to consider the case where the element is not in the Array/Set.

So yeah, gotta make sure to benchmark the right thing 😉

(Ps: Since Ruby 2.4, there are no more chaining, Hashes use open addressing with linear probing, so they will just check the next slot when there is a collision 😛 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants