Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fixes #22] Use Sets instead of Arrays #26

Closed
wants to merge 1 commit into from

Conversation

marcandre
Copy link
Contributor

No description provided.

@bbatsov
Copy link
Contributor

bbatsov commented Jun 7, 2020

I like this suggestion. In general I've often been frustrated with the Ruby community's dislike of sets. I've always attributed this to the fact they are not part of the core language and don't have a convenient literal syntax.

@marcandre
Copy link
Contributor Author

So this breaks one cop (at parse time); I tweaked it in master but if we merge and release this (as is), this would break existing rubocop gems...
This shows two things:

  1. that even with my recent change, CI should check different version of rubocop #5 is not closed...
  2. how do we handle this? and how do we handle it in general?

@bbatsov
Copy link
Contributor

bbatsov commented Jun 7, 2020

how do we handle this? and how do we handle it in general?

Off the bat I can only think of another set of constants with the array versions derived from the set version, but that's going to be painful. Or we can use some custom adapter class that would behave more like array?

@marcandre
Copy link
Contributor Author

Answer to #2 is simple actually... semantic versionning. If we break previous Rubocop, that means a major version (of the -ast) gem. And the core rubocop should always require ~>1.x, ~>2.x, etc.

@marcandre
Copy link
Contributor Author

So I'd say we should tweak the current gemspec from >= 0.0.3 to add ~>0.0. Release 0.86.0 as the latest and greatest v0 release (which will fix the Team.new incompatibility).

Then we can merge this PR for 1.0.0rc0 and the big refactor on the main gem side.

@bbatsov
Copy link
Contributor

bbatsov commented Jun 7, 2020

Sounds like a plan. Let's do it!

@ashmaroli
Copy link

The downside to this move is increased memory consumption.
To verify, you can use the following script:

# frozen_string_literal: true

require 'set'
require 'memory_profiler'

def profile
  puts ''
  MemoryProfiler.report { yield }.pretty_print(scale_bytes: true)
  puts ''
end

profile { COMPARISON_OPERATORS = %i[== === != <= >= > <].freeze }
profile { COMPARISON_OPERATORS_SET = %i[== === != <= >= > <].to_set.freeze }

@marcandre
Copy link
Contributor Author

@ashmaroli you are correct that Sets take more memory than arrays. You need to realize these are constants. It's a one time memory-cost you can calculate in bytes. To avoid status quo bias, imagine they had been written as Sets all along and someone was proposing a PR to save those bytes in memory, at the cost of performance (and expressivity)...

@ashmaroli
Copy link

Yes, I understand that they're constants and won't matter much during runtime. Just saying that Sets are inherently heavier and when they are converted into arrays (perhaps in an extension outside the Core's control), it'll result in allocating more objects (one for the array, one each for constituent item).

Disclaimer: I've not yet checked if a frozen Set constituent of frozen strings get converted into an array of frozen strings.

@marcandre
Copy link
Contributor Author

I've not yet checked if a frozen Set constituent of frozen strings get converted into an array of frozen strings.

FYI, Set#to_a does not dup keys or anything, but implementation uses a hash, and hash will dup&freeze string keys if they aren't already, Set[unfrozen_strings...].to_a # => array of always the same frozen strings

@ashmaroli
Copy link

Thanks for letting me know ✌️

@marcandre marcandre marked this pull request as draft June 8, 2020 03:51
marcandre added a commit to marcandre/rubocop that referenced this pull request Jun 8, 2020
bbatsov pushed a commit to rubocop/rubocop that referenced this pull request Jun 8, 2020
@marcandre
Copy link
Contributor Author

marcandre commented Jun 10, 2020

FWIW, I'm quite frustrated by the state of affairs with Sets.

I'm preparing a series of feature requests for ruby-core (comments welcome).

One issue in particular is that most of our %i[...].freeze should be converted to Sets, but some are only used with join or each, so shouldn't be converted, which would lead to us having to check all the time if a CONSTANT is actually a Set or an Array, when we really shouldn't care much.

Or we can use some custom adapter class that would behave more like array?

Actually I think that might be best. I think it should be easy to define class RuboCop::Tuple < Array with a refined constructor (that would build the set and deep freeze the whole thing), include? and ===. I think that's all we need. This should be 100% compatible and efficient. I'll see how far that can go tomorrow.

@bbatsov
Copy link
Contributor

bbatsov commented Jul 6, 2020

@marcandre We can close this PR, right?

@marcandre marcandre closed this Jul 6, 2020
@marcandre
Copy link
Contributor Author

Yes :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants