Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fixes #22] Introduce FastArray, a frozen Array with fast inclusion lookup #29

Closed
wants to merge 1 commit into from

Conversation

marcandre
Copy link
Contributor

Creates an Array-like class with === and include? from Set.

module AST
# A frozen Array/Set
#
class Tuple < ::Array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subclassing core classes is not recommended.

https://words.steveklabnik.com/beware-subclassing-ruby-core-classes has the quote

If you don’t know exactly how these classes operate at the C level, you’re gonna have a bad time.

And I remember Rails running into problems when making ActionController::Parameters no longer inherit from Hash: rails/rails#14384 and rails/rails#20868

Could we use composition instead, and still get the same benefits?

Copy link
Contributor Author

@marcandre marcandre Jun 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd claim I know almost exactly how these classes operate at the C level 😅

It's true it can be tricky to inherit from base classes. Composition also can be tricky (as your rails PR show). Here though, we're talking about a completely immutable object that acts like an array in almost all aspects, just with an optimized include? and :=== that's an alias. The idea is to use it instead of frozen arrays of strings/symbols.

FWIW, HashWithIndifferentAccess was an abomination 😆, and ActiveSupport::SafeBuffer < String and that's quite a bit trickier as it's meant to interact with other Strings and keeping track of safety is really important. Here we typically don't really care if if make an operation and we get back an Array or a Tuple, because we don't really make operations on them (or we wrap them in a Tuple anyways).

Copy link
Member

@koic koic Jun 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen the concept of Tuple in Rinda, but I'm not familiar with it 😅
https://ruby-doc.org/stdlib-2.7.1/libdoc/rinda/rdoc/Rinda/Tuple.html

Array, Hash and Set do not inherit from each other. I'm wondering if Tuple is a concept related to is_a of Array or not. The following is an example of Rinda:

% ruby -rrinda/rinda -e 'p Rinda::Tuple.ancestors'
[Rinda::Tuple, Object, Kernel, BasicObject]

Tuple may be a different concept than Array.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we cannot use a Set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@koic To be honest, I'm not completely satisfied with Tuple, a more descriptive name would be more like DeeplyFrozenIndexedArray. And that is_a? Array...

@bquorning That was the first thing I tried, but Set are sorely lacking and the first two points created annoying issues. The choice was to have potential incompatibilities and have to worry about which "lists" were Sets (most) or Arrays (some that didn't need indexing and used join, say).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Refinements are a fine (no pun intended) feature that might fit well here. Also have never tested the performance implications of them (but use them actively at work.) 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, FastArray, ConstArray, SetArray (as in "fixed" and as "set-like" :-) ),IndexedArray

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, is there a naming preference? Tuple / FastArray / IndexedArray / ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, arrays are supposed to be indexed by default, so I don't like IndexedArray. Seems to me that FastArray is the best out of the proposed names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with FastArray & rebased

@marcandre marcandre force-pushed the tuple branch 2 times, most recently from e5117a8 to 9fbfca5 Compare June 11, 2020 16:15
@marcandre marcandre changed the title [fixes #22] Introduce Tuple, a frozen Array/Set [fixes #22] Introduce Tuple, a frozen Array with fast lookup Jun 11, 2020
@marcandre marcandre force-pushed the tuple branch 2 times, most recently from 53dcf85 to b03f186 Compare June 24, 2020 01:32
@marcandre marcandre changed the title [fixes #22] Introduce Tuple, a frozen Array with fast lookup [fixes #22] Introduce FastArray, a frozen Array with fast inclusion lookup Jun 24, 2020
end
end

def FastArray(list) # rubocop:disable Naming/MethodName
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not too crazy about us defining constants outside of the RuboCop namespace. Is it necessary? As far as I can see, we only use FastArray inside the RuboCop::AST namespace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it's going to be better to define some refinement for use within RuboCop's code base and just use a conversion method like .to_fa everywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constant FastArray is defined within RuboCop, but I also added a global method FastArray, but it isn't strictly necessary, I could use FastArray[ary]. I can try with refinements too. Is to_fast_array ok? I fear to_fa would be quite cryptic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my personal opinion, if we could get away with defining the FastArray method inside RuboCop scope, that would be optimal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my personal opinion, if we could get away with defining the FastArray method inside RuboCop scope, that would be optimal.

We could, but we'll need an explicit extend FastArray::Method or similar (not needed when inheriting from a class that already extend it, like Cop::Base). Let me try that first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Force pushed without a global function, and on the rubocop side too.

@bbatsov
Copy link
Contributor

bbatsov commented Jul 6, 2020

I'm fine with this approach, but if the others have concerns I won't push for it. We can always revisit this down the road.

@marcandre
Copy link
Contributor Author

Cool.

@bquorning did you have a chance to look at the approach without global function?

@bquorning
Copy link
Contributor

Sorry, I forgot to check before now.

I really like the explicit extend FastArray::Function. It’s not as easy to use as having the function in global namespace, but there’s less confusion about where the function really comes from now.

@koic
Copy link
Member

koic commented Jul 6, 2020

I have the following @bquorning opinion again.

It’s a tradeoff between speed, readability and understandability.

rubocop/rubocop#8133 (comment)

Performance improvements are welcome, however I love the ease and readability of Ruby itself.
I think merging this PR means applying FastArray to the entire RuboCop rubocop/rubocop#8133, and RuboCop party gem and beyond.
My wish is to be able to read and write with plain Array literal provided by Ruby. I would like to take up ease and readability of Ruby itself rather than about 5% performance improvement.

@marcandre
Copy link
Contributor Author

We can always revisit this down the road.

I just want to mention that one issue is that these constants are public. Most are arrays, some are sets, which is confusing and error prone. It's less of an issue to change them now (pre 1.0) than after.

@jonatas
Copy link
Contributor

jonatas commented Jul 27, 2020

I'm not an avid contributor but I'd like to give an idea here:

Instead of enforcing the FastArray inline, we can do it via metaprogramming just patching the class.constants no?

node_classes = RuboCop::AST.constants.grep(/Node$/).map(&RuboCop::AST.method(:const_get))
node_classes.each do |clazz|
  clazz.constants.each do |const|
    value = clazz.const_get(const)
    next unless value.grep(Symbol) != value # all should be symbols
    clazz.remove_const(const)
    clazz.const_set(const, FastArray(value))
  end 
 end

We can keep the implementation and add some good documentation in the to patch it but we don't need to have the code explicit in all the places.

@marcandre
Copy link
Contributor Author

@jonatas I like your spirit 😅

While most such constants are in RuboCop::AST::Node and such, there are other constants in the main gem's cops that would benefit from that treatment too. Still, it could be done with an opt-in, say extend OptimizeArrayConst or something.

Did you have a look at #57?

@jonatas
Copy link
Contributor

jonatas commented Jul 27, 2020

I understood your point. Very nice idea on #57, I checked now :)
I did some benchmarking locally. This looks quite challenging 🥇

@marcandre Have you shared the issue with the ruby-core team? because as far as I can see here, this looks like bring benefit to any ruby array lookup right?

maybe someone has some ideas about how to make it better for the array in general or introduce the FastArray in the core library as we have BasicObject that is lighter than Object.

@marcandre
Copy link
Contributor Author

this looks like bring benefit to any ruby array lookup right?

Not really, no. Arrays may have duplicated elements, are not meant for lookups, but are good for fast creation / insertions / accessing n-th element, low memory, etc. Sets may not have duplicated elements, and are meant for lookups and are slower for creation / insertions. For all these frozen constants, we really should be using Sets, but there are a few annoyances which I hope to get ruby-core to reconsider in https://bugs.ruby-lang.org/issues/16989

@marcandre marcandre closed this Sep 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants