New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize getting node tokens #8450
Conversation
Awesome! I'm ok to merge this, but one easy thing we really should be doing is putting the caching / searching as methods of Also: did you try sorting the tokens by positions? |
|
||
# rubocop:disable Metrics/AbcSize | ||
def tokens(node) | ||
@tokens ||= {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize you just copy-pasted, but {}.compare_by_identity
would be nicer to use.
Yes, looks like the cache not needed in this case. I have also removed such a cache https://github.com/rubocop-hq/rubocop/pull/8450/files#diff-938b46da0d108c71380ac9427b83dde0L47-L56 and didn't notice any difference. In my previous PRs, where I tried to reduce memory, I've seen that method allocated enough memory (can't remember how much), so a win here also.
Yes, will move those methods to
This will always work in |
Correct
Yes, each search is Probably better would simply be to index the tokens by start and end position and do a straight lookups, since we will should always find tokens that start and end exactly where we are looking to, right? Building the indices would be |
Why? I would say, it is a constant. Most of the time, we are not performing searches for every token, but just for some of them, like in |
Right, but how many What do you think of my idea of indexing the begin and end position of tokens? |
I'm still have doubt that this will make things faster. Will implement all 3 approaches (original, sorting and indexing) and report results here. |
That would be great, but I wouldn't want you to feel forced; as stated previously, I'm 👍 to merge this as is. |
No problems 😄 Ok, I have tested 3 implementations and was a bit surprised as I get almost identical results for them. I rechecked, I was testing 3 different versions, not the same one, so everything ok here. With sorting version, I get the most concise code and it does not allocate extra space, like in indexing version, so I would implement that and submit a PR into |
Cool! Glad you checked 👍 |
Good work! |
🤣 we'll be picking a different implementation, but it won't be difficult to revert this |
Yeah, I saw this, but I'm thinking of cutting a release later today, so that's going to benefit our users in the mean time. |
Ran on 30k files.
Before
After
Improvement:
9-10%