New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core API can be more efficient #1212
Comments
@nzakas I think this issue can be closed. |
@btmills said in the PR that he was still digging into this (note the commit says "refs" instead of "fixes"). So I'm leaving it open to give him a chance to continue investigating. |
But its already merged. |
Yes, and there can be other PRs that add onto those changes. Again, the "refs" in the commit message indicates that this is not intended to close the ticket. |
After speeding up tokens, the new most time-consuming method was |
Instead of reverse()ing, iterates parents backwards. Indexes scopes for direct random lookup rather than linear search.
As I mentioned at the end of #1216, I added some timing code in a local branch to track relative processing time consumed by each rule. Here are the ten worst offenders on
(Note that this is with the change in #1216.) Looks like |
Nice analysis! |
Instead of reverse()ing, iterates parents backwards. Indexes scopes for direct random lookup rather than linear search.
Fix: rewrite no-spaced-func rule (refs #1212)
This is interesting. Here's my run of
Here's when I run it with master:
If I go back to the tagged 0.7.4 version, I get:
So for some reason, at least on my machine, it seems like we're going backwards in terms of performance. Any ideas? |
Just a guess: new rules or features added since 0.7.4 carried performance penalties that weren't fully offset by the recent optimizations. If that theory is correct, in doing more, it got slower, but not nearly as much as it could have. |
Good point. I'm going to review any new on-by-default rules and see what's up. |
The only two new rules added into the default config as enabled are |
Okay, something weird is going on. I just tried locally reverting commit 137b4dd and then the performance is even better than 0.7.4:
I'm running Node v0.10.31 on Windows. When I try running it on Linux, master is faster than 0.7.4, so it seems like a platform-specific difference. :( That puts me in a bit of a tough spot because even though the performance is slightly better on Linux, it's quite a bit worse on Windows. |
There's a possibility that V8 handles memory allocation on Linux differently then on Windows. For example, if linked list is done as a sparse array on Windows, memory seek time might be offsetting performance gains from the change (although doubtful). It's also possible that while you have faster CPU, you have slower memory. In that case, linked list is going to be quite a bit more costly then array. Edit: Also, looking at the changes in the initial commit, I'm really not sure why there was any increase in performance at all. Changing to doubly linked list should've made no difference, since BigO of the code didn't change at all (except doubly linked list will be using more memory and can suffer from seek time latency). What would've probably make a difference is to change output array in each of those functions into linked list or stack. Unless |
All quite possible. However, my larger concern is making a divergent performance change across platforms. It makes it really hard to judge if other changes are making things faster or slower. At the moment, in the interest of getting v0.8.0 out the door, I'm strongly considering rolling back the Thoughts? |
I'm on Mac OS.
Is anyone else able to reproduce @nzakas's claim? If not, it's not a divergent perf change, it's a single outlier. |
I'm trying to reproduce it right now on my machine (Windows). Will get back with results in a few. Also, can @btmills explain why he thinks though changes should improve performance. Maybe I'm missing something, but as I mentioned in the edit of my comments, I'm not seeing anything that should improve performance in getTokens changes. |
I'm using Windows 7, FWIW. Anyone else have a Windows 7 machine to test on? I'll hold off on pushing out 0.8.0 at least until tomorrow to see if we can get some more data and maybe a possible solution. If not, I'll revert the change and push out 0.8.0, and we can revisit in a future release. |
I found my really old laptop that is still running Windows 7. Not sure how useful this data is, but here's what I'm seeing:
So still doesn't line up with what @nzakas is seeing. |
Mine, on Windows 8.1:
|
Just for reference: http://jsperf.com/unshift-vs-stack This is an improvement that I was talking about in the comment above. I have changes locally, but I will hold off until after v8 release. I also still don't really think that changes that @btmills introduced are as good as it can get. I can see performance improvement, I just really don't understand why I'm seeing it. Linked lists should be slower to traverse then arrays, they are better at insertion and deletion, which we are not doing at all. Array + hashtable should actually perform better. Doubly linked skip list should be even better then that. I'm going to see if I can do a partial skip list implementation for getTokens and check it's performance. |
I'm working on |
Fix: rewrite no-unused-vars rule (refs #1212)
Okay, I'm reverting 137b4dd and going ahead with the 0.8.0 release. We can continue investigating, but in the meantime, the other changes by @btmills and @michaelficarra have resulted in about a 100-200ms speed improvement, which is pretty sweet. |
I modified the code for tokens to use hashtable + array. Changed all skip functions (getTokenAfter, getTokenBefore, etc.) to have static bigO and all of the rest of the function only do as many iterations as the number of items returned and I'm not seeing any performance improvement (at least on my machine). I think that at this point, the code is IO bound, not CPU. Here's the link to the branch with code changes: https://github.com/ilyavolodin/eslint/tree/performance |
Update: refactor tokens API (refs #1212)
I think we finished this work, so closing. |
There are slight but measurable performance gains to be had from refactoring how tokens are handled internally by the tokens API. Right now, tokens are indexed in a sparse array, and traversal is done (nearly) linearly based on the range index, with some intelligent skips where possible. PR to follow.
The text was updated successfully, but these errors were encountered: