-
Notifications
You must be signed in to change notification settings - Fork 7.3k
path: Improve POSIX path.join performance #6929
Conversation
Thank you for contributing this pull request! Here are a few pointers to make sure your submission will be considered for inclusion. Commit joliss/node@e2b8922 has the following error(s):
The following commiters were not found in the CLA:
You can fix all these things without opening another issue. Please see CONTRIBUTING.md for more information |
Added subsystem and signed CLA. Should be OK now. |
var path = ''; | ||
for (var i = 0; i < arguments.length; i++) { | ||
var segment = arguments[i]; | ||
if (!util.isString(segment)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious. Have you benchmarked what it would be w/o the check?
Also, I know it's not your code, but since you're here. Can you change it to exports.join = function join() {
. Unnamed functions are such a pain to track down with performance benchmarking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious. Have you benchmarked what it would be w/o the check?
I tried it, and it doesn't make a difference (not above the noise at least).
Also, I know it's not your code, but since you're here. Can you change it to exports.join = function join() {. Unnamed functions are such a pain to track down with performance benchmarking.
All the functions in the module are anonymous, so it'd be inconsistent if I changed it only for this one. May I suggest you add names to all the functions in a separate commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oy. Hadn't noticed that. Sure, make it a separate commit or don't bother. Either way is fine with me. :)
Here is a simple benchmark: ```js var path = require('path'); var foo = 'foo', bar = 'bar'; function myJoin (foo, bar) { // Naive implementation for comparison return foo + '/' + bar; } console.time('join'); for (var i = 0; i < 1000000; i++) { path.join(foo, bar); } console.timeEnd('join'); ``` This commit reduces the time for path.join in the above benchmark from 2 us to .45 us. It is still much slower than the naive myJoin implementation (well below .1 us), but it's progress at least. There seems to be some duplicate work between join, normalize, and normalizeArray, if people want to optimize it further. Note that this only improves the POSIX implementation. The Windows implementation uses .filter closures as well (and also regexes), so there is much room for improvement, but I'm not trying to fix it here. Thanks to @isaacs and @othiym23 for helping with this optimization.
path = normalizeArray(path.split('/').filter(function(p) { | ||
return !!p; | ||
}), !isAbsolute).join('/'); | ||
for (var i = 0; i < segments.length; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say the first one is an empty string, that means it's an absolute path. So you could force push the first element then start the for loop at 1. Like so:
nonEmptySegments.push(segments[0]);
for (var i = 1; i < segments.length; i++) {
Then the return (isAbsolute ? '/' : '') + path;
isn't necessary at the bottom because [ '', 'a', 'b', 'c' ].join('/') == '/a/b/c'
. So it'll auto resolve whether the path is absolute or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's getting late, and my brain is having trouble parsing whether this is really 100% equivalent in this function. I don't think it makes a difference in performance, and it's harder to figure out what's going on when the first segment of nonEmptySegments
is possibly empty. Should we leave it as is?
Well, you're correct. There are a lot of performance improvements that can be done here. Honestly I'm cool with how it is now if you don't feel like following up with any of the suggestions. Ping me back when you've done everything you feel like touching and I'll sign-off. |
Merged in b9bec20 with reworded commit message. Thanks for the work. :) |
Here is a simple benchmark:
This commit reduces the time for path.join in the above benchmark from
2 us to .45 us. It is still much slower than the naive myJoin
implementation (.2 us), but it's progress at least.
There seems to be some duplicate work between join, normalize, and
normalizeArray, if people want to optimize it further.
Note that this only improves the POSIX implementation. The Windows
implementation uses .filter closures as well (and also regexes), so
there is much room for improvement, but I'm not trying to fix it here.
Thanks to @isaacs and @othiym23 for helping with this optimization.