Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improved v35 performance #439

Closed
wants to merge 5 commits into from
Closed

Conversation

awwit
Copy link
Contributor

@awwit awwit commented May 6, 2020

For v35, I did a faster parsing of uuid value. This greatly affected the performance.
When parsing a value, it is also validated.

But the bundle size is slightly exceeded…

t’s up to you to decide if this performance increase (and uuid validation) is worth increasing build size.

(Your current solution does not validate uuid strings)

Benchmark master:

uuidv1() x 1,644,613 ops/sec ±1.21% (91 runs sampled)
uuidv1() fill existing array x 7,209,045 ops/sec ±0.82% (91 runs sampled)
uuidv4() x 432,989 ops/sec ±0.91% (92 runs sampled)
uuidv4() fill existing array x 454,731 ops/sec ±0.74% (93 runs sampled)
uuidv3() x 137,231 ops/sec ±0.78% (86 runs sampled)
uuidv5() x 135,756 ops/sec ±1.27% (85 runs sampled)

Benchmark this branch:

uuidv1() x 1,645,002 ops/sec ±1.09% (89 runs sampled)
uuidv1() fill existing array x 7,294,826 ops/sec ±0.82% (92 runs sampled)
uuidv4() x 457,979 ops/sec ±0.76% (90 runs sampled)
uuidv4() fill existing array x 466,131 ops/sec ±0.76% (91 runs sampled)
uuidv3() x 253,029 ops/sec ±1.01% (83 runs sampled)
uuidv5() x 254,426 ops/sec ±1.22% (84 runs sampled)

@awwit
Copy link
Contributor Author

awwit commented May 6, 2020

Bundlewatch result: https://ja2r7.app.goo.gl/R7EhcSrDJemj7yhk6

Please read PR description ^

Copy link
Member

@ctavan ctavan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the bundlesize of neither v1 nor v4 change I would be fine accepting the increase in bundlesize.

However I'm currently a bit undecided whether the increase in code complexity is really worth the improved performance. v3/v5 are used very rarely and we have never received any reports about perf issues with them.

I'm tempted to prefer a concise implementation over performance optimizations in this case.

On the other hand we have good test coverage and can be pretty sure that the code behaves well… 🤷‍♂️

@broofa @LinusU opinions?

src/v35.js Outdated
bytes.push(parseInt(hex, 16));
});
if (uuid.length === 36) {
for (let i = 0; i < uuid.length; ++i) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not i+=2 instead of ++i here and another ++i below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ctavan because not in all cases it is necessary to increase by 2. If we find a -, then increase only by 1.
https://github.com/uuidjs/uuid/pull/439/files#diff-4c3fa2fbfa9d4b899d8be4b79ae6907fR33

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right…

@@ -69,7 +69,7 @@ describe('v5', () => {
assert.equal(v3('hello.example.com', v3.DNS), '9125a8dc-52ee-365b-a5aa-81b0b3681cf6');
assert.equal(v3('http://example.com/hello', v3.URL), 'c6235813-3ba4-3801-ae84-e0a6ebb7d138');
assert.equal(
v3('hello', '0f5abcd1-c194-47f3-905b-2df7263a084b'),
v3('hello', '0f5abcd1-c194-47f3-905b-2df7263a084b'.toUpperCase()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a good opportunity to break up the huge v3/v5 tests into several smaller test() cases, Ideally 1 or 2 assertions per test block.

Then I believe that the toUpperCase() test would deserve its own test case with a reasonable description like ("accepts uppercase uuid notation for namespace")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also believe we should, just for the sake of consistency, keep the v3 and v5 tests in sync and test the same stuff on both variants of the algorithm.

invalid = true;
}

assert.ok(invalid, 'v3 namespace should be invalid');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -69,7 +69,7 @@ describe('v5', () => {
assert.equal(v3('hello.example.com', v3.DNS), '9125a8dc-52ee-365b-a5aa-81b0b3681cf6');
assert.equal(v3('http://example.com/hello', v3.URL), 'c6235813-3ba4-3801-ae84-e0a6ebb7d138');
assert.equal(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to assert.strictEqual()

src/v35.js Outdated
uuid.replace(/[a-fA-F0-9]{2}/g, function (hex) {
bytes.push(parseInt(hex, 16));
});
if (uuid.length === 36) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer early return to save one level of indentation:

if (uuid.length !== 36) {
  return bytes;
}

@@ -1,12 +1,53 @@
import bytesToUuid from './bytesToUuid.js';

function hexSymToDecNum(n) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this so much faster than parseInt(n, 16)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, faster than parseInt(uuid.substr(i, 2), 16)
You can check it yourself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ridiculous 😂 🤷‍♂️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticeably faster, otherwise I would have done through parseInt %)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't get me wrong: I didn't mean to question, that your solution is considerably faster, I was just amazed by the fact…

Copy link
Member

@broofa broofa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However I'm currently a bit undecided whether the increase in code complexity is really worth the improved performance. v3/v5 are used very rarely and we have never received any reports about perf issues with them.

I'm tempted to prefer a concise implementation over performance optimizations in this case.

I'm of two minds. As @ctavan notes, this only benefits v3/v5 users who are complaining about perf... which is nobody. So does not justify the extra complexity and code size. For that reason, I'll suggest that we not take this PR.

That said, Iwe regularly get requests for the ability to parse and validate UUIDs. If/when we offer such (E.g. See my abandoned UUID class), I could definitely see this making sense, as parsing perf will be more relevant in that context.

@awwit awwit requested review from ctavan and broofa May 10, 2020 17:23
@awwit
Copy link
Contributor Author

awwit commented May 10, 2020

I completed the validation using the sample from UUID class.

This of course increased the size of the module. But not more than 1 Kb (for complete bundle). Why do you have such strict limits? )

It’s a little strange that you care so much about the size of your module’s bundle. Since it includes a little-used v35.

It would be logical to publish versions separately (v1, v4, v35): @uuid/v1, @uuid/v4, @uuid/v35.

Of course you understand that the new functionality requires more code (and larger size of bundle).

@broofa
Copy link
Member

broofa commented May 11, 2020

Sorry, I think there may have been a misunderstanding. I wasn't suggesting that we add the UUID class' code to this PR. Rather, I was simply using that as an example of where we might leverage your code if/when we decided to add support for a parse/validate feature. Historically, however, we've shied away from that feature (too small an audience, already addressed in other modules).

@awwit
Copy link
Contributor Author

awwit commented May 11, 2020

@broofa now it’s clear ... then I will try to make the fastest parsing of the UUID that will fit into the limits you set.

@ctavan
Copy link
Member

ctavan commented May 11, 2020

@awwit I believe there might still be a misinterpretation: with regards to UUID parsing we‘re really not concerned about bundle size! We‘re more concerned about the feature and API scope of this library! This library is used by millions of projects in the node.js ecosystem to generate UUIDs. If you look at how widespread UUID parser libraries on npm are you will notice that this is an almost negligible fraction compared to the usage of this library.

Hence, we are reluctant of adding little-used features in here that would probably be better off in a separate, special purpose npm module. I think we would be happy to host such an effort in the uuidjs GitHub organization, but for the time being not in this module itself.

And regarding bundlesize: since we moved to ES modules this library supports treeshaking. So people will only get those algorithms in their frontend bundle that they actually use. For Node.js bundle size is not a concern, hence no need to publish separate npm modules.

@awwit
Copy link
Contributor Author

awwit commented May 11, 2020

@ctavan good =) then which option will suit you?

In this PR, initially I did only a quick parsing of the UUID and checking for string validation by template 00000000-0000-0000-0000-000000000000. To match the exception "namespace must be uuid string or an Array of 16 byte values"

It seemed correct to me to correspond to the description of the exception. =)

@broofa
Copy link
Member

broofa commented May 12, 2020

Closing, with apologies in advance if that comes across as rude. That is not my intent. To recap the reasons for this:

  • The increased code complexity is not warranted. It doesn't offer enough benefit to the existing uuid audience (mainly because this only affects ~1% of users that use v3/v5 uuids).
  • High performance parsing will be more interesting if/when it is exposed as part of a module's public API.
  • Currently we (uuidjs org) don't have plans for such a module. If that changes, we should consider using this implementation.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants