Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Set to improve performance of union #298

Merged

Conversation

matthemsteger
Copy link
Contributor

This one might be a bit more controversial. Doing more performance analysis the next biggest bottleneck was union. Now this algorithm isn't bad and I did not want to get too into micro-optimization of this one, but Set is considerably faster on modern browsers and node.

The example I have went from 12 seconds to 9 seconds.

@coveralls
Copy link

coveralls commented Jul 4, 2020

Coverage Status

Coverage remained the same at 100.0% when pulling 1514352 on matthemsteger:features/improve-union-perf-modern into 3e141b8 on jneen:master.

@wavebeem
Copy link
Collaborator

wavebeem commented Jul 4, 2020

This one looks great! I appreciate you adding code to check for ES2015 features since this library supports ancient environments still.

Would you mind adding a test that set Set = undefined and Array.from = undefined so we can test both code paths and keep the test coverage up?

@matthemsteger
Copy link
Contributor Author

matthemsteger commented Jul 4, 2020

Hopefully this is OK. Instead of the complexity of trying to dynamically do it during tests and figure out which tests end up calling this internal function, I am just going to run the tests twice, once (presumably) with Set and once deleting Set. I imagine one could refactor this later to delete more things out of the environment if it was beneficial to use some more features.

Oh, and the reason why we can't just before delete Set and after put it back on certain tests is because at test loading chai is reading Set available, and if it stops becoming available, then chai code fails.

@wavebeem
Copy link
Collaborator

wavebeem commented Jul 5, 2020

Oh, and the reason why we can't just before delete Set and after put it back on certain tests is because at test loading chai is reading Set available, and if it stops becoming available, then chai code fails.

oh jeez, i get it, but that's pretty annoying! i guess an alternative would be to use "dependency injection" and have like,

Parsimmon.Set = typeof Set !== 'undefined' ? Set : undefined;

and then we could set/unset that value in tests so we don't have to run everything twice, but also don't mess with the global value... i'm honestly kinda impressed that nyc is still doing code coverage with mocha running twice, but i don't know a lot about how that works under the hood.

@matthemsteger
Copy link
Contributor Author

That's a good idea. Let me try that.

@matthemsteger matthemsteger force-pushed the features/improve-union-perf-modern branch from fc48041 to 08ec6fc Compare July 5, 2020 17:38
@wavebeem
Copy link
Collaborator

wavebeem commented Jul 5, 2020

Thanks, that looks great! I just had one last idea: maybe we could rename Parsimmon.Set to Parsimmon._Set so it's more clearly an implementation detail and not part of the API, just in case people stumble upon it?

globalThis.Parsimmon = require("../..");
assert.isNotOk(globalThis.Parsimmon.Set);
// go back to original state
delete require.cache[require.resolve("../..")];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to do Parsimmon.Set = undefined and Parsimmon.Set = origSet rather than busting the require cache and reloading it? Seems like it would be shorter, but I'm not sure if there's a caveat I'm missing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically testing https://github.com/jneen/parsimmon/pull/298/files#diff-d6843052e5b691e32d17eb1fb2508987R1402

This is testing the initial require of Parsimmon when Set is undefined. That last : undefined is only run during require.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that makes sense. I remember this was a bit of a sticking point with the Binary module inside Parsimmon. We solved that with a helper function to check instead of hard coding it at initialization (as Chai apparently does as well).

function bufferExists() {
  return typeof Buffer !== "undefined";
}

That way we can test the code path by just stubbing it temporarily:

  context("Buffer is not present.", function() {
    var buff;
    before(function() {
      buff = global.Buffer;
      global.Buffer = undefined;
    });

    after(function() {
      global.Buffer = buff;
    });

    it("Disallows construction.", function() {
      assert.throws(function() {
        Parsimmon.Binary.bitSeqObj(0xf);
      }, /buffer global/i);
    });
  });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that will work with chai. That was my original solution (because the check for Set was in the function it was essentially lazy) .. but the problem was that chai itself reads Set when it is required, and then when you assert with deepEqual against an object, it throws.

TypeError: Cannot read property 'prototype' of undefined
      at typeDetect (node_modules/type-detect/type-detect.js:285:41)
      at extensiveDeepEqual (node_modules/deep-eql/index.js:183:22)
      at Object.deepEqual [as eql] (node_modules/deep-eql/index.js:106:10)
      at Proxy.assertEql (node_modules/chai/lib/chai/core/assertions.js:1085:11)
      at Proxy.methodWrapper (node_modules/chai/lib/chai/utils/addMethod.js:57:25)
      at Function.assert.deepEqual.assert.deepStrictEqual (node_modules/chai/lib/chai/interface/assert.js:232:56)
      at Context.<anonymous> (test/core/seq.test.js:73:12)
      at processImmediate (internal/timers.js:458:21)

Alternately, I could essentially leave the runtime code as it was (check for Set inside the function using it), and then use require.cache and require to bust chais cache instead and force it to load thinking Set is not in the environment, that should allow for something like above.

Thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I didn't quite explain that fully, but I meant that approach plus using Parsimmon._Set to store the value so we can dependency inject it inside tests, thus not mucking up Chai.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I see where you are going with this. Basically lazily set whether Set exists to avoid having to test the initial setting that is happening now during require. We don't actually need to store a reference to Set, just whether we found it exists.

function setExists() {
  if (Parsimmon._supportsSet !== "undefined") {
    return Parsimmon._supportsSet;
  }
  var exists = typeof Set !== "undefined";
  Parsimmon._supportsSet = exists;
  return exists;
}

Both branches of code will be tested as we test these scenarios. Sound good?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think we're on the same page now. Thanks for your patience with testing on this!

@wavebeem
Copy link
Collaborator

wavebeem commented Jul 8, 2020

This looks great! Thanks for taking the time to get the testing as nice as possible, I really appreciate it. I should probably have some time tomorrow to merge this, update the changelog, and release.

@wavebeem wavebeem merged commit 9561a44 into jneen:master Jul 9, 2020
@wavebeem
Copy link
Collaborator

wavebeem commented Jul 9, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants