Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch option revisited #453

Closed
rubys opened this issue Jun 25, 2015 · 8 comments
Closed

Batch option revisited #453

rubys opened this issue Jun 25, 2015 · 8 comments

Comments

@rubys
Copy link
Member

rubys commented Jun 25, 2015

Issue #11 suggests phantomjs, was closed with an enigmatic comment indicating this was solved, yet the User's Guide suggests otherwise.

I'm suggesting that this be revisited given the current state of io.js in general, and jsdom in particular. Proof of concept code:

var jsdom = require("jsdom");

jsdom.env({
  url: "http://rubys.github.io/breakup/specs/html-forms/",
  features: {
    FetchExternalResources   : ["script"],
    ProcessExternalResources : ["script"],
    SkipExternalResources: false
  },
  done: function(errors, window) {
    setTimeout(function() {
      var source = window.document.children[0].innerHTML;
      window.close();
      console.log(source);
      process.exit(1);
    }, 3000);
  }
})

Notes:

  • On my machine, the above script takes around 4 minutes (yes, minutes) to run with nearly all of that time with one CPU pegged. Despite this amount of time, until I add the setTimeout call, the output corresponded to the DOM after an initial load (i.e., before respec ran). Perhaps @dominic or @tmpvar might have some idea how to speed this up?
  • I'm exploring this my indexer would benefit from being able to deal with the output of respec instead of having to re-implement the logic for determining things such as id values.
@darobin
Copy link
Member

darobin commented Jun 26, 2015

In my experience jsdom hasn't yet reached the point where it can handle this (yet). The user guide is out of date, the tool that it points to is actually used a fair bit and generally works. By default it times out at ten seconds; for a large document on an underpowered machine you might need more — if so you can set that on the command line. On my laptop it generates the document in five seconds, which isn't wonderful but isn't unusable either.

@domenic, @tmpvar: This might be an interesting benchmark with which to investigate jsdom speed? I'm used to it being slower than phantomjs, but not by this much.

@domenic
Copy link

domenic commented Jun 26, 2015

Despite this amount of time, until I add the setTimeout call, the output corresponded to the DOM after an initial load (i.e., before respec ran).

Right, done is called at window.onload time, which I believe in browsers is before ReSpec runs. (If that's not the case then please file a bug.)

This might be an interesting benchmark with which to investigate jsdom speed? I'm used to it being slower than phantomjs, but not by this much.

Yes, definitely! Summoning @Joris-van-der-Wel, our benchmarking expert.

@rubys I assume you're on the latest version of jsdom and io.js?

@rubys
Copy link
Member Author

rubys commented Jun 26, 2015

I assume you're on the latest version of jsdom and io.js?

jsdom@5.4.3
node@2.3.1

CPU: Intel(R) Core(TM) i3 CPU 550 @ 3.20GHz

@domenic
Copy link

domenic commented Jun 26, 2015

Yeah, in that case it's just embarassing. I'd bet we're hitting some O(n^2) case and this won't be too hard to fix... @Joris-van-der-Wel has been working recently on exactly that kind of thing.

@darobin
Copy link
Member

darobin commented Jun 26, 2015

On 27/06/2015 00:08 , Domenic Denicola wrote:

Right, done is called at window.onload time, which I believe in browsers
is before ReSpec runs. (If that's not the case then please file a bug.)

IIRC ReSpec should run at DOMContentLoaded (if not it's certainly a
bug); but it's doing a bunch of its own async stuff and on such
documents I'd expect it to complete after load. You can detect ReSpec
being finished when it publishes an end-all event.

Robin Berjon - http://berjon.com/ - @robinberjon

@rubys
Copy link
Member Author

rubys commented Jun 27, 2015

Yeah, in that case it's just embarassing

@domenic On the plus side, you have a small, real world test case to work with. And just so it is clear: re-writing the test case to perform better is fair game. At a minimum, it should take advantage of the end-all event that @darobin mentioned.

@Joris-van-der-Wel
Copy link

The test takes about 60 seconds for me
Some things that appear to be bottlenecks in jsdom:

  • compareDocumentPosition() (46%)
  • NamedPropertiesTracker.maybeUntrack() (43%) :: you end up here when an Element is removed from the Document or if an id or name attribute changes. This is used for named property support (e.g. window.foo.outerHTML = '<div id="foo"></div>'). It might be possible to improve this a bit, however we really need ES6 Proxy. If I disable this feature the test takes 32 seconds
  • innerHTML setter (2%, but still a full second in total)
  • getElementsByName, getElementsByClassName, childNodes and similar (4%)

@marcoscaceres
Copy link
Member

Snapshots are generated using Electron. It gets the job done nicely - even thought it's a bit huge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants