Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency with jsdom.env #792

Closed
wants to merge 12 commits into from
Closed

Inconsistency with jsdom.env #792

wants to merge 12 commits into from

Conversation

Sebmaster
Copy link
Member

jsdom.env is normally used to be the easiest access point to parse a website. However it doesn't load external scripts by default. By supplying jsdom.env with the proper features config it is possible to load external scripts, however, some bug in jsdom.env prevents that the referenced external script is loaded and processed. If, however, a src or scripts reference is supplied with the config, the referenced external scripts on that page are also executed.

I really like the jsdom.env api and the possibility to provide a features config object is really elegant, this inconsistency is annoying however.

Note: Inconsistency is not fixed with this PR (at the moment). I just want to get some quick comment if this should be properly supporter before I search further to fix this.

@domenic
Copy link
Member

domenic commented Jun 11, 2014

I see, yeah, that sounds bad. Fixing the inconsistency would be appreciated.

@Sebmaster
Copy link
Member Author

Okay, I dug into it a bit: the external code is executed in both cases, however the done() callback doesn't wait for eventual resource loads.

I assume the added src/script "blocks" the done() callback until the onload of the script gets called, however this onload (I assume) waits until all other scripts before it are loaded.

Not sure how to progress further from here. Since the resource loader gets triggered, maybe it'd make sense to delay the done() callback until all resources are loaded? However there's no singleton resourceloader, so that proves somewhat difficult. Any pointers?

@domenic
Copy link
Member

domenic commented Jun 11, 2014

The done() callback should be called ASAP, and not wait for any scripts to execute. You should use window.onload for that.

@Sebmaster
Copy link
Member Author

Oh, I think my problem might be solved with that, however done() waits for the load of the scripts provided in the config object, so it's still inconsistent. And removing that delay would probably break backwards-compatibility.

@domenic
Copy link
Member

domenic commented Jun 11, 2014

Meh, we're pre-1.0, we can break backward compat with a simple minor version number bump ;).

@Sebmaster
Copy link
Member Author

That'd also break a lot of test though I'd imagine.

What about another idea:
We introduce 2 new properties for the config object: ready() and load().

  • ready would basically be called immediately after the HTML is parsed, but before any other resources are loaded. This allows one to modify the window/document prototypes, shim whatever is necessary, etc.
  • done() would stay like it is. It'd get called either right after ready (if no scripts/src are provided) or on the onload of the last provided script (I think that'd be just before window.onload).
  • Thirdly, load() would basically be a proxy for window.onload.

This would also not break backward compatibility.
Better idea or too confusing?

@domenic
Copy link
Member

domenic commented Jun 11, 2014

I'd rather not be so concerned about backward compatibility, so in that case I'd probably remove done(), since as you say it's just plain inconsistent.

@Sebmaster
Copy link
Member Author

I added a test for the api I'd propose.

Since adding a callback just after the HTML is parsed but before any scripts are executed isn't possible, I think ready should fire as soon as the window is created by createWindow() but before the document is appended to it (I followed the callstack a bit but I'm on mobile currently so I can't link it). This allows one to define custom APIs the website might access (like cookieEnabled) without patching jsdom itself.

There's a problem with removing done though: if an error occurs while creating the window (non-existent URL for example), ready will be called with the error. To allow one to handle those cases.

Here we now have two choices:

  • load shouldn't be called since the error is already handled. This would break the really simple api uses since one would always have to define both done and load to catch all errors and process the document further.
  • load will be called with that same error, which would make it impossible to differentiate between load and script errors.

I think both of those choices aren't desirable since env was created to allow really simple uses of the api. I'd therefore propose the following:
Go with the first solution. If one would like to go the complicated route they'll have the full power at their hands.

Additionally, leave done in, but change the semantics a bit: it's basically load, but fires on both, load as well as script errors. This'd also have the added benefit of ensuring (most of) the tests don't break if we introduce that change.

@Sebmaster Sebmaster mentioned this pull request Jun 15, 2014
@Sebmaster
Copy link
Member Author

Okay, all of the existing tests pass now with the new API.

Any specific requirements for commit squashing, nitpicks or last-minute API changes?
I'm thinking of adding 2-3 more tests to test for edge-cases in window.load behaviour. Any specific suggestions?

@domenic
Copy link
Member

domenic commented Jun 16, 2014

I feel like i kind of got lost here. Can you maybe include documentation in the pull request too, or at least outline what the three (four?) different callbacks do in this thread?

@Sebmaster
Copy link
Member Author

See PR-792. I hope this describes it clearly.

jsdom.env is normally used to be the easiest access point to parse a
website. However it doesn't load external scripts by default.
By supplying jsdom.env with the proper features config it is possible to
load external scripts, however, some bug in jsdom.env prevents that the
referenced external script is loaded an processed. If, however, a src or
scripts reference is supplied with the config, the referenced external
scripts on that page are also executed.
The create callback will be called, as soon as the window for the
document is created, but before any scripts are executed in it. This
allows one to fix the window object with custom functions and missing
jsdom properties without patching jsdom.
@Sebmaster
Copy link
Member Author

Rebased on master too now, apparently I missed a failing test before (? Jenkins didn't report anything...).

Anything missing?

@domenic
Copy link
Member

domenic commented Jun 18, 2014

OK, I took a look. I even updated the wiki to be clearer in a few ways: https://github.com/tmpvar/jsdom/wiki/PR-792 (see diffs as well).

Remaining questions/comments:

however that it is not guaranteed that the document is already processed, so don't access window.document in create.

Is the document never ready, or sometimes ready, or partially ready? This sentence as it stands gives clear-cut advice ("don't") without a good explanation.

Will create ever be called with more than one error? If not, would be good to stop making it an array.

Why does load take an errors parameter? Couldn't they just use window.onerror? Maybe it would be better to do { create, load, error } where load and error are directly passed to window.addEventListener('load', ...) and window.addEventListener('error', ...)? It would be really nice to avoid arrays of errors.

Also, I think created, loaded, done is better than create, load, done.

@Sebmaster
Copy link
Member Author

Is the document never ready, or sometimes ready, or partially ready?

It's never ready, the HTML isn't yet written in the doc when created is called.

Will create ever be called with more than one error? If not, would be good to stop making it an array.

created should already never be called with an error array. Did you notice that somewhere?

Couldn't they just use window.onerror?

I think we may have a problem with #587 here. I assumed window.onerror doesn't work?

Also, I think created, loaded, done is better than create, load, done.

Done.

@domenic
Copy link
Member

domenic commented Jun 18, 2014

It's never ready, the HTML isn't yet written in the doc when created is called.

OK cool, will update the wiki.

created should already never be called with an error array. Did you notice that somewhere?

Oh no, I just assumed from the name of the parameter (errors). I suppose I could read the code...

I think we may have a problem with #587 here. I assumed window.onerror doesn't work?

Well that's depressing. I guess it's not enough of a big deal to block this PR on, since after all the aggregated script execution errors is kind of nice.

It's still weird to me that we have something named loaded, mimicking the 'load' event's name, which gets (errors, window) as parameters. I wonder if scriptsExecuted or something would be better? Meh, not really.

@Sebmaster
Copy link
Member Author

OK cool, will update the wiki.

Done already.

Oh no, I just assumed from the name of the parameter (errors). I suppose I could read the code...

Sorry, corrected there.

I wonder if scriptsExecuted or something would be better?

Since window.onload is executed after scripts have executed too, I think a similiar naming is an advantage, but you're the maintainer, your choice.

@domenic
Copy link
Member

domenic commented Jun 18, 2014

Fact-check this paragraph I just updated?

If the error argument is non-null, it will contain whatever loading error caused the window creation to fail; in that case window will be null. Note that unlike loaded or done, script errors will not show up here, since when created is called the document's scripts haven't been evaluated yet.

@Sebmaster
Copy link
Member Author

in that case window will be null.

undefined, but yeah, that's correct.

@@ -243,7 +262,13 @@ function processHTML(config) {
var errors = [];

if (!window || !window.document) {
return callback(new Error('JSDOM: a window object could not be created.'));
if (config.created) {
config.created(new Error('JSDOM: a window object could not be created.'));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, I wonder when this could happen. Not a great error message.

@@ -338,7 +374,7 @@ function getConfigFromArguments(args, callback) {
});
}

if (!config.done) {
if (!config.done && !config.created && !config.loaded) {
throw new Error('Must pass a "done" option or a callback to jsdom.env.');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to update this error message.

@domenic
Copy link
Member

domenic commented Jun 18, 2014

Looking at the code, it looks like done will sometimes be called with a single error (for load errors), and sometimes with an array of errors. I don't believe this was the case before, and we certainly shouldn't continue to do so if it was. It should always call back with an array.

@Sebmaster
Copy link
Member Author

https://github.com/tmpvar/jsdom/pull/792/files#diff-3256400745dfee073b87fb7a45d0aaedR228

Haven't changed it, just guarded with an if statement.

Will change to use an array instead. Actually, the load and script errors have a different structure, which makes them distinguishable. Do we want to mimick the script error structure? Or just change to array and be done with it?

@domenic
Copy link
Member

domenic commented Jun 18, 2014

Hmm, ick. I think we should change to array though. If I were a consumer and wanted to distinguish, I'd just use created + loaded.

created: function (err, window) {
t.ifError(err);

t.notEqual(window.a, 'b');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use double quotes in new tests.

@@ -365,3 +365,35 @@ exports["with scripts and content retrieved from URLs"] = function (t) {
});
});
};


exports["should call callbacks correctly"] = function (t) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this whole window.a thing is unnecessarily confusing. Maybe:

  • created: window.isCreated = true; assert others false
  • script: window.isCreatedBeforeScriptRuns = window.isCreated + window.scriptHasRun = true
  • loaded: assert both are true
  • done: assert both are true

@domenic
Copy link
Member

domenic commented Jun 18, 2014

OK gotta head to a meetup for the next few hours but will likely merge + release this later tonight.

@domenic
Copy link
Member

domenic commented Jun 19, 2014

Merged into the 1.0 branch :O

Take a look at what's going on over there. I figure it's finally time we settle on a stable API.

@Sebmaster Sebmaster mentioned this pull request Jun 19, 2014
@domenic domenic closed this Jun 29, 2014
@Sebmaster Sebmaster deleted the env-external-scripts branch July 1, 2014 00:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants