Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tagName and nodeName getters overridden when parsed with jsdom.jsdom, causing various problems with XML DOM #393

Closed
jbeard4 opened this issue Jan 31, 2012 · 5 comments

Comments

@jbeard4
Copy link

jbeard4 commented Jan 31, 2012

There are various problems when parsing HTML using jsdom.jsdom. Here is a reduced test case:

var jsdom  = require("jsdom");

var doc = jsdom.jsdom('<foo><bar/></foo>',jsdom.dom.level3.core);

console.log(doc.documentElement);   //this will be undefined
var f = doc.getElementsByTagName('foo')[0];   //getElementsByTagName still works
conosle.log(f.tagName);    //this will be "FOO" instead of "foo"

Basically, document.documentElement will be undefined, and tagName and nodeName getters will always return uppercase form.

I traced the code, and found that the problem is most likely in jsdom/browser/index.js, lines 550 - 561, as tagName and nodeName get overriden:

https://github.com/tmpvar/jsdom/blob/master/lib/jsdom/browser/index.js#L550

This will override the getters defined in jsdom.dom.level1.core module, which will return lowercase form for non-HTML documents.

Deleting lines 550 - 561 seems to fix both of these problems: document.documentElement is set to element foo, and f.tagName is "foo" (lowercase).

@jbeard4
Copy link
Author

jbeard4 commented Jan 31, 2012

Also, I think I found some dead code:

https://github.com/tmpvar/jsdom/blob/master/lib/jsdom.js#L58

It looks like browser.HTMLDocument will always be populated in browserAugmentation:

https://github.com/tmpvar/jsdom/blob/master/lib/jsdom/browser/index.js#L325

This means that the else branch ("new browser.Document(options);") will never be called:

https://github.com/tmpvar/jsdom/blob/master/lib/jsdom.js#L60

This means that an HTMLDocument will always been initialized, and a Document will never been initialized, even if XML is passed in. I'm not sure what implications this has.

jbeard4 added a commit to jbeard4/jsdom that referenced this issue Feb 1, 2012
…n, as they override getters defined in dom level1, which normalize case only when working with HTML (as opposed to XML). This solves issue jsdom#393.
@znerol
Copy link

znerol commented Feb 6, 2012

I have the following problem:

var jsdom = require("jsdom");
var dom = jsdom.level(3, 'core');
var doc, elem;

doc = new dom.Document();
elem = doc.createElement('xml');
doc.appendChild(elem);
console.log(elem.nodeName);

jsdom.jsdom("<html><head></head><body></body></html>");

// After HTML parsing, the same fragment as above returns upper case nodeName
doc = new dom.Document();
elem = doc.createElement('xml');
doc.appendChild(elem);
console.log(elem.nodeName);

Output:

xml
XML

Is this related to the case given in this issue?

@jbeard4
Copy link
Author

jbeard4 commented Feb 6, 2012

Yes, I think so, because when you call jsdom.jsdom, it calls browserAugmentation, which overrides the getters for tagName and nodeName, such that they always return uppercase.

@tmpvar
Copy link
Member

tmpvar commented Feb 17, 2012

Agreed, we need to properly handle xml. Until now, there have been few users of the xml portion of jsdom. I sense a change a brewin'

@domenic
Copy link
Member

domenic commented Oct 5, 2012

If you can get some tests in for this, it seems like a good thing to merge in.

Sebmaster pushed a commit that referenced this issue Dec 2, 2015
Fixes #393. Fixes #651. Fixes #415 (wasn't quite applicable). Fixes #1276.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants