Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

MSHTML: XHTML completely broken #3542

nvaccessAuto opened this Issue Sep 23, 2013 · 4 comments


None yet
1 participant

Reported by jteh on 2013-09-23 00:52
XHTML documents break very badly with the MSHTML engine, including IE. The problem is that nodeName returns lower case names in XHTML (as per the spec), but we expect upper case, which is what is returned for normal HTML.

The simplest way to fix this is probably to upper case the node name before we use it in both the Python and C++ MSHTML code. (We could lower case everything, but that would mean a much larger diff.) For searching, I guess we'll need to search for both upper and lower case versions.

Note that in the MSHTML vbuf backend, we currently test for the math tag in lower case, as it always seems to appear like this, at least with MathPlayer installed. This will of course need to be changed if the node name is converted to upper case.

Comment 1 by Michael Curran <mick@... on 2013-10-02 08:20
In [afd0995]:

MSHTML: support xhtml documents. Specially force nodeName fetched from HTML nodes to uppercase as it seems that case can change depending on whether its HTML or XHTML. Re #3542

Comment 2 by Michael Curran <mick@... on 2013-10-02 08:35
In [746c47c]:

Merge branch 't3542' into next. Incubates #3542

Added labels: incubating

Comment 3 by Michael Curran <mick@... on 2013-10-15 09:48
In [bcd419f]:

Merge branch 't3542'. Fixes #3542

Removed labels: incubating
State: closed

Comment 4 by mdcurran on 2013-10-15 09:48
Milestone changed from None to 2013.3

@nvaccessAuto nvaccessAuto added this to the 2013.3 milestone Nov 10, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment