New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Only use HTML rules if mimeType matches #338
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
4 tasks
karfau
added a commit
to karfau/xmldom
that referenced
this pull request
Feb 15, 2022
since we can not rely on it being present in all supported runtimes. Even though the interface is the same as `Object.assign`, it behaves slightly differently from the one provided by browsers. This was extracted from xmldom#338 to support development in xmldom#367
65c8368
to
04c9f26
Compare
9072016
to
69ddd52
Compare
i'm very eager to test this when merged and a pre-release becomes available |
@weiwu-zhang Thank you for the feedback. |
973a6da
to
52acd24
Compare
- always set `Document.type` and `Document.contentType` - `Document.createElement` properly HTML casing and (X)HTML namespacing https://dom.spec.whatwg.org/#dom-domimplementation-createhtmldocument https://dom.spec.whatwg.org/#dom-document-createelement
when `mimeType` is `text/html`. The `mimeType` can now optionally be passed to the `DOMHandler` constructor. Documented `DOMHandler` constructor and all properties. - For XML documents the XHTML and SVG mime types are preserved as expected. - `Document.documentURI` is no longer initialized with the undocumented `Locator.systemId` value. - Deprecate `DOMParserOptions.domBuilder` since state would be preserved between calls to `DOMParser.parseFromString` which can have unexpected side effects, especially since we are now using the `DOMHandler` to manage the mimeType and defaultNamespace.
to be able to copy from options provided to `DOMParser`
Instead of accessing `this.options` in `DOMParser.parseToString`, the default values are now applied in the constructor. Since the locator passed to `options` is no longer being modified, the type of the option was changed to boolean. There is no change in behavior in this commit, since truthy and falsy values are accepted as well.
Instead of accessing `this.options` in `DOMParser.parseToString`, the default values are now applied in the constructor.
Instead of accessing `this.options` in `DOMParser.parseToString`, use `this.errorHandler`.
which points to a class instead of an instance and is only meant for internal testing. BREAKING CHANGE: If you used to configure `DOMParserOptions.domBuilder`. You might be able to configure the `domHandler` instead, but should be avoided. This is only there for testing purposes.
All options are now taken care of by the constructor and are available as individual properties. Most are marked as `readonly`, some are `private`. BREAKING CHANGE: If you used `DOMParser.options` after creating an instance. You can still read the individual properties from the instance, but there is no longer a way to mutate them, so you need to really pass the required options when constructing them.
in HTML docs or namespaces
86cab47
to
ceff927
Compare
and drop warning for boolean attributes in HTML
BREAKING CHANGE: The following methods no longer allow a (non spec compliant) boolean argument to toggle "HTML rules": - `XMLSerializer.serializeToString` - `Node.toString` - `Document.toString`
b1babe9
to
48f49be
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In the living specs for parsing XML and HTML, that this library is trying to implement,
there is a distinction between the different types of documents being parsed:
There are quite some rules that are different for parsing, constructing and serializing XML vs HTML documents.
So far xmldom was always "detecting" whether "the HTML rules should be applied" by looking at the current namespace. So from the first time an the HTML default namespace (
http://www.w3.org/1999/xhtml
) was found, every node was treated as being part of an HTML document. This misconception is the root cause for quite some reported bugs.BREAKING CHANGE: HTML rules are no longer applied just because of the namespace, but require the
mimeType
argument passed toDOMParser.parseFromString(source, mimeType)
to match'text/html'
. Doing so implies all rules for handling casing for tag and attribute names when parsing, creation of nodes and searching nodes.BREAKING CHANGE: Correct the return type of
DOMParser.parseFromString
toDocument | undefined
. In case of parsing errors it was always possible that "the returnedDocument
" has not been created. In case you are using Typescript you now need to handle those cases.BREAKING CHANGE: The instance property
DOMParser.options
is no longer available, instead use the individualreadonly
property per option (assign
,domHandler
,errorHandler
,normalizeLineEndings
,locator
,xmlns
). Those also provides the default value if the option was not passed. The 'locator' option is now just a boolean (default remainstrue
).BREAKING CHANGE: The following methods no longer allow a (non spec compliant) boolean argument to toggle "HTML rules":
XMLSerializer.serializeToString
Node.toString
Document.toString
The following interfaces have been implemented:
DOMImplementation
now implements all methods defined in the DOM spec, but not all of the behavior is implemented (see docstring):createDocument
creates an "XML Document" (prototype:Document
, propertytype
is'xml'
)createHTMLDocument
creates an "HTML Document" (type/prototype:Document
, propertytype
is'html'
).false
no child nodes are createdDocument
now has two new readonly properties as specified in the DOM spec:contentType
which is the mime-type that was used to create the documenttype
which is either the string literal'xml'
or'html'
MIME_TYPE
(/lib/conventions.js
):hasDefaultHTMLNamespace
test if the provided string is one of the miem types that implies the default HTML namespace:text/html
orapplication/xhtml+xml