Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new preserveDocumentType parser option #32

Merged
merged 1 commit into from
Jan 31, 2023
Merged

Add a new preserveDocumentType parser option #32

merged 1 commit into from
Jan 31, 2023

Conversation

rgrove
Copy link
Owner

@rgrove rgrove commented Jan 30, 2023

When true, an XmlDocumentType node representing a document type declaration (if there is one) will be included in the parsed document. When false, any document type declaration encountered will be discarded. The default is false, which matches the behavior of previous versions.

Note that the parser only includes the document type declaration in the node tree; it doesn't actually validate the document against the DTD, load external DTDs, or resolve custom entity references.

This option is useful if you want to preserve the document type declaration when later serializing a document back to XML. Previously, the document type declaration was always discarded, which meant that if you parsed a document with a document type declaration and then serialized it, the original document type declaration would be lost.

const { parseXml } = require('@rgrove/parse-xml');

let xml = '<!DOCTYPE root SYSTEM "root.dtd"><root />';
let doc = parseXml(xml, { preserveDocumentType: true });

console.log(doc.children[0].toJSON());
// => { type: 'doctype', name: 'root', systemId: 'root.dtd' }

xml = '<!DOCTYPE kittens [<!ELEMENT kittens (#PCDATA)>]><kittens />';
doc = parseXml(xml, { preserveDocumentType: true });

console.log(doc.children[0].toJSON());
// => {
//   type: 'doctype',
//   name: 'kittens',
//   internalSubset: '<!ELEMENT kittens (#PCDATA)>'
// }

/cc @wooorm

@rgrove rgrove linked an issue Jan 30, 2023 that may be closed by this pull request
Copy link

@wooorm wooorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍

@rgrove rgrove merged commit ab67ff7 into next Jan 31, 2023
@rgrove rgrove deleted the doctype branch January 31, 2023 05:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optionally include XML declarations and doctype declarations in the DOM
2 participants