v4.1.0
Added
-
Added a new
includeOffsets
parser option. #25When
true
, the starting and ending byte offsets of each node in the input string will be made available viastart
andend
properties on the node. The default isfalse
.This option is useful if you want to preserve the original source text of each node when later serializing a document back to XML. Previously, the original source text was always discarded, which meant that if you parsed a document and then serialized it, the original source text would be lost.
const { parseXml } = require('@rgrove/parse-xml'); let xml = '<root><child /></root>'; let doc = parseXml(xml, { includeOffsets: true }); console.log(doc.root.toJSON()); // => { type: 'element', name: 'root', start: 0, end: 22, ... } console.log(doc.root.children[0].toJSON()); // => { type: 'element', name: 'child', start: 6, end: 15, ... }
-
Added a new
preserveXmlDeclaration
parser option. #31When
true
, anXmlDeclaration
node representing the XML declaration (if there is one) will be included in the parsed document. Whenfalse
, the XML declaration will be discarded. The default isfalse
, which matches the behavior of previous versions.This option is useful if you want to preserve the XML declaration when later serializing a document back to XML. Previously, the XML declaration was always discarded, which meant that if you parsed a document with an XML declaration and then serialized it, the original XML declaration would be lost.
const { parseXml } = require('@rgrove/parse-xml'); let xml = '<?xml version="1.0" encoding="UTF-8"?><root />'; let doc = parseXml(xml, { preserveXmlDeclaration: true }); console.log(doc.children[0].toJSON()); // => { type: 'xmldecl', version: '1.0', encoding: 'UTF-8' }
-
Added a new
preserveDocumentType
parser option. #32When
true
, anXmlDocumentType
node representing a document type declaration (if there is one) will be included in the parsed document. Whenfalse
, any document type declaration encountered will be discarded. The default isfalse
, which matches the behavior of previous versions.Note that the parser only includes the document type declaration in the node tree; it doesn't actually validate the document against the DTD, load external DTDs, or resolve custom entity references.
This option is useful if you want to preserve the document type declaration when later serializing a document back to XML. Previously, the document type declaration was always discarded, which meant that if you parsed a document with a document type declaration and then serialized it, the original document type declaration would be lost.
const { parseXml } = require('@rgrove/parse-xml'); let xml = '<!DOCTYPE root SYSTEM "root.dtd"><root />'; let doc = parseXml(xml, { preserveDocumentType: true }); console.log(doc.children[0].toJSON()); // => { type: 'doctype', name: 'root', systemId: 'root.dtd' } xml = '<!DOCTYPE kittens [<!ELEMENT kittens (#PCDATA)>]><kittens />'; doc = parseXml(xml, { preserveDocumentType: true }); console.log(doc.children[0].toJSON()); // => { // type: 'doctype', // name: 'kittens', // internalSubset: '<!ELEMENT kittens (#PCDATA)>' // }
Changed
- Errors thrown by the parser are now instances of a new
XmlError
class, which extendsError
. These errors still have all the same properties as before, but now with improved type definitions. #27
Fixed
-
Leading and trailing whitespace in comment content is no longer trimmed. This issue only affected parsing when the
preserveComments
parser option was enabled. #28 -
Text content following a CDATA section is no longer appended to the preceding
XmlCdata
node. This issue only affected parsing when thepreserveCdata
parser option was enabled. #29