When I try to handle DTD entities in xml2 my R session (both in RGui and RStudio) consistently aborts. In RStudio (1.2.1114) a dialog window appears with R Session Aborted , R encountered a fatal error, The session was terminated,Start New Session.
I enclose a session where I
define a DTD and a XML example file
read the XML with the package XML that works as expected: the field field1 contains 'Han Oostdijk Quantitative Consultancy' i.e. the resolved entity hoqc with the nested entity author defined in the DTD.
do the same with package xml2. I have commented out the line c1 = xml2::xml_find_all(w2,"//field1") because this is the line that causes the abortion of the R session.
use package xml2 but replace the line in the XML with the entity hoqc by <field1>a none-entity</field1>. This works as expected: results in the string 'a none-entity'.
use package xml2 but replace the line in the XML with the entity hoqc by <field1>&author;</field1>. This works as expected: results in the string 'Han Oostdijk'.
My conclusion: xml2 can not handle nested entities where XML has no problems (?)
getwd()
#> [1] "D:/data/R/default_working_directory"dtd_info<- c(
'<!ENTITY author "Han Oostdijk">',
'<!ENTITY hoqc "&author; Quantitative Consultancy">',
"<!ELEMENT records (record+)>",
"<!ELEMENT record (field1)>",
"<!ELEMENT field1 (#PCDATA)>"
)
xml_data<- c(
'<?xml version="1.0" encoding="UTF-8"?>',
'<!DOCTYPE records SYSTEM "records.dtd">',
"<records>",
"<record>",
"<field1>&hoqc;</field1>",
"</record>",
"</records>"
)
writeLines(dtd_info, "records.dtd")
xml_data1<- paste(xml_data, collapse="\n")
# read with package XMLw1<-XML::xmlParse(xml_data1, options= c(XML::DTDVALID))
print(w1)
#> <?xml version="1.0" encoding="UTF-8"?>#> <!DOCTYPE records SYSTEM "records.dtd">#> <records>#> <record>#> <field1>&hoqc;</field1>#> </record>#> </records>#>
unlist(XML::xpathApply(w1, "//field1", XML::xmlValue))
#> [1] "Han Oostdijk Quantitative Consultancy"# read with package xml2w2<-xml2::read_xml(xml_data1, options= c("DTDVALID"))
print(w2)
#> {xml_document}#> <records>#> [1] <record>\n<field1>&hoqc;</field1>\n</record>######## next line will cause abort of session# c1 = xml2::xml_find_all(w2,"//field1")######## previous line will cause abort of session# replace '&hoqc;' by 'a none-entity'xml_data2<-xml_dataxml_data2[5] <-"<field1>a none-entity</field1>"xml_data2<- paste(xml_data2, collapse="\n")
w3<-xml2::read_xml(xml_data2, options= c("DTDVALID"))
print(w3)
#> {xml_document}#> <records>#> [1] <record>\n<field1>a none-entity</field1>\n</record>c2<-xml2::xml_find_all(w3, "//field1")
purrr::map_chr(c2, xml2::xml_text)
#> [1] "a none-entity"# replace '&hoqc;' by '&author;'xml_data3<-xml_dataxml_data3[5] <-"<field1>&author;</field1>"xml_data3<- paste(xml_data3, collapse="\n")
w4<-xml2::read_xml(xml_data3, options= c("DTDVALID"))
print(w4)
#> {xml_document}#> <records>#> [1] <record>\n<field1>&author;</field1>\n</record>c3<-xml2::xml_find_all(w4, "//field1")
purrr::map_chr(c3, xml2::xml_text)
#> [1] "Han Oostdijk"
When I try to handle DTD entities in
xml2
my R session (both in RGui and RStudio) consistently aborts. In RStudio (1.2.1114) a dialog window appears withR Session Aborted
,R encountered a fatal error
,The session was terminated
,Start New Session
.I enclose a session where I
XML
that works as expected: the fieldfield1
contains'Han Oostdijk Quantitative Consultancy'
i.e. the resolved entityhoqc
with the nested entityauthor
defined in the DTD.xml2
. I have commented out the linec1 = xml2::xml_find_all(w2,"//field1")
because this is the line that causes the abortion of the R session.xml2
but replace the line in the XML with the entityhoqc
by<field1>a none-entity</field1>
. This works as expected: results in the string'a none-entity'
.xml2
but replace the line in the XML with the entityhoqc
by<field1>&author;</field1>
. This works as expected: results in the string'Han Oostdijk'
.My conclusion:
xml2
can not handle nested entities whereXML
has no problems (?)Created on 2018-11-26 by the reprex package (v0.2.1)
Session info
The text was updated successfully, but these errors were encountered: