You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As the documentation of xml_add_child suggests, setting .copy = FALSE moves the element to its new location. However, it does not remove the node from its original location, leaving a zombied node that cannot be removed.
R version 3.6.3 (2020-02-29)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)
xml2_1.3.1
Case 1
# Create a simple documentdoc<-xml2::as_xml_document(
"<document> <body> </body> </document>"
)
body<-xml2::xml_find_all(doc, "//body")[[1]]
# Create an element named "a" in the documenta<-xml2::xml_add_child(doc, "a")
# Move it to bodya_moved<-xml2::xml_add_child(body, a, .copy=FALSE)
# We see it is copied by referencea$node#> <pointer: 0x0000000008a32e40>a_moved$node#> <pointer: 0x0000000008a32e40># We see it now lives in two places, but we find only one as there is only one
cat(as.character(doc))
#> <?xml version="1.0" encoding="UTF-8"?>#> <document>#> <body>#> <a/></body>#> <a/>#> </document>xml2::xml_find_all(doc, "//a")
#> {xml_nodeset (1)}#> [1] <a/># This should remove 'both' of themxml2::xml_remove(a)
# But only removes the "a" at the location we wanted to move it to
cat(as.character(doc))
#> <?xml version="1.0" encoding="UTF-8"?>#> <document>#> <body>#> </body>#> <a/>#> </document>xml2::xml_find_all(doc, "//a")
#> {xml_nodeset (1)}#> [1] <a/># Removing it againxml2::xml_remove(a)
# It is now zombied, because we cannot remove it.
cat(as.character(doc))
#> <?xml version="1.0" encoding="UTF-8"?>#> <document>#> <body>#> </body>#> <a/>#> </document>xml2::xml_find_all(doc, "//a")
#> {xml_nodeset (1)}#> [1] <a/>
Case 2
Whenever namespaces are involved, xml_find_all creates an error, I guess it produces an infinite loop somewhere.
# Create a simple documentdoc<-xml2::as_xml_document(
"<document xmlns:z=\"https://example.com/\"> <z:body> </z:body> </document>"
)
body<-xml2::xml_find_all(doc, "//z:body")[[1]]
# Create an element named "a" in the documenta<-xml2::xml_add_child(doc, "z:a")
# Move it to bodya_moved<-xml2::xml_add_child(body, a, .copy=FALSE)
# We see it is copied by referencea$node#> <pointer: 0x00000000047eaea0>a_moved$node#> <pointer: 0x00000000047eaea0># We see it now lives in two places, but trying to find the first, or all, crashes
cat(as.character(doc))
#> <?xml version="1.0" encoding="UTF-8"?>#> <document xmlns:z="https://example.com/">#> <z:body>#> <z:a/></z:body>#> <z:a/>#> </document>xml2::xml_find_first(doc, "//z:a")
#> Error in xml_find_first.xml_node(doc, "//z:a"): Memory allocation failed : growing nodeset hit limit#> [2]xml2::xml_find_all(doc, "//z:a")
#> Error in xml_find_all.xml_node(doc, "//z:a"): Memory allocation failed : growing nodeset hit limit#> [2]
The text was updated successfully, but these errors were encountered:
I can reproduce the behavior you describe, but I don't really have time to look into if there is a obvious way to fix it. I would suggest not using .copy = FALSE
As the documentation of
xml_add_child
suggests, setting.copy = FALSE
moves the element to its new location. However, it does not remove the node from its original location, leaving a zombied node that cannot be removed.Case 1
Case 2
Whenever namespaces are involved,
xml_find_all
creates an error, I guess it produces an infinite loop somewhere.The text was updated successfully, but these errors were encountered: