Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[XML] fix writing characters with xml content. closes #894 #895

Merged
merged 1 commit into from Jan 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions NEWS.md
Expand Up @@ -4,6 +4,10 @@

* Experimental support for calculation of pivot table fields. [892](https://github.com/JanMarvin/openxlsx2/pull/892)

## Fixes

* Character strings with XML content were not written correctly: `a <br/> b` was converted to something neither we nor spreadsheet software was able to decipher. [895](https://github.com/JanMarvin/openxlsx2/pull/895)


***************************************************************************

Expand Down
3 changes: 3 additions & 0 deletions R/class-workbook-wrappers.R
Expand Up @@ -122,6 +122,9 @@ wb_save <- function(wb, file = NULL, overwrite = TRUE, path = NULL) {
#' If the data frame contains this string, the output will be broken.
#' Many base classes are covered, though not all and far from all third-party classes.
#' When data of an unknown class is written, it is handled with `as.character()`.
#' It is not possible to write character nodes beginning with `<r>` or `<r/>`. Both
#' are reserved for internal functions. If you need these. You have to wrap
#' the input string in `fmt_txt()`.
#' @family workbook wrappers
#' @family worksheet content functions
#' @return A `wbWorkbook`, invisibly.
Expand Down
3 changes: 3 additions & 0 deletions man/wb_add_data.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 7 additions & 3 deletions src/strings_xml.cpp
Expand Up @@ -112,12 +112,16 @@ std::string txt_to_xml(

pugi::xml_node is_node = doc.append_child(type.c_str());

pugi::xml_document txt_node;
pugi::xml_parse_result result = txt_node.load_string(text.c_str(), pugi::parse_default | pugi::parse_ws_pcdata | pugi::parse_escapes);
// txt input beginning with "<r" is assumed to be a fmt_txt string
if (text.rfind("<r>", 0) == 0 || text.rfind("<r/>", 0) == 0) {

pugi::xml_document txt_node;
pugi::xml_parse_result result = txt_node.load_string(text.c_str(), pugi::parse_default | pugi::parse_ws_pcdata | pugi::parse_escapes);
if (!result) Rcpp::stop("Could not parse xml in txt_to_xml()");

if (result) {
for (auto is_n : txt_node.children())
is_node.append_copy(is_n);

} else {
// text to export
pugi::xml_node t_node = is_node.append_child("t");
Expand Down
25 changes: 25 additions & 0 deletions tests/testthat/test-strings_xml.R
Expand Up @@ -116,4 +116,29 @@ test_that("strings_xml", {
got <- wb$worksheets[[1]]$sheet_data$cc$is
expect_equal(exp, got)

exp <- "<is><t>foo &lt;em&gt;bar&lt;/em&gt;</t></is>"
got <- txt_to_is('foo <em>bar</em>')
expect_equal(exp, got)

# exception to the rule: it is not possible to write characters starting with "<r/>" or "<r>""
exp <- "<is><r>foo</r></is>"
got <- txt_to_is('<r>foo</r>')
expect_equal(exp, got)

exp <- "<is><r><rPr/><t>&lt;r&gt;foo&lt;/r&gt;</t></r></is>"
got <- txt_to_is(fmt_txt('<r>foo</r>'))
expect_equal(exp, got)

exp <- "<is><t>&lt;red&gt;foo&lt;/red&gt;</t></is>"
got <- txt_to_is('<red>foo</red>')
expect_equal(exp, got)

exp <- "<is><t>foo&lt;/r&gt;</t></is>"
got <- txt_to_is('foo</r>')
expect_equal(exp, got)

exp <- "<is><t xml:space=\"preserve\"> &lt;r&gt;foo&lt;/r&gt;</t></is>"
got <- txt_to_is(' <r>foo</r>')
expect_equal(exp, got)

})