Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.escape_xml() performance tweak #127

Closed
TimTaylor opened this issue Aug 8, 2023 · 2 comments
Closed

.escape_xml() performance tweak #127

TimTaylor opened this issue Aug 8, 2023 · 2 comments

Comments

@TimTaylor
Copy link

TimTaylor commented Aug 8, 2023

If stringi::stri_replace_all_fixed() cycles through patterns and replacements in the order given then you could change

x_utf8 <- stringi::stri_enc_toutf8(x)
x_no_amp <- stringi::stri_replace_all_fixed(str = x_utf8, pattern = c("&"), replacement = c("&amp;"), vectorize_all = FALSE)
stringi::stri_replace_all_fixed(str = x_no_amp, pattern = c("\"", "<", ">", "'"), replacement = c("&quot;", "&lt;", "&gt;", "&apos;"), vectorize_all = FALSE)

to

x_utf8 <- stringi::stri_enc_toutf8(x)
stringi::stri_replace_all_fixed(str = x_utf8, pattern = c("&","\"", "<", ">", "'"), replacement = c("&amp;", "&quot;", "&lt;", "&gt;", "&apos;"), vectorize_all = FALSE)

and eek out a little more performance (For me 10% on top of change you propose in #126). Looking at the stringi documentation and code this seems to be the case.

Hope this makes sense.

@chainsawriot
Copy link
Collaborator

Thanks for the suggestion, @TimTaylor . It seems that it can really prevent some wasteful steps.

.escape_xml <- function(x) {
    stringi::stri_replace_all_fixed(str = stringi::stri_enc_toutf8(x), pattern = c("&", "\"", "<", ">", "'"), replacement = c("&amp;", "&quot;", "&lt;", "&gt;", "&apos;"), vectorize_all = FALSE)
}

Without #126

system.time(write_ods(nycflights13::flights))
##   user  system elapsed 
## 162.434   3.269 165.810 

Again versus current HEAD

system.time(write_ods(nycflights13::flights))
##   user  system elapsed 
## 203.636   4.029 207.797 

chainsawriot added a commit that referenced this issue Aug 8, 2023
chainsawriot added a commit that referenced this issue Aug 8, 2023
* Implement #127

* Update benchmark
@chainsawriot
Copy link
Collaborator

added 3833747 thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants