-
Notifications
You must be signed in to change notification settings - Fork 83
Closed
Description
In my html-code i have  
(i.e. non non-breaking space). If i try to get the the text via xml_text(..., trim=TRUE)
it returns the non-breaking space instead of an empty string.
Is this a feature? IMHO the expected behavior would be to return an empty string...
Minimal-Example:
require(xml2)
space <- rawToChar(as.raw(c(0xc2, 0xa0)))
doc <- read_xml(paste0('<td style="text-align:left;">', space, '</td>'))
xml_text(doc, trim = TRUE) == "" # FALSE
charToRaw(xml_text(doc, trim = TRUE)) #[1] c2 a0
Workaround:
stringi::stri_trim_both(xml_text(doc, trim = TRUE))
or stringr::str_trim
Metadata
Metadata
Assignees
Labels
No labels