New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: sep argument for xml_text #152
Comments
+1 This would be great. As a quick fix you could use my solution presented in tidyverse/rvest#175. Using this on your example results in:
|
This feels out of scope for xml2 to me. The job of xml2 is to give you low-level access to the xml tree, and this doesn't feel like that to me. |
It makes sense that this is out of scope for the xml2 package, but... Where is the high-level user-friendly xml parsing package? Would this feature request be welcomed for |
Yes, it would be more suitable for rvest, but it seems quite special purpose to me, and doing the operation by hand is quite easy and looks natural in a pipe: library(rvest)
library(purrr)
x <- read_xml("<root>
<a><b>1</b> <b>2</b></a>
<a><b>3</b></a>
</root>")
x %>%
xml_find_all("a") %>%
map_chr(. %>% xml_find_all("b") %>% xml_text() %>% paste(collapse = ", ")) |
xml_text
extracts texts from a node and all child nodes. That is very convenient. It would be even more convenient if there were a way to insert a separator between the text retrieved from sibling or child nodes.Currently
produces
# [1] "this is item onehere comes item two"
It would be really nice if
xml_text
had asep
argument, so we could say (e.g.)rather than
The text was updated successfully, but these errors were encountered: