man/xml_find_all.Rd

% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/xml_find.R
\name{xml_find_all}
\alias{xml_find_all}
\alias{xml_find_chr}
\alias{xml_find_lgl}
\alias{xml_find_num}
\alias{xml_find_one}
\title{Find nodes that match an xpath expression.}
\usage{
xml_find_all(x, xpath, ns = xml_ns(x))

xml_find_one(x, xpath, ns = xml_ns(x))

xml_find_num(x, xpath, ns = xml_ns(x))

xml_find_chr(x, xpath, ns = xml_ns(x))

xml_find_lgl(x, xpath, ns = xml_ns(x))
}
\arguments{
\item{x}{A document, node, or node set.}

\item{xpath}{A string containing a xpath (1.0) expression.}

\item{ns}{Optionally, a named vector giving prefix-url pairs, as produced
by \code{\link{xml_ns}}. If provided, all names will be explicitly
qualified with the ns prefix, i.e. if the element \code{bar} is defined
in namespace \code{foo}, it will be called \code{foo:bar}. (And
similarly for atttributes). Default namespaces must be given an explicit
name.}
}
\value{
\code{xml_find_all} always returns a nodeset: if there are no matches
  the nodeset will be empty. The result will always be unique; repeated
  nodes are automatically de-duplicated.

  \code{xml_find_one} returns a node if applied to a node, and a nodeset
  if applied to a nodeset. The output is \emph{always} the same size as
  the input. If there are no matches, \code{xml_find_one} will throw an
  error; if there are multiple matches, it will use the first with a warning.

  \code{xml_find_num}, \code{xml_find_chr}, \code{xml_find_lgl} return
  numeric, character and logical results respectively.
}
\description{
Xpath is like regular expressions for trees - it's worth learning if
you're trying to extract nodes from arbitrary locations in a document.
Use \code{xml_find_all} to find all matches - if there's no match you'll
get an empty result. Use \code{xml_find_one} to find a specific match -
if there's no match you'll get an error.
}
\examples{
x <- read_xml("<foo><bar><baz/></bar><baz/></foo>")
xml_find_all(x, ".//baz")
xml_path(xml_find_all(x, ".//baz"))

# Note the difference between .// and //
# //  finds anywhere in the document (ignoring the current node)
# .// finds anywhere beneath the current node
(bar <- xml_find_all(x, ".//bar"))
xml_find_all(bar, ".//baz")
xml_find_all(bar, "//baz")

# Find all vs find one -----------------------------------------------------
x <- read_xml("<body>
  <p>Some <b>text</b>.</p>
  <p>Some <b>other</b> <b>text</b>.</p>
  <p>No bold here!</p>
</body>")
para <- xml_find_all(x, ".//p")

# If you apply xml_find_all to a nodeset, it finds all matches,
# de-duplicates them, and returns as a single list. This means you
# never know how many results you'll get
xml_find_all(para, ".//b")

# xml_find_one only returns one match per input node. If there are 0
# matches it will return a missing node; if there are more than one it picks
# the first with a warning
xml_find_one(para, ".//b")
xml_text(xml_find_one(para, ".//b"))

# Namespaces ---------------------------------------------------------------
# If the document uses namespaces, you'll need use xml_ns to form
# a unique mapping between full namespace url and a short prefix
x <- read_xml('
 <root xmlns:f = "http://foo.com" xmlns:g = "http://bar.com">
   <f:doc><g:baz /></f:doc>
   <f:doc><g:baz /></f:doc>
 </root>
')
xml_find_all(x, ".//f:doc")
xml_find_all(x, ".//f:doc", xml_ns(x))
}
\seealso{
\code{\link{xml_ns_strip}} to remove the default namespaces
}