Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upRefactor XmlSeeker to support more types of return values #55
Conversation
This comment has been minimized.
This comment has been minimized.
Character returns also seem to be working fine, I edited the above comment respectively. |
This comment has been minimized.
This comment has been minimized.
} | ||
return out; | ||
} | ||
case XPATH_NUMBER: { return Rcpp::NumericVector(1, result_->floatval); } |
This comment has been minimized.
This comment has been minimized.
} | ||
case XPATH_NUMBER: { return Rcpp::NumericVector(1, result_->floatval); } | ||
case XPATH_BOOLEAN: { return Rcpp::LogicalVector(1, result_->boolval); } | ||
case XPATH_STRING: { return Rcpp::CharacterVector(1, reinterpret_cast<const char*>(result_->stringval)); } |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I prefer type-stable functions, so maybe the R interface should be: |
This comment has been minimized.
This comment has been minimized.
@hadley I think this is ready for review, I added the type specific functions as well as tests for them. The only thing remaining would be tests for the missing functionality if you think we need them. I was getting segfaults when I tried to use the |
This comment has been minimized.
This comment has been minimized.
I think it's because that object should actually be a A slightly better fix is |
This comment has been minimized.
This comment has been minimized.
Hmmm, I think there's something wrong in the logic for x <- read_xml("<body>
<p>Some <b>text</b>.</p>
<p>Some <b>other</b> <b>text</b>.</p>
<p>No bold text</p>
</body>")
para <- xml_find_all(x, ".//p")
xml_find_one(para, ".//b")
# Error: expecting an external pointer I'm not sure about the logic in there - should it be checking for |
This comment has been minimized.
This comment has been minimized.
Re: |
This comment has been minimized.
This comment has been minimized.
0a12a80 adds tests for all the methods on xml_missing objects and contains the fix for the xml_find_one error you mentioned. Also added support for xml_missing to nodes_duplicated. |
This comment has been minimized.
This comment has been minimized.
Ok, this looks great :) Two last things:
#' # Find all vs find one -----------------------------------------------------
#' x <- read_xml("<body>
#' <p>Some <b>text</b>.</p>
#' <p>Some <b>other</b> <b>text</b>.</p>
#' <p>No bold here!</p>
#' </body>")
#' para <- xml_find_all(x, ".//p")
#'
#' # If you apply xml_find_all to a nodeset, it finds all matches,
#' # de-duplicates them, and returns as a single list. This means you
#' # never know how many results you'll get
#' xml_find_all(para, ".//b")
#'
#' # xml_find_one only returns one match per input node. If there are 0
#' # matches it will return a missing node; if there are more than one it picks
#' # the first with a warning
#' xml_find_one(para, ".//b")
#' xml_text(xml_find_one(para, ".//b")) |
This comment has been minimized.
This comment has been minimized.
I missing committing the new test file (whoops) in the previous round, but it is now added, as well as the documentation and notes to NEWS. |
Refactor XmlSeeker to support more types of return values
This comment has been minimized.
This comment has been minimized.
Thanks Jim! |
jimhester commentedNov 5, 2015
This refactors the
XmlSeeker
class to returnRObjects
rather than strictlyXPtrNode
orLists
. It provides a generalsearch()
method that directly returns the resultRObject
, and an exportedxpath_search()
function which provides a superset of the behavior of the previousnode_find_all()
,node_find_one()
functions. It also allows one to return a user defined subset of the results (1:max found), rather than just 1 or all.The existing behavior and user facing API is preserved, however you can also use
xpath_search()
to return booleans, numbers, characters, e.g.I figured we should discuss the R API for this, so I did not flesh that out at all or add tests, but it is a functional first pass.