New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: attach namespace to an xml document #28

Closed
jennybc opened this Issue Apr 29, 2015 · 4 comments

Comments

Projects
None yet
3 participants
@jennybc
Member

jennybc commented Apr 29, 2015

This is something you mentioned yourself here:

#24 (comment)

From where I sit, this would be a very handy thing.

Related question: could the result of xml_ns() be attached to a new xml document by default? Or just available by default? Or could that behaviour be requestable?

Consider this example from here:

x <- read_xml('
<catalog xmlns="http://www.edankert.com/examples/">
  <cd>
    <artist>Sufjan Stevens</artist>
    <title>Illinois</title>
    <src>http://www.sufjan.com/</src>
  </cd>
  <cd>
    <artist>Stoat</artist>
    <title>Future come and get me</title>
    <src>http://www.stoatmusic.com/</src>
  </cd>
  <cd>
    <artist>The White Stripes</artist>
    <title>Get behind me satan</title>
    <src>http://www.whitestripes.com/</src>
  </cd>
</catalog>')

Due to the use of a default namespace, this does not work:

> xml_find_all(x, "//cd")
{xml_nodeset (0)}

So I inspect the namespace.

> xml_ns(x)
d1 <-> http://www.edankert.com/examples/

OK now I have hope I can use the d1 prefix to specify default namespace. I'm usually happy to accept a default prefix like this.

> xml_find_all(x, "//d1:cd")
xmlXPathEval: evaluation failed
{xml_nodeset (0)}
Warning message:
In node_find_all(x$node, x$doc, xpath = xpath, nsMap = ns) :
  Undefined namespace prefix [1219]

Sad. But this does work:

> xml_find_all(x, "//d1:cd", xml_ns(x))
{xml_nodeset (3)}
[1] <cd>\n              <artist>Sufjan Stevens</artist>\n              <title>Illi ...
[2] <cd>\n              <artist>Stoat</artist>\n              <title>Future come a ...
[3] <cd>\n              <artist>The White Stripes</artist>\n              <title>G …

Could any of that be made easier?

@hadley

This comment has been minimized.

Member

hadley commented Apr 29, 2015

I think I could add xml_ns<-, and then do an auto assign on document load.

@jennybc

This comment has been minimized.

Member

jennybc commented May 18, 2016

If I can put in a good word for this issue while you guys are working on xml2 ... I would cry actual tears of joy if this namespace thing happened 🙏.

@hadley

This comment has been minimized.

Member

hadley commented May 18, 2016

Don't worry, it'll happen 😄

@jimhester

This comment has been minimized.

Member

jimhester commented May 18, 2016

You can always just search for the node name directly if you don't want to specify the default namespace.

xml_find_all(x, "//*[name()='cd']")
#> {xml_nodeset (3)}
#> [1] <cd>\n    <artist>Sufjan Stevens</artist>\n    <title>Illinois</titl ...
#> [2] <cd>\n    <artist>Stoat</artist>\n    <title>Future come and get me< ...
#> [3] <cd>\n    <artist>The White Stripes</artist>\n    <title>Get behind  ...

But I just sent #89, which implements the attached namespace when the document is created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment