Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

give a helpful message if doc has default namespace and nothing matched

  • Loading branch information...
commit 5cb1977bb3127443415a9952b0afad5be9fb54b7 1 parent 521275a
@npostavs npostavs authored
Showing with 22 additions and 3 deletions.
  1. +4 −0 src/xml.c
  2. +13 −3 src/xml_edit.c
  3. +4 −0 src/xml_select.c
  4. +1 −0  src/xmlstar.h
View
4 src/xml.c
@@ -351,6 +351,10 @@ static void bad_ns_opt(const char *msg)
exit(EXIT_BAD_ARGS);
}
+const char* DEFAULT_NS_FAIL_MESSAGE =
+ "None of the XPaths matched; to match a node in the default namespace\n"
+ "use '_' as the prefix (see section 5.1 in the manual).\n"
+ "For instance, use /_:node instead of /node\n";
const xmlChar *default_ns = NULL;
#define MAX_NS_ARGS 256
View
16 src/xml_edit.c
@@ -442,10 +442,12 @@ edMove(xmlDocPtr doc, xmlNodeSetPtr nodes, xmlNodePtr to)
/**
* Loop through array of operations and perform them
+ * @return number of successful operations
*/
-static void
+static int
edProcess(xmlDocPtr doc, const XmlEdAction* ops, int ops_count)
{
+ int op_applied = 0;
int k;
xmlXPathContextPtr ctxt = xmlXPathNewContext(doc);
/* NOTE: later registrations override earlier ones */
@@ -488,6 +490,8 @@ edProcess(xmlDocPtr doc, const XmlEdAction* ops, int ops_count)
if (!res || res->type != XPATH_NODESET || !res->nodesetval) continue;
nodes = res->nodesetval;
+ op_applied += (nodes->nodeNr > 0);
+
switch (ops[k].op)
{
case XML_ED_DELETE:
@@ -532,6 +536,8 @@ edProcess(xmlDocPtr doc, const XmlEdAction* ops, int ops_count)
xmlDeregisterNodeDefault(NULL);
xmlXPathFreeContext(ctxt);
+
+ return op_applied;
}
/**
@@ -541,6 +547,7 @@ static void
edOutput(const char* filename, const XmlEdAction* ops, int ops_count,
const edOptions* g_ops)
{
+ int op_applied;
xmlDocPtr doc;
int save_options =
#if LIBXML_VERSION >= 20708
@@ -561,7 +568,7 @@ edOutput(const char* filename, const XmlEdAction* ops, int ops_count,
exit(EXIT_BAD_FILE);
}
- edProcess(doc, ops, ops_count);
+ op_applied = edProcess(doc, ops, ops_count);
/* avoid getting ASCII CRs in UTF-16/UCS-(2,4) text */
if ((xmlStrcasestr(doc->encoding, BAD_CAST "UTF") == 0
@@ -578,8 +585,11 @@ edOutput(const char* filename, const XmlEdAction* ops, int ops_count,
save = xmlSaveToFilename(g_ops->inplace? filename : "-", NULL, save_options);
xmlSaveDoc(save, doc);
xmlSaveClose(save);
-
xmlFreeDoc(doc);
+
+ if (!op_applied && ops_count > 0 && default_ns && !globalOptions.quiet) {
+ fprintf(stderr, DEFAULT_NS_FAIL_MESSAGE);
+ }
}
/**
View
4 src/xml_select.c
@@ -739,6 +739,10 @@ selMain(int argc, char **argv)
if (i == argc)
do_file("-", style_tree, xml_options, &ops, &xsltOps, &status);
+ if (status != EXIT_SUCCESS && default_ns && !ops.quiet) {
+ fprintf(stderr, DEFAULT_NS_FAIL_MESSAGE);
+ }
+
/*
* Shutdown libxml
*/
View
1  src/xmlstar.h
@@ -45,6 +45,7 @@ void registerXstarVariable(xmlXPathContextPtr ctxt,
const char* name, xmlXPathObjectPtr value);
void registerXstarNs(xmlXPathContextPtr ctxt);
+const char* DEFAULT_NS_FAIL_MESSAGE;
extern const xmlChar *default_ns;
int parseNSArr(xmlChar** ns_arr, int* plen, int argc, char **argv);

11 comments on commit 5cb1977

@mogsie

This change causes lots of false "helpful messages" to appear when xmlstarlet is used to test for the simple presence of a node. Using -q is not an option since it causes the output of the sel to be suppressed too. This message should be hidden behind a --verbose flag or something.

@npostavs

This message should be hidden behind a --verbose flag or something.

That would defeat the purpose: if you knew to pass the --verbose flag, you would probably already know to use the correct namespace. A more correct implementation would look at the XPath expression to see if it actually lacked a namespace prefix, but I don't want to start parsing XPath; I'll probably just remove the message for sel.

@npostavs

A more correct implementation would look at the XPath expression to see if it actually lacked a namespace prefix

@mogsie: Do you think doing approximating this with something like strchr(xpath, ':') would reduce the false positive hits enough?

@mogsie

May well be. But then at some random point in time, some guy will have an xpath with a colon in some selector which will spew out these error messages. What about checking if there are any namespaces defined? You can't use any namespaces in the xpath unless you declare them with -N right? So if there are no namespaces defined and the document uses namespaces; then the message is printed?

@npostavs

an xpath with a colon in some selector which will spew out these error messages.

Actually, I meant the opposite: for documents with a default namespace, using an XPath without a colon would give the message (if nothing matched). I think this would remove all false positives except for attributes searches (since default namespace doesn't apply to them), as in xml sel -t -v //@some-attrib

You can't use any namespaces in the xpath unless you declare them with -N right?

Actually, you can use the namespace prefixes defined in the top level node of the document, with the default namespace as "_". I should probably document this properly...

@mogsie

Wow, that's both strange and spooky, actually. I had no idea you could use namespace bindings of the document, let alone reference whatever namespace is the default namespaces of the root element... To me that sounds like pitfall that (if it were well known) would lead to coupling between the namespace prefixes used rather than the namespaces themselves.

I suspect there are loads of users of xmlstarlet who have come to depend on this (some without even knowing about the dependency: they hack away at it until they get something that works) so it's not easily removed without causing massive breakage. In that case I would like a way to opt out of this... --omit-default-namespace-bindings or something.

But then yes, I agree. If the document has a namespace, then not using a namespace is slightly strange; it would catch a lot of these bogus error messages.

@npostavs

To me that sounds like pitfall that (if it were well known) would lead to coupling between the namespace prefixes used rather than the namespaces themselves.
I would like a way to opt out of this... --omit-default-namespace-bindings or something.

Ah, I see you are True Believer in XML Namespaces. So far I have heard only from people who are confused by namespaces, and/or want things like --ignore-namespaces.

Note that the prefixes for various (E)XSLT functions were already automatic, would you want to opt out of that too? With the same switch or a different switch?

Also note that -N does override any automatic bindings (or if it doesn't that's a bug).

@mogsie

Indeed, I am a True Believer of XML Namespaces. Or at least a True Understander. (E)XSLT function bindings are less problematic since using those doesn't tie my xmlstarlet command invocation to a document. The problem is just that: that the scripts I write today become tied to documents using a specific namespace prefix for different namespaces.

Imagine I'm writing a script and I need to do some XML editing and I use my handy xmlstarlet to get the job done. The documents I see look like this:

<bar:foo xmlns:bar="urn:bar">This</bar:foo>

If I write

xmlstarlet ed -u /bar:foo -v That

then I'm all good. But this command line fails in the face of this (semantically equivalent) document:

<foo:foo xmlns:foo="urn:bar">This</foo:foo>

or this (also pretty much identical) one:

<foo xmlns="urn:bar">This</foo>

The only way to ensure that my script is capable of understanding all future documents that I throw at it is to manually declare the namespace prefix in the xmlstarlet command line:

xmlstarlet ed -N bar=urn:bar -u /bar:foo -v That
@npostavs

Okay, I've tried to avoid unwanted cases of the message in 18589bf. Let me know if you still encounter any, I may yet decide to remove it.

Also, dc94219 adds a --no-doc-namespace option (and the inverse --doc-namespace) to opt out of the magic namespace stuff. It also turns off the message, since the message doesn't really apply in that case.

@darose

So I upgraded our server to Ubuntu 13.10 and accidentally picked up a newer version of xmlstarlet that contains this new "feature". Trying to figure out how to work around and/or disable it.

We have scripts that we rely on heavily which use xmlstarlet to interface with Amazon's API using XML. But the scripts are now generating lots of spurious "error" messages for us on XML parsing that's completely valid.

So, for example, this call:

$ aws --xml describe-instances --filter private-dns-name=<ip of our "job controller" machine> | less

Generates XML that starts like this:

<?xml version="1.0" encoding="UTF-8"?>
<DescribeInstancesResponse xmlns="http://ec2.amazonaws.com/doc/2012-10-01/">
...

So as a result, this works fine:

$ aws --xml describe-instances --filter private-dns-name=<ip of our "job controller" machine>| xmlstarlet sel -N ec2="http://ec2.amazonaws.com/doc/2012-10-01/" -t -m ec2:DescribeInstancesResponse/ec2:reservationSet/ec2:item/ec2:instancesSet/ec2:item/ec2:tagSet/ec2:item[ec2:key=\"Name\"] -v ec2:value; echo                                                            
Job controller

And this works fine too:

$ aws --xml describe-instances --filter private-dns-name=<ip of our "job controller" machine> | xmlstarlet sel -t -m _:DescribeInstancesResponse/_:reservationSet/_:item/_:instancesSet/_:item/_:tagSet/_:item[_:key=\"Name\"] -v _:value; echo
Job controller

But these xpath queries, which both result in the completely legitimate case of no nodes being selected, both generate the error message:

$ aws --xml describe-instances --filter private-dns-name=<ip of our "job controller" machine> | xmlstarlet sel -N ec2="http://ec2.amazonaws.com/doc/2012-10-01/" -t -m ec2:DescribeInstancesResponse/ec2:reservationSet/ec2:item/ec2:instancesSet/ec2:item/ec2:tagSet/ec2:item[ec2:key=\"Namee\"] -v ec2:value; echo                                                           
None of the XPaths matched; to match a node in the default namespace
use '_' as the prefix (see section 5.1 in the manual).
For instance, use /_:node instead of /node

$ aws --xml describe-instances --filter private-dns-name=<ip of our "job controller" machine>| xmlstarlet sel -t -m _:DescribeInstancesResponse/_:reservationSet/_:item/_:instancesSet/_:item/_:tagSet/_:item[_:key=\"Namee\"] -v _:value; echo                                                                                                           
None of the XPaths matched; to match a node in the default namespace
use '_' as the prefix (see section 5.1 in the manual).
For instance, use /_:node instead of /node

How can I eliminate / work around those error messages? They're really quite useless in this situation - just pointless noise.

I see you added a new --no-doc-namespace option, but that doesn't appear to be included in the release currently shipping with Ubuntu 13.10. (v1.5.0)

@npostavs

Yes, I have decided to get rid of this message completely in 1.6.0, which I'm just releasing right now.

Meanwhile, you can work around it by adding -n or --nl to the end of the select template. This will unconditionally output a newline so that the statement will never "fail to match" (also replacing the call to echo).

... | xmlstarlet sel -t -m the/xpath/ -v _:value -b -n

Please sign in to comment.
Something went wrong with that request. Please try again.