New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: brittle support for matching a namespace by identifier or url #12624

Open
pkieltyka opened this Issue Sep 15, 2015 · 4 comments

Comments

Projects
None yet
4 participants
@pkieltyka
Copy link

pkieltyka commented Sep 15, 2015

The issue is that I believe a struct tag's namespace should be matchable by the xmlns identifier or url.

To shed some light on the issue, consider a RSS feed parser thats deals with namespaces from a variety of definitions. I could expect a few different kinds of xmlns definitions for the same type of structure. ie. consider mRSS feeds in the wild that use the "media" namespace, you will find:

  1. Xmlns wasn't defined, but the namespace was used (ie. for mRSS with media namespace)
  2. Xmlns was defined as xmlns:media="http://search.yahoo.com/mrss/"
  3. Xmlns was defined as xmlns:media="http://search.yahoo.com/mrss"

I noticed that encoding/xml would track the xmlns' in a map to the url, and would match the struct tags to the url. The issue of course here is with 2 and 3, where the difference between a "/" would throw off the parser.

I wrote a fix (including tests) using Go 1.5.1's encoding/xml code: pkieltyka/xml@7ad1fab

Consider a partial parser for the media rss module:

type Media struct {
  Title Title `xml:"media title"`
  Description Description `xml:"media description"`
  Thumbnails []Thumbnail `xml:"media thumbnail"`
  Contents []Content `xml:"media content"`
  MediaGroups []Group `xml:"media group"`
}

Notice the using the namespace prefix in the struct tag instead of the ns url. But, if xmlns:media="URL" was defined in the original document, the parser would expect to match it by the URL, but IMO, it should check both the prefix and url of the namespace. I'm reporting this issue and will submit the fix separately, thanks for the consideration.

@pkieltyka

This comment has been minimized.

Copy link
Author

pkieltyka commented Sep 15, 2015

@gopherbot

This comment has been minimized.

Copy link

gopherbot commented Sep 15, 2015

CL https://golang.org/cl/14601 mentions this issue.

@rakyll

This comment has been minimized.

Copy link
Member

rakyll commented Sep 16, 2015

We need a consortium of ideas in general around the namespace support for the xml package before considering this change.

See #11496, #11496 and #6800.

1.5 cycle unnecessary broke the existing behavior of the package for many cases and the changes that have gone through to address the namespacing bugs had to be reverted, see #11841.

@iWdGo

This comment has been minimized.

Copy link
Contributor

iWdGo commented Apr 9, 2018

This proposal is not in line with the namespace XML standard (https://www.w3.org/TR/xml-names/#NSNameComparison) which explicitely states that URI are treated as strings and must be exactly identical, i.e. without escaping or any other manipulation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment