Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: brittle support for matching a namespace by identifier or url #12624

Open
pkieltyka opened this issue Sep 15, 2015 · 4 comments
Open
Milestone

Comments

@pkieltyka
Copy link

@pkieltyka pkieltyka commented Sep 15, 2015

The issue is that I believe a struct tag's namespace should be matchable by the xmlns identifier or url.

To shed some light on the issue, consider a RSS feed parser thats deals with namespaces from a variety of definitions. I could expect a few different kinds of xmlns definitions for the same type of structure. ie. consider mRSS feeds in the wild that use the "media" namespace, you will find:

  1. Xmlns wasn't defined, but the namespace was used (ie. for mRSS with media namespace)
  2. Xmlns was defined as xmlns:media="http://search.yahoo.com/mrss/"
  3. Xmlns was defined as xmlns:media="http://search.yahoo.com/mrss"

I noticed that encoding/xml would track the xmlns' in a map to the url, and would match the struct tags to the url. The issue of course here is with 2 and 3, where the difference between a "/" would throw off the parser.

I wrote a fix (including tests) using Go 1.5.1's encoding/xml code: pkieltyka/xml@7ad1fab

Consider a partial parser for the media rss module:

type Media struct {
  Title Title `xml:"media title"`
  Description Description `xml:"media description"`
  Thumbnails []Thumbnail `xml:"media thumbnail"`
  Contents []Content `xml:"media content"`
  MediaGroups []Group `xml:"media group"`
}

Notice the using the namespace prefix in the struct tag instead of the ns url. But, if xmlns:media="URL" was defined in the original document, the parser would expect to match it by the URL, but IMO, it should check both the prefix and url of the namespace. I'm reporting this issue and will submit the fix separately, thanks for the consideration.

@pkieltyka
Copy link
Author

@pkieltyka pkieltyka commented Sep 15, 2015

@gopherbot
Copy link

@gopherbot gopherbot commented Sep 15, 2015

CL https://golang.org/cl/14601 mentions this issue.

@rakyll
Copy link
Member

@rakyll rakyll commented Sep 16, 2015

We need a consortium of ideas in general around the namespace support for the xml package before considering this change.

See #11496, #11496 and #6800.

1.5 cycle unnecessary broke the existing behavior of the package for many cases and the changes that have gone through to address the namespacing bugs had to be reverted, see #11841.

@iWdGo
Copy link
Contributor

@iWdGo iWdGo commented Apr 9, 2018

This proposal is not in line with the namespace XML standard (https://www.w3.org/TR/xml-names/#NSNameComparison) which explicitely states that URI are treated as strings and must be exactly identical, i.e. without escaping or any other manipulation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.