Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: brittle support for matching a namespace by identifier or url #12624

Open
pkieltyka opened this issue Sep 15, 2015 · 5 comments
Open
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@pkieltyka
Copy link

The issue is that I believe a struct tag's namespace should be matchable by the xmlns identifier or url.

To shed some light on the issue, consider a RSS feed parser thats deals with namespaces from a variety of definitions. I could expect a few different kinds of xmlns definitions for the same type of structure. ie. consider mRSS feeds in the wild that use the "media" namespace, you will find:

  1. Xmlns wasn't defined, but the namespace was used (ie. for mRSS with media namespace)
  2. Xmlns was defined as xmlns:media="http://search.yahoo.com/mrss/"
  3. Xmlns was defined as xmlns:media="http://search.yahoo.com/mrss"

I noticed that encoding/xml would track the xmlns' in a map to the url, and would match the struct tags to the url. The issue of course here is with 2 and 3, where the difference between a "/" would throw off the parser.

I wrote a fix (including tests) using Go 1.5.1's encoding/xml code: pkieltyka/xml@7ad1fab

Consider a partial parser for the media rss module:

type Media struct {
  Title Title `xml:"media title"`
  Description Description `xml:"media description"`
  Thumbnails []Thumbnail `xml:"media thumbnail"`
  Contents []Content `xml:"media content"`
  MediaGroups []Group `xml:"media group"`
}

Notice the using the namespace prefix in the struct tag instead of the ns url. But, if xmlns:media="URL" was defined in the original document, the parser would expect to match it by the URL, but IMO, it should check both the prefix and url of the namespace. I'm reporting this issue and will submit the fix separately, thanks for the consideration.

@pkieltyka
Copy link
Author

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/14601 mentions this issue.

@rakyll
Copy link
Contributor

rakyll commented Sep 16, 2015

We need a consortium of ideas in general around the namespace support for the xml package before considering this change.

See #11496, #11496 and #6800.

1.5 cycle unnecessary broke the existing behavior of the package for many cases and the changes that have gone through to address the namespacing bugs had to be reverted, see #11841.

@iwdgo
Copy link
Contributor

iwdgo commented Apr 9, 2018

This proposal is not in line with the namespace XML standard (https://www.w3.org/TR/xml-names/#NSNameComparison) which explicitely states that URI are treated as strings and must be exactly identical, i.e. without escaping or any other manipulation.

@iwdgo
Copy link
Contributor

iwdgo commented Jul 4, 2024

https://go.dev/play/p/zBgGuTzbMoe?v=gotip contains the test provided.

@seankhliao seankhliao added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jul 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants