New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: support QName values / expose namespace bindings #12406

Open
pdw-mb opened this Issue Aug 30, 2015 · 6 comments

Comments

Projects
None yet
6 participants
@pdw-mb

pdw-mb commented Aug 30, 2015

It's not uncommon for XML to contain QNames as element and attribute values, e.g.

  <my-document xmlns:foo="http//..." >
    <my-element>foo:bar</my-element>
  </my-document>

In order to correctly unmarshal the value, you need to know the namespace bindings in effect for my-element, but Decoder doesn't appear to expose this information. A simple addition to encoding/xml of:

  func (d *Decoder) NamespaceBindings() map[string]string {
    return d.ns
  }

allows unmarshallers to access the necessary information, for example, I can now write:

  type QName struct {
    Namespace string
    Local     string
  } 

 func (qname *QName) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
    var s string
    d.DecodeElement(&s, &start)
    i := strings.Index(s, ":")
    prefix := ""
    if i >= 0 {
      prefix = s[:i]
      qname.Namespace = s[i+1:]
    } else {
      qname.Namespace = s
    }
    var ok bool
    qname.Namespace, ok = d.NamespaceBindings()[prefix]
    if !ok {
      return errors.New("Unbound namespace prefix: " + prefix)
    }
    return nil
  }

Arguably, something like the above, and a corresponding attribute unmarshaller could be provided on the standard xml.Name.

More discussion of this issue here:

https://groups.google.com/forum/#!searchin/golang-nuts/QName/golang-nuts/DexmVLQOJxk/whBaKK9ntHsJ

go version go1.5 darwin/amd64

@md5

This comment has been minimized.

md5 commented Sep 11, 2015

This could help with an issue I reported at hooklift/gowsdl#37

In that case, the gowsdl library is trying to parse SOAP envelopes that have variable body content, but the VirtualBox web service is putting the xmlns:vbox="http://www.virtualbox.org" declaration on the <SOAP-ENV:Envelope>. When the innerxml of the <SOAP-ENV:Body> is parsed, the xmlns:vbox="http://www.virtualbox.org" mapping is unavailable to the newly created xml.Decoder inside the second call to xml.Unmarshal.

Having access to the namespaces from the outer xml.Decoder and being able to pass them to a new xml.Decoder would be one way to deal with this issue. 👍

@rsc

This comment has been minimized.

Contributor

rsc commented Oct 23, 2015

Thanks for the note. I think we may try one more time to get namespaces right in xml. And then we're going to give up and say "what we've got is what we've got."

@rsc rsc added this to the Go1.6 milestone Oct 23, 2015

@rsc rsc added the Thinking label Oct 23, 2015

@md5

This comment has been minimized.

md5 commented Oct 23, 2015

Thanks @rsc.

In the gowsdl case, I ended up implementing xml.Unmarshaller to allow the tool to process the whole XML file in a single pass: hooklift/gowsdl#43

@pdw-mb

This comment has been minimized.

pdw-mb commented Oct 24, 2015

Thanks @rsc. I think what's there is pretty close. QName values in XML documents are inherently problematic because you need access to the current namespace bindings in order to understand them, but they are fairly widely used.

After filing this issue, I've realised that the fix for attributes is more problematic as UnmarshallXMLAttr doesn't currently get passed the Decoder object, so addressing this would require a breaking change to the API, rather than just the addition of a method.

@rsc rsc changed the title from encoding/xml: Support QName values / expose namespace bindings to encoding/xml: support QName values / expose namespace bindings Nov 5, 2015

@rsc

This comment has been minimized.

Contributor

rsc commented Nov 25, 2015

Blocked on #13400.

@rsc rsc modified the milestones: Go1.7, Go1.6 Nov 25, 2015

@pdw-mb

This comment has been minimized.

pdw-mb commented Feb 16, 2016

I've implemented the proposed change in a fork that can be found here: https://code.blinkace.com/go/xml

The relevant changes are:

  • Add Decoder.NamespaceBindings to allow Unmarshalers to get access to current bindings
  • Alter UnmarshalXMLAttr to include the Decoder as a parameter (breaking change)
  • Add Encoder.GetPrefix to allow marshalers to insert prefixes needed for QName and other values that require them.

I've also implemented a QName package which is a Marshaler / Unmarshaler. This might be better merged with XMLName.

@rsc rsc modified the milestones: Go1.8, Go1.7 May 18, 2016

@rsc rsc modified the milestones: Go1.9Early, Go1.8 Oct 26, 2016

@bradfitz bradfitz modified the milestones: Go1.9Early, Go1.10Early May 3, 2017

@bradfitz bradfitz removed this from the Go1.10Early milestone Jun 14, 2017

@bradfitz bradfitz modified the milestones: Go1.10, Go1.10Early Jun 14, 2017

@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017

@ianlancetaylor ianlancetaylor removed the Blocked label May 17, 2018

@bradfitz bradfitz modified the milestones: Go1.11, Go1.12 May 18, 2018

@ianlancetaylor ianlancetaylor added this to the Go1.12 milestone Jun 1, 2018

@ianlancetaylor ianlancetaylor modified the milestones: Go1.12, Go1.13 Dec 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment