Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: encoding/xml: support *string for innerxml in Unmarshal #26512

Open
damianoneill opened this issue Jul 20, 2018 · 5 comments

Comments

@damianoneill
Copy link

commented Jul 20, 2018

I would like the contributors to consider extending the xml Unmarshal to support Unmarshalling a innerxml to a *string.

Currently func Unmarshal(data []byte, v interface{}) error decodes an xml element into a innerxml as a []byte or string, trying to decode to a *string results in nil as seen below.

https://play.golang.org/p/oaGu0rKYNgi

The motivation for this is supporting use cases where both empty and 'not set' semantics are required. This is described well in this article https://willnorris.com/2014/05/go-rest-apis-and-pointers#pointers

Thanks in advance,
Damian.

@bcmills bcmills changed the title Extend xml.Unmarshal to support unmarshalling innerxml as a *string proposal: encoding/xml: support *string for innerxml in Unmarshal Jul 23, 2018

@gopherbot gopherbot added this to the Proposal milestone Jul 23, 2018

@gopherbot gopherbot added the Proposal label Jul 23, 2018

@bcmills

This comment has been minimized.

Copy link
Member

commented Jul 23, 2018

use cases where both empty and 'not set' semantics are required

Do you have some example real-world XML APIs that require this distinction?
Do other XML libraries (e.g. in other popular languages) support it?

It's not obvious to me whether this sort of distinction is idiomatic in XML in general: if it is, we should consider supporting it, but we shouldn't add more decisions for Go users to make if those features aren't portable to other XML parsers anyway.

@damianoneill

This comment has been minimized.

Copy link
Author

commented Jul 30, 2018

Hi @bcmills sorry for the delay. You can refer to this rfc https://tools.ietf.org/html/rfc6241#page-20 This describes a valid document for the use case above.

See below for valid documents:

<filter type="subtree">
       <top xmlns="http://example.com/schema/1.2/config">
         <users/>
       </top>
</filter>

Above shows a document with a filter including a containment, in this case beginning with the element top, note there are lots of valid sub-elements, no rules other than this is innerxml can be inferred.

<filter type="subtree">   </filter>

In the above case the empty filter with no containment, note the whitespace.

Ideally I could decode this into a struct

type Filter struct {
	XMLName     xml.Name `xml:"filter,omitempty" json:"filter,omitempty"`
	Type        string   `xml:"type,attr"  json:"type,omitempty"`
	Containment *string  `xml:",innerxml" json:",omitempty"`
}

A nil check on the string pointer would allow me to rationalise that no Containment was set.

Does this make sense?

@bcmills

This comment has been minimized.

Copy link
Member

commented Jul 30, 2018

I understand the syntactic difference. My question is, are there real-world APIs for which this is also a semantic difference? The Go encoding/xml package in general only preserves semantically-meaningful information: for example, it trims and ignores whitespace from boolean and integer attributes.

Without the pointer-to-string, you could still check Containment != "" (see https://play.golang.org/p/9k1yossNk9C). So are there real-world APIs for which the difference between

<filter type="subtree" />

and

<filter type="subtree"></filter>

is important?

@rsc

This comment has been minimized.

Copy link
Contributor

commented Aug 6, 2018

Yes, I too am confused about when Containment != "" is the wrong check. I'm not even sure what it would mean to have a nil vs empty string. Are you trying to distinguish <filter></filter> and <filter/> as Bryan suggested? Or to distinguish <filter></filter> and <filter> </filter> (with spaces)? Either way I don't understand why that would be semantically meaningful.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Oct 17, 2018

On hold for XML sweep.

@rsc rsc added the Proposal-Hold label Oct 17, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.