Skip to content

proposal: encoding/xml: Add EmptyElement token type to support self-closing elements. #69273

@nemith

Description

@nemith

Proposal Details

Background

The current implementation has no concept of self-closing elements and if a empty value (like an empty string) is provided then a start element followed by a end element is given like <foo></foo>

Unfortunately there are XML implementations out there that depend on self closing elements. These are usually closed source and probably doing some hand parsing of certain elements. These implementations are broken but it is what we have.

For my personal issue Juniper Networks has a XML API to their routers. This is all closed source and they expect certain elements to denote configuration targets such as the running config or the startup config. These probably should not be elements at all but they only accept <running/> or <startup/> as valid config targets. Sending <running></running> results in a parsing error using their API.

There may be other reasons to support self-closing elements but it is unknown to me.

There have been a number of requests to support self-closing elements in the encoding/xml package.

And on external repos

Proposal

This expands on some of the discussions in #21399 bu @SamWhited, @Merovius and others that there needed to be a way to encode with Encoder.EncodeToken() (at very least).

This would add a new token type called EmptyElement (not in love with the name and we can bikeshed it needed) which would be essentially a clone of StartElement

// EmptyElement repsents a self-closing element (i.e <my-element/>).  This
// element type is only used during encoding and is never emitted when decoding.
type EmptyElement struct {
	Name Name
	Attr []Attr
}

This element would be encoded by Encoder.EncodeToken() and would emit a self closing tag for the given Name and optional attributes.

enc := xml.NewEncoder(w)
_ := enc.EncodeToken(xml.EmptyElement{Name: xml.Name{Local: "foo"}}; 

would emit <foo/> to the writer.

Unlike StartElement no error would be given if a EndElement is not found as it already self-closing.

This new token type would only be used for encoding. Decoding would continue to use the same logic and would only produce StartElement and EndElement types even if a self-closing tag is found to be fully backwards compatible. Documenation will state this.

Given this new type custom types using xml.Marshaler can be created to emit self-closing elements. In the future a struct tag to make this easier for encoding structs could be investigated, but given the scope of this issue, probably isn't warranted.

Alternatives

Automatically convert "empty" elements to self-closing (i.e: <foo></foo> -> <foo/>)

This was my original suggestion back in 2013 in #6896. As pointed out this would be a breaking change for the encoding itself and it is not feasible with the existing package.

allowempty struct tag or similar as proposed in #21399

This was implemented in https://go-review.googlesource.com/c/go/+/59830 and inspired this change. However there is no way to use the lower level Encoder.EncodeElement() or Encoder.EncodeToken() methods with it.

This proposal would lay the ground work for basic support and then something like a struct tag could be added/layered on top later if needed.

Extend StartElement struct

StartElement could be extended to have a Closed bool field to be self closing.

i.e:

type StartElement struct {
	Name  Name
	Attr  []Attr
	SelfClose bool   // ADDED
}

This would allow for encoder.EncodeToken() to produce the right tag. However this may be confusing when used with Encoder.EncodeElement() which specifically requires a StartElement token to be passed in along with a value. Today if a nil value is passed to Encoder.EncodeElement() then no xml element is produced.

One could imagine revising this so that if nil value and SelfClose is set then you would produce a self-closing tag. You would also want to emit an error if a non-nil (or non-empty?) value is passed and SelfClose is set to true.

However I believe this makes for a more confusing API given the current EncodeElement() function signature as well as the MarshalXML() method on the Marshaller interface.

A RawXML token as proposed in #26756

Having a RawXML token as proposed in #26756 could allow for users to compose and emit their own self-closing tags. This could replace and/or augment this proposal and could be acceptable.

Wait for a revised encoding/xml sweep and possibly a overhall (encoding/xml/v2?)

There are a number of other shortcomings in the xml package. It may be worth it to hold off on any changes to the existing API to either move to a new package. This package could be replace the one in the standard library or perhaps even be created outside of the stdlib with the existing encoding/xml being placed on a official freeze similar to packages like net/smtp as the demand for XML isn't as big as JSON and other encodings.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Incoming

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions