Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: Encoder encodes double quotes incorrectly. #12400

Closed
dragonfax opened this Issue Aug 30, 2015 · 5 comments

Comments

Projects
None yet
6 participants
@dragonfax
Copy link

dragonfax commented Aug 30, 2015

xml.Encoder encodes a double quote in a text node into the entity ". Technically this a valid entity. But the standard is to encode a double quote into " instead. This is the standard when not using a DTD. https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Predefined_entities_in_XML

I only noticed this because it sends riak-cs haywire when used with the aws-sdk-go client (an AWS s3 client written in go).

Here is an example.
https://gist.github.com/dragonfax/ca3ee45a0acf97820f58

@mikioh mikioh changed the title xml.Encoder encodes double quotes incorrectly. encoding/xml: Encoder encodes double quotes incorrectly. Aug 30, 2015

@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Aug 30, 2015

@nodirt

This comment has been minimized.

Copy link
Member

nodirt commented Oct 14, 2015

Same applies to ' https://play.golang.org/p/tIu_6xahLG
I'd like to work on this

@nodirt

This comment has been minimized.

Copy link
Member

nodirt commented Oct 14, 2015

Except, this may be considered backwards incompatible. @adg?

@nodirt

This comment has been minimized.

Copy link
Member

nodirt commented Oct 14, 2015

Apparently, this is behavior is intended because " and ' are shorter than " and ' respectively

go/src/encoding/xml/xml.go

Lines 1833 to 1834 in bf21643

esc_quot = []byte(""") // shorter than """
esc_apos = []byte("'") // shorter than "'"

@adg (or another core gopher) to make the final decision (I can't close bugs)

@adg

This comment has been minimized.

Copy link
Contributor

adg commented Oct 15, 2015

I don't see anywhere in the spec that says " must be used and not ". In fact, all I could find was this sentence which suggests that it's fine to use the numeric entity instead:

Entity and character references may both be used to escape the left angle bracket, ampersand, and other delimiters. A set of general entities (amp, lt, gt, apos, quot) is specified for this purpose. Numeric character references may also be used; they are expanded immediately when recognized and must be treated as character data, so the numeric character references " < " and " & " may be used to escape < and & when they occur in character data.

I'd be inclined not to change the existing behavior because—as @nodirt says—this may break (or at least unexpectedly change) existing programs.

@rakyll

This comment has been minimized.

Copy link
Member

rakyll commented Oct 15, 2015

&quot; is just a predefined &#34;. Given the fact that &#34; is intentional, there is no reason we should fix this bug if there are no other major practical reasons.

@rakyll rakyll closed this Oct 15, 2015

@golang golang locked and limited conversation to collaborators Oct 17, 2016

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.