Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: bad UTF-8 coercion #5880

Closed
rsc opened this Issue Jul 13, 2013 · 1 comment

Comments

Projects
None yet
3 participants
@rsc
Copy link
Contributor

rsc commented Jul 13, 2013

The program below asks encoding/xml to marshal the byte "string"
{192,168,0,1}. That's C0 A8 00 01 in hex. C0 A8 has the right upper bits to be a valid
2-byte UTF-8 sequence, but it is not one, because the encoded value is too small (0x28,
which should be encoded as a single byte). encoding/xml appears to pass the bytes
through unaltered (or else does the check badly, which would be worse). It does replace
the 00 with U+FFFD. It should replace other invalid UTF-8 sequences too.

http://play.golang.org/p/xFXcF-2Hpb

package main

import (
    "encoding/hex"
    "encoding/xml"
    "fmt"
    "unicode/utf8"
)

type T []byte

func main() {
    data, _ := xml.Marshal(T{192,168,0,1})
    fmt.Println(hex.Dump(data))
    fmt.Println(utf8.Valid(data)) // should always be true for result of xml.Marshal!
}
@adg

This comment has been minimized.

Copy link
Contributor

adg commented Jul 30, 2013

Comment 1:

This issue was closed by revision 789e1c3.

Status changed to Fixed.

@rsc rsc added fixed labels Jul 30, 2013

@golang golang locked and limited conversation to collaborators Jun 24, 2016

This issue was closed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.