Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: character encoding detection should be case insensitive #12417

Closed
cGuille opened this Issue Aug 31, 2015 · 3 comments

Comments

Projects
None yet
4 participants
@cGuille
Copy link

cGuille commented Aug 31, 2015

Hello,

  • What version of Go are you using (go version)? go version go1.3.3 linux/amd64
  • What operating system and processor architecture are you using? Debian Wheezy (Linux 3.16.0-44-generic x86_64)
  • What did you do? I ran an app which parsed a third party XML file.
  • What did you expect to see? The parsing should work and the data should be extracted from the file.
  • What did you see instead? An error: "xml: encoding "Utf-8" declared but Decoder.CharsetReader is nil".

As discussed in the Google group, I encountered an issue while parsing an XML document claiming to be Utf-8 encoded, so neither UTF-8 nor utf-8 as expected by the encoding/xml library.

This document specifies that "XML processors should match character encoding names in a case-insensitive way".

I plan to propose a patch doing just that, as soon as I install the golang dev tools (it is kind of my first meeting with Go).

About the implementation, would you just write a simple comparison such as strings.ToUpper(enc) == "UTF-8", or would you prefer using something like the charset lookup function?

Thanks.

@dullgiulio

This comment has been minimized.

Copy link
Contributor

dullgiulio commented Aug 31, 2015

Could you try if https://go-review.googlesource.com/14084 does what you want?

@ianlancetaylor ianlancetaylor added this to the Go1.6 milestone Aug 31, 2015

@cGuille

This comment has been minimized.

Copy link
Author

cGuille commented Aug 31, 2015

I confirm it does! I have spent an entire hour building everything from go to my app into a clean docker container. :D
When I checkout the commit before yours, I reproduce my issue ; when I checkout your commit, it works fine.

@ianlancetaylor ianlancetaylor changed the title encoding/xml Character encoding detection should be case insensitive encoding/xml: Character encoding detection should be case insensitive Sep 3, 2015

@gopherbot

This comment has been minimized.

Copy link

gopherbot commented Oct 23, 2015

CL https://golang.org/cl/14084 mentions this issue.

@rsc rsc changed the title encoding/xml: Character encoding detection should be case insensitive encoding/xml: character encoding detection should be case insensitive Nov 5, 2015

@rsc rsc closed this in 0b55be1 Nov 25, 2015

@golang golang locked and limited conversation to collaborators Nov 27, 2016

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.