Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: line endings in data get replaced #24426

tehsphinx opened this Issue Mar 16, 2018 · 3 comments


None yet
3 participants
Copy link

tehsphinx commented Mar 16, 2018

What version of Go are you using (go version)?

go version go1.9.3 darwin/amd64

Does this issue reproduce with the latest release?

I created a copy of the xml package and applied the code changes of
which is supposed to fix issue
but the issue persists.

What did you do?

I get some xml from another service with data that contains line endings.
After parsing the data all line endings are standardized to \n.

I found this on the subject but wonder if that is supposed to even touch line endings inside CDATA. If it is just let me kindly know and I can move on to finding a workaround.

Here a reproducable sample with and without CDATA escaping:

What did you expect to see?

My data with \r\n line endings intact.

What did you see instead?

My data with \n line endings.

@andybons andybons added this to the Unplanned milestone Mar 16, 2018


This comment has been minimized.

Copy link

andybons commented Mar 16, 2018

Hm. The section on CDATA makes no note about normalizing line endings, however CDATA sections are unparsed entities and the line endings section seems to only apply to parsed entities.

This could go either way. Do you know how other parsers handle it?

@ianlancetaylor @rsc?


This comment has been minimized.

Copy link

tehsphinx commented Mar 17, 2018

Did some more research on the topic:

Here on stackoverflow somebody argues that since this has to be done before parsing the xml parser does not yet know if the line ending is part of a CDATA section or not: stackoverflow

MSDN library states:

XML processors treat the character sequence Carriage Return-Line Feed (CRLF) like single CR or LF characters. All are reported as a single LF character. Applications can save documents using the appropriate line-ending convention.

So I guess golang xml parser is correctly implemented and I should use base64 encoding to get line endings across.


This comment has been minimized.

Copy link

andybons commented Mar 18, 2018

OK. Closing for now. Let us know if you have any other concerns and feel free to re-open if you like.

@andybons andybons closed this Mar 18, 2018

@golang golang locked and limited conversation to collaborators Mar 18, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.