Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: unmarshaller and whitespace #22146

Closed
md2perpe opened this Issue Oct 5, 2017 · 6 comments

Comments

Projects
None yet
6 participants
@md2perpe
Copy link

md2perpe commented Oct 5, 2017

The issue

I had a file with space padded numbers (NumberOfPoints=" 266703") but the XML unmarshaller couldn't handle the spaces.

Booleans are trimmed, so why not numbers?

Does this issue reproduce with the latest release?

The issue exists in the currently latest commit in the GitHub repository:
https://github.com/golang/go/blob/99da873/src/encoding/xml/read.go

What operating system and processor architecture are you using (go env)?

Not relevant as the issue is in architecture-independent library code.

Examples

Good to know

I first asked about this and was given a workaround on Google Groups:
https://groups.google.com/forum/#!topic/golang-nuts/RtKU05cE9PY

@mvdan mvdan changed the title XML unmarshaller and whitespace encoding/xml: unmarshaller and whitespace Oct 5, 2017

@ianlancetaylor ianlancetaylor added this to the Go1.10 milestone Oct 5, 2017

@leighmcculloch

This comment has been minimized.

Copy link
Contributor

leighmcculloch commented Oct 6, 2017

According to the git history 1a37656 bools have been trimmed since they were first added by @rsc. @rsc: What was the motivation for trimming bools? Was it convenience?

If the reasoning was convenience I think it makes sense for there to be consistency across types where whitespace has no meaning since consistency will result in less wrong assumptions, but I think if it was to be added for consistency it'd need to be added to unsigned ints and floats too.

The no error on an invalid integer isn't intuitive and I think it should be considered a bug if trimming isn't introduced.

@kostix

This comment has been minimized.

Copy link

kostix commented Oct 6, 2017

Not to weigh on any side of the scales, I'd still point out a bit from my response to @md2perpe over there on the ML:

The question of whether the string " 266703" represents a valid
integer is hardly decidable in my opinion. For instance, would the
string "\v\v\v\n\n\n\t\t266703\n\n\n\v\v\v\v\t\x20\t" represent a valid
integer as well?

That is, while " 266703" looks innocent enough, though mildly annoying as it hints at sloppy
implementation of whatever generated that document, a string containing lots of "weird"
characters considered to be whitespace by Unicode (or Go's strings / bytes packages)
is, IMO, a bit less easily decidable to represent "valid" data.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

ianlancetaylor commented Oct 6, 2017

@leighmcculloch Since we don't do our code review on Github it sometimes helps to look over there to get more background on a change. In this case, https://golang.org/cl/218050 . In this case it looks like the purpose of the CL was to accept more strings for bool values. It's not clear that anybody really considered the call to strings.TrimSpace in there.

@rsc

This comment has been minimized.

Copy link
Contributor

rsc commented Oct 23, 2017

Clearly we didn't think much of adding strings.TrimSpace in CL 218050 (which Michael Hoisie wrote and I reviewed), but it seems fine to apply strings.TrimSpace to all those numeric fields. Please make sure to update the docs and add a test. CLs welcome. Thanks.

@rsc rsc added the NeedsFix label Oct 23, 2017

@leighmcculloch

This comment has been minimized.

Copy link
Contributor

leighmcculloch commented Oct 23, 2017

If no one else is already on it I'd like to submit a CL to make the change. 🖐️

@gopherbot

This comment has been minimized.

Copy link

gopherbot commented Oct 27, 2017

Change https://golang.org/cl/73891 mentions this issue: encoding/xml: ignore whitespace in values and attrs

@gopherbot gopherbot closed this in 14bc4f5 Nov 1, 2017

@golang golang locked and limited conversation to collaborators Nov 1, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.