New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
split() breaks no-break spaces #42731
Comments
string.split(), str.split() and unicode.split() without >>> u"Hello\u00A0world".split()
[u'Hello', u'world'] |
Logged In: YES split isn't a word-wrapping split, so I'm not sure that's |
Logged In: YES Python documentation says that it splits in "whitespace However, I feel the need for the splitting function that |
Logged In: YES What's wrong with the following? import sys, unicodedata
spaces = u"".join(unichr(c) for c in xrange(0,
sys.maxunicode) if unicodedata.category(unichr(c))=="Zs" and
c != 160)
foo.split(spaces) |
Logged In: YES Maxim, you are right that \xA0 is a non-break space. If you'd rather like to see a different set of whitespace Closing this as "Won't fix". |
Logged In: YES Walter and MAL, did you actually try that work around? It
doesn't work:
>>> import sys, unicodedata
>>> spaces = u"".join(unichr(c) for c in xrange(0,
sys.maxunicode) if unicodedata.category(unichr(c))=="Zs" and
c != 160)
>>> foo = u"Hello\u00A0world"
>>> foo.split(spaces)
[u'Hello\xa0world'] That's because split() takes the whole separator argument as |
Logged In: YES Oops. You're right, Sjoerd. Still, you could achieve the splitting by using a |
Logged In: YES Seems I confused strip() with split(). I *did* try that work If we want to fix this discrepancy, we could add methods |
Logged In: YES No. These things are application scope details and should thus The methods always work on whitespace and that's clearly |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: