Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bytes, strings: Title does not treat Unicode punctuation as separators #34994

Open
amwolff opened this issue Oct 18, 2019 · 4 comments · May be fixed by #34995
Open

bytes, strings: Title does not treat Unicode punctuation as separators #34994

amwolff opened this issue Oct 18, 2019 · 4 comments · May be fixed by #34995
Labels

Comments

@amwolff
Copy link

@amwolff amwolff commented Oct 18, 2019

What version of Go are you using (go version)?

$ go version
go version go1.13.3 darwin/amd64

Does this issue reproduce with the latest release?

Affirmative.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN="..."
GOCACHE="/Users/amw/Library/Caches/go-build"
GOENV="/Users/amw/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="..."
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/73/mbpfcm4d1gj_9nbjpfm_0qyc0000gn/T/go-build033100524=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

The bug that prevents from capitalizing letters that begin words preceeded with Unicode punctuation is mentioned here: https://github.com/golang/go/blob/master/src/strings/strings.go#L713

Also here: https://github.com/golang/go/blob/master/src/bytes/bytes.go#L652

Simple recipe reproducing the bug: https://play.golang.org/p/b1PyVSETmV3

Output:

Go.Go․go != Go.Go․Go

What did you expect to see?

No output (every word in the processed string should be capitalized).

What did you see instead?

The word after U+2024 ONE DOT LEADER (․) remained uncapitalized.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Oct 18, 2019

Change https://golang.org/cl/202077 mentions this issue: bytes, strings: make Title treat Unicode punctuation as separators

@smasher164

This comment has been minimized.

Copy link
Member

@smasher164 smasher164 commented Oct 19, 2019

Potential duplicate of #6801, although this is specific to permitting Unicode punctuation. I'm not sure that the return value of Title can be changed now.
/cc @bradfitz @ianlancetaylor

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Oct 19, 2019

It's not obvious to me that we can change this now.

If we do change it, does Unicode define the set of characters that break words? Is that locale dependent? CC @mpvl

@amwolff

This comment has been minimized.

Copy link
Author

@amwolff amwolff commented Oct 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.