Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unicode: use unicode.ToUpper on the result of unicode.ToLower returns the wrong rune if it is a Greek rune: ẞ and ϴ #56448

Closed
chanced opened this issue Oct 27, 2022 · 4 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@chanced
Copy link

chanced commented Oct 27, 2022

What version of Go are you using (go version)?

$ go version
go version go1.19.2 darwin/arm64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/chance/Library/Caches/go-build"
GOENV="/Users/chance/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/chance/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/chance/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/opt/homebrew/Cellar/go/1.19.2/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/opt/homebrew/Cellar/go/1.19.2/libexec/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.19.2"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/bc/myr1vrns0zq7nz8xb0ng7zkh0000gn/T/go-build1058298884=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

for _, r := range []rune{'ϴ', 'ẞ'} {
	l := unicode.ToLower(r)
	u := unicode.ToUpper(l)
	fmt.Printf("input: %v, lower: %v, upper: %v\n", r, l, u)
}

go.dev/play/p/ALKvDC3qbov

What did you expect to see?

Using unicode.ToUpper on the result of unicode.ToLower would result in the same rune for the greek runes: ẞ and ϴ.

input: 1012, lower: 952, upper: 1012
input: 7838, lower: 223, upper: 7838

What did you see instead?

The lowercase variant is being used in lieu of the uppercase variant

input: 1012, lower: 952, upper: 920
input: 7838, lower: 223, upper: 223
@hopehook hopehook changed the title affected/package: unicode unicode: use unicode.ToUpper on the result of unicode.ToLower returns the wrong rune if it is a Greek rune: ẞ and ϴ Oct 27, 2022
@hopehook hopehook added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 27, 2022
@hopehook
Copy link
Member

CC @robpike

@robpike
Copy link
Contributor

robpike commented Oct 27, 2022

There are multiple theta characters in Unicode, and I don't know what the right word is, but 1012 is U_3F4, which is not "canonical". It is a Unicode "symbol", not a letter. It's in a funny state, and Go is just doing what Unicode says. Lowercasing and uppercasing tends to canonicalize.

The ess-zed character is different but similar. It's U+1e9e and lowercases to U+00DF, but that uppercases to a plain 'S', U+0053.

So although it may be surprising behavior, it's correct.

@robpike robpike closed this as completed Oct 27, 2022
@robpike
Copy link
Contributor

robpike commented Oct 27, 2022

By the way, case changing is not reflexive, however it may seem it should be. The rules aren't even consistent for the same code point in different languages.

@chanced
Copy link
Author

chanced commented Oct 27, 2022

Thank you for the thorough explanation, and for the language itself, Rob.

I'll pass it along to the individual that raised the issue in my package.

Much obliged.

@golang golang locked and limited conversation to collaborators Oct 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants