This issue was discussed in #12750, at which point @mpvl noted that:
...the implementation is based on the CLDR UCA tables. If I look at the collation elements of both the DUCET (Unicode's tables) and CLDR (the tailorings) they both show Hangul to have a higher primary collation value then Latin. So that explains why Korean is sorted later.
What is probably happening in ICU is that the the script for the selected language is sorted before other scripts. The Go implementation currently does not support script reordering, though. This is an TODO, but depends on changing the implementation to using fractional weights...
The text was updated successfully, but these errors were encountered:
ianlancetaylor
changed the title
Go collation does not work for Korean
test/collate: collation does not work for Korean
Feb 14, 2017
What version of Go are you using (
go version
)?go1.8beta2 linux/amd64
What operating system and processor architecture are you using (
go env
)?GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/anx/go"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build492080285=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
What did you do?
I attempted to sort some strings according to Korean rules.
These rules say that Korean characters should be sorted before Latin characters.
What did you expect to see?
Expected output: [나는 abc]
What did you see instead?
Actual output: [abc 나는]
This issue was discussed in #12750, at which point @mpvl noted that:
The text was updated successfully, but these errors were encountered: