Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/text/collate: Norwegian collation order differs from Danish #59908

Open
flwyd opened this issue May 1, 2023 · 1 comment
Open

x/text/collate: Norwegian collation order differs from Danish #59908

flwyd opened this issue May 1, 2023 · 1 comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@flwyd
Copy link

flwyd commented May 1, 2023

What version of Go are you using (go version)?

$ go version
go version go1.20.3 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/tstone/Library/Caches/go-build"
GOENV="/Users/tstone/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/tstone/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/tstone/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/opt/local/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/opt/local/lib/go/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="go1.20.3"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="/usr/bin/clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/tstone/devel/adif-multitool/go.mod"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/jl/971w22kn1_l85jzmswnpn3tw0000gn/T/go-build2430323303=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Code on playground

The collation order for language.Norwegian sorts the letters æ, ø, and å after a, o, and a respectively, rather than as the last three letters in the alphabet (in that order). The collation for language.Danish puts those three letters at the end of the alphabet, as expected. It's my understanding that Norwegian and Danish use the same alphabetic order, which is the same initial 26 letter order as English, followed by the three others, which are not treated as diacritics. This ordering for both Norwegian and Danish is called out in the introduction to Unicode Technical Standard #10: Unicode Collation Algorithm and is also described in the "Danish and Norwegian alphabet" Wikipedia page.

What did you expect to see?

Norwegian and Danish should collate the same, with Æ, Ø, and Å at the end of the alphabet. These are U+00C6 LATIN CAPITAL LETTER AE, U+00D8 LATIN CAPITAL LETTER O WITH STROKE, U+00C5 LATIN CAPITAL LETTER A WITH RING ABOVE, and "SMALL" variants for lower case.

What did you see instead?

Norwegian (but not Danish) sorts these letters similar to diacritics in other European languages rather than treating them as independent letters.

@gopherbot gopherbot added this to the Unreleased milestone May 1, 2023
@cagedmantis cagedmantis added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 1, 2023
@cagedmantis cagedmantis modified the milestones: Unreleased, Backlog May 1, 2023
@cagedmantis
Copy link
Contributor

cc @mpvl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

3 participants