Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/text/language: change of behavior for language matcher #24211

Open
LayneChris opened this issue Mar 2, 2018 · 6 comments

Comments

@LayneChris
Copy link

commented Mar 2, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

1.9.2

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

linux amd64

What did you do?

The golang.org/x/text package seemed to have changed in the behaviour of language matching with an update a few days ago:

package main

import (
    "fmt"

    "golang.org/x/text/language"
)

func main() {
    s := []language.Tag{language.MustParse("en"), language.MustParse("fr")}
    p := []language.Tag{language.MustParse("en-US"), language.MustParse("en")}

    l := language.NewMatcher(s)
    ll, _, _ := l.Match(p...)

    fmt.Println(ll)
}

What did you expect to see?

This used to print "en" but now prints "en-u-rg-uszzzz". This doesn't make sense because I only support "en" and "fr" so why is it returning something else? Switching the order of my preferred languages gives "en".

Is there a rhyme or reason why the change because I cannot understand? Defeats the purpose of a language "matcher" if it is going to return languages that I don't support.

If this is by design what is the best way to get just "en". Parent()? Base()? SomethingElse()?

What did you see instead?

"en-u-rg-uszzzz", a language I do not support.

@bradfitz bradfitz changed the title Change of behavior for language matcher x/text/language: change of behavior for language matcher Mar 2, 2018

@gopherbot gopherbot added this to the Unreleased milestone Mar 2, 2018

@bradfitz

This comment has been minimized.

Copy link
Member

commented Mar 2, 2018

/cc @mpvl

@basgys

This comment has been minimized.

Copy link

commented Mar 14, 2018

This issue seems to occur only when the preferred language contains regional preferences.

Example

Supported:

  • en
  • en-GB
  • de-CH
  • fr-CH
  • en-US

Tests:

  • want en expect result en actual result en
  • want fr expect result fr-CH actual result fr-CH
  • want de expect result de-CH actual result de-CH
  • want en-GB expect result en-GB actual result en-GB
  • want fr-CH expect result fr-CH actual result fr-CH
  • want de-CH expect result de-CH actual result de-CH
  • want it expect result en actual result en
  • want it-CH expect result en actual result en-u-rg-chzzzz

Regression

This regression was introduced on this commit: golang/text@6008361#diff-c716fa1ccf70cc54ac8b513b999c84eb

Here:

if w.RegionID != tt.RegionID && w.RegionID != 0 {
	if w.RegionID != 0 && tt.RegionID != 0 && tt.RegionID.Contains(w.RegionID) {
		tt.RegionID = w.RegionID
		tt.RemakeString()
	} else if r := w.RegionID.String(); len(r) == 2 {
		// TODO: also filter macro and deprecated.
		tt, _ = tt.SetTypeForKey("rg", strings.ToLower(r)+"zzzz")
	}
}

Temporary workaround
Rollback golang.org/x/text to commit 2120f96286c5897163172b6f5ac2fd0921777714 cc @LayneChris

@nicksnyder

This comment has been minimized.

Copy link

commented Apr 4, 2018

I was just bitten by this today.

Here is a minimal repro (does not repro on play.golang.org):

package main

import "fmt"
import "golang.org/x/text/language"

func main() {
	m := language.NewMatcher([]language.Tag{language.English})
	tag, i, conf := m.Match(language.AmericanEnglish)
	fmt.Println(tag, i, conf) // en-u-rg-uszzzz 0 Exact
}
@nicksnyder

This comment has been minimized.

Copy link

commented Apr 4, 2018

I notice that the documentation says this:

Note that Tag that is returned by Match and MatchString may differ from any of the supported languages, as it may contain carried over settings from the user tags. This may be inconvenient when your application has some additional locale-specific data for your supported languages. Match and MatchString both return the index of the matched supported tag to simplify associating such data with the matched tag.

It seems like the intended behavior is to save the input array passed to the matcher and then use the index returned from Match.

It is annoying that I can't query the input array from the matcher itself though.

Example workaround

package main

import "fmt"
import "golang.org/x/text/language"

func main() {
	supported := []language.Tag{language.English}
	m := language.NewMatcher(supported)
	_, i, conf := m.Match(language.AmericanEnglish)
	fmt.Println(supported[i], i, conf) // en 0 Exact
}
@dnx2k

This comment has been minimized.

Copy link

commented May 30, 2018

My temporary workaround:

langTag, _, _ := languageMatcher.Match(tags...)
langTagString := langTag.String()[0:2]
stapelberg added a commit to stapelberg/debiman that referenced this issue Aug 9, 2018
Bugfix: use returned index, tags don’t necessarily match
See golang/go#24211 for more details

This hit us for a zh_CN manpage with precisely one "en" option: the returned tag
is en-u-rg-chzzzz, not en.
@chenjie4255

This comment has been minimized.

Copy link

commented Sep 21, 2018

I was a bit surprised by this when I meet this problem in my production environment ,is that just a mistake? I was really loving golang, but this is really frustrated me, my every beliefs on golang have beed break.

it not something about changing the exists codes or improving my project's unit test, it never be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.