Skip to content

Fix culture creation with undetermined lang tag #115166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

tarekgh
Copy link
Member

@tarekgh tarekgh commented Apr 29, 2025

Fixes #98543

This change enables support for creating CultureInfo objects using an undetermined language tag like und-US.
The root issue is that ICU's uloc_getName returns malformed names such as _US when given und-US. This fix ensures the removed language subtag und is preserved and restored in the result.
A more comprehensive alternative—switching to uloc_toLanguageTag—was considered, but it may introduce compatibility and performance concerns. It would also require a broader audit of how we normalize culture names around ICU usage. We can revisit that approach if more issues like this one are discovered.

@Copilot Copilot AI review requested due to automatic review settings April 29, 2025 21:48
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes the behavior of creating CultureInfo objects from undetermined language tags like "und-US" by restoring the removed "und" subtag.

  • Added new tests in CultureInfoCtor.cs to validate the undetermined language tag behavior.
  • Updated CultureData.Icu.cs to include the original input name in NormalizeCultureName and to restore "und" as necessary.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/libraries/System.Runtime/tests/System.Globalization.Tests/CultureInfo/CultureInfoCtor.cs Added tests for verifying normalization of undetermined language tags.
src/libraries/System.Private.CoreLib/src/System/Globalization/CultureData.Icu.cs Modified NormalizeCultureName to accept the original name and restore the "und" prefix when needed.
Comments suppressed due to low confidence (1)

src/libraries/System.Private.CoreLib/src/System/Globalization/CultureData.Icu.cs:50

  • Ensure that the variable 'changed' is properly declared and in scope. If it is not declared in this method or a broader scope, this line will cause a compilation error.
changed = true;

@tarekgh
Copy link
Member Author

tarekgh commented Apr 29, 2025

@mcdurdin would you be interested to test the fix when we merge it?

@tarekgh tarekgh added this to the 10.0.0 milestone Apr 29, 2025
@tarekgh
Copy link
Member Author

tarekgh commented Apr 29, 2025

CC @xadxura

@mcdurdin
Copy link

@mcdurdin would you be interested to test the fix when we merge it?

Yes, sure, that would be great thanks.

[InlineData("und-us", "und-US", "und-US")]
[InlineData("und-us_tradnl", "und-US", "und-US_tradnl")]
[InlineData("und-es-u-co-phoneb", "und-ES", "und-ES_phoneb")]
[InlineData("und-es-t-something", "und-ES", "und-ES")]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a test for und-fonipa? That's a common tag for IPA, referenced on Wikipedia for example: https://en.wikipedia.org/wiki/International_Phonetic_Alphabet#IETF_language_tags

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunetly, Windows currently doesn't allow using this name. You can try the code:

            const uint LOCALE_SNAME = 0x0000005C;
            char[] buffer = new char[256];
            int res = GetLocaleInfoEx("und-fonipa", LOCALE_SNAME, buffer, buffer.Length);
            int errorCode = Marshal.GetLastWin32Error();

            Console.WriteLine($"GetLocaleInfoEx: {res} ... Error Code: {errorCode}");

        [DllImport("kernel32.dll", CharSet = CharSet.Unicode, SetLastError = true)]
        internal static extern unsafe int GetLocaleInfoEx(string lpLocaleName, uint LCType, char[] buffer, int cchData);

and you will see GetLocaleInfoEx: 0 ... Error Code: 87. 87 means The parameter is incorrect.

I would suggest you log issue for Windows through Windows Feedback hub. If Windows fixes that, it will automatically work with .NET.

By the way, if you run on Linux, this should work fine with the current fix here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xadxura may help with this too?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows requires the script tag to be declared unless a suppress-script value is defined for that langauge in BCP-47. The language tag 'und' has no suppress script defined so the script must be declared. One could say taht und-fonipa implicitly declares the script from the variant tag fonipa however we don't allow leftward propagation of script info. This is also why we don't infer Hans from zh-CN. Please use the tag und-Latn-fonipa, which does work with Windows:

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xadxura I am seeing Windows don't allow und-Latn-fonipa when calling GetLocaleInfoEx. Am I missing something? should und-Latn-fonipa need to get registered on Windows before using this tag?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @tarekgh and @xadxura. Duh, I knew that Windows needed Latn but just plain forgot when asking here. Sorry for the side-track. But it sounds like und-Latn-fonipa still has an open question.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, custom BCP-47 tags must be registered on the system first.

@tarekgh
Copy link
Member Author

tarekgh commented Apr 30, 2025

/ba-g unrelated failures

@tarekgh tarekgh merged commit 4eb1693 into dotnet:main Apr 30, 2025
137 of 142 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CultureNotFoundException thrown when a keyboard with language tag und is activated
4 participants