Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Breaking change]: .NET 8 Uses UTF 8 Encoding in Non-EN Languages #34250

Closed
3 tasks done
nagilson opened this issue Feb 23, 2023 · 1 comment · Fixed by #34365
Closed
3 tasks done

[Breaking change]: .NET 8 Uses UTF 8 Encoding in Non-EN Languages #34250

nagilson opened this issue Feb 23, 2023 · 1 comment · Fixed by #34365
Assignees
Labels
binary incompatible Existing binaries may encounter a breaking change in behavior. breaking-change Indicates a .NET Core breaking change 🏁 Release: .NET 8 Work items for the .NET 8 release doc-idea Indicates issues that are suggestions for new topics [org][type][category] Pri1 High priority, do before Pri2 and Pri3 📌 seQUESTered Identifies that an issue has been imported into Quest. source incompatible Source code may encounter a breaking change in behavior when targeting the new version.

Comments

@nagilson
Copy link
Member

nagilson commented Feb 23, 2023

Description

As part of dotnet/sdk#29755, if DOTNET_CLI_UI_LANGUAGE or VSLANG is set, then the console output and input encoding will change to UTF 8, so that the code page can change to be UTF 8 as well. This means that characters from languages set by those environment variable(s) can be rendered correctly.

This only applies on Windows Machines only (encoding was ok on other platforms.) And it also only applies to windows 10+ machines. This encoding change only happens if the UI Culture set by the user is not some form of english as well.

Version

.NET 8 Preview 1

Note: We are considering this for 7.0.3xx.

Previous behavior

Characters in certain languages but not limited to Chinese, German, Japanese, and or Russian, would sometimes display as garbled characters, or as ?.

New behavior

Characters will render correctly. The encoding will change as well as the code page.

How this is breaking:
Versions of Windows older than Windows 10 (November 2019 update) may not fully support UTF 8. These versions may experience issues.

In addition, there is an existing bug where the SDK leaves its encoding behind and can affect the encoding of other commands/programs called in the same command prompt after it's finished execution. (dotnet/sdk#30170). Now that the SDK more frequently changes the encoding, the impact of this may have increased.

In addition, some legacy consoles may not support UTF 8.

Type of breaking change

  • Binary incompatible: Existing binaries may encounter a breaking change in behavior, such as failure to load or execute, and if so, require recompilation.
  • Source incompatible: When recompiled using the new SDK or component or to target the new runtime, existing source code may require source changes to compile successfully.
  • Behavioral change: Existing binaries may behave differently at run time.

Reason for change

Using the SDK CLI in other languages provided a poor experience before this change.
Example:
before:
image

after:
image

Recommended action

For those on an older version of windows 10, upgrade to November 2019 update or higher.

For those who want to use a legacy console or are facing build issues or others due to the encoding change, they should unset VSLANG and or DOTNET_CLI_UI_LANGUAGE to disable this change. We expect minimal impact as this language setting wouldn't have worked well in the first place due to garbled characters. Anyone who was not using these already will be unimpacted, and only those on windows 10+ will be impacted, most of which we think will be on November 2019 update or higher. The legacy scenarios are less likely to support the broken languages either, so it is unlikely the user would want to use another language to cause this break anyway, as well.

Feature area

SDK

Affected APIs

No response


Associated WorkItem - 67117

@nagilson nagilson added doc-idea Indicates issues that are suggestions for new topics [org][type][category] breaking-change Indicates a .NET Core breaking change Pri1 High priority, do before Pri2 and Pri3 labels Feb 23, 2023
@dotnet-bot dotnet-bot added the ⌚ Not Triaged Not triaged label Feb 23, 2023
@gewarren gewarren added the 🏁 Release: .NET 8 Work items for the .NET 8 release label Feb 27, 2023
@dotnet-bot dotnet-bot added binary incompatible Existing binaries may encounter a breaking change in behavior. source incompatible Source code may encounter a breaking change in behavior when targeting the new version. labels Feb 27, 2023
@gewarren gewarren removed ⌚ Not Triaged Not triaged binary incompatible Existing binaries may encounter a breaking change in behavior. source incompatible Source code may encounter a breaking change in behavior when targeting the new version. labels Feb 27, 2023
@dotnet-bot dotnet-bot added binary incompatible Existing binaries may encounter a breaking change in behavior. source incompatible Source code may encounter a breaking change in behavior when targeting the new version. labels Feb 27, 2023
@nagilson
Copy link
Member Author

Breaking change also introduced in 7.0.3xx now. Thanks!

@gewarren gewarren added the 🗺️ reQUEST Triggers an issue to be imported into Quest. label Mar 2, 2023
@github-actions github-actions bot added 📌 seQUESTered Identifies that an issue has been imported into Quest. and removed 🗺️ reQUEST Triggers an issue to be imported into Quest. labels Mar 2, 2023
@ghost ghost added the in-pr This issue will be closed (fixed) by an active pull request. label Mar 4, 2023
@ghost ghost removed the in-pr This issue will be closed (fixed) by an active pull request. label Mar 6, 2023
JaynieBai pushed a commit to dotnet/msbuild that referenced this issue May 12, 2023
…8503)

Fixes #1596

Changes Made
SetConsoleUI now calls into a helper which sets the encoding to support non-en languages and checks if an environment variable exists to change the language to.

Testing
Setting DOTNET_CLI_UI_LANGUAGE=ja now changes msbuild correctly:
image

Doing a complicated build (aka building MSBuild) to use multiple threads shows other threads seem to use the same UI culture:

image

See that chcp remains the same after execution:
image

(Was set to 65001 temporarily but back to the original page before execution.)

Notes
Much of this code is a port of this code: dotnet/sdk#29755
There are some details about the code here.

[!] In addition, it will introduce a breaking change for msbuild just like the SDK.
The break is documented here for the sdk: dotnet/docs#34250
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binary incompatible Existing binaries may encounter a breaking change in behavior. breaking-change Indicates a .NET Core breaking change 🏁 Release: .NET 8 Work items for the .NET 8 release doc-idea Indicates issues that are suggestions for new topics [org][type][category] Pri1 High priority, do before Pri2 and Pri3 📌 seQUESTered Identifies that an issue has been imported into Quest. source incompatible Source code may encounter a breaking change in behavior when targeting the new version.
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants