New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Terminal: [console]::InputEncoding and [console]::OutputEncoding are set to the wrong UTF-8 encoding (*with* BOM) #7634
Comments
Well it is not PowerShell issue, but .NET one. |
I haven't looked into it, @PetSerAl - do you know where these encodings are assigned? |
@mklement0 I will reference Windows implementation. I am not sure if it behave the same on Linux or MacOS since it use platform dependent function on Windows. [Text.Encoding]::GetEncoding(65001).GetPreamble().Length # 3 |
Though, at quick glance on .NET Core implementation, it looks like it supposed to return BOM-less UTF-8 in this case, so maybe I was wrong and it have some PowerShell related issue. |
Thanks, @PetSerAl. Indeed, given that |
It is here So on Unixes CoreFX looks in env variables "LC_ALL", "LC_MESSAGES", "LANG". |
Thanks for digging deeper, @iSazonov. In the absence of said environment variables the behavior is correct, but it is broken if these environment variables are present and specify UTF-8 - and these environment variables are virtually never absent, because they are part of the current locale (culture) setting, and these days they indeed typically specify UTF-8. I've filed a CoreFx bug - see https://github.com/dotnet/corefx/issues/32004 |
@mklement0 So this should be resolved as external correct? |
@BrucePay: Indeed, thanks - should've made that clearer. |
Note: As of v6.1.0-rc.1, the console on Windows is fundamentally not configured to use UTF-8 yet - see #7233
While
[console]::InputEncoding
and[console]::OutputEncoding
on macOS and Linux are set to UTF-8, the specific encoding variant used is the one with a BOM, which is the wrong one (though I'm unclear on what the practical implications are, given that streams, not files are typically involved).This contrasts with automatic variable
$OutputEncoding
which correctly uses the BOM-less UTF-8 encoding.Steps to reproduce (macOS and Linux)
Expected behavior
Actual behavior
That is, the 3-byte BOM is unexpectedly present in
[console]::InputEncoding
and[console]::OutputEncoding
Environment data
The text was updated successfully, but these errors were encountered: