New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Umlaute brake when using curl.exe #21456
Comments
When you run an executable in PowerShell there are two ways the executable can output text
The first example is what happens when you execute something in PowerShell without capturing or redirecting the output in any way, for example PS C:\> curl.exe https://slftool.github.io/data.json The second example is what happens when the process' output is redirect by PowerShell as the data is going to be captured by PowerShell. This can occur in any of the cases like setting it to a var, pipelining data, redirection, or in your case grouping, for example PS C:\> $out = curl.exe https://slftool.github.io/data.json
PS C:\> curl.exe https://slftool.github.io/data.json | Out-String
PS C:\> (curl.exe https://slftool.github.io/data.json)
# This example no longer applies in pwsh 7.4+ but older ones do
PS C:\> curl.exe https://slftool.github.io/data.json > C:\test.txt This is important because the second scenario will have PowerShell convert the raw bytes of the process' stdout pipe using the encoding of What you need to do is ensure that [Console]::OutputEncoding = [System.Text.UTF8Encoding]::new()
curl.exe ... | Out-String If the encoding PowerShell uses does not match what curl is encoding its output as you'll get the incorrect characters back. Using your example of It's also good practice you set the $OutputEncoding = [Console]::OutputEncoding = [Console]::InputEncoding = [System.Text.UTF8Encoding]::new() |
@jborean93's helpful comment was posted while I was in the middle of composing mine, but perhaps the following provides a slightly different angle / supplemental information that may be helpful: In short:
Note that there is a way to avoid the need for the above:
|
The standard encoding for JSON files is Unicode, typically UTF8. https://www.json.org/json-en.html
When curl is writing data to stdout it should be writing verbatim binary content and not doing any character translation. |
Thank you all. |
📣 Hey @kort3x, how did we do? We would love to hear your feedback with the link below! 🗣️ 🔗 https://aka.ms/PSRepoFeedback |
[Console]::OutputEncoding = [Console]::InputEncoding = [System.Text.UTF8Encoding]::new() helps with utf8 and external binaries |
Prerequisites
Steps to reproduce
I did this
curl https://slftool.github.io/data.json
then
(curl https://slftool.github.io/data.json)
Expected behavior
For both commands Umlaute should work like in this line from the output of the first command
"stadt": ["Zweibrücken (Deutschland)", "Zwiesel (Deutschland)", "Zwickau (Deutschland)", "Zürich (Schweiz)", "Zabol (Iran)", "Zagreb (Kroatien)"]
Actual behavior
Umlaute are broken in the output of the second command, the one with parentheses
"stadt": ["Zweibr├╝cken (Deutschland)", "Zwiesel (Deutschland)", "Zwickau (Deutschland)", "Z├╝rich (Schweiz)", "Zabol (Iran)", "Zagreb (Kroatien)"],
As soon as you touch the output of curl they brake.
Environment data
Visuals
No response
The text was updated successfully, but these errors were encountered: