New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable curl to work as expected on Windows #3008

Closed
bitcrazed opened this Issue Sep 17, 2018 · 8 comments

Comments

Projects
None yet
3 participants
@bitcrazed
Contributor

bitcrazed commented Sep 17, 2018

@bagder and @jay & the curl team:

Hi - I am the PM for Windows Console. Since adding curl to Windows 10 in the Windows 10 Sprint 2018 Update, we've seen and received reports of issue with Curl not behaving as expected on Windows, e.g. curl issues #731 and #345, and known-issue 5.5

I and the Windows Console Team, are in the process of overhauling Windows Console.

For example, we recently added a Pseudo Console (ConPTY) API to Windows for the first time. This API was inspired by Linux' openpty(3) APIs and enables terminal/server applications to create and supply (via a call to CreatePseudoConsole()) the pipes via which they communicate directly with command-line apps (e.g. Cmd, PowerShell, curl.exe, etc.), using UTF-8 encoded text/VT.

We're also in the process of overhauling how the Console's internal buffers store and handle Unicode, and UTF-8. In the up-coming Windows 10 Fall 2018 release, Console will be able to store and retrieve Unicode text in codepage 65001 (though it'll still struggle to display emoji, chars outside the currently selected font, etc. - that's a whole other set of issues 😜) :

image

To accomplish the above, I hacked a couple of quick changes into my own fork of curl to enable VT-mode in the Console. when curl starts-up.

However, these changes are far from complete:

  1. I am sure you'd want such changes implemented in a different way?!
  2. I haven't yet figured out is where to add code that changes the codepage to 65001 at startup on Windows 10? Alas, without switching to codepage 65001 for UTF-8 HTTP responses, Console garbles its output:

image

Could you steer me towards the changes you'd recommend to enable curl to light-up on Windows?

@bagder

This comment has been minimized.

Member

bagder commented Sep 17, 2018

Hey @bitcrazed, this looks awesome.

  1. I think that looks like a great start. If you make a PR out of that I'm sure we'll nitpick some details but the general approach should be fine.

  2. Can't it be done in those functions or in functions adjacent to configure_terminal from (1) ? (I feel I'm missing some finer point.)

@bitcrazed

This comment has been minimized.

Contributor

bitcrazed commented Sep 18, 2018

Thanks @badger. Great to hear - I'll submit a PR in a moment and let the fun begin :)

"Can't it be done in those functions or in functions adjacent to configure_terminal"

Good question - we could just set the CP to 65001 at startup, but I was originally thinking to try to set the CP to more closely match the CP expressed by an HTML doc, as per W3C guidelines on encoding declaration, by looking for a <meta charset="utf-8" /> declaration in the first 2K or so of any response.

Is this overkill?

@bagder

This comment has been minimized.

Member

bagder commented Sep 18, 2018

we could just set the CP to 65001 at startup, but I was originally thinking to try to set the CP to more closely match the CP expressed by an HTML doc

Ah, now I get you, thanks. Okay, I see what you're saying but that's an approach we don't do on any system so it would be a first -- and quite frankly that's a route full of traps and sorry faces. I'm not convinced that road actually leads to happiness without an awful lot of work and trial and error.

So, perhaps not "overkill" but quite complicated. But of course all other approaches will mean more shortcuts and guessing with less actual input to base the guesses on...

@bagder

This comment has been minimized.

Member

bagder commented Sep 18, 2018

(oh btw, my nick name is a dyslexic animal, not the correctly spelled version!)

@bitcrazed

This comment has been minimized.

Contributor

bitcrazed commented Sep 18, 2018

OMG! Sorry @ba-gd-er - Hadn't even noticed that - I think I need new glasses!

@bitcrazed

This comment has been minimized.

Contributor

bitcrazed commented Sep 27, 2018

Okay, I think we're close.

Here's what the version of curl currently shipping in Windows 10 looks like when querying http://wttr.in/seattle sans VT and UTF-8 support:
image

And here's what it looks like with VT enabled and the UTF-8 codepage (65001) selected:
image

For comparison, here's what curl for Linux looks like built from the same sources and running in Ubuntu on WSL:
image

If the user is running an older version of Windows that doesn't support VT then they'll see something very similar to the first screenshot above, which is no worse than what they'll see today anyhow.

@bagder bagder closed this in becfe12 Sep 29, 2018

@mattn

This comment has been minimized.

Contributor

mattn commented Nov 1, 2018

Please revert this change. Please Please Please!
This is very ugly change. Changing console codepage effect console font. Also you probably notice that curl doesn't restore original font.

shoei

@bitcrazed This is NOT right way to show unicode. You should use wide string APIs.

@bagder

This comment has been minimized.

Member

bagder commented Nov 1, 2018

I turned that comment into a proper issue. Take follow-ups there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment