Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tool: support UTF-16 command line on Windows #3784

Conversation

@MarcelRaad
Copy link
Member

@MarcelRaad MarcelRaad commented Apr 15, 2019

This makes it possible to pass parameters with characters outside of
the current locale on Windows, which is required for some tests, e.g.
the IDN tests.

Out of the box, this currently only works with the Visual Studio
project files, which default to Unicode. I'll look into building
Unicode versions with the other build tools later.

Ref: #3747

@jay
Copy link
Member

@jay jay commented Apr 16, 2019

Refer to #637 and #345 particularly this post. The problem is basically the Microsoft C runtime does not support UTF-8 locale. Lets say you have an outfile name extracted from the UTF-8 encoded URL or a separate UTF-8 encoded argument then you can't pass that to fopen. We're left with a problem where we have to somehow keep track of the encoding of strings because we can't set the locale to UTF-8.

Anyway I think this is possible to support I just feel like it's going down a rabbit hole.

@MarcelRaad
Copy link
Member Author

@MarcelRaad MarcelRaad commented Apr 16, 2019

Thanks @jay , I hadn't thought about that! Reading these issues now. I wish there were char8_t in C.

@bagder
Copy link
Member

@bagder bagder commented Oct 12, 2019

hi @MarcelRaad, is this something you still intend to finalize?

@MarcelRaad
Copy link
Member Author

@MarcelRaad MarcelRaad commented Oct 13, 2019

Sorry for the long delay - yes, I plan to resume work on this when I have a couple of days, hopefully soon.

MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 2, 2020
This will also be needed in the tool and tests.

Ref: curl#3758 (comment)
Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 2, 2020
- use wmain instead of main when _UNICODE is defined [0]
- define argv_item_t as wchar_t * in this case
- use the curl_multibyte gear to convert the command-line arguments to
  UTF-8

This makes it possible to pass parameters with characters outside of
the current locale on Windows, which is required for some tests, e.g.
the IDN tests. Out of the box, this currently only works with the
Visual Studio project files, which default to Unicode.

[0] https://devblogs.microsoft.com/oldnewthing/?p=40643

Ref: curl#3747
Closes curl#3784
@MarcelRaad MarcelRaad force-pushed the MarcelRaad:windows_unicode_command_line_arguments branch from fea657a to 3b458c9 Jan 2, 2020
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 2, 2020
Use them only if `_UNICODE` is defined, in which case command-line
arguments have been converted to UTF-8.

Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 2, 2020
It only applies to non-Unicode builds now.
Also merge 5.10 into it as it's effectively a duplicate.

Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 2, 2020
Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 3, 2020
Use them only if `_UNICODE` is defined, in which case command-line
arguments have been converted to UTF-8.

Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 3, 2020
It only applies to non-Unicode builds now.
Also merge 5.10 into it as it's effectively a duplicate.

Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 3, 2020
Closes curl#3784
@MarcelRaad MarcelRaad force-pushed the MarcelRaad:windows_unicode_command_line_arguments branch from 8577e97 to cd29484 Jan 3, 2020
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 3, 2020
This will also be needed in the tool and tests.

Ref: curl#3758 (comment)
Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 3, 2020
- use wmain instead of main when _UNICODE is defined [0]
- define argv_item_t as wchar_t * in this case
- use the curl_multibyte gear to convert the command-line arguments to
  UTF-8

This makes it possible to pass parameters with characters outside of
the current locale on Windows, which is required for some tests, e.g.
the IDN tests. Out of the box, this currently only works with the
Visual Studio project files, which default to Unicode.

[0] https://devblogs.microsoft.com/oldnewthing/?p=40643

Ref: curl#3747
Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 3, 2020
Use them only if `_UNICODE` is defined, in which case command-line
arguments have been converted to UTF-8.

Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 3, 2020
It only applies to non-Unicode builds now.
Also merge 5.10 into it as it's effectively a duplicate.

Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 3, 2020
Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 3, 2020
These are cheap as they don't build tests.

Closes curl#3784
@MarcelRaad MarcelRaad force-pushed the MarcelRaad:windows_unicode_command_line_arguments branch from cd29484 to 26651df Jan 3, 2020
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 3, 2020
This will also be needed in the tool and tests.

Ref: curl#3758 (comment)
Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 3, 2020
- use wmain instead of main when _UNICODE is defined [0]
- define argv_item_t as wchar_t * in this case
- use the curl_multibyte gear to convert the command-line arguments to
  UTF-8

This makes it possible to pass parameters with characters outside of
the current locale on Windows, which is required for some tests, e.g.
the IDN tests. Out of the box, this currently only works with the
Visual Studio project files, which default to Unicode.

[0] https://devblogs.microsoft.com/oldnewthing/?p=40643

Ref: curl#3747
Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 3, 2020
Use them only if `_UNICODE` is defined, in which case command-line
arguments have been converted to UTF-8.

Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Jan 3, 2020
It only applies to non-Unicode builds now.
Also merge 5.10 into it as it's effectively a duplicate.

Closes curl#3784
#if defined(USE_WIN32_LARGE_FILES) || defined(USE_WIN32_SMALL_FILES)

FILE *curlx_win32_fopen(const char *filename, const char *mode)
{

This comment has been minimized.

@jay

jay Apr 25, 2020
Member

this is another instance where we are passing something opened from libcurl possibly to an application using a different CRT

@jay
Copy link
Member

@jay jay commented Apr 25, 2020

Aside from the curlx_ issues I think should be carefully examined (I just posted some comments; admittedly I don't understand curlx that well), I don't have an objection if you are going to follow up with maintenance on this. I just don't think it's going to be this easy, but let's find out.

@MarcelRaad
Copy link
Member Author

@MarcelRaad MarcelRaad commented Apr 25, 2020

@mback2k

I would like to see the CI build failures fixed before a merge

Sure! They're new after today's rebase. Admittedly, I haven't compiled the branch locally between rebasing and pushing.

MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Apr 26, 2020
This will also be needed in the tool and tests.

Ref: curl#3758 (comment)
Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Apr 26, 2020
- use `wmain` instead of `main` when `_UNICODE` is defined [0]
- define `argv_item_t` as `wchar_t *` in this case
- use the curl_multibyte gear to convert the command-line arguments to
  UTF-8

This makes it possible to pass parameters with characters outside of
the current locale on Windows, which is required for some tests, e.g.
the IDN tests. Out of the box, this currently only works with the
Visual Studio project files, which default to Unicode, and winbuild
with the `ENABLE_UNICODE` option.

[0] https://devblogs.microsoft.com/oldnewthing/?p=40643

Ref: curl#3747
Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Apr 26, 2020
Use them only if `_UNICODE` is defined, in which case command-line
arguments have been converted to UTF-8.

Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request Apr 26, 2020
It only applies to non-Unicode builds now.
Also merge 5.10 into it as it's effectively a duplicate.

Closes curl#3784
@MarcelRaad MarcelRaad force-pushed the MarcelRaad:windows_unicode_command_line_arguments branch from d337aeb to 1577c33 Apr 26, 2020
@MarcelRaad MarcelRaad force-pushed the MarcelRaad:windows_unicode_command_line_arguments branch from 1577c33 to cacc3c4 May 9, 2020
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request May 9, 2020
This will also be needed in the tool and tests.

Ref: curl#3758 (comment)
Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request May 9, 2020
- use `wmain` instead of `main` when `_UNICODE` is defined [0]
- define `argv_item_t` as `wchar_t *` in this case
- use the curl_multibyte gear to convert the command-line arguments to
  UTF-8

This makes it possible to pass parameters with characters outside of
the current locale on Windows, which is required for some tests, e.g.
the IDN tests. Out of the box, this currently only works with the
Visual Studio project files, which default to Unicode, and winbuild
with the `ENABLE_UNICODE` option.

[0] https://devblogs.microsoft.com/oldnewthing/?p=40643

Ref: curl#3747
Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request May 9, 2020
Use them only if `_UNICODE` is defined, in which case command-line
arguments have been converted to UTF-8.

Closes curl#3784
MarcelRaad added a commit to MarcelRaad/curl that referenced this pull request May 9, 2020
It only applies to non-Unicode builds now.
Also merge 5.10 into it as it's effectively a duplicate.

Closes curl#3784
@MarcelRaad
Copy link
Member Author

@MarcelRaad MarcelRaad commented May 9, 2020

Any idea how I can see what went wrong with the "CI / macos normal (pull_request)" job? All I see there is a big white cross and "This check failed" without any visible link or button.

Also, it seems like 148534d broke the AppVeyor tests when both OpenSSL and WinSSL are off. 36 tests are failing with "curl: (48) An unknown option was passed in to libcurl".

@mback2k
Copy link
Member

@mback2k mback2k commented May 10, 2020

Any idea how I can see what went wrong with the "CI / macos normal (pull_request)" job? All I see there is a big white cross and "This check failed" without any visible link or button.

This happens randomly with macOS builds on GitHub Actions (similar to the random failures on travis-ci before), because all CIs use MacStadium for hosting their macOS runners. If there is an issue with the runner it will show up as a failed CI run, but GitHub Action will retry running it. This is why you can see a 2nd result of "CI / macos normal (pull_request)" showing up as successful.

MarcelRaad added 4 commits Apr 13, 2019
This will also be needed in the tool and tests.

Ref: #3758 (comment)
Closes #3784
- use `wmain` instead of `main` when `_UNICODE` is defined [0]
- define `argv_item_t` as `wchar_t *` in this case
- use the curl_multibyte gear to convert the command-line arguments to
  UTF-8

This makes it possible to pass parameters with characters outside of
the current locale on Windows, which is required for some tests, e.g.
the IDN tests. Out of the box, this currently only works with the
Visual Studio project files, which default to Unicode, and winbuild
with the `ENABLE_UNICODE` option.

[0] https://devblogs.microsoft.com/oldnewthing/?p=40643

Ref: #3747
Closes #3784
Use them only if `_UNICODE` is defined, in which case command-line
arguments have been converted to UTF-8.

Closes #3784
It only applies to non-Unicode builds now.
Also merge 5.10 into it as it's effectively a duplicate.

Closes #3784
@MarcelRaad MarcelRaad force-pushed the MarcelRaad:windows_unicode_command_line_arguments branch from cacc3c4 to 045d10c May 12, 2020
MarcelRaad added a commit that referenced this pull request May 14, 2020
- use `wmain` instead of `main` when `_UNICODE` is defined [0]
- define `argv_item_t` as `wchar_t *` in this case
- use the curl_multibyte gear to convert the command-line arguments to
  UTF-8

This makes it possible to pass parameters with characters outside of
the current locale on Windows, which is required for some tests, e.g.
the IDN tests. Out of the box, this currently only works with the
Visual Studio project files, which default to Unicode, and winbuild
with the `ENABLE_UNICODE` option.

[0] https://devblogs.microsoft.com/oldnewthing/?p=40643

Ref: #3747
Closes #3784
MarcelRaad added a commit that referenced this pull request May 14, 2020
Use them only if `_UNICODE` is defined, in which case command-line
arguments have been converted to UTF-8.

Closes #3784
MarcelRaad added a commit that referenced this pull request May 14, 2020
It only applies to non-Unicode builds now.
Also merge 5.10 into it as it's effectively a duplicate.

Closes #3784
@MarcelRaad MarcelRaad deleted the MarcelRaad:windows_unicode_command_line_arguments branch May 14, 2020
mback2k added a commit to mback2k/curl that referenced this pull request May 28, 2020
Since curl_setup.h now makes use of curlx_win32_fopen for Win32
builds with USE_WIN32_LARGE_FILES or USE_WIN32_SMALL_FILES defined,
we need to include the relevant files for tests using fopen,
because the libtest sources are including curl_setup.h

Follow up to curl#3784 (ffdddb4)
mback2k added a commit that referenced this pull request May 30, 2020
Since curl_setup.h now makes use of curlx_win32_fopen for Win32
builds with USE_WIN32_LARGE_FILES or USE_WIN32_SMALL_FILES defined,
we need to include the relevant files for tests using fopen,
because the libtest sources are also including curl_setup.h

Reviewed-by: Marcel Raad
Reviewed-by: Daniel Stenberg

Follow up to #3784 (ffdddb4)
Closes #5475
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants