-
-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
curl on Windows incorrectly handle IDN-urls #731
Comments
Seem to be related to #637. Which is closed, so some issues remains then? I've never seen a |
WinIDN conversion in libcurl is still not perfect, but it's not directly related. |
See #654 (comment). I have a MinGW64 built curl and it does not crash
|
|
@jay "Doctor, I have a pain in my leg!" |
The problem is in bad support of charset conversion and charset detection in libidn. |
I still can't get it to crash, also cygwin built curl works for me
|
I'd suggest to use UTF-8 in |
@jay Try |
I tried LC_ALL set to C but it can't convert the domain name, as expected. If there's a buffer overrun I'm not seeing it. |
libidn's tld_check_lz returns an error offset of the first character that it failed to process, however that offset is not a byte offset and may not even be in the locale encoding therefore we can't use it to show the user the character that failed to process. Bug: #731 Reported-by: Karlson2k
You are correct about the bad err_pos. Fixed in 3d144ab. |
So the crash is removed by this? |
@bagder No, crash is caused by libidn internals. |
Crash fixed by Msys2 local patch. https://github.com/Alexpux/MINGW-packages/pull/1276 |
Glad you were able to find and fix the crash. The remaining issue that curl in Windows incorrectly handles IDN URLs is caused by two problems. First is curl only works with arguments that are encoded in the current ANSI codepage. Except in the case of Cygwin you can't set the locale to UTF-8 (or you can, but it won't work properly). See #345. Second is related to the first and that is if you are using WinIDN (which expects IDN URLs to be UTF-8 encoded) curl cannot retrieve UTF-8 encoded arguments. Or it can, but it can cause other problems down the line in libcurl since there's no way to set the locale to UTF-8. Since we don't have time to work on this right now it's been added to KNOWN_BUGS. 9f740d3 |
I did this
curl -v http://яндекс.рф
Results:
MinGW64-compiled:
Cygwin-compiled:
VS14-compiled:
I expected the following
Normal http connection to site.
curl/libcurl version
MinGW64-compiled:
Cygwin-compiled:
VS14-compiled:
operating system
Windows 7 SP1 x64
MinGW, Cygwin, VS and components - latest version.
Checked - libidn support for current locale encoding is broken. It detects some incorrect encoding from environment variable
CHARSET
and fromnl_langinfo (CODESET)
(which need to be fixed for Win32). Instead of fixing libidn I suggest to revamp encoding handling in libcurl/curl tool.Using "locale encoding" may means something only for curl tool, but for libcurl it's useless and unsafe as other threads may change it on-fly.
I suggest to document that libcurl always assumes UTF-8 encoding in urls (which change nothing for most POSIX systems), convert encoding in curl tool and use
idna_to_ascii_8z()
in libcurl.WinIDN-conversion
curl_win32_idn_to_ascii()
already assume UTF-8 input, but currently it's incorrect as curl tool don't convert anything.For curl tool I'd use highly portable
mbstowcs()
with fast custom conversionwchar_t
->UTF-8
.The text was updated successfully, but these errors were encountered: