Description
I did this
curl -v http://яндекс.рф
Results:
MinGW64-compiled:
* STATE: INIT => CONNECT handle 0x712f30; line 1108 (connection #-5000)
* Rebuilt URL to: http://яндекс.рф/
* Input domain encoded as `cp1251'
Segmentation fault
Cygwin-compiled:
* STATE: INIT => CONNECT handle 0x6000108b0; line 1108 (connection #-5000)
* Rebuilt URL to: http://яндекс.рф/
* Input domain encoded as `ANSI_X3.4-1968'
* Failed to convert яндекс.рф to ACE; System iconv failed
* Added connection 0. The cache now contains 1 members
* getaddrinfo(3) failed for яндекс.рф:80
* Couldn't resolve host 'яндекс.рф'
* Closing connection 0
* The cache now contains 0 members
* Expire cleared
curl: (6) Couldn't resolve host 'яндекс.рф'
VS14-compiled:
* Could not resolve host: http
* Closing connection 0
curl: (6) Could not resolve host: http
I expected the following
Normal http connection to site.
curl/libcurl version
MinGW64-compiled:
curl 7.46.0 (x86_64-w64-mingw32) libcurl/7.47.2-DEV OpenSSL/1.0.2e zlib/1.2.8 libidn/1.32 libssh2/1.6.0 librtmp/2.3
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: Debug IDN IPv6 Largefile NTLM SSL libz TLS-SRP
Cygwin-compiled:
curl 7.48.1-DEV (x86_64-unknown-cygwin) libcurl/7.48.1-DEV zlib/1.2.8 libidn/1.29
Protocols: dict file ftp gopher http imap pop3 rtsp smtp telnet tftp
Features: Debug IDN IPv6 Largefile libz UnixSockets
VS14-compiled:
curl 7.48.1-DEV (x86_64-pc-win32) libcurl/7.48.1-DEV WinSSL WinIDN
Protocols: dict file ftp ftps gopher http https imap imaps ldap pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IDN Largefile SSPI Kerberos SPNEGO NTLM SSL
operating system
Windows 7 SP1 x64
MinGW, Cygwin, VS and components - latest version.
Checked - libidn support for current locale encoding is broken. It detects some incorrect encoding from environment variable CHARSET
and from nl_langinfo (CODESET)
(which need to be fixed for Win32). Instead of fixing libidn I suggest to revamp encoding handling in libcurl/curl tool.
Using "locale encoding" may means something only for curl tool, but for libcurl it's useless and unsafe as other threads may change it on-fly.
I suggest to document that libcurl always assumes UTF-8 encoding in urls (which change nothing for most POSIX systems), convert encoding in curl tool and use idna_to_ascii_8z()
in libcurl.
WinIDN-conversion curl_win32_idn_to_ascii()
already assume UTF-8 input, but currently it's incorrect as curl tool don't convert anything.
For curl tool I'd use highly portable mbstowcs()
with fast custom conversion wchar_t
-> UTF-8
.