Skip to content

curl on Windows incorrectly handle IDN-urls #731

Closed
@Karlson2k

Description

@Karlson2k

I did this

curl -v http://яндекс.рф
Results:
MinGW64-compiled:

* STATE: INIT => CONNECT handle 0x712f30; line 1108 (connection #-5000)
* Rebuilt URL to: http://яндекс.рф/
* Input domain encoded as `cp1251'
Segmentation fault

Cygwin-compiled:

* STATE: INIT => CONNECT handle 0x6000108b0; line 1108 (connection #-5000)
* Rebuilt URL to: http://яндекс.рф/
* Input domain encoded as `ANSI_X3.4-1968'
* Failed to convert яндекс.рф to ACE; System iconv failed
* Added connection 0. The cache now contains 1 members
* getaddrinfo(3) failed for яндекс.рф:80
* Couldn't resolve host 'яндекс.рф'
* Closing connection 0
* The cache now contains 0 members
* Expire cleared
curl: (6) Couldn't resolve host 'яндекс.рф'

VS14-compiled:

* Could not resolve host: http
* Closing connection 0
curl: (6) Could not resolve host: http

I expected the following

Normal http connection to site.

curl/libcurl version

MinGW64-compiled:

curl 7.46.0 (x86_64-w64-mingw32) libcurl/7.47.2-DEV OpenSSL/1.0.2e zlib/1.2.8 libidn/1.32 libssh2/1.6.0 librtmp/2.3
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: Debug IDN IPv6 Largefile NTLM SSL libz TLS-SRP

Cygwin-compiled:

curl 7.48.1-DEV (x86_64-unknown-cygwin) libcurl/7.48.1-DEV zlib/1.2.8 libidn/1.29
Protocols: dict file ftp gopher http imap pop3 rtsp smtp telnet tftp
Features: Debug IDN IPv6 Largefile libz UnixSockets

VS14-compiled:

curl 7.48.1-DEV (x86_64-pc-win32) libcurl/7.48.1-DEV WinSSL WinIDN
Protocols: dict file ftp ftps gopher http https imap imaps ldap pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IDN Largefile SSPI Kerberos SPNEGO NTLM SSL

operating system

Windows 7 SP1 x64
MinGW, Cygwin, VS and components - latest version.


Checked - libidn support for current locale encoding is broken. It detects some incorrect encoding from environment variable CHARSET and from nl_langinfo (CODESET) (which need to be fixed for Win32). Instead of fixing libidn I suggest to revamp encoding handling in libcurl/curl tool.
Using "locale encoding" may means something only for curl tool, but for libcurl it's useless and unsafe as other threads may change it on-fly.
I suggest to document that libcurl always assumes UTF-8 encoding in urls (which change nothing for most POSIX systems), convert encoding in curl tool and use idna_to_ascii_8z() in libcurl.
WinIDN-conversion curl_win32_idn_to_ascii() already assume UTF-8 input, but currently it's incorrect as curl tool don't convert anything.
For curl tool I'd use highly portable mbstowcs() with fast custom conversion wchar_t -> UTF-8.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions