Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libidn with bad UTF8 input #332

Closed
jay opened this issue Jul 1, 2015 · 6 comments
Closed

libidn with bad UTF8 input #332

jay opened this issue Jul 1, 2015 · 6 comments

Comments

@jay
Copy link
Member

jay commented Jul 1, 2015

Validate the encoding before passing strings to libcurl or glibc - xnyhps’ blog

I've reviewed the recent curl security advisory regarding libidn and also I've followed some of the discussion on the wget mailing list regarding this issue, and read the reporter's comments.

I think we should do what wget is doing and sanitize check the input. To that end I've started work in branch check_utf8_before_libidn on checking the hostname for UTF-8 correctness before we pass it to libidn.

I didn't even build it yet but I want to get it out there so we have something to work from. Based on what the reporter describes I'm also concerned there could possibly be a salient issue when it comes to multi-byte code pages, like Korea in Windows. I wonder if there could be a similar issue in libidn with that. We are calling idna_to_ascii_lz to convert the hostname and so possibly we may have to consider not only UTF-8 but validating any locale codepage before passing it to libidn.

I think reading past the null byte is a libidn problem but we have to do something. Alternative or maybe do it anyway we could go dirtier and do something like this

    char *safehost;
    len = strlen(host->name);
    safehost = malloc(len + 4);
    memcpy(safehost, host->name, len);
    safehost[len] = 0;
    safehost[len + 1] = 0;
    safehost[len + 2] = 0;
    safehost[len + 3] = 0;
    rc = idna_to_ascii_lz(safehost, &ace_hostname, 0);
@ghedo
Copy link
Contributor

ghedo commented Jul 1, 2015

I think Daniel's opinion was that this should be fixed in libidn not curl (or wget, or any other project that uses libidn), but he may have changed his mind, idk. FWIW I also prepared a patch (that Daniel rejected for the reason above) that uses iconv() to validate UTF-8 input. Lemme see if I can find it, though your patch is self-contained so maybe it's better.

IMO it makes sense for applications that accept untrusted input and pass it to libidn to do UTF8 validation themselves, though it might be tricky to do properly (hence my use of iconv()).

@ghedo
Copy link
Contributor

ghedo commented Jul 1, 2015

@jay
Copy link
Member Author

jay commented Jul 1, 2015

Neat, thanks. From the advisory:

We have also decided that libcurl is not responsible for scanning for invalid unicode, making every libcurl application that is not validating the input encoding of the domain names possibly vulnerable to this issue.

I don't get the impression additional checking is going to rush through libidn. In the interim at least my opinion is we have to do something. What could it hurt to put some safeguard in until they fix? I don't have much more time to work on the branch at the moment. I figured other collaborators might want to pick it up and change it however they want. Perhaps I was presumptuous putting it on bagder/curl, I'll acquiesce if it's withdrawn.

@ghedo
Copy link
Contributor

ghedo commented Jul 8, 2015

@bagder
Copy link
Member

bagder commented Jul 19, 2015

See #340 , closed since libidn fixed it.

@bagder bagder closed this as completed Jul 19, 2015
@jay
Copy link
Member Author

jay commented Jul 22, 2015

I deleted the branch but it's still in my fork in case we need such a function in the future.
https://github.com/bagder/curl/compare/master...jay:check_utf8_before_libidn?expand=1

@lock lock bot locked as resolved and limited conversation to collaborators May 7, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

3 participants