New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libidn with bad UTF8 input #332
Comments
I think Daniel's opinion was that this should be fixed in libidn not curl (or wget, or any other project that uses libidn), but he may have changed his mind, idk. FWIW I also prepared a patch (that Daniel rejected for the reason above) that uses IMO it makes sense for applications that accept untrusted input and pass it to libidn to do UTF8 validation themselves, though it might be tricky to do properly (hence my use of |
Neat, thanks. From the advisory:
I don't get the impression additional checking is going to rush through libidn. In the interim at least my opinion is we have to do something. What could it hurt to put some safeguard in until they fix? I don't have much more time to work on the branch at the moment. I figured other collaborators might want to pick it up and change it however they want. Perhaps I was presumptuous putting it on bagder/curl, I'll acquiesce if it's withdrawn. |
FWIW libidn now rejects invalid UTF8 input http://git.savannah.gnu.org/cgit/libidn.git/commit/?id=2e97c2796581c27213962c77f5a8571a598f9a2e |
See #340 , closed since libidn fixed it. |
I deleted the branch but it's still in my fork in case we need such a function in the future. |
Validate the encoding before passing strings to libcurl or glibc - xnyhps’ blog
I've reviewed the recent curl security advisory regarding libidn and also I've followed some of the discussion on the wget mailing list regarding this issue, and read the reporter's comments.
I think we should do what wget is doing and
sanitizecheck the input. To that end I've started work in branch check_utf8_before_libidn on checking the hostname for UTF-8 correctness before we pass it to libidn.I didn't even build it yet but I want to get it out there so we have something to work from.
Based on what the reporter describes I'm also concerned there could possibly be a salient issue when it comes to multi-byte code pages, like Korea in Windows. I wonder if there could be a similar issue in libidn with that. We are calling idna_to_ascii_lz to convert the hostname and so possibly we may have to consider not only UTF-8 but validating any locale codepage before passing it to libidn.I think reading past the null byte is a libidn problem but we have to do something. Alternative or maybe do it anyway we could go dirtier and do something like this
The text was updated successfully, but these errors were encountered: