New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DarwinSSL + iOS / CFStringGetCString 128 char buffer too small for Asian locale #1823

Closed
bsergean opened this Issue Aug 24, 2017 · 8 comments

Comments

Projects
None yet
5 participants
@bsergean

bsergean commented Aug 24, 2017

I did this

We are getting an error downloading a page through SSL with libcurl in our iOS app. This only happen when we hit a certain page with a certain certificate, and on Asian locales (simplified Chinese, Japanese).

The error message is "SSL: invalid CA certificate subject", which come from this block of code:

    /* Check if cacert is valid. */
    CFStringRef subject = CopyCertSubject(cacert);
    if(subject) {
      char subject_cbuf[128];
      memset(subject_cbuf, 0, 128);
      if(!CFStringGetCString(subject,
                            subject_cbuf,
                            128,
                            kCFStringEncodingUTF8)) {
        CFRelease(cacert);
        failf(data, "SSL: invalid CA certificate subject");
        return CURLE_SSL_CACERT;
      }

If I bump 128 to 256 everywhere in this file, the error goes away.

  1. Does a certificate subjects needs to be less than 128 bytes in utf-8 ?
  2. Is it "good enough" to bump that value a little higher to 256. Or maybe to 128 * max number of bytes in a utf-8 char (5 ?) if we only ever expect 128 characters in a subject ?
  3. Should we make that code work with arbitrary length ?

I expected the following

I should be able to download that page through SSL.

curl/libcurl version

curl-7.54.1

operating system

iOS

@jay jay added the SSL/TLS label Aug 24, 2017

@jay

This comment has been minimized.

Member

jay commented Aug 24, 2017

That comment "Check if cacert is valid" seems kind of misleading, when I read that I thought some kind of comparison was supposed to occur there and instead it's only checking whether the subject can be converted to UTF-8.

CopyCertSubject in darwinssl.c determines what it is the subject depending on the OS version and functions available. I don't think we can say for certain how big a subject will be. Instead we can allocate the space dynamically. I'm not entirely sure I'm using CFStringGetLength correctly here but it would probably look something like this:

diff --git a/lib/vtls/darwinssl.c b/lib/vtls/darwinssl.c
index 18751ca..f511072 100644
--- a/lib/vtls/darwinssl.c
+++ b/lib/vtls/darwinssl.c
@@ -1888,18 +1888,25 @@ static int append_cert_to_array(struct Curl_easy *data,
       return CURLE_SSL_CACERT;
     }
 
-    /* Check if cacert is valid. */
     CFStringRef subject = CopyCertSubject(cacert);
     if(subject) {
-      char subject_cbuf[128];
-      memset(subject_cbuf, 0, 128);
-      if(!CFStringGetCString(subject,
-                            subject_cbuf,
-                            128,
-                            kCFStringEncodingUTF8)) {
-        CFRelease(cacert);
-        failf(data, "SSL: invalid CA certificate subject");
-        return CURLE_SSL_CACERT;
+      /* If subject is not UTF-8 then check if it can be converted */
+      if(!CFStringGetCStringPtr(subject, kCFStringEncodingUTF8)) {
+        size_t cbuf_size = ((size_t)CFStringGetLength(subject) * 4) + 1;
+        char *cbuf = calloc(cbuf_size, 1);
+        if(!cbuf) {
+          CFRelease(cacert);
+          failf(data, "SSL: couldn't allocate %zu bytes of memory", cbuf_size);
+          return CURLE_OUT_OF_MEMORY;
+        }
+        if(!CFStringGetCString(subject, cbuf, cbuf_size,
+                               kCFStringEncodingUTF8)) {
+          free(cbuf);
+          CFRelease(cacert);
+          failf(data, "SSL: invalid CA certificate subject");
+          return CURLE_SSL_CACERT;
+        }
+        free(cbuf);
       }
       CFRelease(subject);
     }

CFStringGetCStringPtr as a speedup because I read here that if there's already that encoding available in the string it can just return it.

CFStringGetLength questions because it returns CFIndex (long) so can it ever be negative? And also, doc says "Returns the number (in terms of UTF-16 code pairs) of Unicode characters in a string" but then I read this.

/cc @nickzman

@bsergean

This comment has been minimized.

bsergean commented Aug 24, 2017

UTF-8 encodes each of the 1,112,064[9] valid code points in Unicode using one to four 8-bit bytes. From https://en.wikipedia.org/wiki/UTF-8

This tells me that your upper bound evaluation of the number of bytes required for the conversion as X * 4 + 1 is correct.

Re: negative result, you could add an extra check for that (with better error string and return type than what's below). Maybe it's overkill and CFStringGetLength in practice can never return a negative number, but it's safer that way ?

CFIndex len = CFStringGetLength(subject);
if (len < 0) {
    failf(data, "SSL: CFStringGetLength negative return");
    return CURLE_XXX;
}
size_t cbuf_size = ((size_t) 4 * len) + 1;
....

Last but not least, should that same block of code be used everywhere in that file CFStringGetCString was used ? (I would say yes), to eliminate all the static 128 buffers ? It would be less efficient to dynamically allocate that memory, but it would be safer that way.

@BillPyne

This comment has been minimized.

BillPyne commented Aug 24, 2017

Edited to fix mistakes.

It is interesting to note in this particular case that @bsergean reported that the CA certificate subject is 125 bytes characters long with null-terminator (129 bytes). Note the 2x which are each 3 bytes each with UTF-8.

VeriSign Trust Network,(c) 1998 VeriSign, Inc. - For authorized use only,Class 3 Public Primary Certification Authority - G2

When the device is set to English, CFStringGetLength returns 126. However, copying to a 126 byte buffer fails in this case. The smallest buffer size I found to succeed is 127 in this particular case. It gets even more interesting though...

It is interesting that a 127 byte buffer succeeds according to CFStringGetLength since 128 bytes should be required without a null-terminator in UTF-8. Also, the documentation for CFStringGetCString states that there must be room for a null-terminator which is not the case here!

When the device is set to Chinese, CFStringGetLength returns 124! Worse yet, the smallest buffer size I found to succeed is 129 bytes for this particular string.

The Chinese locale appears to work correctly. It is the many other languages that should be failing.

Perhaps the buffer size requires space (however not used) for a BOM? This would also imply that subject is stored in UTF-32 for Chinese and UTF-16 for English. This, of course, might not be the case.

In light of the earlier mistakes and later comment, this theory doesn't fit in the slightest.

I might reach out to Apple for clarification on what is happening here.

I will probably reach out to Apple for the buffer copying "succeeding" when it shouldn't. Perhaps the issue is already known though.

In the mean time, based on these results, the proposed solution of X * 4 + 1 would work in this case more by coincidence based on the findings listed above. It appears it is safer to add +5 though, but the reason is unclear.

@bagder

This comment has been minimized.

Member

bagder commented Aug 24, 2017

The 128 limit is used on multiple places in the darwinssl.c code so if one instance of it is deemed too small, it should probably be increased everywhere.

Since this limit hasn't caused any problem before, it is probably rare with this large values so maybe that makes it OK to just double the value and keep the code simple

@BillPyne

This comment has been minimized.

BillPyne commented Aug 24, 2017

As mentioned in the link provided by @jay, and the example I listed above. It appears CFStringGetLength returns different values based on the underlying string encoding. With UTF-32 (which might be the case with Chinese), the string above would be 124 elements (what is the proper term?) with a null terminator. In UTF-16, it would be 126 without a null-terminator.

So the proposal for X*4 + 1 seems the safest unless there is a way to check the underlying string encoding of the subject?

@nickzman

This comment has been minimized.

Collaborator

nickzman commented Aug 25, 2017

The Unicode standard sometimes does terrible things in order to protect developers from even worse solutions.

The proposal for (CFStringGetLength(subject)*4)+1 is better than what we currently have, and ought to solve this problem.

bagder added a commit that referenced this issue Aug 25, 2017

darwinssl: handle long strings in TLS certs
... as the previous fixed length 128 bytes buffer was sometimes too
small.

Fixes #1823

Reported-by: Benjamin Sergeant
Assisted-by: Bill Pyne, Ray Satiro, Nick Zitzmann
@bagder

This comment has been minimized.

Member

bagder commented Aug 25, 2017

Please review my take at a function that attempts to handle all the fixed-size 128 bytes buffers in #1831. I'm sure I missed something. - I have not built all the possible combinations so some changes are not even compiled.

@bagder bagder closed this in b3b75d1 Aug 27, 2017

@bagder

This comment has been minimized.

Member

bagder commented Aug 27, 2017

Thanks!

jay added a commit to jay/curl that referenced this issue Aug 28, 2017

darwinssl: handle long strings in TLS certs (follow-up)
- Fix handling certificate subjects that are already UTF-8 encoded.

Follow-up to b3b75d1 from two days ago. Since then a copy would be
skipped if the subject was already UTF-8, possibly resulting in a NULL
deref later on.

Ref: curl#1823
Ref: curl#1831

Closes curl#1836

jay added a commit to jay/curl that referenced this issue Aug 28, 2017

darwinssl: handle long strings in TLS certs (follow-up)
- Fix handling certificate subjects that are already UTF-8 encoded.

Follow-up to b3b75d1 from two days ago. Since then a copy would be
skipped if the subject was already UTF-8, possibly resulting in a NULL
deref later on.

Ref: curl#1823
Ref: curl#1831

Closes curl#1836

jay added a commit that referenced this issue Aug 31, 2017

darwinssl: handle long strings in TLS certs (follow-up)
- Fix handling certificate subjects that are already UTF-8 encoded.

Follow-up to b3b75d1 from two days ago. Since then a copy would be
skipped if the subject was already UTF-8, possibly resulting in a NULL
deref later on.

Ref: #1823
Ref: #1831

Closes #1836

@lock lock bot locked as resolved and limited conversation to collaborators May 6, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.