Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add access to mbedtls_x509_name::next_merged #5431

Closed
mpg opened this issue Jan 13, 2022 · 24 comments · Fixed by #5954
Closed

Add access to mbedtls_x509_name::next_merged #5431

mpg opened this issue Jan 13, 2022 · 24 comments · Fixed by #5954
Assignees
Labels
bug size-s Estimated task size: small (~2d)

Comments

@mpg
Copy link
Contributor

mpg commented Jan 13, 2022

Access to most fields of mbedtls_x509_name was added back in 3.1, but without access to next_merged, applications cannot reliably walk the list. Originally reported in #5331.

@gstrauss
Copy link
Contributor

include/mbedtls/asn1.h contains

    /** Merge next item into the current one?
     *
     * This field exists for the sake of Mbed TLS's X.509 certificate parsing
     * code and may change in future versions of the library.
     */
    unsigned char MBEDTLS_PRIVATE(next_merged);

It would be easy to submit a patch to remove MBEDTLS_PRIVATE(), but the comment suggests there may be other plans for the structure. Are there? If there are indeed concrete plans to change the implementation, would you instead consider patches to replace mbedtls_snprintf() in mbedtls_x509_dn_gets()? Each call to the underlying snprintf can be many hundreds and even thousands of CPU instructions.

@mpg
Copy link
Contributor Author

mpg commented Jan 13, 2022

would you instead consider patches to replace mbedtls_snprintf() in mbedtls_x509_dn_gets()?

That's what I was trying to ask yesterday: if mbedtls_x509_dn_gets() was more efficient, would you still need to walk the mbedtls_x509_name linked list? If the answer is no, then let's keep next_merged private, and improve mbedtls_x509_dn_gets() instead so that everyone can enjoy better performance! (Also, on embedded platforms, reducing usage of snprintf() is good for code size reasons as well.)

(Or we could also do both: improve the performance of dn_gets() and make next_merged public, if it turns out it's needed anyway.)

the comment suggests there may be other plans for the structure. Are there?

I'm not aware of any concrete plans in the immediate future (though in general I've always found this "merged" thing a bit dirty and would be happy if it was refactored to something cleaner at some point). @gilles-peskine-arm according to the history you wrote that comment, do you know more?

@gstrauss
Copy link
Contributor

Yes, I still need to walk the list to create separate variables in the lighttpd environment for each component of the DN. If I were not able to walk the list, I could attempt to parse the value returned from mbedtls_x509_dn_gets().

@mpg
Copy link
Contributor Author

mpg commented Jan 13, 2022

Well as we already discussed, having a function that builds a string that you then parse seems highly inefficient, so let's wait for Gilles' opinion and see if we can make next_merged public.

Independently of that, if you'd like to contribute improvements to dn_gets() making it more efficient and snprintf()-free, please feel free to submit a PR!

@gilles-peskine-arm
Copy link
Contributor

I went through ASN.1 and X.509 parsing structures. I made fields public when I thought there it wasn't likely that the field might disappear or change in semantics in the future. I was not comfortable making mbedtls_asn1_named_data::next_merged public because it's only used for one specific bit of X.509, in a way that's not documented. It looked to me like an optimization that we might want to change in the future, and not a robust API design.

So there are no concrete plans for a change, but a willingness to change.

I'm not fundamentally opposed to making it public, but I'd much prefer to improve the interface or implementation of mbedtls_x509_dn_gets and remove next_merged.

@gstrauss
Copy link
Contributor

I agree with @gilles-peskine-arm
next_merged feels very much like an implementation detail to me, too.

The reason I am walking the mbedtls_x509_name linked list is for historic compatibility with the (excessive) SSL_CLIENT_S_DN_* variables created by Apache to break down the client Subject DN components for (possible) use by CGI consumers. However, I do not believe such use is common, and so I believe it is a huge waste. (I still try to minimize the expense.)

I would prefer not to re-parse the string generated by mbedtls_x509_dn_gets(), so to avoid next_merged, how about an additional interface that walks the mbedtls_x509_name linked list and then calls a callback function (passed by the caller) with each component of the DN? (This is a common idiom in lighttpd where the CGI environment variables are the same, but CGI needs to format them in the environment, and FastCGI needs to encode them using the FastCGI protocol, and SCGI needs to encode them using the SCGI protocol, AJP13 needs to encode them using the Apache JServe Protocol 1.3, etc.)

@gstrauss
Copy link
Contributor

And since there is a willingness to review a PR reducing the use of mbedtls_snprintf, I'll look to provide a patch next week. :)

@gstrauss
Copy link
Contributor

The following is completely untested. I tried to follow the coding conventions in the library/x509.c file. Before I continue, please let me know if this approach appears acceptable or if there are non-starters. Thank you. I aimed for behavior identical to the prior code except that if truncation will occur, I truncate before adding the next short_name, and truncate within the name->val, if necessary. Is C90 declaration in 'for' loop ok?

int mbedtls_x509_dn_gets( char *buf, size_t size, const mbedtls_x509_name *dn )
{
    size_t n = 0, klen, vlen;
    int merge = 0;
    const char *short_name;

    for ( const mbedtls_x509_name *name = dn; name != NULL; name = name->next )
    {
        if( !name->oid.p )
            continue;

        if( 0 != mbedtls_oid_get_attr_short_name( &name->oid, &short_name ) )
            short_name = "??";
        klen = strlen(short_name);

        /*(include space for at least 'short_name=' and one val char + '\0')*/
        vlen = ( name != dn ) ? 3 : 0; /*(" + " or ", " or "")*/
        if( vlen + klen + 2 + 1 > size - n )
        {
            if ( size )
                buf[n] = '\0';
            return( MBEDTLS_ERR_X509_BUFFER_TOO_SMALL );
        }
        if( vlen )
        {
            vlen = merge ? 3 : 2; /*(" + " or ", ")*/
            memcpy( buf+n, merge ? " + " : ", ", vlen );
            n += vlen;
        }
        memcpy( buf+n, short_name, klen );
        n += klen;
        buf[n++] = '=';

        vlen = name->val.len;
        if( vlen >= MBEDTLS_X509_MAX_DN_NAME_SIZE )
            vlen = MBEDTLS_X509_MAX_DN_NAME_SIZE-1;
        if( vlen > size - n )
            vlen = size - n; /*(very end of buffer to detect truncation below)*/
        for( size_t i = 0; i < vlen; ++i )
        {
            unsigned char c = name->val.p[i];
            buf[n+i] = ( c >= 32 && c < 127 ) ? c : '?';
        }
        n += vlen;
        if( n == size )
        {
            buf[n-1] = '\0';
            return( MBEDTLS_ERR_X509_BUFFER_TOO_SMALL );
        }

        merge = name->next_merged;
    }

    if ( size )
        buf[n] = '\0';
    return( (int)n );
}

@gstrauss
Copy link
Contributor

@gilles-peskine-arm wrote

I was not comfortable making mbedtls_asn1_named_data::next_merged public because it's only used for one specific bit of X.509, in a way that's not documented.

Can you give me some hints about how and when something might be encoded in a way that uses next_merged? When I walk the Subject DN, will I never encounter next_merged? Can I simply use the first mbedtls_x509_name and then call a function to return the next item in the linked list (which under the hood skips all the following nodes with next_merged set)?

@gstrauss
Copy link
Contributor

Hmmm. So the next_merged is for multi-valued RDN (Relative Distinguished Name). For my use splitting the client Subject DN into separate environment variables, one for each RDN, I think that I can ignore next_merged and treat each mbedtls_x509_name as an attribute=value to put into the environment.

@gstrauss
Copy link
Contributor

gstrauss commented Jan 13, 2022

Sorry about my misunderstanding. I did not know about multi-valued RDNs.

This issue can be closed. Hiding next_merged is not a regression.

Once I get a thumbs up or thumbs down for the reimplementation above of mbedtls_x509_dn_gets(), I'll file a separate PR with the patch.

@gilles-peskine-arm
Copy link
Contributor

Your code looks plausible at first glance. C99 declarations are ok.

I don't actually know what next_merged is for: all I know is that by reading the code, I can tell that it was solely meant for one specific purpose and it looks like a bit of a hack. I haven't researched multi-valued RDN.

@gilles-peskine-arm
Copy link
Contributor

We do need to make sure that we have enough test cases that cover the cases that next_merged are used for, and edge cases that the rewrite might plausibly affect.

@mpg
Copy link
Contributor Author

mpg commented Jan 14, 2022

At a glance, I don't see any obvious blocker with your code. Out of curiosity, I tried and it passes our existing tests. But then I looked, and we only have 4 test cases, and none of them seems to be exercising edge cases or the use of next_merged. None of them exercises the case where the output buffer is too small, which is quite crucial for security.

So, you're welcome to open a PR with a new version of dn_gets(), as Gilles suspected, the tests need to be improved before a replacement can be accepted. If you feel like contributing extended tests too, that's great! Otherwise, I'm afraid your rewrite will have to wait for us to improve testing first.

@gstrauss
Copy link
Contributor

Yes, writing more tests is, of course, a requirement for string slinging functions. I wanted to test the temperature with this mock-up before spending more effort on lots of tests. This may take me some time, as I have run out of tuits this week.

@gstrauss
Copy link
Contributor

Data point: I did a quick, non-scientific benchmark and found the mbedtls_x509_dn_gets() above compiled -O2 is approx 5x - 6x faster than the existing implementation in mbedtls library/x509.c.

@gstrauss
Copy link
Contributor

@gilles-peskine-arm wrote (#3326):

If lighttpd uses the output of mbedtls_x509_dn_gets for other purposes, and you want to preserve non-ASCII characters, can you please open a separate feature request for this, and explain your constraints (with respect to diversity of certificates to support, acceptability of invalid output, etc.)?

One use case is in web servers which support client certificate verification, where the client provides a certificate to authenticate the client with the server. When client certificate verification fails, certificate info may end up in error logs. Since encoding/escaping data sent to error logs is a good practice to do on all data, let us put the failure case to the side. When client certificate verification succeeds, the web server presumably has validated that the client certificate is signed by a CA trusted by the web server. Many web servers (lighttpd included) will set the environment variable SSL_CLIENT_S_DN (client certificate "subject" DN) (probably following historical Apache httpd behavior) and this environment variable may be used by scripts to identify the authenticated client. The TLS connection is terminated in the lighttpd server, so SSL_CLIENT_S_DN and other variables are used to transmit the client certificate info to backends such as FastCGI, SCGI, AJP13, etc.

mbedtls_x509_dn_gets() -- which can not assume the case of validated certificate signed by a trusted CA -- restricts to (non-CTL) ASCII the characters allowed in RDN values, replacing other characters with '?'. As noted in #3326, this overwrites valid non-ASCII UTF-8 with '?', making SSL_CLIENT_S_DN non-identifying for subject DNs containing non-ASCII UTF-8.

There are a host of complicated rules and conventions and RFCs for stringifying and formatting a DN. mbedtls_x509_dn_gets() coarsely simplifies things by restricting chars to ASCII, but does not perform backslash escaping of restricted chars, such as comma (','). This is not an issue for lighttpd, since lighttpd trusts the CA when client certificate verification succeeds, assuming the server admin has not misconfigured CA trusts, and assuming the CA does not sign certs containing chars requiring such escaping. lighttpd would like to support non-ASCII UTF-8 in certificates with mbedtls, so the option available to me right now is to avoid mbedtls_x509_dn_gets() and to construct the subject DN string myself. I can do this to whatever specifications I deem fit, but if I avoid next_merged, then I will not be able to support multi-valued RDNs. Multi-valued RDNs are presumably less common than non-ASCII UTF-8, so I will probably roll my own function instead of using mbedtls_x509_dn_gets(). Regarding acceptability of invalid input, if I trust the CA not to issue certificates which require any special escaping when stringifying the DN, then I do not absolutely need to validate the RDN value, though checking for embedded '\0' would still be a good sanity check before lighttpd puts the value into the environment.

Without creating a complex function similar to openssl X509_NAME_print_ex() with many flags to adjust formatting behavior, a future version of mbedtls might restrict mbedtls_x509_dn_gets() to debug and info functions under #ifndef MBEDTLS_X509_REMOVE_INFO.

Regarding this github issue about the hiding of next_merged in mbedtls 3.0.0, it might be useful to provide an (inline) accessor function is_multi_valued_rdn() which can check if next_merged is set, while hiding the implementation detail.

After this research, I have concluded that this github issue is not a blocker or in any way urgent for lighttpd use of mbedtls, so I am okay with this github issue being closed.

@gilles-peskine-arm
Copy link
Contributor

@gstrauss Thanks for the detailed explanation! I guess that apart from RDN, you're planning to access dn->val->p directly and decode it according to dn->val->tag? Which allows the application to decide how thorough it wants to be about encodings.

And for RDN, I guess we should add a function mbedtls_x509_x509_dn_get_next which returns a mbedtls_x509_name *, either a pointer to the next DN list node or NULL.

I wish we could remove next_merged from mbedtls_asn1_named_data and change mbedtls_x509_name to be a structure containing an mbedtls_asn1_named_data and a mbedtls_x509_name*, but I fear this isn't possible since we didn't make mbedtls_x509_name opaque: it's transparently interchangeable with mbedtls_asn1_named_data. It's a pity we didn't think of changing those X509 type to be structs with the ASN1 struct as a field.

lighttpd-git pushed a commit to lighttpd/lighttpd1.4 that referenced this issue Jan 19, 2022
reconstruct SSL_CLIENT_S_DN in lighttpd due to limitations of
mbedtls_x509_dn_gets().  Adds support for non-ASCII UTF-8,
but loses support for multi-valued RDNs.

x-ref:
  "Add access to mbedtls_x509_name::next_merged"
  Mbed-TLS/mbedtls#5431
@gstrauss
Copy link
Contributor

@gilles-peskine-arm it is difficult to provide a one-size fits all "convenience" interface. ;)

I guess that apart from RDN, you're planning to access dn->val->p directly and decode it according to dn->val->tag? Which allows the application to decide how thorough it wants to be about encodings.

Yes. In lighttpd's case, I replaced mbedtls_x509_dn_gets() with my own in lighttpd mod_mbedtls since I would like to support non-ASCII UTF-8, and when lighttpd is generating SSL_CLIENT_S_DN, lighttpd is doing so on a certificate signed by a CA lighttpd has been configured to trust. lighttpd 1.4.64 has been released with experimental support for mbedtls 3.0.0.

@gstrauss
Copy link
Contributor

@mpg wrote:

At a glance, I don't see any obvious blocker with your code. Out of curiosity, I tried and it passes our existing tests. But then I looked, and we only have 4 test cases, and none of them seems to be exercising edge cases or the use of next_merged. None of them exercises the case where the output buffer is too small, which is quite crucial for security.

So, you're welcome to open a PR with a new version of dn_gets(), as Gilles suspected, the tests need to be improved before a replacement can be accepted. If you feel like contributing extended tests too, that's great! Otherwise, I'm afraid your rewrite will have to wait for us to improve testing first.

Ideally, I would be able to add a line to tests/suites/test_suite_x509write.data to validate mbedtls_x509_dn_gets() with a multi-valued RDN. However, from my limited read of the code, it does not appear that mbedtls_x509_string_to_names() supports multi-valued RDNs. I think it too much of an imposition to ask me to add support to mbedtls for this esoteric extension in order to write a test to verify that the debug function mbedtls_x509_dn_gets() can stringify a multi-valued RDN.

I can look into adding some tests where the output buffer is too small, but I do not plan to add multi-valued RDN tests.
==> Will this be acceptable?

@mpg
Copy link
Contributor Author

mpg commented Jan 31, 2022

Sorry for the late reply. I think your point is entirely fair, we appreciate your time in giving us very valuable feedback and contributing improvements, and we don't want to impose. However, I'm not sure that's entirely the same question as whether changing this part of the code without testing that case (which, as you say, it relatively esoteric, but precisely for that reason, an area where mistakes seem more likely) is technically acceptable. I'll try to find some time this week to have a closer look and see if I can perhaps extend testing myself, then we'd be in a better position to accept you contribution.

@gstrauss
Copy link
Contributor

I thought of a possible way to test this without having to write tests using binary-encoded strings for input: use mbedtls_x509_string_to_names() to parse a string, and then manually set next_merged on one node. That and testing size limits in mbedtls_x509_dn_gets() both would benefit from a specific test routine for mbedtls_x509_dn_gets() in tests/suites/test_suite_x509write.function

@gstrauss
Copy link
Contributor

gstrauss commented Feb 2, 2022

I filed #5494. I expect that you will find it too large a change, but I wanted to share the code. lighttpd no longer uses these functions in common lighttpd code paths due to limitations in these mbedtls routines, such as not supporting non-ASCII UTF-8. (However, lighttpd mod_mbedtls does still use mbedtls_x509_crt_verify_info in an error condition with client certificate verification.)

For lighttpd, I deemed support for non-ASCII UTF-8 more valuable than seldom-used multi-valued RDNs. As I mentioned above, multi-valued RDNs are not fully supported in mbedtls, either, e.g. mbedtls_x509_string_to_names() is not aware of multi-valued RDNs.

@davidhorstmann-arm
Copy link
Contributor

This issue was long ago closed, but it may be worth pointing out that the X.509 spec states in Appendix M.5:

Each RDN is constructed from one or more attributes by combining the attribute type with an appropriate attribute value. A complication was added by allowing a set of attribute types and values to make up a single RDN. That capability should not be used in a PKI context.

Which as I read it implies that multi-valued RDNs are not to be used for X.509 certificates at all. This would make the next_merged field unnecessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug size-s Estimated task size: small (~2d)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants