Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSL "issuer" and "server" names cannot be parsed #44165

Closed
nagle mannequin opened this issue Oct 24, 2006 · 13 comments
Closed

SSL "issuer" and "server" names cannot be parsed #44165

nagle mannequin opened this issue Oct 24, 2006 · 13 comments
Labels
stdlib Python modules in the Lib dir

Comments

@nagle
Copy link
Mannequin

nagle mannequin commented Oct 24, 2006

BPO 1583946
Nosy @loewis, @akuchling

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2007-09-10.21:55:03.444>
created_at = <Date 2006-10-24.18:32:27.000>
labels = ['library']
title = 'SSL "issuer" and "server" names cannot be parsed'
updated_at = <Date 2007-09-10.21:55:03.444>
user = 'https://bugs.python.org/nagle'

bugs.python.org fields:

activity = <Date 2007-09-10.21:55:03.444>
actor = 'janssen'
assignee = 'janssen'
closed = True
closed_date = <Date 2007-09-10.21:55:03.444>
closer = 'janssen'
components = ['Library (Lib)']
creation = <Date 2006-10-24.18:32:27.000>
creator = 'nagle'
dependencies = []
files = []
hgrepos = []
issue_num = 1583946
keywords = []
message_count = 13.0
messages = ['30384', '30385', '30386', '30387', '30388', '30389', '30390', '30391', '30392', '55298', '55446', '55652', '55799']
nosy_count = 4.0
nosy_names = ['loewis', 'akuchling', 'janssen', 'nagle']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = None
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue1583946'
versions = []

@nagle
Copy link
Mannequin Author

nagle mannequin commented Oct 24, 2006

(Python 2.5 library)

The Python SSL object offers two methods from

obtaining the info from an SSL certificate, "server()"
and "issuer()". These return strings.

The actual values in the certificate are a series

of key /value pairs in ASN.1 binary format. But what
"server()" and "issuer()" return are single strings,
with the key/value pairs separated by "/".

However, "/" is a valid character in certificate

data. So parsing such strings is ambiguous, and
potentially exploitable.

This is more than a theoretical problem.  The

issuer field of Verisign certificates has a "/" in the
middle of a text field:

"/O=VeriSign Trust Network/OU=VeriSign,
Inc./OU=VeriSign International Server CA - Class
3/OU=www.verisign.com/CPS Incorp.by Ref. LIABILITY
LTD.(c)97 VeriSign".

Note the

"OU=Terms of use at www.verisign.com/rpa (c)00"

with a "/" in the middle of the value field. Oops.

Worse, this is potentially exploitable.  By

ordering a low-level certificate with a "/" in the
right place, you can create the illusion (at least for
flawed implementations like this one) that the
certificate belongs to someone else. Just order a
certificate from GoDaddy, enter something like this in
the "Name" field

"Myphonyname/C=US/ST=California/L=San Jose/O=eBay

Inc./OU=Site Operations/CN=signin.ebay.com"

and Python code will be spoofed into thinking you're eBay.

Fortunately, browsers don't use Python code.

The actual bug is in

python/trunk/Modules/_ssl.c

at

if ((self-\>server_cert =

SSL_get_peer_certificate(self->ssl))) {

X509_NAME_oneline(X509_get_subject_name(self->server_cert),
self->server, X509_NAME_MAXLEN);

X509_NAME_oneline(X509_get_issuer_name(self->server_cert),
self->issuer, X509_NAME_MAXLEN);

The "X509_name_oneline" function takes an X509_NAME
structure, which is the certificate system's
representation of a list, and flattens it into a
printable string. This is a debug function, not one
for use in production code. The SSL documentation for
"X509_name_oneline" says:

"The functions X509_NAME_oneline() and

X509_NAME_print() are legacy functions which produce a
non standard output form, they don't handle multi
character fields and have various quirks and
inconsistencies. Their use is strongly discouraged in
new applications."

What OpenSSL callers are supposed to do is call
X509_NAME_entry_count() to get the number of entries in
an X509_NAME structure, then get each entry with
X509_NAME_get_entry(). A few more calls will obtain
the name/value pair from the entry, as UTF8 strings,
which should be converted to Python UNICODE strings.
OpenSSL has all the proper support, but Python's shim
doesn't interface to it correctly.

X509_NAME_oneline() doesn't handle Unicode; it converts
non-ASCII values to "\xnn" format. Again, it's for
debug output only.

So what's needed are two new functions for Python's SSL
sockets to replace "issuer" and "server". The new
functions should return lists of Unicode strings
representing the key/value pairs. (A list is needed,
not a dictionary; two strings with the same key
are both possible and common.)

The reason this now matters is that new "high
assurance" certs, the ones that tell you how much a
site can be trusted, are now being deployed, and to use
them effectively, you need that info. Support for them
is in Internet Explorer 7, so they're going to be
widespread soon. Python needs to catch up.

And, of course, this needs to be fixed as part of
Unicode support.

            John Nagle
            Animats

@nagle nagle mannequin added stdlib Python modules in the Lib dir labels Oct 24, 2006
@gpshead
Copy link
Member

gpshead commented Oct 24, 2006

@nagle
Copy link
Mannequin Author

nagle mannequin commented Oct 24, 2006

Logged In: YES
user_id=5571

The problem isn't in the version of OpenSSL used in Python,
which is at 0.9.8a. OpenSSL has had the necessary functions
for years. But Python isn't using them.

It's in "python/trunk/Modules/_ssl.c", as described above.

@loewis
Copy link
Mannequin

loewis mannequin commented Oct 25, 2006

Logged In: YES
user_id=21627

The bug is not in the the server() and issuer() methods
(which do exactly what they are meant to do); the bug is in
applications which assume that the result of these methods
can be parsed. As you point out, it cannot. The functions,
as is, don't present a security problem. If their result is
presented as-is to the user, the user can determine herself
whether she recognizes the entity referred-to in the
distinguished name.

Notice that it is certainly possible to produce an
unambigous string representation of a distinguished name;
RFC 4514 specifies an algorithm to do so (for use within LDAP).

Also notice that that the SSL module does little to actually
support trust: there is no verification of server-side
certs, no access to extensions of a certificate, etc. So an
application and a user should *not* trust the issuer name it
received, anyway (unless
there is an independent verification that the server
certificate can be trusted).

All that said: If you think you need this functionality,
please provide a patch to implement it.

@nagle
Copy link
Mannequin Author

nagle mannequin commented Oct 25, 2006

Logged In: YES
user_id=5571

Actually, they don't do what they're "designed to do".
According to the Python library documentation for SSL
objects, the server method "Returns a string containing the
ASN.1 distinguished name identifying the server's
certificate. (See below for an example showing what
distinguished names look like.)" The example "below" is
missing from the documentation, so the documentation gives
us no clue of what to expect.

There are several standardized representations for ASN.1
information. See
"http://www.oss.com/asn1/tutorial/Explain.html" Most are
binary. The only standard textual form is "XER", which is an
XML representation of ASN.1 encoded information. It's
essentially the same representation used for parameters in
SOAP.

So, given the documentation and the standard, what should be
coming out is the XML representation of that data.

Here's an entire X.509 certificate in XML:

http://www.gnu.org/software/gnutls/manual/html_node/An-X_002e509-certificate.html

The "issuer" field can be seen in there. It's awfully
bulky. And making SSL dependent on the SOAP module probably
isn't desireable. But that's an ASN.1 distinguished name in
XML format, per the standard.

That's probably not what's wanted by most users, although
the ability to retrieve an entire certificate in XML format
would be useful.

However, there's another standard string encoding, which is
defined in RFC2253. This is comma-separated UTF-8 with
backslash escapes for special characters. That's reliably
parseable. There's an openSSL function,
"X509_NAME_print_ex", which does this formatting, but it
doesn't output to a string. That's the right mechanism if
it can be invoked in some way to yield a string. It should
be invoked with flags = ASN1_STRFLGS_RFC2253, which yields a
UTF8 string, which of course should become a Python Unicode
string.

Now if someone can figure out how to get a string, instead
of file output, out of OpenSSL's "X509_NAME_print_ex", we're
home.

@loewis
Copy link
Mannequin

loewis mannequin commented Oct 25, 2006

Logged In: YES
user_id=21627

Notice that RFC 2253 has been superceded by RFC 4514 (see my
earlier message). However, I really see no reason to fix this:
even if the ambiguity problems were fixed, you *still*
should not
use the issuer and subject names in a security-relevant context.

@akuchling
Copy link
Member

Logged In: YES
user_id=11375

I've reworded the description in the documentation to say
something like this: "Returns a string describing the issuer
of the server's certificate.
Useful for debugging purposes; do not parse the content of
this string
because its format can't be parsed unambiguously."

For adding new features: please submit a patch. Python's
maintainers probably don't use SSL in
any sophisticated way and therefore have no idea what shape
better SSL/X.509 support would take.

@nagle
Copy link
Mannequin Author

nagle mannequin commented Nov 8, 2006

Logged In: YES
user_id=5571

I've submitted a request (titled "Request: make
X509_NAME_oneline() use same formatter as
X509_NAME_print_ex()") to the OpenSSL developers to fix this
on their side. If they fix that, delimiters will be escaped
per the standard.

The OpenSSL people should also export the functionality of
getting this information
as a UTF8 string, and if they do, Python should use that
call as part of Unicode support. Keep this open pending
action on the OpenSSL side. Thanks.

@akuchling
Copy link
Member

The request is bug bpo-1425 in the OpenSSL request tracker (go to openssl.org > Support for a link).

@janssen
Copy link
Mannequin

janssen mannequin commented Aug 26, 2007

I believe bpo-1018 addressed this, and that it can be closed. Though
socket.ssl, and its methods "server" and "issuer", should be deprecated.

@janssen
Copy link
Mannequin

janssen mannequin commented Aug 29, 2007

Actually, looking at it further, I'm not sure that it is fixed by the new
SSL code. If in fact the issuer or subject field can contain multiple
name-value pairs with the same name, the dictionary-based approach
currently used won't work. We'll need more of an alist approach, with
name-value tuples in it. I'd better look into this.

@janssen janssen mannequin assigned janssen Aug 29, 2007
@janssen
Copy link
Mannequin

janssen mannequin commented Sep 5, 2007

I've changed the return value of ssl.sslsocket.getpeercert() to return the
"issuer" and "subject" names as tuples containing 2-element name-value
tuples, in the same order that they appear in the certificate. This
should complete the fulfillment of this issue. Please see the doc page on
library/ssl.rst for more information.

@janssen
Copy link
Mannequin

janssen mannequin commented Sep 10, 2007

Fixed in rev 58097.

@janssen janssen mannequin closed this as completed Sep 10, 2007
@janssen janssen mannequin closed this as completed Sep 10, 2007
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir
Projects
None yet
Development

No branches or pull requests

2 participants