Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting the symbol name of an error library or reason #19848

Open
davidben opened this issue Dec 6, 2022 · 8 comments
Open

Getting the symbol name of an error library or reason #19848

davidben opened this issue Dec 6, 2022 · 8 comments
Labels
triaged: feature The issue/pr requests/adds a feature

Comments

@davidben
Copy link
Contributor

davidben commented Dec 6, 2022

(CC @tiran and @tniessen for CPython and Node, respectively. Please correct me if I've misunderstood what your projects are trying to do here!)

Given a value like SSL_R_THING_WENT_WRONG, ERR_reason_error_string returns a string like "thing went wrong". There doesn't seem to be any way to get the symbol name of an error reason.

I've noticed that both Node and CPython care about doing this, both because their public APIs involve exposing the symbolic name of the error. Node does it by uppercasing and replacing spaces with underscores.
https://github.com/nodejs/node/blob/v19.2.0/src/crypto/crypto_util.cc#L466-L476

This inverts the transform OpenSSL applies to generate the reason strings.
https://github.com/openssl/openssl/blob/openssl-3.0.7/util/mkerr.pl#L565

However, this is not documented and some errors in OpenSSL have manually adjusted strings. SSL_R_AT_LEAST_TLS_1_0_NEEDED_IN_FIPS_MODE's error string is "at least TLS 1.0 needed in FIPS mode". But this isn't universal; SSL_R_UNABLE_TO_FIND_ECDH_PARAMETERS is "unable to find ecdh parameters", not "unable to find ECDH parameters".

CPython takes a different approach. They maintain a table mapping error code to symbol name, seeded by parsing OpenSSL's header files:
https://github.com/python/cpython/blob/v3.11.0/Tools/ssl/make_ssl_data.py
https://github.com/python/cpython/blob/v3.11.0/Modules/_ssl_data_300.h
https://github.com/python/cpython/blob/v3.11.0/Modules/_ssl.c#L6022
https://github.com/python/cpython/blob/v3.11.0/Modules/_ssl.c#L458

Neither of these seems a particularly satisfying solution. The Node solution breaks on errors like SSL_R_AT_LEAST_TLS_1_0_NEEDED_IN_FIPS_MODE, while the CPython solution requires updating CPython source every time OpenSSL introduces a new error. Given two different projects have had to work around this already, perhaps OpenSSL should just provide a function with the appropriate semantics.

Perhaps ERR_reason_symbol_name and ERR_lib_symbol_name to return values like AT_LEAST_TLS_1_0_NEEDED_IN_FIPS_MODE and SSL for SSL_R_AT_LEAST_TLS_1_0_NEEDED_IN_FIPS_MODE and ERR_LIB_SSL, respectively?

@davidben davidben added the issue: feature request The issue was opened to request a feature label Dec 6, 2022
@t8m t8m added triaged: feature The issue/pr requests/adds a feature and removed issue: feature request The issue was opened to request a feature labels Dec 6, 2022
davidben added a commit to davidben/cpython that referenced this issue Dec 6, 2022
…codes

Prior to python#25300, the
make_ssl_data.py script used various tables, exposed in _ssl, to update
the error list.

After that PR, this is no longer used. Moreover, the err_names_to_codes
map isn't used at all. Clean those up. This gets them out of the way if,
in the future, OpenSSL provides an API to do what the code here is doing
directly. (openssl/openssl#19848)
davidben added a commit to davidben/cpython that referenced this issue Dec 6, 2022
…codes

Prior to python#25300, the
make_ssl_data.py script used various tables, exposed in _ssl, to update
the error list.

After that PR, this is no longer used. Moreover, the err_names_to_codes
map isn't used at all. Clean those up. This gets them out of the way if,
in the future, OpenSSL provides an API to do what the code here is doing
directly. (openssl/openssl#19848)
zooba pushed a commit to python/cpython that referenced this issue Apr 3, 2023
…H-100063)

Prior to #25300, the
make_ssl_data.py script used various tables, exposed in _ssl, to update
the error list.

After that PR, this is no longer used. Moreover, the err_names_to_codes
map isn't used at all. Clean those up. This gets them out of the way if,
in the future, OpenSSL provides an API to do what the code here is doing
directly. (openssl/openssl#19848)
gaogaotiantian pushed a commit to gaogaotiantian/cpython that referenced this issue Apr 8, 2023
…codes (pythonGH-100063)

Prior to python#25300, the
make_ssl_data.py script used various tables, exposed in _ssl, to update
the error list.

After that PR, this is no longer used. Moreover, the err_names_to_codes
map isn't used at all. Clean those up. This gets them out of the way if,
in the future, OpenSSL provides an API to do what the code here is doing
directly. (openssl/openssl#19848)
warsaw pushed a commit to warsaw/cpython that referenced this issue Apr 11, 2023
…codes (pythonGH-100063)

Prior to python#25300, the
make_ssl_data.py script used various tables, exposed in _ssl, to update
the error list.

After that PR, this is no longer used. Moreover, the err_names_to_codes
map isn't used at all. Clean those up. This gets them out of the way if,
in the future, OpenSSL provides an API to do what the code here is doing
directly. (openssl/openssl#19848)
@ofek
Copy link

ofek commented Sep 25, 2023

Has there been any update on this?

@levitte
Copy link
Member

levitte commented Sep 26, 2023

Are those projects having similar problems with other error sets, such as errno? After all, errno comes with a number of macros as well, like EPERM, EBADF, ERFKILL, and a number of other macros that are hiding in a plethora of errno.h files.

Technically, this is possible to do, but within limits, as engines and providers get the error library number dynamically (OpenSSL's built-in providers cheat in this regard). CPython paves the way, the way they do it is exactly what I could think of top of my head.

Should this end up in libcrypto / libssl? Not sure... entirely separate library for anyone to use if they want to? That has ramifications that might not be very tasteful...

@gpshead
Copy link

gpshead commented Sep 26, 2023

Unlike SSL & TLS versions, with evolving implementations, errno hasn't changed meaningfully in decades. Untrue of OpenSSL.

If you feel that "parsing .h files" is the public API for this, could you own and provide a .h -> structured data (json?) generator plus a test suite guaranteeing its output is complete and accurate? That'd at least make it official rather than hacks other projects have come up with today and continually expect may break upon new openssl releases.

(An actual API or some generated json that perhaps mkerr.pl or similar outputs may be simpler and easier to maintain.)

@levitte
Copy link
Member

levitte commented Sep 26, 2023

We do have a "similar output", which is what CPython looks at rather than .h files.

crypto/err/openssl.txt

It's not as complete as one might wish, as it doesn't contain the library codes, but that could certainly be added.

So what I hear from you is that it would be desirable if this was a bit more complete, and exported / installed alongside the libraries. And even better if it was exported in a more modern machine digestable form.

That about right?

@davidben
Copy link
Contributor Author

davidben commented Nov 21, 2023

@levitte Are you envisioning that CPython would dynamically find these files at runtime and parse them? Or that this would be a build-time process?

If runtime, isn't this just a less reliable and more cumbersome (relies on finding a random data file) version of the ERR_reason_symbol_name API?

If it's build-time, this doesn't resolve the problem with updates described in the bug report. Since OpenSSL aims to be ABI-stable (i.e. it is possible to update OpenSSL without recompiling applications), this seems particularly bad. It means CPython cannot effectively take advantage of OpenSSL's ABI stability because it'd require a recompile to report errors correctly.

ERR_reason_symbol_name seems most straightforward and the best fit for what real-world OpenSSL applications seem to need.

@gpshead
Copy link

gpshead commented Feb 17, 2024

Rehashing what @davidben said to ping this issue... From a CPython perspective: We don't want to find, load, and parse an SSL error data file at runtime. That is complex (how do we find it? what happens when it doesn't exist? how do we know it is the correct one to go with the library we're linked against... sometimes statically? etc.) and potentially slow (we would likely do it at _ssl module import time, slowing down startup, to avoid added complexity inside the error reporting path).

The proposed ERR_reason_symbol_name and ERR_lib_symbol_name APIs would suffice for CPython.

@levitte
Copy link
Member

levitte commented Mar 6, 2024

Ok, this is doable, but has its limits:

  • error lib and reason symbols for engines and providers can't reasonably be supported (it would require an expansion of how they are declared by engines and providers).
  • error reasons for the ERR_LIB_SYS reason table can't reasonably be supported, as they map system errors (errno numbers), and it's not reasonable to think that we'd hold a catalogue of all the E macros that exist on diverse operating systems.

So essentially, the proposed ERR_reason_symbol_name() and ERR_lib_symbol_name() can reasonably only work with error libs and reasons defined by crypto/err/openssl.txt.

@davidben
Copy link
Contributor Author

davidben commented Mar 6, 2024

Returning NULL when there isn't a symbol name available, and then having the caller handle it then (probably by returning something like UNKNOWN_ERROR in their APIs), seems reasonable enough to me.

davidben added a commit to google/boringssl that referenced this issue Mar 12, 2024
CPython needs this operation. See
openssl/openssl#19848 and
https://discuss.python.org/t/error-tables-in-the-ssl-module/25431 for
details.

In principle, our functions already return the symbol names. The
differences are:

- Our library strings say "common libcrypto routines" instead of
  "CRYPTO".
- The global reason codes say "internal error" instead of
  "INTERNAL_ERROR". (We should consider changing this.)
- The library forwarding reason codes (ERR_R_BN_LIB) say the library
  string instead of "BN_LIB". (We should consider changing this.)
- errnos report strerror
- Unknown errors return "unknown error" because we've found that
  projects tend to crash when these APIs return NULL.

The new APIs consistently return the symbol name, when available. If
unavailable (ERR_LIB_SYS's errno reasons), it returns NULL because I
assume callers would rather be able to handle that case themselves.
Hopefully this will not be as common so callers can take on this one.

Change-Id: Idd9e4b1cb5a4f64513310d8066d6bf3970722c23
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/66807
Reviewed-by: Bob Beck <bbe@google.com>
Commit-Queue: David Benjamin <davidben@google.com>
gpshead added a commit to gpshead/cpython that referenced this issue Mar 27, 2024
Python needs to map OpenSSL error codes like ERR_R_INTERNAL_ERROR into strings
like "INTERNAL_ERROR". OpenSSL lacks an API for this, so CPython instead
maintains its own table.

This table is necessarily sensitive to the OpenSSL version and causes issues for
BoringSSL. Rather than maintain our own copy of this table, BoringSSL has APIs
to do the thing CPython actually wants. This patch switches CPython to use them.
To keep the patch small, it doesn't ifdef the err_codes_to_names, etc., fields,
but they are no longer necessary.

See openssl/openssl#19848 and
https://discuss.python.org/t/error-tables-in-the-ssl-module/25431 for context.

BoringSSL API addition:
https://boringssl.googlesource.com/boringssl/+/dbad745811195c00b729efd0ee0a09b7d9fce1d2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged: feature The issue/pr requests/adds a feature
Projects
None yet
Development

No branches or pull requests

5 participants