You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Problem: the standard C character handling functions from ctype.h
(isalpha, isdigit, isxdigit, isspace, toupper, tolower, etc.) are locale
aware, but for almost all uses CPython needs locale-unaware versions of
There are various solutions in the current source:
there's a file Include/bytes_methods.h which provides suitable
ISDIGIT/ISALPHA/... macros, but also undefines the standard functions.
As it is, it can't be included in Python.h since that would break
3rd party code that includes Python.h and also uses isdigit.
some files have their own solution: Python/pystrtod.c defines
its own (probably inefficient) ISDIGIT and ISSPACE macros.
in some places the standard C functions are just used directly (and
possibly incorrectly). A gotcha here is that one has to remember to use
Py_CHARMASK to avoid errors on some platforms. (See bpo-3633 for an
It would be nice to clean all this up, and have one central, efficient,
easy-to-use set of Py_ISDIGIT/Py_ISALPHA ... locale-independent macros (or
functions) that could be used safely throughout the Python source.
I concur. I've also been bitten by forgetting Py_CHARMASK, so a single
version that took this into account (and was locale-unaware) would be
In private mail I'd mentioned that if these are functions, they should
take int. But I now think that's incorrect, and they should take char or
unsigned char. I think the standard C functions take int because they
also allow EOF. I think the Py_ versions should allow only characters
and not allow EOF. Py_CHARMASK already enforces this, anyway, with
likely undefined results.
I'll implement this by adding a pyctype.h and pyctype.c, mimicking
<ctype.h>. I'll essentially copy and rename the methods in
bytes_methods.[ch], then change bytes_methods.h to refer to the new
versions, for backward compatibility.