New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible integer overflow in operations with addresses and sizes. #59349
Comments
In unicodeobject.c and stringlib aligned addresses and sizes are used for optimization. pointer->integer and implicit integer->integer conversions may overflow or underflow on platforms with sizeof(size_t) != sizeof(void *) or sizeof(size_t) != sizeof(int). The proposed patch fixes such unsafe things in unicodeobject.c, stringlib and some other files. There are still a few unsafe places in libffi, but in this library Py_uintptr_t nor uintptr_t are not available. |
If we're worrying about undefined behaviour, it looks like recent optimizations have *introduced* new undefined behaviour in the form of strict aliasing violations. E.g., from ascii_decode:
(here _p has type const char *). This should really be fixed; compilers are known to make optimizations based on the assumption that this sort of undefined behaviour doesn't occur. |
Doesn't the compiler have all the necessary information here? It's not like a subroutine is called. |
Me neither, but it doesn't seem safe to assume that no compiler will take advantage of this. I don't want to start guessing what compilers might or might not do; it would be much better simply to stick to valid C where possible. |
N.B. This could probably be fixed without affecting performance by using the usual union trick. (IIUC, that trick was technically still undefined behaviour for a while, but was eventually made legal by C99 + TC3.) As far as I know there aren't any instances of compilers causing problems with that construct. |
How would it work? We would have to add various unions to the |
No, you'd just need a temporary union defined in unicodeobject.c that would look something like: typedef union { unsigned long v; char s[SIZEOF_LONG]; } U; (with better choices of names). Python/dtoa.c does a similar thing to read / write the pieces of a C double using integers safely. |
I'll see if I can come up with a patch, and open a new issue for it (since I've successfully derailed this issue from its original topic) |
I don't see what the undefined behavior. Can you specify exactly the
I don't know how else you can rewrite it, without destroying completely In any case, I don't think that the original patch introduces some new |
It's C99 section 6.5, paragraph 7: "An object shall have its stored value accessed only by an lvalue expression that has one of the following types ...". It's the dereferencing of the pointer that's the problem: that's accessing a stored value of type 'char' by an lvalue expression that has type 'unsigned long'.
Ah; were the strict aliasing problems already there before the patch? I didn't check. |
Please open separate issue for this. |
Please, Mark or some other C expert, can you do a code review for the patch? |
Perhaps the three new macros should be made available in a .h file? |
Perhaps. There are similar macros in other files (Include/objimpl.h, |
Good idea. Maybe pymacros.h? These macros need to be documented (with a comment in the .h file) |
Well, here is a new patch. The five new macros moved to pymacros.h and used in more files. |
Apologies; I got distracted from the main point of this issue with the strict aliasing stuff, and then it fell off the to-do list. Unassigning; Antoine or Victor, do you want to take this? |
Mark, please open a new discussion, so we don't lose it and that was the place for discussion. |
Thanks for the patch! These macros will be useful. |
New changeset 99112b851b25 by Antoine Pitrou in branch 'default': |
Committed in 3.3(.1). |
Done: bpo-15992. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: