New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
_sre: avoid relying on pointer overflow #61218
Comments
Modules/_sre.c relies on pointer overflow in 5 places to check that the supplied offset does not cause wraparound when added to a base pointer; e.g.:
however, pointer wraparound is undefined behavior in C, and gcc will optimize away (code+prefix_len < code) to (true), since prefix_len is an unsigned value. This will happen with -O2 and even with -fwrapv: nickolai@sahara:/tmp$ cat x.c
void bar();
void
foo(int *p, unsigned int x)
{
if (p + x < p)
bar();
}
nickolai@sahara:/tmp$ gcc x.c -S -o - -O2 -fwrapv
...
foo:
.LFB0:
.cfi_startproc
rep
ret
.cfi_endproc
...
nickolai@sahara:/tmp$ On a 32-bit platform with the development version of cpython, prefix_len seems to end up being an 'unsigned int', so I suspect that supplying a large prefix_len value (perhaps 0xffffffff) could lead to the subsequent loop writing garbage all over memory, or worse (but I have not tried to construct a concrete input that triggers this bug, so maybe there are some checks that make it difficult to trigger the bug). In any case, this might be worth fixing -- the attached patch provides one proposed fix. Another option might be to add -fno-strict-overflow to the gcc flags, which may be a reasonable additional measure to take, to avoid such problems biting Python in the future, but I would suggest doing this in addition to fixing the code (since not all compilers support such a flag to disable certain optimizations). |
LGTM. There are other doubtful places, at lines: 658, 678, 1000, 1084, 2777, 3111. |
Lines 1000 and 1084 will be a problem only if you're near the top of the address space. This is because:
See also issue bpo-13169. If the 'unlimited' value is raised then fixing those lines will become more urgent. |
Lines 2777 and 3111 do indeed look suspect, because gcc can compile (ptr + offset < ptr) into (offset < 0): nickolai@sahara:/tmp$ cat x.c
void bar();
void
foo(char* ptr, int offset)
{
if (ptr + offset < ptr)
bar();
}
nickolai@sahara:/tmp$ gcc x.c -S -o - -O2
...
foo:
.LFB0:
.cfi_startproc
testl %esi, %esi
js .L4
rep
ret
.p2align 4,,10
.p2align 3
.L4:
xorl %eax, %eax
jmp bar
.cfi_endproc
...
nickolai@sahara:/tmp$ Lines 658, 678, 1000, 1084 are potentially problematic -- I don't know of current compilers that will do something unexpected, but it might be worth rewriting the code to avoid undefined behavior anyway. |
You're checking "int offset", but what happens with "unsigned int offset"? |
For an unsigned int offset, see my original bug report: gcc eliminates the check altogether, since offset >= 0 by definition. |
Nickolai, are you want to update your patch with fixes for other possible pointer overflows? Note, that the maximal repetition number has been increased now. |
Sorry for the delay. Attached is an updated patch that should fix all of the issues mentioned in this bug report. |
Nickolai, can you please submit a contributor form? http://python.org/psf/contrib/contrib-form/ |
I just submitted the contributor form -- thanks for the reminder. |
I get an HTTP error when trying to upload another patch through Rietveld, so here's a revised patch that avoids the need for Py_uintptr_t (thanks Serhiy). |
Of course it would be nice to have the tests for so much cases as possible, but I am afraid that it will not be easy. The patch LGTM. |
New changeset 27162465316f by Serhiy Storchaka in branch '2.7': New changeset 2673d207c524 by Serhiy Storchaka in branch '3.3': New changeset f280786d0e64 by Serhiy Storchaka in branch 'default': |
Thank you, Nickolai, for the patch. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: