Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_sre: avoid relying on pointer overflow #61218

Closed
NickolaiZeldovich mannequin opened this issue Jan 22, 2013 · 14 comments
Closed

_sre: avoid relying on pointer overflow #61218

NickolaiZeldovich mannequin opened this issue Jan 22, 2013 · 14 comments
Assignees
Labels
extension-modules C modules in the Modules dir topic-regex type-security A security issue

Comments

@NickolaiZeldovich
Copy link
Mannequin

NickolaiZeldovich mannequin commented Jan 22, 2013

BPO 17016
Nosy @mdickinson, @vstinner, @ezio-melotti, @serhiy-storchaka
Files
  • pp.patch: Proposed patch for _sre.c
  • pp-2.patch
  • pp-3.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2013-04-13.20:48:30.420>
    created_at = <Date 2013-01-22.15:54:28.408>
    labels = ['type-security', 'extension-modules', 'expert-regex']
    title = '_sre: avoid relying on pointer overflow'
    updated_at = <Date 2013-10-31.15:03:20.570>
    user = 'https://bugs.python.org/NickolaiZeldovich'

    bugs.python.org fields:

    activity = <Date 2013-10-31.15:03:20.570>
    actor = 'vstinner'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2013-04-13.20:48:30.420>
    closer = 'serhiy.storchaka'
    components = ['Extension Modules', 'Regular Expressions']
    creation = <Date 2013-01-22.15:54:28.408>
    creator = 'Nickolai.Zeldovich'
    dependencies = []
    files = ['28804', '29368', '29379']
    hgrepos = []
    issue_num = 17016
    keywords = ['patch']
    message_count = 14.0
    messages = ['180403', '180411', '180443', '180473', '180484', '180485', '182231', '183891', '183959', '183960', '183965', '184184', '186778', '186840']
    nosy_count = 7.0
    nosy_names = ['mark.dickinson', 'vstinner', 'ezio.melotti', 'mrabarnett', 'python-dev', 'serhiy.storchaka', 'Nickolai.Zeldovich']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'security'
    url = 'https://bugs.python.org/issue17016'
    versions = ['Python 2.7', 'Python 3.3', 'Python 3.4']

    @NickolaiZeldovich
    Copy link
    Mannequin Author

    NickolaiZeldovich mannequin commented Jan 22, 2013

    Modules/_sre.c relies on pointer overflow in 5 places to check that the supplied offset does not cause wraparound when added to a base pointer; e.g.:

                    SRE_CODE prefix_len;
                    GET_ARG; prefix_len = arg;
                    GET_ARG;
                    /* Here comes the prefix string */
                    if (code+prefix_len < code || code+prefix_len > newcode)
                        FAIL;
    

    however, pointer wraparound is undefined behavior in C, and gcc will optimize away (code+prefix_len < code) to (true), since prefix_len is an unsigned value. This will happen with -O2 and even with -fwrapv:

    nickolai@sahara:/tmp$ cat x.c
    void bar();
    
    void
    foo(int *p, unsigned int x)
    {
      if (p + x < p)
        bar();
    }
    nickolai@sahara:/tmp$ gcc x.c -S -o - -O2 -fwrapv
    ...
    foo:
    .LFB0:
    	.cfi_startproc
    	rep
    	ret
    	.cfi_endproc
    ...
    nickolai@sahara:/tmp$ 

    On a 32-bit platform with the development version of cpython, prefix_len seems to end up being an 'unsigned int', so I suspect that supplying a large prefix_len value (perhaps 0xffffffff) could lead to the subsequent loop writing garbage all over memory, or worse (but I have not tried to construct a concrete input that triggers this bug, so maybe there are some checks that make it difficult to trigger the bug).

    In any case, this might be worth fixing -- the attached patch provides one proposed fix. Another option might be to add -fno-strict-overflow to the gcc flags, which may be a reasonable additional measure to take, to avoid such problems biting Python in the future, but I would suggest doing this in addition to fixing the code (since not all compilers support such a flag to disable certain optimizations).

    @NickolaiZeldovich NickolaiZeldovich mannequin added the type-security A security issue label Jan 22, 2013
    @serhiy-storchaka
    Copy link
    Member

    LGTM.

    There are other doubtful places, at lines: 658, 678, 1000, 1084, 2777, 3111.

    @mrabarnett
    Copy link
    Mannequin

    mrabarnett mannequin commented Jan 23, 2013

    Lines 1000 and 1084 will be a problem only if you're near the top of the address space. This is because:

    1. ctx->pattern[1] will always be <= ctx->pattern[2].

    2. A value of 65535 in ctx->pattern[2] means unlimited, even though SRE_CODE is now UCS4.

    See also issue bpo-13169.

    If the 'unlimited' value is raised then fixing those lines will become more urgent.

    @NickolaiZeldovich
    Copy link
    Mannequin Author

    NickolaiZeldovich mannequin commented Jan 23, 2013

    Lines 2777 and 3111 do indeed look suspect, because gcc can compile (ptr + offset < ptr) into (offset < 0):

    nickolai@sahara:/tmp$ cat x.c 
    void bar();
    
    void
    foo(char* ptr, int offset)
    {
      if (ptr + offset < ptr)
        bar();
    }
    nickolai@sahara:/tmp$ gcc x.c -S -o - -O2
    ...
    foo:
    .LFB0:
    	.cfi_startproc
    	testl	%esi, %esi
    	js	.L4
    	rep
    	ret
    	.p2align 4,,10
    	.p2align 3
    .L4:
    	xorl	%eax, %eax
    	jmp	bar
    	.cfi_endproc
    ...
    nickolai@sahara:/tmp$ 

    Lines 658, 678, 1000, 1084 are potentially problematic -- I don't know of current compilers that will do something unexpected, but it might be worth rewriting the code to avoid undefined behavior anyway.

    @mrabarnett
    Copy link
    Mannequin

    mrabarnett mannequin commented Jan 23, 2013

    You're checking "int offset", but what happens with "unsigned int offset"?

    @NickolaiZeldovich
    Copy link
    Mannequin Author

    NickolaiZeldovich mannequin commented Jan 23, 2013

    For an unsigned int offset, see my original bug report: gcc eliminates the check altogether, since offset >= 0 by definition.

    @serhiy-storchaka
    Copy link
    Member

    Nickolai, are you want to update your patch with fixes for other possible pointer overflows? Note, that the maximal repetition number has been increased now.

    @serhiy-storchaka serhiy-storchaka self-assigned this Feb 16, 2013
    @NickolaiZeldovich
    Copy link
    Mannequin Author

    NickolaiZeldovich mannequin commented Mar 10, 2013

    Sorry for the delay. Attached is an updated patch that should fix all of the issues mentioned in this bug report.

    @serhiy-storchaka
    Copy link
    Member

    Nickolai, can you please submit a contributor form?

    http://python.org/psf/contrib/contrib-form/
    http://python.org/psf/contrib/

    @NickolaiZeldovich
    Copy link
    Mannequin Author

    NickolaiZeldovich mannequin commented Mar 11, 2013

    I just submitted the contributor form -- thanks for the reminder.

    @NickolaiZeldovich
    Copy link
    Mannequin Author

    NickolaiZeldovich mannequin commented Mar 11, 2013

    I get an HTTP error when trying to upload another patch through Rietveld, so here's a revised patch that avoids the need for Py_uintptr_t (thanks Serhiy).

    @serhiy-storchaka
    Copy link
    Member

    Of course it would be nice to have the tests for so much cases as possible, but I am afraid that it will not be easy. The patch LGTM.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 13, 2013

    New changeset 27162465316f by Serhiy Storchaka in branch '2.7':
    Issue bpo-17016: Get rid of possible pointer wraparounds and integer overflows
    http://hg.python.org/cpython/rev/27162465316f

    New changeset 2673d207c524 by Serhiy Storchaka in branch '3.3':
    Issue bpo-17016: Get rid of possible pointer wraparounds and integer overflows
    http://hg.python.org/cpython/rev/2673d207c524

    New changeset f280786d0e64 by Serhiy Storchaka in branch 'default':
    Issue bpo-17016: Get rid of possible pointer wraparounds and integer overflows
    http://hg.python.org/cpython/rev/f280786d0e64

    @serhiy-storchaka serhiy-storchaka added the extension-modules C modules in the Modules dir label Apr 13, 2013
    @serhiy-storchaka
    Copy link
    Member

    Thank you, Nickolai, for the patch.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir topic-regex type-security A security issue
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants