Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ElementTree segmentation fault in expat_start_ns_handler #64014

Closed
YannDiorcet mannequin opened this issue Nov 27, 2013 · 9 comments
Closed

ElementTree segmentation fault in expat_start_ns_handler #64014

YannDiorcet mannequin opened this issue Nov 27, 2013 · 9 comments
Labels
stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@YannDiorcet
Copy link
Mannequin

YannDiorcet mannequin commented Nov 27, 2013

BPO 19815
Nosy @freddrake, @scoder, @vstinner, @tiran, @vajrasky
Files
  • trace: Trace of the gdb backtrace
  • aa.tar.gz: An example
  • empty_uri.patch
  • fix_xml_etree_with_empty_namespace.patch: The fix is by Christian Heimes. The unit test is by Vajrasky Kok.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2013-11-28.14:36:16.084>
    created_at = <Date 2013-11-27.17:26:48.699>
    labels = ['library', 'type-crash']
    title = 'ElementTree segmentation fault in expat_start_ns_handler'
    updated_at = <Date 2013-11-28.14:36:16.083>
    user = 'https://bugs.python.org/YannDiorcet'

    bugs.python.org fields:

    activity = <Date 2013-11-28.14:36:16.083>
    actor = 'eli.bendersky'
    assignee = 'none'
    closed = True
    closed_date = <Date 2013-11-28.14:36:16.084>
    closer = 'eli.bendersky'
    components = ['Library (Lib)']
    creation = <Date 2013-11-27.17:26:48.699>
    creator = 'Yann.Diorcet'
    dependencies = []
    files = ['32872', '32876', '32877', '32880']
    hgrepos = []
    issue_num = 19815
    keywords = ['patch']
    message_count = 9.0
    messages = ['204601', '204602', '204603', '204617', '204623', '204644', '204656', '204659', '204661']
    nosy_count = 8.0
    nosy_names = ['fdrake', 'scoder', 'vstinner', 'christian.heimes', 'eli.bendersky', 'python-dev', 'vajrasky', 'Yann.Diorcet']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue19815'
    versions = ['Python 2.7', 'Python 3.3', 'Python 3.4']

    @YannDiorcet
    Copy link
    Mannequin Author

    YannDiorcet mannequin commented Nov 27, 2013

    I fell on a bug in ElementTree of Python 2.7.5 (default, Nov 12 2013, 16:18:04)

    The bug seems to be here: http://hg.python.org/cpython/file/ab05e7dd2788/Modules/_elementtree.c#l2341

    uri is NULL and not checked before be passed to strlen

    Maybe linked to my expat version:
    expat.i686 2.1.0-5.fc19 @fedora
    expat.x86_64 2.1.0-5.fc19 @anaconda

    @YannDiorcet YannDiorcet mannequin added stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump labels Nov 27, 2013
    @vstinner
    Copy link
    Member

    Can you please provide use the script wadl.py? Or if it's not possible, can you please try to write a short Python script reproducing the crash?

    @tiran
    Copy link
    Member

    tiran commented Nov 27, 2013

    @vstinner
    Copy link
    Member

    Thanks, I'm able to reproduce the crash using aa.tar.gz.

    Python traceback on the crash:

    (gdb) py-bt
    Traceback (most recent call first):
      File "/home/haypo/prog/python/default/Lib/xml/etree/ElementTree.py", line 1235, in feed
        self._parser.feed(data)
      File "/home/haypo/prog/python/default/Lib/xml/etree/ElementTree.py", line 1304, in __next__
        self._parser.feed(data)
      File "abcd.py", line 18, in _from_stream
        for event, elem in ET.iterparse(stream, events):
      File "abcd.py", line 30, in <module>
        _from_stream("aa.wadl")

    C traceback in gdb:

    (gdb) where
    #0 0x00007ffff7258491 in __strlen_sse2_pminub () from /lib64/libc.so.6

    #1 0x00007ffff06124d8 in expat_start_ns_handler (self=0x7ffff0ad6d68, prefix=0x0, uri=0x0) at /home/haypo/prog/python/default/Modules/_elementtree.c:3041

    #2 0x00007ffff03d7fc7 in addBinding (parser=0xa8eea0, prefix=0x7ffff7f31c28, attId=0x7ffff0bbc190, uri=0xaa3720 "", bindingsPtr=0x7fffffff6b58) at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:3158

    #3 0x00007ffff03d6de0 in storeAtts (parser=0xa8eea0, enc=0x7ffff06011e0 <utf8_encoding_ns>, attStr=0xaa4170 "<ns2:representation xmlns:ns2=\"http://wadl.dev.java.net/2009/02\\" xmlns=\"\" element=\"org\" mediaType=\"application/xml\"/>\n", ' ' <repeats 24 times>, "<ns2:representation xmlns:ns2=\"http://wadl.dev.java.net/20"..., tagNamePtr=0x7fffffff6b10, bindingsPtr=0x7fffffff6b58) at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:2820

    #4 0x00007ffff03d5b3f in doContent (parser=0xa8eea0, startTagLevel=0, enc=0x7ffff06011e0 <utf8_encoding_ns>, s=0xaa4170 "<ns2:representation xmlns:ns2=\"http://wadl.dev.java.net/2009/02\\" xmlns=\"\" element=\"org\" mediaType=\"application/xml\"/>\n", ' ' <repeats 24 times>, "<ns2:representation xmlns:ns2=\"http://wadl.dev.java.net/20"..., end=0xaa42fa '\313' <repeats 199 times>, <incomplete sequence \313>..., nextPtr=0xa8eed0, haveMore=1 '\001') at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:2464

    #5 0x00007ffff03d4b7e in contentProcessor (parser=0xa8eea0, start=0xaa3fd8 "<application xmlns=\"http://wadl.dev.java.net/2009/02\\"\>\\n <grammars>\n <include href=\"application.wadl/xsd0.xsd\">\n", ' ' <repeats 12 times>, "<doc title=\"Generated\" xml:lang=\"en\"/>\n </include>\n </gra"..., end=0xaa42fa '\313' <repeats 199 times>, <incomplete sequence \313>..., endPtr=0xa8eed0) at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:2105

    #6 0x00007ffff03d9d54 in doProlog (parser=0xa8eea0, enc=0x7ffff06011e0 <utf8_encoding_ns>, s=0xaa3fd8 "<application xmlns=\"http://wadl.dev.java.net/2009/02\\"\>\\n <grammars>\n <include href=\"application.wadl/xsd0.xsd\">\n", ' ' <repeats 12 times>, "<doc title=\"Generated\" xml:lang=\"en\"/>\n </include>\n </gra"..., end=0xaa42fa '\313' <repeats 199 times>, <incomplete sequence \313>..., tok=29, next=0xaa3fd8 "<application xmlns=\"http://wadl.dev.java.net/2009/02\\"\>\\n <grammars>\n <include href=\"application.wadl/xsd0.xsd\">\n", ' ' <repeats 12 times>, "<doc title=\"Generated\" xml:lang=\"en\"/>\n </include>\n </gra"..., nextPtr=0xa8eed0, haveMore=1 '\001') at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:4016

    #7 0x00007ffff03d9213 in prologProcessor (parser=0xa8eea0, s=0xaa3fa0 "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<application xmlns=\"http://wadl.dev.java.net/2009/02\\"\>\\n <grammars>\n <include href=\"application.wadl/xsd0.xsd\">\n", ' ' <repeats 12 times>, "<doc title="..., end=0xaa42fa '\313' <repeats 199 times>, <incomplete sequence \313>..., nextPtr=0xa8eed0) at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:3739

    #8 0x00007ffff03d8cdf in prologInitProcessor (parser=0xa8eea0, s=0xaa3fa0 "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<application xmlns=\"http://wadl.dev.java.net/2009/02\\"\>\\n <grammars>\n <include href=\"application.wadl/xsd0.xsd\">\n", ' ' <repeats 12 times>, "<doc title="..., end=0xaa42fa '\313' <repeats 199 times>, <incomplete sequence \313>..., nextPtr=0xa8eed0) at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:3556

    #9 0x00007ffff03d3e6a in XML_ParseBuffer (parser=0xa8eea0, len=858, isFinal=0) at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:1651

    #10 0x00007ffff03d3d30 in XML_Parse (parser=0xa8eea0, s=0xaa3330 "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<application xmlns=\"http://wadl.dev.java.net/2009/02\\"\>\\n <grammars>\n <include href=\"application.wadl/xsd0.xsd\">\n", ' ' <repeats 12 times>, "<doc title="..., len=858, isFinal=0) at /home/haypo/prog/python/default/Modules/expat/xmlparse.c:1617

    #11 0x00007ffff0614356 in expat_parse (self=0x7ffff0ad6d68, data=0xaa3330 "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<application xmlns=\"http://wadl.dev.java.net/2009/02\\"\>\\n <grammars>\n <include href=\"application.wadl/xsd0.xsd\">\n", ' ' <repeats 12 times>, "<doc title="..., data_len=858, final=0) at /home/haypo/prog/python/default/Modules/_elementtree.c:3351

    #12 0x00007ffff061470c in xmlparser_feed (self=0x7ffff0ad6d68, arg=b'<?xml version="1.0" encoding="UTF-8" standalone="yes"?>\n<application xmlns="http://wadl.dev.java.net/2009/02"\>\\n <grammars>\n <include href="application.wadl/xsd0.xsd">\n <doc title="Generated" xml:lang="en"/>\n </include>\n </grammars>\n <resources base="sdfdsfdsf">\n <resource>\n <resource path="dfdfsddsf">\n <method id="usersdfsfdsdf" name="PUT">\n <request>\n <ns2:representation xmlns:ns2="http://wadl.dev.java.net/2009/02" xmlns="" element="org" mediaType="application/xml"/>\n <ns2:representation xmlns:ns2="http://wadl.dev.java.net/2009/02" xmlns="" element="org" mediaType="application/json"/>\n </request>\n </method>\n </resource>\n </resource>\n </resources>\n</application>\n') at /home/haypo/prog/python/default/Modules/_elementtree.c:3423

    #13 0x00000000005ad4fe in call_function (pp_stack=0x7fffffff7128, oparg=1) at Python/ceval.c:4212

    #14 0x00000000005a5d29 in PyEval_EvalFrameEx (f=Frame 0xa38c18, for file /home/haypo/prog/python/default/Lib/xml/etree/ElementTree.py, line 1235, in feed (self=<XMLPullParser(_events_queue=[('start-ns', ('', 'http://wadl.dev.java.net/2009/02')), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff0bb3858>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff088b458>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff081bb58>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff081bc58>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff081bd58>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff081be58>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff081bed8>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff081bf58>), ('start', <xml.etree.ElementTree.Element at remote 0x7ffff08200d8>), ('start-ns', ('ns2', 'http://wadl.dev.java.net/2009/02'))], _parser=<xml.etree.ElementTree.XMLParser at remote 0x7ffff0ad6d68>, _index=0) at remote 0x7ffff0bce468>, data=b'<?xml version="1.0" encoding="UTF...(truncated), throwflag=0) at Python/ceval.c:2826

    @tiran
    Copy link
    Member

    tiran commented Nov 27, 2013

    The patch removes the cause of the segfault but I'm no sure if that's the right way. I'm adding Eli und Stefan to the ticket.

    @vajrasky
    Copy link
    Mannequin

    vajrasky mannequin commented Nov 28, 2013

    Here is the patch (by Christian Heimes) with unit test (by me).

    Apparently the namespace handlers (start-ns and end-ns) got problem with empty namespace. But they (start-ns and end-ns) must be combined together to create this problem. start-ns handler only will not create this problem.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 28, 2013

    New changeset 395a266bcb5a by Eli Bendersky in branch '2.7':
    Issue bpo-19815: Fix segfault when parsing empty namespace declaration.
    http://hg.python.org/cpython/rev/395a266bcb5a

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 28, 2013

    New changeset 68f1e5262a7a by Eli Bendersky in branch '3.3':
    Issue bpo-19815: Fix segfault when parsing empty namespace declaration.
    http://hg.python.org/cpython/rev/68f1e5262a7a

    New changeset 2b2925c08a6c by Eli Bendersky in branch 'default':
    Issue bpo-19815: Fix segfault when parsing empty namespace declaration.
    http://hg.python.org/cpython/rev/2b2925c08a6c

    @elibendersky
    Copy link
    Mannequin

    elibendersky mannequin commented Nov 28, 2013

    Thanks for the report & patches. Fixed in all active branches.

    @elibendersky elibendersky mannequin closed this as completed Nov 28, 2013
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants