Skip to content

Destructor of ElementTree.Element is recursive #73057

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
serhiy-storchaka opened this issue Dec 4, 2016 · 9 comments
Closed

Destructor of ElementTree.Element is recursive #73057

serhiy-storchaka opened this issue Dec 4, 2016 · 9 comments
Assignees
Labels
extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@serhiy-storchaka
Copy link
Member

BPO 28871
Nosy @scoder, @vstinner, @serhiy-storchaka
PRs
  • [Do Not Merge] Convert Misc/NEWS so that it is managed by towncrier #552
  • Files
  • etree-trashcan.patch
  • bug.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2017-01-13.07:31:38.829>
    created_at = <Date 2016-12-04.23:03:24.258>
    labels = ['extension-modules', 'type-crash']
    title = 'Destructor of ElementTree.Element is recursive'
    updated_at = <Date 2017-03-31.16:36:23.864>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2017-03-31.16:36:23.864>
    actor = 'dstufft'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2017-01-13.07:31:38.829>
    closer = 'serhiy.storchaka'
    components = ['Extension Modules']
    creation = <Date 2016-12-04.23:03:24.258>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['45900', '46058']
    hgrepos = []
    issue_num = 28871
    keywords = ['patch']
    message_count = 9.0
    messages = ['282376', '283211', '283739', '284145', '284146', '284155', '284203', '285365', '285367']
    nosy_count = 5.0
    nosy_names = ['scoder', 'vstinner', 'eli.bendersky', 'python-dev', 'serhiy.storchaka']
    pr_nums = ['552']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue28871'
    versions = ['Python 2.7']

    @serhiy-storchaka
    Copy link
    Member Author

    The Element class in the xml.etree.ElementTree module is a collection. It can contain other Element's. But unlike to common Python collections (list, dict, etc) and pure Python classes, C implementation of Element doesn't support unlimited recursion. As result, destroying very deep Element tree can cause stack overflow. Example:

    import xml.etree.cElementTree as ElementTree
    y = x = ElementTree.Element('x')
    for i in range(200000): y = ElementTree.SubElement(y, 'x')

    del x

    @serhiy-storchaka serhiy-storchaka added 3.7 (EOL) end of life extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump labels Dec 4, 2016
    @serhiy-storchaka
    Copy link
    Member Author

    Proposed patch fixes the crash.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 21, 2016

    New changeset 957091874ea0 by Serhiy Storchaka in branch '3.5':
    Issue bpo-28871: Fixed a crash when deallocate deep ElementTree.
    https://hg.python.org/cpython/rev/957091874ea0

    New changeset 78bf34b6a713 by Serhiy Storchaka in branch '2.7':
    Issue bpo-28871: Fixed a crash when deallocate deep ElementTree.
    https://hg.python.org/cpython/rev/78bf34b6a713

    @vstinner
    Copy link
    Member

    bm_xml_etree.py benchmark started to crash on Python 2.7 because of the change 78bf34b6a713.

    Python 2.7 @ 78bf34b6a713: bug.py does crash
    Python 2.7 @ 32cc37a89b58: no crash

    Full script: https://github.com/python/performance/blob/master/performance/benchmarks/bm_xml_etree.py

    @vstinner vstinner reopened this Dec 28, 2016
    @vstinner
    Copy link
    Member

    haypo@selma$ gdb -args ./python ~/bug.py
    (gdb) run
    python: Objects/object.c:2453: _PyTrash_thread_deposit_object: Assertion `PyObject_IS_GC(op)' failed.

    Program received signal SIGABRT, Aborted.

    (gdb) py-bt
    Traceback (most recent call first):
      File "/home/haypo/bug.py", line 130, in bench_parse
        root1 = etree.parse(xml_file).getroot()
      File "/home/haypo/bug.py", line 171, in bench_etree
        bench_func(etree, file_path, xml_data, xml_root)
      File "/home/haypo/bug.py", line 197, in <module>
        bench_etree(1, ET, bench_func)

    (gdb) where
    #0 0x00007ffff711892f in raise () from /lib64/libc.so.6
    #1 0x00007ffff711a52a in abort () from /lib64/libc.so.6
    #2 0x00007ffff7110e37 in __assert_fail_base () from /lib64/libc.so.6
    #3 0x00007ffff7110ee2 in __assert_fail () from /lib64/libc.so.6
    #4 0x0000000000463abe in _PyTrash_thread_deposit_object (op=<Element at remote 0x7fffec9152e0>) at Objects/object.c:2453
    #5 0x00007fffecd29b17 in element_dealloc (self=0x7fffec9152e0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:566
    #6 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec9152e0>) at Objects/object.c:2262
    #7 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915280) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    #8 0x00007fffecd29abc in element_dealloc (self=0x7fffec915280) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
    #9 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915280>) at Objects/object.c:2262
    #10 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915220) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    #11 0x00007fffecd29abc in element_dealloc (self=0x7fffec915220) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
    #12 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915220>) at Objects/object.c:2262
    #13 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec9151c0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    #14 0x00007fffecd29abc in element_dealloc (self=0x7fffec9151c0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
    #15 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec9151c0>) at Objects/object.c:2262
    #16 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915160) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    #17 0x00007fffecd29abc in element_dealloc (self=0x7fffec915160) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
    #18 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915160>) at Objects/object.c:2262
    #19 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915100) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    #20 0x00007fffecd29abc in element_dealloc (self=0x7fffec915100) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
    #21 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915100>) at Objects/object.c:2262
    #22 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec9150a0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    #23 0x00007fffecd29abc in element_dealloc (self=0x7fffec9150a0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
    #24 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec9150a0>) at Objects/object.c:2262
    #25 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec915040) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    #26 0x00007fffecd29abc in element_dealloc (self=0x7fffec915040) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
    #27 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec915040>) at Objects/object.c:2262
    #28 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90ffa0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    #29 0x00007fffecd29abc in element_dealloc (self=0x7fffec90ffa0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
    #30 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90ffa0>) at Objects/object.c:2262
    #31 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90ff40) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    #32 0x00007fffecd29abc in element_dealloc (self=0x7fffec90ff40) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
    #33 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90ff40>) at Objects/object.c:2262
    #34 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fee0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    #35 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fee0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
    #36 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fee0>) at Objects/object.c:2262
    #37 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fe80) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    #38 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fe80) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
    #39 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fe80>) at Objects/object.c:2262
    #40 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fe20) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    #41 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fe20) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
    #42 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fe20>) at Objects/object.c:2262
    #43 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fdc0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    #44 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fdc0) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
    #45 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fdc0>) at Objects/object.c:2262
    #46 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fd60) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    #47 0x00007fffecd29abc in element_dealloc (self=0x7fffec90fd60) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:561
    #48 0x000000000046346a in _Py_Dealloc (op=<Element at remote 0x7fffec90fd60>) at Objects/object.c:2262
    #49 0x00007fffecd29058 in element_dealloc_extra (self=0x7fffec90fd00) at /home/haypo/prog/python/2.7/Modules/_elementtree.c:301
    (...)

    @serhiy-storchaka
    Copy link
    Member Author

    Ah, I tested only with non-debug build in which asserts were ignored! In 2.7 Element doesn't support garbage collection, and the trashcan mechanism Py_TRASHCAN_SAFE_BEGIN/Py_TRASHCAN_SAFE_END can't be applied.

    I see three alternatives:

    1. Just revert the changes. Let deep ElementTree crashing.

    2. Add the support of garbage collection. This will increase the size of empty Element by 1.5 times. This looks less appropriate that the first option since this harms working code.

    3. Try to implement different mechanism. By using external list object as a stack or using other field for creating a linked list.

    I'll revert the patch (except tests fix) and will try to implement different mechanism.

    @serhiy-storchaka serhiy-storchaka removed the 3.7 (EOL) end of life label Dec 28, 2016
    @vstinner
    Copy link
    Member

    This issue seems theorical to me, whereas the breakage of benchmarks is
    very concrete for me. So I suggest to revert the change in Python 2.7.

    (2) looks like the right design and it was implemented in Python 3 (no?).

    I don't think that it's worth it to backport the change to Python 2. You
    are the first one to report the issue and the backport is risky.

    @serhiy-storchaka
    Copy link
    Member Author

    Changes were reverted by 78bf34b6a713.

    It is very uneasy to implement an alternative mechanism (without increasing the size of Element objects). It adds duplication of low level garbage collecting code. I think it is better to left all as is in 2.7. Yet one argument for moving to Python 3.

    @vstinner
    Copy link
    Member

    Yet one argument for moving to Python 3.

    Yep, thanks ;-)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants