New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disable reference clearing for _ParserSchemaValidationContext class #266

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
2 participants
@juliengreard
Copy link

juliengreard commented May 30, 2018

I catchted a segfault using lxml 4.2.1:

#0  0x00007ffff49f02e2 in __pyx_f_4lxml_5etree_30_ParserSchemaValidationContext_disconnect (__pyx_v_self=0x0) at src/lxml/etree.c:206054
#1  0x00007ffff48e2bd9 in __pyx_pf_4lxml_5etree_14_ParserContext_2__dealloc__ (__pyx_v_self=0x7fffb8670a00) at src/lxml/etree.c:107567
#2  0x00007ffff48e2b22 in __pyx_pw_4lxml_5etree_14_ParserContext_3__dealloc__ (__pyx_v_self=<lxml.etree._ParserContext at remote 0x7fffb8670a00>) at src/lxml/etree.c:107482
#3  0x00007ffff4a073a9 in __pyx_tp_dealloc_4lxml_5etree__ParserContext (o=<lxml.etree._ParserContext at remote 0x7fffb8670a00>) at src/lxml/etree.c:218544
#4  0x000000000048dc5d in _Py_Dealloc (op=<lxml.etree._ParserContext at remote 0x7fffb8670a00>) at ../Objects/object.c:2262
#5  0x00007ffff4a07d51 in __pyx_tp_dealloc_4lxml_5etree__BaseParser (o=<lxml.etree.XMLParser at remote 0x7fffb87a73a0>) at src/lxml/etree.c:218676
#6  0x000000000048dc5d in _Py_Dealloc (op=<lxml.etree.XMLParser at remote 0x7fffb87a73a0>) at ../Objects/object.c:2262
#7  0x00007ffff49fdca0 in __pyx_tp_clear_4lxml_5etree__Document (o=<lxml.etree._Document at remote 0x7fffb8657418>) at src/lxml/etree.c:212358
#8  0x0000000000571e63 in delete_garbage (collectable=0x7fff9ffe3640, old=0x8c5a40 <generations+96>) at ../Modules/gcmodule.c:820
#9  0x00000000005723bb in collect (generation=2) at ../Modules/gcmodule.c:984

It occurs when calling the disconnect() method whose attributes don't exist anymore.

My modification prevent the clearing of those attributes (I believe).

According to the following comment in parser.pxi :

# If the parser was not closed correctly (e.g. interrupted iterparse()),
# and the schema validator wasn't freed and cleaned up yet, the libxml2 SAX
# validator plug might still be in place, which will make xmlFreeParserCtxt()
# crash when trying to xmlFree() a static SAX handler.
# Thus, make sure we disconnect the handler interceptor here at the latest.

It seems that the memory clearing of this object is already taken into account, so there won't be a memory leak.

FYI : this is my first pull request, please tell me if I did something wrong ;-) also I don't know much about Cython & LXML project , so my modification may be wrong

@scoder scoder added the fixed label Jun 3, 2018

@scoder scoder closed this in 69e5de4 Jun 3, 2018

scoder added a commit that referenced this pull request Jun 3, 2018

Work around Cython bug that calls the parent class tp_clear() functio…
…n even if the parent is declared as @no_gc_clear.

See #266.
@scoder

This comment has been minimized.

Copy link
Member

scoder commented Jun 3, 2018

Thanks for the patch. I took another deep look myself, and I think the decorator needs to go to the parser context class instead. Could you please check that my fix works for you? Just install Cython and build lxml from a source checkout (or master branch download).

@juliengreard

This comment has been minimized.

Copy link

juliengreard commented Jun 19, 2018

Hi. It seems that it's working.
Thanks ! Will the fix be integrated in the next release?

@scoder

This comment has been minimized.

Copy link
Member

scoder commented Jun 23, 2018

Released in lxml 4.2.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment