Browse files

Clean all occurrences of _private on GC when libxml-ruby is loaded.

libxml-ruby uses the global libxml2 callback xmlDeregisterNodeDefault.

This callback looks in the _private field for every deleted libxml2
node. If the _private field is set, it is treated as a Ruby VALUE ptr.
The callback NULLs the Ruby object's data pointer and mark & free fn

When Nokogiri wraps a libxml2 document it puts a nokogiriTuple ptr
in the _private field. We can't let the libxml-ruby callback be
invoked on a libxml2 document before NULLing the document _private

When Nokogiri wraps other libxml2 nodes it puts VALUE ptrs in the
_private field. This makes the libxml-ruby callback generally safe for
these node types.

There is a caveat - it is unsafe to access VALUE ptrs during GC sweep.

This commit tests whether the xmlDeregisterNodeDefault callback is set
during document cleanup. If set, we traverse the document and clean up
all _private fields.

The previous solution was to unset xmlDeregisterNodeDefault callback.
However, this doesn't work in multithreaded environments.
  • Loading branch information...
1 parent a098ddf commit 45e51d74112e8639e004d99197189d95461829a3 @ender672 ender672 committed with flavorjones Apr 25, 2013
Showing with 27 additions and 4 deletions.
  1. +27 −4 ext/nokogiri/xml_document.c
@@ -17,24 +17,47 @@ static int dealloc_node_i(xmlNodePtr key, xmlNodePtr node, xmlDocPtr doc)
+static void remove_private(xmlNodePtr node)
+ xmlNodePtr child;
+ for (child = node->children; child; child = child->next)
+ remove_private(child);
+ if ((node->type == XML_ELEMENT_NODE ||
+ node->type == XML_XINCLUDE_START ||
+ node->type == XML_XINCLUDE_END) &&
+ node->properties) {
+ for (child = (xmlNodePtr)node->properties; child; child = child->next)
+ remove_private(child);
+ }
+ node->_private = NULL;
static void dealloc(xmlDocPtr doc)
- xmlDeregisterNodeFunc func;
st_table *node_hash;
- func = xmlDeregisterNodeDefault(NULL);
node_hash = DOC_UNLINKED_NODE_HASH(doc);
st_foreach(node_hash, dealloc_node_i, (st_data_t)doc);
- doc->_private = NULL;
+ /* When both Nokogiri and libxml-ruby are loaded, make sure that all nodes
+ * have their _private pointers cleared. This is to avoid libxml-ruby's
+ * xmlDeregisterNode callback from accessing VALUE pointers from ruby's GC
+ * free context, which can result in segfaults.
+ */
+ if (xmlDeregisterNodeDefaultValue)
+ remove_private((xmlNodePtr)doc);
- xmlDeregisterNodeDefault(func);

0 comments on commit 45e51d7

Please sign in to comment.