New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make libxml-ruby play nice with other consumers #118

Closed
wants to merge 2 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@jacobbednarz

jacobbednarz commented May 18, 2016

libxml-ruby registers a global callback with libxml for node
deletions. If a node's _private field is not NULL, libxml-ruby
reaches through the pointer and clobbers a few fields. This interacts
badly with other consumers of the underlying libxml library, such as
Nokogiri. This fixes the behaviour by maintaining a hashtable of node
pointers which we own and not allowing writing through any others.

We have been running this in production for close to 48 hours and so far
it's stopped all occurrences of segmentation faults where previously we
would be receiving ~25 per hour. We've been monitoring for memory leaks
but have not yet seen anything.

Make libxml-ruby play nice with other consumers
`libxml-ruby` registers a global callback with libxml for node
deletions. If a node's `_private` field is not NULL, `libxml-ruby`
reaches through the pointer and clobbers a few fields. This interacts
badly with other consumers of the underlying libxml library, such as
Nokogiri.

Maintain a hashtable of node pointers which we own. Do not write through
any others.
@@ -45,13 +45,18 @@ static void rxml_node_deregisterNode(xmlNodePtr xnode)
/* Has the node been wrapped and exposed to Ruby? */
if (xnode->_private)
{
/* Does it belong to another libxml user? */
if (!OWNED(xnode))

This comment has been minimized.

@johnsyweb

johnsyweb May 18, 2016

If I'm reading this correctly, we could replace the condition above with if (OWNED(xnode)) and reduce the number of checks.

This comment has been minimized.

@jacobbednarz

jacobbednarz May 18, 2016

Readability wise, I prefer it returning early instead of wrapping the lot in a big if block. Makes it easier to follow IMO.

This comment has been minimized.

@johnsyweb

johnsyweb May 19, 2016

Agreed.

To clarify my thinking...

First I would refactor to a single-condition and remove a level of nesting:

if (xnode->_private && OWNED(xnode)) {
  // do stuff

Then I would refactor to return early to remove the block:

if (!xnode->_private || !OWNED(xnode))
  return;

But I think (!xnode->_private || !OWNED(xnode)) is a tautology (or at least !OWNED(xnode) || !xnode->_private is)... so I would remove the redundancy:

if (!OWNED(xnode))
  return;

Either way. Nice work on tracking this down and fixing the bug!

This comment has been minimized.

@cfis

cfis May 19, 2016

Member

Nice work!

So, I agree, isn't this line now incorrect and needs to be remove?

if (xnode->_private)

This comment has been minimized.

@jacobbednarz

jacobbednarz May 19, 2016

It's entirely possible. I was not 100% sure if we wanted to also be cleaning up regardless of the xnode->_private check being performed and solely rely on the OWNED(xnode) check. I was not sure what other conditions may be introduced if we removed this.

This comment has been minimized.

@cfis

cfis May 20, 2016

Member

Cool (learned something new SGTM).

@jacobbednarz - Any chance you can update the patch to not touch private. Also, any reason for the macros versus just a function call?

This comment has been minimized.

@jacobbednarz

jacobbednarz May 20, 2016

Will take a look today and see what I come up with. @abrasive wrote the original patch (I'm just pushing it upstream and maintaining it internally) so he's best to answer re: macros vs. function calls.

This comment has been minimized.

@abrasive

abrasive May 20, 2016

Contributor

They're macros because the nodes are three different C types (node/document/dtd), but they all have the _private field. Without the _private check, the check functions can take void* instead.

This comment has been minimized.

@cfis

cfis May 20, 2016

Member

Make sense...So @abrasive do you mind updating the patch? If it looks good, then I'll apply it and cut a new release.

Thanks again for solving this - its really great!

This comment has been minimized.

@abrasive

abrasive May 20, 2016

Contributor

Yep, I'll bring you a fresh patch in a few days. Thanks for getting back on this so promptly!

@cfis

This comment has been minimized.

Member

cfis commented May 19, 2016

Nice detective work. So nokogiri also sets the private field in newer versions? Is that the cause of the issue?

@jacobbednarz

This comment has been minimized.

jacobbednarz commented May 19, 2016

I feel that is the case since there are a few places which Nokogiri is setting node->_private to NULL which could be leading us down this path.

I've also just found this code comment and this commit which shows me Nokogiri tried to protect (?) against some of this behaviour.

@cfis

This comment has been minimized.

Member

cfis commented May 19, 2016

Ok, that makes sense.

Move NULL check into `rxml_owned_p`
Removes a level of nesting and takes the NULL check into `rxml_owned_p`
instead of leaving it inside of `rxml_node_deregisterNode`.
@jacobbednarz

This comment has been minimized.

jacobbednarz commented May 25, 2016

Closing in favour of refactor in #119 and the refactor in there.

@jacobbednarz jacobbednarz deleted the jacobbednarz:play-nice-with-other-consumers branch May 25, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment