Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downgrade Node, Context and Object document references to Weak #42

Merged
merged 1 commit into from
Sep 24, 2018

Conversation

dginev
Copy link
Member

@dginev dginev commented Sep 23, 2018

CC @triptec , would be good to get a review if you have some a bit of time on your hands.

The story here is as follows. I hit memory leak problems in one of the projects I have using rust-libxml, where I have a corpus-level iterator that traverses multiple documents, and parses them into libxml as you .next() through the iterator.

Before the big Node refactor, that code worked very elegantly, in almost constant memory, and gracefully deallocated each libxml2 Document (and its sub-objects) as the iterator proceeded to the next one. After, I observed memory leaking, as (a portion of) the allocated memory for each document remained present for the full run.

I had some suspicions and indeed - it turned out that the Rc<> wrappers ended up impossible to deallocate in my setup due to having references to a document in multiple levels of a deep data structure. It was also extremely confusing to 1) localize and 2) understand the details of how this leakage occurred. It is both silent and hard to grasp, and it isn't helping that the particular project can't run under valgrind for separate and unrelated reasons.

So, anyhow, there is an obvious way to relax our design to avoid unneeded "memory hogging", which is to downgrade the Rc<> wrappers into Weak<> wrappers, that do not enforce ownership.

This PR takes a stab at that, and indeed I can report my project is back to constant memory use, and is leak-free. Things I am not fully happy about:

  • I really dislike using .unwrap(), especially since if the main document Rc is no longer in scope, it will panic the entire process. Thinking how to improve without introducing endless result types...
  • The new requirement for authors would be that a document needs to remain in scope while its nodes are used, which is in fact what all reasonable programs should do already. If a node is getting transferred to a different document, it should be setting unlinked and all rust code should be guarding using that flag, against directly accessing the Weak reference.

That's about all... I am quite happy to have solved the memory leak, so I am quite confident we need a solution in this vein ...

@triptec
Copy link
Collaborator

triptec commented Sep 23, 2018

Could you add a test that when running without those changes would exhibit the memory leak? I think your changes makes sense and that you should merge

@dginev
Copy link
Member Author

dginev commented Sep 24, 2018

I'll work on the tests in the coming weeks, it's "awkward" to write leakage tests... But I agree we should have them. I also want to release the version with the fix to make it a dependency "officially", so merging here - thanks for reviewing!

@dginev dginev merged commit bd4b120 into master Sep 24, 2018
@dginev dginev deleted the weak-document-references branch March 28, 2019 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants