-
Notifications
You must be signed in to change notification settings - Fork 74
Reuse existing fields to speed up default case in DOMHandle. #1053
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse existing fields to speed up default case in DOMHandle. #1053
Conversation
|
Thanks for isolating a potential hotspot. One concern -- as I understand it, DOM implementations are not required to be thread safe. I'm wondering whether it might be best to wrap the DocumentBuilder and LSParser in a ThreadLocal. One footnote: unless it's necessary to buffer an in-memory document, other XML representations (such as StAX or a JAXP StreamSource) might be more efficient than DOM. By the way, have we asked for a contributor agreement? |
|
Maybe #477 may be leveraged here. This introduced an Potentially, you may add similar suppliers for the DOM factories as well. |
|
I agree, a ThreadLocal definitely makes sense. I will look into this the next couple of days. |
|
I added a per thread caching factory for |
Creating a new DocumentBuilder and LsParser each time DOMHandle#receiveContent gets called is very time consuming. Fix this by using a per thread cached DocumentBuilder in DomHandle. LSParser is created the first time receiveContent is called and then reused. Extracted CachedInstancePerThreadSupplier to be able to reuse it in DocumentBuilderFactories. Fixes #1054. Signed-off-by: Wagner Michael <maffelbaffel@posteo.de>
ehennum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks good to me -- especially refactoring out CachedInstancePerThreadSupplier as a generally reusable class.
My preference would be to put the DOM factories in the same class as the other XML factories for easier comparison and maintenance in parallel, but that's just a preference.
Regarding whether LSParser should be thread local, I would think that's only necessary if constructing the LSParser was also a performance hotspot in your testing, making caching advisable. Otherwise, caching the DocumentBuilder would be enough.
If I read the history correctly, @robsman did the initial work on the thread local with soft references, so it would be good to get his review as well if he has time.
Finally, thanks for identifying and solving this problem -- we'll all get the benefit.
Caching it would also speed up queries which do not reuse a |
|
Alright, I added another commit which adds threadlocal caching for There now is a new Also, I merged the Factories into This second commit again speeds up various queries in our codebase where we did not re-use a DOMHandle. Hope that makes sense 👍 |
* Moved DocumentBuilderFactories to existing XmlFactories. * Created class CachedInstancePerThreadFunction similar to CachedInstancePerThreadSupplier. * Re-using thread-local cached instances of LSParser in DOMHandle provided by XmlFactories. This results in faster receiveContent calls which do not re-use a DOMHandle, but rather use a own instance in each api call. Fixes #1054. Signed-off-by: Wagner Michael <maffelbaffel@posteo.de>
|
@maffelbaffel , thanks for the thoughtfulness invested in this pull request. A belated question occurred to me: did your performance testing indicate how much of the DOM initialization cost is consumed in the creation of the DOMImplementationLS object? In other words, if we had a threadsafe cache for the DOMImplementationLS object, would that improve the efficiency of both reading and writing of Document objects without requiring a separate cache for the LSParser? |
|
By DOM initialization cost do you mean this line? The most time-consuming calls by far where creating a new DocumentBuilder ( |
|
I added another patched.zip which shows the current performance flamegraph (just drop the file in your browser to navigate through the flamegraph). I did not yet try writing with a DOMHandle, but we might also have the possiblility to improve by caching Should I check that too? |
|
I should have been more clear -- I was wondering whether it would make sense to cache DOMImplementationLS instead of DocumentBuilder because both the LSParser and LSSerializer are created from the DOMImplementationLS object. I would expect the LSSerializer is pretty lightweight. Anyway, I'll compare and contrast the two flamegraphs. |
|
In comparing the first and second flame graph, I didn't see a big difference in the overall profile. However, I'm not a performance engineer specialist. As usual, the engineering concern would be the balance between optimization and maintainability. If the LSParser caching doesn't make a big difference, maybe it would be better to cache one object. If we do cache one object, maybe that object should be the DOMImplementationLS. By the way, are you using the search summary provided by SearchHandle or simply using the query to retrieve documents? If the latter, DocumentManager.search() is likely to be faster because it transports documents as parts in a multipart instead of extracting them from an XML payload. +1 to your earlier observation about the DOM having a reputation for being slow. If you don't need to keep the document in memory, SAX or STaX might be faster if you need to extract values and Reader or InputStream faster if you just need to send the bits somewhere. |
|
Yes I can understand that especially the second commit adds up complexity for only a few improvements performance wise. I definitely want to get this right in a way both of us are happy 👍 I wonder, because |
|
I think I see what you're saying. The DOM implementation is strange in that the application pays for initializing the DocumentBuilder object (which is itself unsafe) even though the DOMImplementationLS object must be threadsafe. So, if we get the DOMImplementationLS on first use and cache it in a static, we should be okay (and only pay the initialization cost of the DocumentBuilder once even though we don't cache it). While that depends on the current JDK implementation, the DOM implementation seems very stable. If it is ever revised, perhaps the revisers will take thread safety into consideration. Okay, I'm persuaded. |
|
@maffelbaffel Apologies for what has obviously been a long delay in any response. Is this PR still of value to you? And if so, do you happen to have any test for verifying the performance improvement? Or could that simply be done by e.g. reading 1k rows via the old DOMHandle and via the DOMHandle in your PR? |
|
Closing this, as the source repository is no longer accessible. Will revisit if we hear of performance issues with DOMHandle in the future. |
DocumentBuilderFactory,DocumentBuilderandLSParser.These will be used, if no other factory or resolver is specified.
While doing some performance tests in our codebase, I notice that
DOMHandle#receiveContenttakes up about 50% of a request to MarkLogic. In particular callingcreateLSParserandnewDocumentBuildertook most of the time. This PR tries to reuse static fields ofDocumentBuilderFactory,DocumentBuilderandLSParserto speed things up.patched.zip