New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add adoptExternalDocument to public header #240
Conversation
Would it be acceptable for you to have a combined C-API function instead that wraps both Proposal in search of a better name: That function would then also check that the tree doesn't contain any non-NULL |
Speaking of which, please take a look at the newly added See this ticket for why it was added: The author refers to the Gumbo parser as well, but I can't say whether he ended up using it. Sorry, I had completely forgotten about that part. |
@scoder ah, hadn't seen that PR. I think I'd prefer to access it directly from C/Cython rather than rope in PyCapsules. Would something like this work on your end? cdef public api _ElementTree adoptExternalDocument(xmlDoc* c_doc, parser, bint is_owned):
if c_doc is NULL:
raise TypeError
doc = _adoptForeignDoc(c_doc, parser, is_owned)
return _elementTreeFactory(doc, None) |
Yes, perfect. Thanks! |
That's great! Confirmed that it's working in https://github.com/thatdatabaseguy/gumbo_lxml. Would you be able to cut a new version of lxml that I can require? |
Haven't got a release date yet, but I was planning to release 3.8.0 pretty soon, yes. |
Sounds good, will keep an eye out for it. |
Allows the creation of LxmlDocument structs used throughout lxml.etree from a raw libxml xmlDoc pointer in a C/Cython extension.
Example usage:
My current use case is the one mentioned in #239 (closed in favor of this, simpler to move that extension to a separate project), where I would like to use an external C HTML parser (gumbo-parser), build a libxml tree from its output (gumbo-libxml), and have the ability to run XPaths, cleaner, etc. on said tree using lxml.
This could generally open the door for using other C parsers for lxml.