Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lxml instead of libxml2? #10

Open
jiapei100 opened this issue Aug 6, 2016 · 5 comments
Open

lxml instead of libxml2? #10

jiapei100 opened this issue Aug 6, 2016 · 5 comments

Comments

@jiapei100
Copy link

jiapei100 commented Aug 6, 2016

Hi:

It looks Python now moves to support lxml, instead of libxml2? So, I wonder if itstool can also change from libxml2 support to lxml support?

Cheers

@shaunix
Copy link
Contributor

shaunix commented Aug 6, 2016

Doubtful. libxml2 gives a pretty unprecedented level of access to the XML, and itstool uses this extensively, because what it's doing to XML is non-trivial. IIRC, I even had to land new features in the libxml2 Python bindings to do everything itstool does. Feel free to take a crack at it, though.

@jiapei100
Copy link
Author

@shaunix Oh, Hi, shaunix...
I noticed your pull request at https://github.com/itstool/itstool/pull/3
Great topic...
It will take sometime to shift from libxml2 to liblxml though. Let me try...

@heftig
Copy link

heftig commented Nov 13, 2017

libxml2's Python bindings are insane because they require the program to manually free the document memory.

I recommend migrating away from either libxml2 or Python posthaste.

@concatime
Copy link

@shaunix so, neither xml nor lxml can replace libxml2 for itstool’s usecase?

@shaunix
Copy link
Contributor

shaunix commented May 13, 2021

I don't really know. An important consideration is XPath support. I did recently switch yelp-tools to lxml, using some XPath:

https://gitlab.gnome.org/GNOME/yelp-tools/-/blob/master/tools/yelp-check.in

I remember running into some XPath problems, and I don't remember the details. That code is just using XPath to extract some info, and isn't doing the kind of document rewriting that itstool does. The way lxml handles text nodes doesn't really line up with the XPath data model, so that could be an issue.

It might be doable. I'd certainly welcome not having to do manual memory management in Python anymore (ugh). But it would take me a considerable amount of development time to figure it out, and I just don't have bandwidth for that right now. If someone else wants to take a crack at it, I'll try very hard to prioritize reviewing their work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants