Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Monkey Patch #5

Merged
merged 2 commits into from Jun 7, 2013
Merged

Remove Monkey Patch #5

merged 2 commits into from Jun 7, 2013

Conversation

brianglass
Copy link
Contributor

I have removed the monkey patches for _xpath_in_bbox and xpath_overlaps_bbox. It seems that they were not actually being used anywhere and were causing conflicts with lxml >= 3.

If I'm missing something here please let me know.

@jcushman
Copy link
Owner

Hmm hmm, thanks for passing this on. Yeah, those actually are used -- they're automagically turned into custom selectors for doing spatial searches:

https://github.com/jcushman/pdfquery/#custom-selectors

But it looks like lxml 3 switched to using the cssselect library under the hood, and now custom selectors are implemented in a different way (and hopefully a way that involves less monkeying around). Pyquery adds its custom selectors in JQueryTranslator here:

https://github.com/gawel/pyquery/blob/master/pyquery/cssselectpatch.py

So in_bbox and overlaps_bbox should be re-implemented the way those custom selectors are -- and then we have to get PyQuery using them. Looking at the pyquery source, I think the plan there is to subclass JQueryTranslator to PDFQueryTranslator with our custom selectors, and then instantiate our pyquery as PyQuery(css_translator=PDFQueryTranslator(xhtml=False))

I don't suppose there's any chance you feel like trying to do that? :) If not I'll put it on my list ... in the meantime I wouldn't mind pulling this if you could put it in a try block instead of deleting it, so people who still have the right version of lxml can still use it.

Thanks!
Jack

@brianglass
Copy link
Contributor Author

Go ahead and put it on your list. I'll see if I can find some time to look at it, but don't hold high hopes for anything in the next couple weeks. ;) I'll push out a revised PR this week.

@brianglass
Copy link
Contributor Author

jcushman, I have updated the PR as you requested. I have not had a chance to do it properly, but this is a stop-gap.

jcushman added a commit that referenced this pull request Jun 7, 2013
Handle error caused by incompatibility with lxml >= 3 (stopgap measure).
@jcushman jcushman merged commit d5901dc into jcushman:master Jun 7, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants