Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lxml missing type annotation for XPath #525

Closed
uglycoyote opened this issue Sep 8, 2016 · 16 comments
Closed

lxml missing type annotation for XPath #525

uglycoyote opened this issue Sep 8, 2016 · 16 comments

Comments

@uglycoyote
Copy link

The following small script

from lxml import etree    
NameXPath = etree.XPath("Name/text()")

gives the error

test.py:3: error: "module" has no attribute "XPath"

which appears to be because the lxml typings are incomplete. I confirmed that this source does not mention XPath

https://github.com/python/typeshed/blob/master/third_party/3/lxml/etree.pyi

but I don't know enough about mypy or the types in lxml to propose a patch myself.

I asked a question about this on Stack Overflow (http://stackoverflow.com/questions/39382937/mypy-spurious-error-module-has-no-attribute-xpath-with-etree#39382937) which brought me here.

@Naruto0
Copy link
Contributor

Naruto0 commented Sep 22, 2016

It seems that lmxl stubs are not complete. On code from lxml import html I have following error __main__.py:64: error: Module has no attribute 'html'

@gvanrossum
Copy link
Member

gvanrossum commented Sep 22, 2016 via email

@Naruto0
Copy link
Contributor

Naruto0 commented Sep 23, 2016

If @uglycoyote won't complain i call dibs on lxml typeshed.

@gvanrossum
Copy link
Member

Go for it!

@uglycoyote
Copy link
Author

Yes, naruto0, please do. I am only a casual user of lxml and don't really understand the types that it uses, hence my need for typeshed to help me. Thanks for offering. And thanks Guido for nudging this along.

@Naruto0
Copy link
Contributor

Naruto0 commented Oct 4, 2016

Bump, i'm working on it, have been busy lately.

@underyx
Copy link

underyx commented Nov 8, 2016

Here's a quick and friendly ping for @Naruto0

@gvanrossum
Copy link
Member

Assuming @Naruto0 is still busy, would you (@underyx) want to submit a PR yourself?

@underyx
Copy link

underyx commented Nov 8, 2016

Unfortunately my experience with mypy and type annotations can be summed up as 'tried it on a project, saw the lxml failures, decided to stop trialling mypy', so, just like the issue author, I don't think I'm qualified to take on the task.

Though, I am planning to look into the topic further in the coming weeks, so no idea if this'll stay the case.

@uglycoyote
Copy link
Author

uglycoyote commented Nov 8, 2016

It would be great if there was a tool which helped someone explore the types that a library uses, to help someone make a correct pyi type information file without necessarily being an export on the source.

I understand that there is an existing tool to statically analyze sources and generate a stub pyi file (stubgen) but when I tried running "stubgen.py lxml" I got a file with exactly one line

def get_include(): ...

Also it seems unlikely to me that the static approach would work very well: since the sources don't declare the types of any function or variable, how on earth would a static analyzer know what type a function returns. It seems to me that the only workable solution would be to have a dynamic analyzer which can give you information about the types from runtime so that you can use them later when not running.

e.g. I can do the following in an interactive type window:

>>> x = etree.XPath("Hello")
>>> type(x)
<class 'lxml.etree.XPath'> 

which tells me enough that I might be able to write a stub for the "etree.XPath" function, but at best I would just make a class definition for XPath which says something like

class XPath:
pass

because I still know next to nothing about the XPath class.
Maybe I could even go so far as to type

>>> [y[0] for y in inspect.getmembers(x)]
['__call__', '__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__ne__', '__new__', '__pyx_vtable__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'error_log', 'evaluate', 'path']

which tells me that an XPath object has a member called 'evaluate' and a member called path.

I know from experience that the XPath object can be used as though it is a function itself, e.g. the docs at http://lxml.de/xpathxslt.html#the-xpath-class give the example

>>> find = etree.XPath("//b")
>>> print(find(root)[0].tag) 

where find(root) is presumably equivalent to calling find.evaluate(root). But I would have no idea how to express the idea of a "object that acts like a function but is actually a class instance" in a pyi file with my limited knowledge of type annotations.

I tried looking in my site-packages/lxml directory to see if any of the source made sense to but I was completely bewildered where to start. When I do "from lxml import etree" it must be importing etree.pyd, which is binary, which is completely opaque

@gvanrossum
Copy link
Member

gvanrossum commented Nov 9, 2016 via email

@timabbott
Copy link
Contributor

I just ran into a similar issue with using lxml.html on Python 3 (zulip/zulip@ec080ae). Unless this gets cleaned up soon, we might want to just remove the lxml stubs until they're more complete.

@gvanrossum
Copy link
Member

@JelleZijlstra Do you have any insights to share here? In your opinion, do you think there's still hope to get the lxml stubs quickly to a point where most users will be happy with them, or do you think it's better to rip them out until someone wants to take this up as a serious project (perhaps distributing the lxml stubs separately using the mechanisms proposed in Ethan's PEP 561)? Or do you think the fuss might perhaps be exaggerated and we should just keep improving the lxml stubs incrementally?

JelleZijlstra added a commit that referenced this issue Oct 10, 2017
I disagree with the recommendation that users create incomplete stubs, because such stubs lead to false positive type checker errors (see for example #525). I would like to instead set a norm that all stubs should contain all public objects in the library they cover.
@JelleZijlstra
Copy link
Member

I think we should remove the stubs for now and take a stance against incomplete stubs in general (filed #1659 for that). Hopefully the lxml stubs can become a PEP 561-style stubs package and develop to completion.

gvanrossum pushed a commit that referenced this issue Oct 10, 2017
I disagree with the recommendation that users create incomplete stubs, because such stubs lead to false positive type checker errors (see for example #525). I would like to instead set a norm that all stubs should contain all public objects in the library they cover.
@gvanrossum
Copy link
Member

OK let's remove it.

JelleZijlstra added a commit to JelleZijlstra/typeshed that referenced this issue Oct 11, 2017
JelleZijlstra added a commit to JelleZijlstra/mypy that referenced this issue Nov 7, 2017
See python/typeshed#525 and python/typeshed#1664.

We decided to remove the lxml stubs because they cause problems so
frequently. However, this requires small tweaks to mypy's own usage
of lxml.

My plan is to proceed as follows:
- Merge this PR, adding type ignores and removing warn_unused_ignores
  from mypy's self-check config.
- Merge python/typeshed#1664, which should pass Travis after his PR
  is merged.
- Add "warn_unused_ignores" back to mypy's self-check config.

I confirmed that this PR passes tests both with and without the
lxml stubs being present.
JukkaL pushed a commit to python/mypy that referenced this issue Nov 7, 2017
See python/typeshed#525 and python/typeshed#1664.

We decided to remove the lxml stubs because they cause problems so
frequently. However, this requires small tweaks to mypy's own usage
of lxml.

My plan is to proceed as follows:
- Merge this PR, adding type ignores and removing warn_unused_ignores
  from mypy's self-check config.
- Merge python/typeshed#1664, which should pass Travis after his PR
  is merged.
- Add "warn_unused_ignores" back to mypy's self-check config.

I confirmed that this PR passes tests both with and without the
lxml stubs being present.
gvanrossum pushed a commit that referenced this issue Nov 7, 2017
@JelleZijlstra
Copy link
Member

We decided to solve this issue by removing the stubs.

I made a standalone project https://github.com/JelleZijlstra/lxml-stubs to provide a place to improve the lxml stubs. I probably won't be actively working on improving the stubs, but contributions are welcome.

timabbott added a commit to zulip/zulip that referenced this issue Nov 25, 2017
Mostly these can be removed because the broken LXML stubs were removed
from typeshed in python/typeshed#525.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants