New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin for ctypes.Array #4869

Merged
merged 28 commits into from Nov 19, 2018

Conversation

Projects
None yet
4 participants
@dgelessus
Copy link
Contributor

dgelessus commented Apr 7, 2018

Followup to python/typeshed#1986 (specifically this discussion). This adds a mypy.plugins.ctypes module, which provides accurate type info for ctypes.Array's methods. The corresponding Typeshed stub uses Any in lots of places, because ctypes' rules for what is accepted/returned are too complex. This plugin provides callbacks for the following ctypes.Array methods and attributes:

  • __init__ ctypes.Array constructor
  • __getitem__
  • __setitem__
  • __iter__
  • value and raw (see python/typeshed#2111)

Note: ignore the following paragraphs, the issues described there have now been fixed in mypy.

This is still a work in progress. I added some debug prints showing which plugin hooks are called when, because it seems they are not always called properly. Specifically:

  • For __getitem__, get_method_hook is called as expected.
  • For __init__, get_method_hook is not called. Instead, get_function_hook is called with "Array" as the argument. This seems like a bug - shouldn't mypy figure out that Array is a class and look at its __init__ method? Even if this is intentional, the name should be "ctypes.Array" and not "Array".
  • For __setitem__, get_method_hook is not called either. Instead, get_function_hook is called with "__setitem__ of Array" as the argument. This also seems like a bug - __setitem__ is clearly a method and not a function, and "__setitem__ of Array" is not a fully qualified name.
  • For __iter__, the behavior is like with __setitem__.
  • get_method_signature_hook is never called for any of these dunder methods.

Are these bugs, or am I simply misunderstanding how the plugin hooks are supposed to behave?

Here's the test code with which I observed this behavior:

import ctypes

class MyCInt(ctypes.c_int):
    pass

intarr4 = ctypes.c_int * 4
reveal_type(intarr4)
a = intarr4(1, 2, 3, 4)
reveal_type(a)
reveal_type(a[0])
reveal_type(a[1:3])
a[0] = 42
a[1] = ctypes.c_int(42)
a[2] = MyCInt(42)
a[3] = b"bytes"  # type error
reveal_type(iter(a))
for x in a:
    reveal_type(x)

myintarr4 = MyCInt * 4
reveal_type(myintarr4)
mya = myintarr4(1, 2, 3, 4)
reveal_type(mya)
reveal_type(mya[0])
reveal_type(mya[1:3])
mya[0] = 42
mya[1] = ctypes.c_int(42)  # type error
mya[2] = MyCInt(42)
mya[3] = b"bytes"  # type error
reveal_type(iter(mya))
for myx in mya:
    reveal_type(myx)
Add basic parts of a ctypes plugin
This provides accurate return types for ctypes.Array.__getitem__
and __iter__. __init__ is partially implemented but not yet working.
@ilevkivskyi

This comment has been minimized.

Copy link
Collaborator

ilevkivskyi commented Apr 23, 2018

Hm, it looks like a forgot to press "Comment" a week ago, sorry. Trying to remember, I think my conclusion was: behavior of __init__ is intentional (modulo name), other two are bugs. Fixing these should be a separate PR landing before this one.

@ilevkivskyi

This comment has been minimized.

Copy link
Collaborator

ilevkivskyi commented Apr 23, 2018

... and thanks for working on this! :-)

@dgelessus

This comment has been minimized.

Copy link
Contributor

dgelessus commented Apr 23, 2018

Hm, it looks like a forgot to press "Comment" a week ago, sorry.

No worries :)

behavior of __init__ is intentional (modulo name), other two are bugs.

OK, that means the following bugs exist at the moment:

  • When a class is instantiated, the class name passed to get_function_hook is not fully qualified
  • For some dunder methods (__setitem__, __iter__), get_function_hook is incorrectly called instead of get_method_hook (and the method name is not fully qualified)
  • get_method_signature_hook is not called for some dunder methods (__getitem__, __setitem__, __iter__)

Is that correct? Should I put these points into a separate issue (or multiple issues)?

@ilevkivskyi

This comment has been minimized.

Copy link
Collaborator

ilevkivskyi commented Apr 24, 2018

Is that correct? Should I put these points into a separate issue (or multiple issues)?

Yes, I think one common issue for these would be OK.

@gvanrossum

This comment has been minimized.

Copy link
Member

gvanrossum commented Sep 24, 2018

This work has been in progress for a mighty long time. Is it still relevant?

Merge branch 'master' into ctypes_array_plugin
# Conflicts:
#	mypy/plugin.py
@dgelessus

This comment has been minimized.

Copy link
Contributor

dgelessus commented Sep 25, 2018

As far as I can tell, this is still blocked by #4964. I merged master into my branch and checked, and the method signature hooks aren't looked up correctly for __init__, __setitem__ and __iter__.

I'm aware of #5379, but I think that doesn't help in this case. The ctypes.Array creation hook needs to know the element type of the array that's being created, so that it can check the types of the arguments. As far as I can tell, this information is not available in a function hook: the FunctionContext only includes the argument and return types, but not the type of the callable (in this case Type[ctypes.Array[E]], where E is the array's element type).

Because of this, array_init_callback is currently implemented as a method signature hook - that way the array element type can be determined by looking at the type of self (ctx.type). However, this approach currently doesn't work, because mypy doesn't call method hooks for __init__ when a class is instantiated.

@ilevkivskyi

This comment has been minimized.

Copy link
Collaborator

ilevkivskyi commented Sep 25, 2018

Are interested in fixing #4964 first (or some parts of it)? This is an important issue and you have a good context of what is needed in real world.

@dgelessus

This comment has been minimized.

Copy link
Contributor

dgelessus commented Oct 1, 2018

@ilevkivskyi I think I have a possible fix for the remaining #4964 issues, see #5700. Based on my local testing, those fixes should be enough to make the ctypes plugin work correctly.

dgelessus added some commits Nov 9, 2018

Reimplement ctypes.Array constructor hook as discussed in #5700
Class instantiation intentionally does not call plugin hooks on
__init__, instead function hooks for the class name are called.
Since plugin hooks cannot modify the signature of a function call
(there is no "function signature hook", unlike for methods), we instead
perform manual argument type checking in a regular function hook.

@dgelessus dgelessus changed the title [WIP] Plugin for ctypes.Array (and possible bugs with the plugin hooks?) Plugin for ctypes.Array Nov 13, 2018

@ilevkivskyi
Copy link
Collaborator

ilevkivskyi left a comment

Thanks! This looks very good. I also tried this agains our internal codebases, and it doesn't generate any errors.

Here I have some comments (mostly minor).

Show resolved Hide resolved test-data/unit/check-ctypes.test Outdated
Show resolved Hide resolved test-data/unit/check-ctypes.test Outdated
@@ -384,24 +384,48 @@ class DefaultPlugin(Plugin):

def get_function_hook(self, fullname: str
) -> Optional[Callable[[FunctionContext], Type]]:
from mypy.plugins import ctypes

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Nov 14, 2018

Collaborator

This is unfortunate, but it is not your fault. We already have an issue to refactor plugin modules to avoid these imports, see #5117. Could you please also add a comment (or comments?) about this also here?

Another idea is to move the ctypes plugin to plugin.py since all other stdlib plugins live there. In future we might add more stdlib plugins (e.g. for functools) so this file will grow, but for now this is probably a better solution.

This comment has been minimized.

@dgelessus

dgelessus Nov 14, 2018

Contributor

Another idea is to move the ctypes plugin to plugin.py since all other stdlib plugins live there

That would be possible, but I'm not sure if it's worth it here. The circular import issues aren't too big (the only affected parts are the argument type annotations of the hooks). And since the ctypes plugin is a bit larger than the other hooks in plugin.py, I think it makes more sense to have it in a separate module. (It's not the only stdlib plugin to have its own module - dataclasses is separated as well, and it's roughly the same size as the ctypes plugin.)

Show resolved Hide resolved mypy/plugins/ctypes.py
Show resolved Hide resolved mypy/plugins/ctypes.py Outdated
Show resolved Hide resolved mypy/plugins/ctypes.py Outdated
'Array constructor argument {} of type "{}"'
' is not convertible to the array element type "{}"'
.format(arg_num, arg_type, et),
ctx.context)

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Nov 14, 2018

Collaborator

Does this also support calling Array with *args/**kwargs/etc? If yes, I would add a test for this, if no, I would add a comment explaining this.

This comment has been minimized.

@dgelessus

dgelessus Nov 14, 2018

Contributor

You're right, it doesn't work with *args at the moment, but it definitely should. **kwargs should always be an error, since Array doesn't take keyword arguments. (Yes, **{} is technically allowed, but I really hope nobody does that.)

I'm not sure how to handle this in a function hook though - I don't see a way to find out if an argument is a regular arg, *args, or **kwargs.

In any case, I've added a (currently failing) test for *args.

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Nov 18, 2018

Collaborator

Yes, the logic in function hook is sub-optimal, this is probably the most problematic hook currently, several people who tried it, found the API inconvenient. You also get corresponding expressions in ctx.args[0] for every type in ctx.arg_types[0], maybe you can try checking if it is a StarExpr(...) and then somehow "unpack" type if it maps to iterable supertype?

This comment has been minimized.

@dgelessus

dgelessus Nov 19, 2018

Contributor

maybe you can try checking if it is a StarExpr(...)

This doesn't seem to work - if I print out ctx.args[0][0] for a call like intarr4(*int_values), I get NameExpr(int_values [__main__.int_values]), with no StarExpr around it.

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Nov 19, 2018

Collaborator

Hm... This is unfortunate, there is a trivial fix to apply_function_plugin() to pass arg_kinds to FunctionContext, but this looks like a partial solution of a bigger problem. Could you please add a TODO comment here? We can fix it later after we decide what to do with the FunctionContext API.

Show resolved Hide resolved mypy/plugins/ctypes.py Outdated
Show resolved Hide resolved mypy/plugins/ctypes.py Outdated
class slice: pass
class str: pass
class tuple(Generic[T]): pass
class unicode: pass

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Nov 14, 2018

Collaborator

Are these two fixtures needed, or you can slightly tweak one of existing fixtures to accommodate your needs? If yes, then I think it is better than adding two new fixtures.

This comment has been minimized.

@dgelessus

dgelessus Nov 14, 2018

Contributor

The closest one would be floatdict.pyi, if I add definitions for unicode and slice there, I can use it for all of the ctypes tests.

dgelessus added some commits Nov 14, 2018

@dgelessus

This comment has been minimized.

Copy link
Contributor

dgelessus commented Nov 14, 2018

mypy/plugins/ctypes.py:169: error: Argument 1 to "make_simplified_union" of "UnionType" has incompatible type "List[Instance]"; expected "List[Type]"

This seems odd - shouldn't make_simplified_union accept any Sequence[Type]? (If necessary I can work around this by explicitly annotating my list, but it seems strange that an exact list type is required here.)

@ilevkivskyi

This comment has been minimized.

Copy link
Collaborator

ilevkivskyi commented Nov 18, 2018

This seems odd - shouldn't make_simplified_union accept any Sequence[Type]? (If necessary I can work around this by explicitly annotating my list, but it seems strange that an exact list type is required here.)

You are totally allowed to update that signature to be more reasonable :-)

@ilevkivskyi
Copy link
Collaborator

ilevkivskyi left a comment

Thanks for updates! I have one more (hopefully last) round of comments here about few remaining corner cases (sorry for being pedantic here).

Show resolved Hide resolved mypy/plugins/ctypes.py Outdated
"""
# Every type can be converted from itself (obviously).
allowed_types = [tp]
if isinstance(tp, Instance):

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Nov 18, 2018

Collaborator

Again, should we care about unions here?

This comment has been minimized.

@dgelessus

dgelessus Nov 19, 2018

Contributor

We should, but this isn't as straightforward as with the attribute hooks. The type returned from _autoconvertible_to_cdata is not used as a return type, but an argument type, for example in Array.__setitem__. This makes the rules for union types a little more complicated here.

For example, if arr has the type Array[Union[c_int, c_uint]], arr[i] = x is only valid if x can be converted to both c_int and c_uint, i. e. the type of x should be the intersection of _autoconvertible_to_cdata(c_int) and _autoconvertible_to_cdata(c_uint).

As far as I know, mypy doesn't support intersection types, and I can't do an easy intersection calculation using sets, since Type objects are not hashable. So I guess I'd have to write a manual intersection algorithm based on is_subtype?

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Nov 19, 2018

Collaborator

Mypy types are hashable (we have a subtype cache for example that uses this), but I am not sure this is what you need. Also I am not sure we need an intersection here, it is indeed safe, but I am afraid it may cause false positives. I am fine with a less strict check (especially if it is easy to implement). IOW I don't worry about false negatives because we don't allow unions here, I worry about false positives.

This comment has been minimized.

@dgelessus

dgelessus Nov 19, 2018

Contributor

Mypy types are hashable

Ah, my mistake - I only looked at the Type base class and saw that it had no __hash__, but almost all of the concrete type classes define __hash__.

IOW I don't worry about false negatives because we don't allow unions here, I worry about false positives.

In that case I can simply add another "union-loop" thing. But tbh even if this is changed later, I don't think it will give many unexpected errors - an Array[Union[...]] type can only come from an explicit annotation, in which case the user perhaps wants this kind of strict checking.

if et is not None:
types = []
for tp in union_items(et):
if isinstance(tp, Instance) and tp.type.fullname() == 'ctypes.c_char':

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Nov 18, 2018

Collaborator

I think we should double-check that Any is still valid here and elsewhere. Could you please add tests for this and update the code if necessary?

"""Callback to provide an accurate type for ctypes.Array.raw."""
et = _get_array_element_type(ctx.type)
if et is not None:
if isinstance(et, Instance) and et.type.fullname() == 'ctypes.c_char':

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Nov 18, 2018

Collaborator

Should Union[c_char, Any] be still valid here? I think it should.

This comment has been minimized.

@dgelessus

dgelessus Nov 19, 2018

Contributor

Fixed. I'm wondering though, what would a union like Union[c_char, Any] be useful for? Every c_char is an Any, so why doesn't the union get simplified to Any at some point? (The make_simplified_union docstring points out that Anys in unions are not simplified, but doesn't explain why.)

'Array constructor argument {} of type "{}"'
' is not convertible to the array element type "{}"'
.format(arg_num, arg_type, et),
ctx.context)

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Nov 18, 2018

Collaborator

Yes, the logic in function hook is sub-optimal, this is probably the most problematic hook currently, several people who tried it, found the API inconvenient. You also get corresponding expressions in ctx.args[0] for every type in ctx.arg_types[0], maybe you can try checking if it is a StarExpr(...) and then somehow "unpack" type if it maps to iterable supertype?

@dgelessus

This comment has been minimized.

Copy link
Contributor

dgelessus commented Nov 18, 2018

You are totally allowed to update that signature to be more reasonable :-)

Actually, after looking into it, it's not as easy to change as I thought. It turns out that make_simplified_union will sometimes reassign and/or mutate the list parameter, in which case the List[Type] type is actually correct. Refactoring this isn't trivial and I don't want to do it in an otherwise unrelated PR, so I simply added the necessary type comments in the plugin code instead.

@ilevkivskyi

This comment has been minimized.

Copy link
Collaborator

ilevkivskyi commented Nov 19, 2018

@dgelessus I left two more comments, I would propose to apply simple fixes, land this (finally after many months :-) and then fix the remaining issues in a separate PR later.

(The test that currently fails can be just skipped by adding -skip suffix to its name.)

dgelessus added some commits Nov 19, 2018

Disable the ctypes.Array(*args) test case for now
This case cannot be handled correctly yet, because the function hook
API doesn't expose the necessary information (see comment in code).
@ilevkivskyi
Copy link
Collaborator

ilevkivskyi left a comment

Great! I think this is ready now. The problem with *args can be revisited after we improve the plugin hook API.

@ilevkivskyi ilevkivskyi merged commit d4729d9 into python:master Nov 19, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
# unpacked. However, FunctionContext currently doesn't provide a way to differentiate
# between normal arguments and *args, so the iterable type is considered invalid.
# Once FunctionContext has an API for this, *args should be allowed here if the
# iterable's element type is compatible with the array element type.

This comment has been minimized.

@Mic92

Mic92 Dec 7, 2018

Is there any workaround for this? I currently have a list of integers that I want to convert into a ctypes array of bytes.

This comment has been minimized.

@ilevkivskyi

ilevkivskyi Dec 8, 2018

Collaborator

You can just put a # type: ignore on the line where error appears. We are already working on something that will allow to fix this.

This comment has been minimized.

@dgelessus

dgelessus Dec 8, 2018

Contributor

(Relevant issue: #5409 (comment))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment