Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor plugin system and special case TypedDict get and int.__pow__ #3501

Merged
merged 11 commits into from
Jun 7, 2017

Conversation

JukkaL
Copy link
Collaborator

@JukkaL JukkaL commented Jun 6, 2017

Implement a general-purpose way of extending type inference of
methods. Also special case TypedDict get and int.__pow__.

This an alternative to #2620 by @rowillia. I borrowed some test
cases from that PR. This PR has a few major differences:

  • Use the plugin system instead of full special casing.
  • Don't support d.get('x', {}) as it's not type safe. Once we
    have Typed dicts with missing keys #2632 we can add support for this idiom safely.
  • Code like f = foo.get loses the special casing for get.

Fixes #2612. Work towards #1240.

JukkaL added 10 commits June 6, 2017 15:37
Prepare for supporting more general plugins, for things other
than just functions.  The new design also makes it easier to
add support for user-defined plugins.
Implement a general-purpose way of extending type inference of
methods.
Some of the tests are adapted from #2620 by @rowillia.
Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked all of this over once, and it looks pretty good! I have a few questions but I think it's nearly done.

mypy/checker.py Outdated
def __init__(self, errors: Errors, modules: Dict[str, MypyFile], options: Options,
tree: MypyFile, path: str) -> None:
tree: MypyFile, path: str, plugin: Optional[Plugin] = None) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a use case for leaving the plugin unspecified?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm probably not. It's trivial to provide an instance of the empty Plugin as the argument in case no plugin functionality is needed. I'll make this non-optional if it doesn't break anything.

@@ -362,8 +363,10 @@ def apply_function_plugin(self,
for actual in actuals:
formal_arg_types[formal].append(arg_types[actual])
formal_arg_exprs[formal].append(args[actual])
return self.function_plugins[fullname](
formal_arg_types, formal_arg_exprs, inferred_ret_type, self.chk.named_generic_type)
callback = self.plugin.get_function_hook(fullname)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it seems odd since the caller also calls get_function_hook(). Why not pass in the callback?

mypy/plugin.py Outdated
return inferred_return_type


def int_pow_callback(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pow() function is also candidate. And float powers are also interesting, since (-3)**0.5 or (-3.0)**0.5 will return a complex (there's a bug in typeshed, it claims that float**float -> float; but presumably in x**y, x is usually not a literal, so it's not easy do do much about this except very conservatively declare that float**float -> complex, which I expect would cause other problems than it solves; maybe float**float -> Any makes sense just like int**int -> Any?).

reveal_type(a**2) # E: Revealed type is 'builtins.int'
reveal_type(a**(-0)) # E: Revealed type is 'builtins.int'
reveal_type(a**(-1)) # E: Revealed type is 'builtins.float'
reveal_type(a**(-2)) # E: Revealed type is 'builtins.float'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parentheses around the negative number are redundant; you can write a**-2. (Though it's nice to know that you handle parenthesized exponent too, since I presume users might not know the parentheses are redundant.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove most of the parentheses but keep one pair to ensure that they are accepted.

reveal_type(a**(-0)) # E: Revealed type is 'builtins.int'
reveal_type(a**(-1)) # E: Revealed type is 'builtins.float'
reveal_type(a**(-2)) # E: Revealed type is 'builtins.float'
reveal_type(a**b) # E: Revealed type is 'Any'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point in the future I'd like mypy to do constant propagation too, so this would become int as well. :-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, constant propagation for at least simple things like ints, floats and strings shouldn't be hard. Then we'd have to update this test case.

D = TypedDict('D', {'x': List[int], 'y': int})
d: D
reveal_type(d.get('x', [])) # E: Revealed type is 'builtins.list[builtins.int]'
d.get('x', ['x']) # E: List item 0 has incompatible type "str"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this an error while two lines below is not?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here mypy first infers type List[int] for [...] based on type context and then checks that all items have compatible types (which they don't). Below the type of a has been inferred previously so the type context is ignored -- but this is actually fine, since the formal argument type is a union that accepts anything. This is perhaps a little unintuitive, but changing this would be hard and it's not really specific to this PR. I'm testing both cases to catch regressions.

Copy link
Member

@gvanrossum gvanrossum Jun 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, is there a issue for the general problem then? I can repro this like this:

from typing import *
T = TypeVar('T')
def f(a: Union[List[int], T]) -> T: pass
f([1])   # OK
f([''])  # E: List item 0 has incompatible type "str"
a = ['']
f(a)     # OK

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #3506 to track this issue.

d: D
d.get() # E: No overload variant of "get" of "Mapping" matches argument types []
d.get('x', 1, 2) # E: No overload variant of "get" of "Mapping" matches argument types [builtins.str, builtins.int, builtins.int]
reveal_type(d.get('z')) # E: Revealed type is 'builtins.object*'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be None instead, since we know it's not a valid key? Or is there the possibility that extra items are present in a TypedDict? (I can't recall where we ended up with that.)

@@ -508,7 +512,7 @@ def check_call(self, callee: Type, args: List[Expression],
or (object_type is not None and self.plugin.get_method_hook(callable_name))):
ret_type = self.apply_function_plugin(
arg_types, callee.ret_type, arg_kinds, formal_to_actual,
args, len(callee.arg_types), callable_name, object_type)
args, len(callee.arg_types), callable_name, object_type, context)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add context to the list of arguments in the docstring?

@@ -1698,4 +1698,5 @@ reveal_type(a**(-0)) # E: Revealed type is 'builtins.int'
reveal_type(a**(-1)) # E: Revealed type is 'builtins.float'
reveal_type(a**(-2)) # E: Revealed type is 'builtins.float'
reveal_type(a**b) # E: Revealed type is 'Any'
reveal_type(a.__pow__(2)) # E: Revealed type is 'builtins.int'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try a.__pow__(-2) too?

E = TypedDict('E', {'d': D})
p = E(d=D(x=0, y=''))
reveal_type(p.get('d', {'x': 1, 'y': ''})) # E: Revealed type is 'TypedDict(x=builtins.int, y=builtins.str, _fallback=__main__.D)'
p.get('d', {}) # E: Expected items ['x', 'y'] but found [].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't there a use case where people write p.get('d', {}).get('x') and expect int or None?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but as mentioned in the PR description, handling it safely requires a way of specifying that some TypedDict items may be missing, and we don't have that feature yet. The return type of p.get('d', {}) should be a TypedDict that is like D but where both 'x' and 'y' may be missing. I'll add support for this in a separate PR once we can agree on the syntax (or I can add support for this as an mypy internal feature without public syntax so that missing keys are only generated through type inference).

@chadrik
Copy link
Contributor

chadrik commented Jun 7, 2017

Here are some thoughts on the user-plugin aspect of this PR. Should we open a new PR for user-facing aspect of the plugin system?

Plugin discovery options

  • A. by file path. e.g. /path/to/myplugin.py. could also extend this with a MYPY_PLUGIN_PATH
    • pro: easier to write test cases (I discovered that placing a file on the PYTHONPATH within the tests was difficult, likely by design)
    • con: can't use pip to install plugins
  • B. by dotted path: e.g. package.module
    • pro: easy for users to create pip-installable plugins
    • con: adding plugin modules and their requirements to the PYTHONPATH could interfere with type checking?
  • C: setuptools entry points. e.g.:
    setup(
        entry_points={
            'mypy.userplugin': ['my_plugin = my_module.plugin:register_plugin']
        }
    )

Other questions

  • shall we always look for an object within the module with a designated named, e.g. Plugin, or make this configurable as well? e.g. package.module.MyPlugin or /path/to/myplugin.py:MyPlugin (Note: I've already written some functionality for the latter)
  • do user plugins need to inherit from mypy.plugin.Plugin?

Plugin chainability options

  • A. aggregate all user plugins into a single uber-plugin instance.
    • each method on this aggregate plugin would cycle through its children in order until one returns a non-None result. we could then cache the mapping from feature to user-plugin instance to speed up future lookups.
    • this is compatible with the current design which passes a single Plugin instance around.
  • B. register a plugin per feature (.e.g 'typing.Mapping.get'). this allows you to replace the search with a fast dictionary lookup, as well as detect up-front at registration time when two plugins contend for the same feature.

@gvanrossum
Copy link
Member

@JukkaL if you're happy you can merge this now.

@gvanrossum
Copy link
Member

@chadrik Good thoughts. Could you open a new issue or add this to the general issue about plugins? This PR will soon be merged and I don't want discussion unrelated to the PR itself happening here.

@JukkaL
Copy link
Collaborator Author

JukkaL commented Jun 7, 2017

@chadrik Thanks for the write-up, it's very useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants