Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syntax, semantics and use cases of forward references #34

Closed
gvanrossum opened this issue Jan 7, 2015 · 8 comments
Closed

Syntax, semantics and use cases of forward references #34

gvanrossum opened this issue Jan 7, 2015 · 8 comments

Comments

@gvanrossum
Copy link
Member

The best proposal we have for forward references is:

  • If a type annotation must reference a class that is not yet defined at the time the annotation is evaluated at run time, the annotation (or a part thereof) can be enclosed in string quotes, and the type checker will resolve this.

This typically occurs when defining a container class where the class itself appears in the signature of some of its methods; the (global) name for the class is not defined until after the body of the class definition has been executed. (In some cases there may be multiple mutually recursive classes, so a shorthand for "this class" isn't really enough.)

A typical example is:

T = TypeVar('T')
class Node(Generic[T]):
    def add_left(self, node: 'Node[T]'):
        self.left = node

Note that the entire annotation expression 'Node[T]' is quoted, not just the class name Node (because 'Node'[T] makes no sense).

The question I'm trying to ask here is whether there is a reasonable limit to the complexity of the syntax that we must support inside string quotes. And if not, whether we may need to invent some other way to specify forward references. For example, something like this has been proposed:

T = TypeVar('T')
Node = ForwardRef('Node')
class Node(Generic[T]):
    def add_left(self, node: Node[T]):
        self.left = node

A related question is whether the __annotations__ dict should be "patched" to reference the intended class, or whether it is up to the code introspecting __annotations__ to interpret forward references (and if the latter, what kind of support typing.py should export to help).

I've got a feeling that the ForwardRef() approach makes things a little easier for the runtime support, at the cost of slight verbosity. The question is how common forward references really are (apart from stubs for builtin/stdlib container).

@gvanrossum
Copy link
Member Author

Another idea: if we're okay with Node = ForwardRef... followed by class Node..., maybe we can just repeat the class statement with an empty body? I.e.

T = TypeVar('T')
class Node(Generic[T]): pass
class Node(Generic[T]):
    def add_left(self, node: Node[T]):
        self.left = node

I think I've seen similar things in other language.s Maybe the body of the forward declaration could be some special symbol, e.g. an ellipsis (...)?

@gvanrossum
Copy link
Member Author

With either form of explicit forward ref (as opposed to the string literal variant) the key question for the runtime then becomes, whether, and how, to unify the forward ref. I think we can probably implement this by calling sys._getframe() from the metaclass __new__() method -- if it finds a class with the same name and bases in the calling scope's locals (which should normally be the globals, but e.g. for tests it could be the locals and that should work) it should somehow unify the two objects -- perhaps by updating the object that's already there.

(This is really just an issue for the runtime typing.py that I am working on. For the static checker and the PEP, the issue is simply which syntax we should endorse.)

@gvanrossum
Copy link
Member Author

The runtime implementation using inspection of sys._getframe().f_locals is really quite simple. 309369f

@JukkaL
Copy link
Contributor

JukkaL commented Jan 8, 2015

What about forward references that target non-generic classes? We don't have a metaclass there.

At least in the mypy implementation, string escapes are used fairly often. A subtle point is that type annotations sometimes introduce cyclic module dependencies that wouldn't be there otherwise, and in these cases we may need string literal types with a module prefix (of form foo.bar.ClassName). ForwardRef could easily support this case as well:

Node = ForwardRef('node.Node')  # node.Node may not be available yet

However, the name ForwardRef would be a bit misleading, since this is not really a forward reference.

I don't know how the class statement syntax would support inter-module type references.

Here is a toy example of a cyclic dependency caused by an annotation:

# a.py
import typing
import b

class A:
    x = 1

    def f(self) -> int:
        return b.g(self)
# b.py
import typing
import a  # Only needed for the type annotation!

def g(a: 'a.A') -> int:
    return a.x

@JukkaL
Copy link
Contributor

JukkaL commented Jan 8, 2015

Mypy allows using arbitrary types in string literals. Also, normal and string literal types can be combined. For example, List['A'] is a valid mypy type (of course, assuming that A is defined in the module).

@gvanrossum
Copy link
Member Author

OK, scrap my ideas. Let's think harder about the syntax string literals must support. How about this:

<expr> ::= <name> <attr>* <subscription>?
<attr> ::= '.' <name>
<subscription> ::= '[' <sub_list> ']'
<sub_list> ::= <sub> | <sub_list> ',' <sub>
<sub> ::= <expr> | '[' <expr_list> ']' | '...'
<expr_list> ::= <expr> | <expr_list> ',' <expr>

This is a recursive grammar but not a super complicated one. (The list notation and ellipsis are for Callable.)

At runtime we can take a shortcut by just using eval() in the right scope (let's say the globals of the function whose annotation we're evaluating). Perhaps after some regexp-based sanity check (just letters, digits, underscores, dots, brackets, spaces) to prevent fear of security vulnerabilities.

I think Eugene Toder suggested that typing.py should define a helper function to normalize an annotation given some context (perhaps a function or method object).

@pludemann
Copy link

How about allowing ForwardRef for the common cases, so programmers don't have to put quotes on most things, but still allow strings for cases that can't handle, e.g. cyclical dependencies?
Or did I miss something in Guido's "OK, scrap my ideas"?

@gvanrossum
Copy link
Member Author

I meditated on this and in the end decided that the quotes are almost always better, and TOOWTDI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants