Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotation scopes containing nested scopes #109118

Closed
15r10nk opened this issue Sep 7, 2023 · 13 comments · Fixed by #109196
Closed

Annotation scopes containing nested scopes #109118

15r10nk opened this issue Sep 7, 2023 · 13 comments · Fixed by #109196
Assignees
Labels
3.13 new features, bugs and security fixes type-bug An unexpected behavior, bug, or error

Comments

@15r10nk
Copy link
Contributor

15r10nk commented Sep 7, 2023

Bug report

Bug description:

The following code causes a SystemError during compilation.

class name_2[*name_5, name_3: int]:
    (name_3 := name_4)

    class name_4[name_5: name_5]((name_4 for name_5 in name_0 if name_3), name_2 if name_3 else name_0):
        pass

output (Python 3.12.0rc2+):

SystemError: compiler_lookup_arg(name='name_3') with reftype=3 failed in <generic parameters of name_4>; freevars of code <genexpr>: ('name_3',)

removing the named-expression makes it worse and causes a crash

class name_2[*name_5, name_3: int]:

    class name_4[name_5: name_5]((name_4 for name_5 in name_0 if name_3), name_2 if name_3 else name_0):
        pass

output (Python 3.12.0rc2+):

Traceback (most recent call last):
  File "/home/frank/projects/pysource-codegen/bug2.py", line 2, in <module>
    class name_2[*name_5, name_3: int]:
  File "/home/frank/projects/pysource-codegen/bug2.py", line 2, in <generic parameters of name_2>
    class name_2[*name_5, name_3: int]:
  File "/home/frank/projects/pysource-codegen/bug2.py", line 4, in name_2
    class name_4[name_5: name_5]((name_4 for name_5 in name_0 if name_3), name_2 if name_3 else name_0):
  File "/home/frank/projects/pysource-codegen/bug2.py", line 4, in <generic parameters of name_4>
    class name_4[name_5: name_5]((name_4 for name_5 in name_0 if name_3), name_2 if name_3 else name_0):
                                                       ^^^^^^
NameError: name 'name_0' is not defined
Modules/gcmodule.c:113: gc_decref: Assertion "gc_get_refs(g) > 0" failed: refcount is too small
Enable tracemalloc to get the memory block allocation traceback

object address  : 0x7f4c77059a30
object refcount : 1
object type     : 0x560ea4b6cfe0
object type name: dict
object repr     : {'__module__': '__main__', '__qualname__': 'name_2', '__type_params__': (name_5, name_3)}

Fatal Python error: _PyObject_AssertFailed: _PyObject_AssertFailed
Python runtime state: finalizing (tstate=0x0000560ea4cc9ba0)

Current thread 0x00007f4c7756c280 (most recent call first):
  Garbage-collecting
  <no Python frame>

side note:
The reason for the strange looking language constructs and variable names is that I generated this code. I found it while working on my pysource-codegen library.

CPython versions tested on:

3.12

Operating systems tested on:

Linux

Linked PRs

@15r10nk 15r10nk added the type-bug An unexpected behavior, bug, or error label Sep 7, 2023
@JelleZijlstra JelleZijlstra self-assigned this Sep 7, 2023
@AlexWaygood AlexWaygood added 3.12 bugs and security fixes 3.13 new features, bugs and security fixes labels Sep 7, 2023
@JelleZijlstra
Copy link
Member

Thanks, this is very helpful!

Minimized version:

class name_2[name_3]:
    name_3 = name_4
    class name_4[name_5]((name_3 for name_5 in name_0), name_3):
        pass

The runtime crash when removing the name_3 = name_4 line is a bug in the interpreter that I will send a PR for. The compiler crash (SystemError) is a different bug that I haven't yet tracked down.

@JelleZijlstra
Copy link
Member

The first thing to figure out with the compiler crash is how the code should behave, as class scopes are weird. We have a genexp within the type parameters of a generic class nested inside another generic class.

Here's the test I wrote, demonstrating how this works if the inner class is not generic:

+    def test_gen_exp_in_nested_class(self):
+        class C[T]:
+            T = "class"
+            class Inner(make_base(T for _ in (1,)), make_base(T)):
+                pass
+        T, = C.__type_params__
+        base1, base2 = C.Inner.__bases__
+        self.assertEqual(list(base1.__arg__), [T])
+        self.assertEqual(base2.__arg__, "class")
+
+    def test_gen_exp_in_nested_generic_class(self):
+        class C[T]:
+            T = "class"
+            class Inner[U](make_base(T for _ in (1,)), make_base(T)):
+                pass
+        T, = C.__type_params__
+        base1, base2, _ = C.Inner.__bases__
+        self.assertEqual(list(base1.__arg__), [T])
+        self.assertEqual(base2.__arg__, "class")
+
+
+def make_base(arg):
+    class Base:
+        __arg__ = arg
+    return Base
+

In short, the genexp should not see the definition in the outer class body, only the one in the generic parameters of the outer class. However, the other bases should see the definition in the class body. This is how it works now if the inner class is not generic, and this is correct.

If the inner class is generic, the behavior should be analogous, but evidently there's something wrong in the bookkeeping.

JelleZijlstra added a commit to JelleZijlstra/cpython that referenced this issue Sep 9, 2023
JelleZijlstra added a commit to JelleZijlstra/cpython that referenced this issue Sep 9, 2023
…EP 695 function (pythonGH-109123).

(cherry picked from commit 17f9941)

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
@JelleZijlstra JelleZijlstra changed the title SystemError/crash while compiling python code SystemError with generator expression within PEP 695 scope within class Sep 9, 2023
@JelleZijlstra
Copy link
Member

JelleZijlstra commented Sep 9, 2023

Thinking about it more, I recommend that we should disallow nested scopes (comprehensions, generator expressions, and lambdas) within PEP 695 scopes (bases of generic classes, annotations of generic functions, values of type aliases) within classes. The normal scoping rules would produce very weird outcomes and would I think be difficult to implement. Also, disallowing them is a safe option this late in the Python 3.12 release cycle: if we can implement it correctly, we can always allow these constructs later for 3.13.

We would disallow the following constructs:

class C:
     type T = lambda: T  # SyntaxError
     type T = [T for T in T]  # SyntaxError
     type T[U]: (T for T in U)] = int  # SyntaxError

     def f[T](x: lambda: T): ...  # SyntaxError
     class I[T](T for T in T): ...  # SyntaxError

All of these constructs would still be allowed by the interpreter outside of class scopes.

Why do I say the outcomes are weird? Consider this code:

def make_base(arg):
    class Base:
        __arg__ = arg
    return Base


class C[T]:
    T = "class"
    U = "class"
    class Inner[U](make_base((T, U)), make_base((T, U) for _ in (1,))):
        pass

There are four occurrences of T and U in the bases of Inner. What should they mean (referring to https://peps.python.org/pep-0695/#scoping-behavior)?

  • The first T is inside the annotation scope for Inner, but annotation scopes (unlike normal function scopes) see names defined in the enclosing class scope, so it should refer to the class-scoped variable "class".
  • The first U is also inside the annotation scope, so it should refer to the type parameter U for the inner class.
  • The second T is inside a generator expression that is in turn inside an annotation scope. It should not see the class scope, because genexps inside classes cannot normally see variables in that class scope (as discussed in List comprehensions now have access to the enclosing class scope #104374). Therefore, here T should refer to the type parameter for the outer generic class C.
  • The second U is also inside a genexp inside an annotation scope. I think the intuitive answer is that it should refer to the inner type parameter U. But that creates a weird result where the genexp can see the enclosing annotation scope, but not all names that are visible to that annotation scope. This is, as far as I can tell, unlike any other scope in Python, and it would require some changes to the abstractions in symtable.c to implement correctly.

Why is it OK to disallow comprehensions/lambdas/genexps in PEP 695 scopes within classes?

  • This is new syntax, so we won't be breaking anyone's code by disallowing it. We can always allow this later.
  • There are few conceivable use cases for nested scopes within annotation scopes. Annotation scopes can include the following:
    • The right-hand side of type aliases. This should be a type, which cannot include a lambda or comprehension if you follow the standard type system.
    • The bounds and constraints of TypeVars. These are also supposed to be valid as types.
    • The parameter and return annotations of generic functions. Same.
    • The bases and keyword arguments of generic classes. While these could contain any code and therefore also comprehensions or lambdas, that's going to be very rare.

I will submit a PR implementing this restriction.

@carljm
Copy link
Member

carljm commented Sep 11, 2023

I am OK with the "disallow" approach for now. It seems like the simplest option for 3.12, and is unlikely to cause serious practical problems.

Longer-term, I think it is awkward and confusing to disallow these constructs in annotation scope only within classes. Among your four bullet points illustrating the awkwardness of the scoping, the third point I don't agree with:

The second T is inside a generator expression that is in turn inside an annotation scope. It should not see the class scope, because genexps inside classes cannot normally see variables in that class scope (as discussed in #104374). Therefore, here T should refer to the type parameter for the outer generic class C.

The special case here is the annotation scope, which (unlike other nested scopes) can see the enclosing class scope. IMO, this means that it is consistent and correct (by the same transitive rule of scope visibility that you discuss in the fourth bullet point) for a comprehension (or any other nested scope) within the annotation scope to also see the enclosing class scope. Therefore, I think that T within that comprehension should refer to the class-scoped T and have the value "class". The conclusions in #104374 aren't binding here, because annotation scopes (and their special ability to see enclosing class scopes) are new and already break the previous rules about class scope visibility.

If we change this conclusion, that also eliminates the weirdness in the fourth bullet point, and I think results in a reasonable and consistent set of rules.

@JelleZijlstra
Copy link
Member

@carljm your approach would also have a confusing effect. Given this code:

x = "global"
class C:
    x = "class"
    def meth(self, y: [x for _ in (1,)]): ...

The annotation for the y parameter would be ["global"], as comprehensions in the class scope can't see the class scope.

But now consider this:

x = "global"
class C:
    x = "class"
    def meth[T](self, y: [x for _ in (1,)]): ...

Where we made meth generic. Now with your rule, the comprehension suddenly can see the class scope, and the annotation for y is ["class"]. That seems counterintuitive.

@carljm
Copy link
Member

carljm commented Sep 11, 2023

Agreed. Of course, it's also counterintuitive if making the method generic renders the previously-working comprehension a SyntaxError :/ It's hard to introduce a new intervening scope with new visibility rules, and also make everything nested inside that new scope behave the same as if it weren't there.

If the priority is to avoid adding generics changing the meaning of existing code (which does make sense), then it may be that the best (longer-term) option is to bite your fourth bullet and accept that annotation scopes inside class scopes have non-transitive visibility for class-scoped names.

@carljm
Copy link
Member

carljm commented Sep 11, 2023

PEP 649 will also change the landscape here, since it will mean that all annotations (not just those on generics) run in annotation scopes...

@JelleZijlstra
Copy link
Member

JelleZijlstra commented Sep 11, 2023

Another possibility is to make it so that comprehensions in annotation scopes in classes cannot see the annotation scope at all, so in this code:

T = "global"
class X:
    T = "class"
    def meth[T](x: [T for _ in (1,)]): pass

T would refer to global, not class or meth's type param. That avoids some of the weirdness from the other ideas, but it's unexpected that the type param T isn't visible inside the comprehension. Then again, it's consistent with how class scopes work in general.

Agree that PEP 649 makes it more important to fix this in a non-SyntaxError way for 3.13.

@carljm
Copy link
Member

carljm commented Sep 11, 2023

Another possibility is to make it so that comprehensions in annotation scopes in classes cannot see the annotation scope at all

I think if we did this, we might need to make it the general rule, not just "in classes." Otherwise it's too surprising that the visibility of T in def meth[T](x: [T for _ in [1]]): pass would depend on whether that def is inside a class, even though no class-scoped name is involved. (Unless I guess you consider the T in meth[T] to be a class-scoped name definition, if the def is in a class? But that doesn't seem intuitive.)

But we can't make that the general rule after 3.12 is released, if we are only making nested scopes in annotation scopes within classes a SyntaxError right now.

@JelleZijlstra
Copy link
Member

I think if we did this, we might need to make it the general rule, not just "in classes."

Not necessarily. Right now, this code:

T = 1
[T for _ in (1,)]

Would produce [1] in a function or module scope, but a NameError (assuming T isn't defined elsewhere) in a class scope. The behavior for annotation scopes would be analogous.

@carljm
Copy link
Member

carljm commented Sep 11, 2023

Yes, this is what I meant by "Unless I guess you consider..." in my previous comment. I am not sure those two cases are equivalent (in terms of whether it's intuitive for their context to matter), because the one involves an explicit assignment in a separate statement. But I agree that it could be argued they are.

JelleZijlstra added a commit that referenced this issue Sep 12, 2023
#109196)

Fixes #109118. Fixes #109194.

Co-authored-by: Carl Meyer <carl@oddbird.net>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Sep 12, 2023
…classes (pythonGH-109196)

Fixes pythonGH-109118. Fixes pythonGH-109194.

(cherry picked from commit b88d9e7)

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Co-authored-by: Carl Meyer <carl@oddbird.net>
@JelleZijlstra
Copy link
Member

Reopening until 3.12 backport is in

@JelleZijlstra JelleZijlstra reopened this Sep 12, 2023
Yhg1s pushed a commit that referenced this issue Sep 12, 2023
… function (GH-109123) (#109173)

* gh-109118: Fix runtime crash when NameError happens in PEP 695 function (#109123)

(cherry picked from commit 17f9941)

* [3.12] gh-109118: Fix runtime crash when NameError happens in PEP 695 function (GH-109123).
(cherry picked from commit 17f9941)

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Yhg1s pushed a commit that referenced this issue Sep 12, 2023
… classes (GH-109196) (#109297)

gh-109118: Disallow nested scopes within PEP 695 scopes within classes (GH-109196)

Fixes GH-109118. Fixes GH-109194.

(cherry picked from commit b88d9e7)

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
Co-authored-by: Carl Meyer <carl@oddbird.net>
vstinner pushed a commit to vstinner/cpython that referenced this issue Sep 13, 2023
@carljm carljm closed this as completed Sep 14, 2023
@JelleZijlstra
Copy link
Member

Reopening this to consider changing some of the behavior for Python 3.13, where actual annotations will also be implemented using annotation scopes (PEP-649; cc @larryhastings).

I now think the right behavior is that if a scope is nested in an annotation scope that is itself in a class scope, it can see names defined in the annotation scope, but not names defined in the enclosing class scope. After all, that's how generic methods currently work:

T = "global T"
U = "global U"
class C:
    T = "class T"
    U = "class U"
    def meth[T](self):
        print(T, U)
C().meth()

This currently picks up the type variable for T and the global for U.

This is also the natural behavior of the current implementation, at least for lambdas. I'll send a PR first to change the behavior for lambdas, then work on comprehensions.

@JelleZijlstra JelleZijlstra reopened this Apr 18, 2024
@JelleZijlstra JelleZijlstra changed the title SystemError with generator expression within PEP 695 scope within class Annotation scopes containing nested scopes Apr 18, 2024
@JelleZijlstra JelleZijlstra removed release-blocker 3.12 bugs and security fixes labels Apr 18, 2024
JelleZijlstra added a commit to JelleZijlstra/cpython that referenced this issue Apr 18, 2024
JelleZijlstra added a commit that referenced this issue Apr 28, 2024
…hout inlining (#118160)

Co-authored-by: Carl Meyer <carl@oddbird.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 new features, bugs and security fixes type-bug An unexpected behavior, bug, or error
Projects
Development

Successfully merging a pull request may close this issue.

4 participants