Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve super() error message and document zero-arg super() usage inside nested functions and generator expressions. #113212

Closed
WolframAlph opened this issue Dec 16, 2023 · 9 comments
Labels
type-feature A feature request or enhancement

Comments

@WolframAlph
Copy link
Sponsor Contributor

WolframAlph commented Dec 16, 2023

Bug report

Bug description:

I tried searching for something related but I could not find exactly this case with super. Maybe I missed… so posting it here.
Following code works fine:

class A:
    def foo(self):
        return 1


class B(A):
    def foo(self):
        return [super().foo() for _ in range(10)]


print(B().foo())

But if you modify it to be like this:

class A:
    def foo(self):
        return 1


class B(A):
    def foo(self):
        return list(super().foo() for _ in range(10))


print(B().foo())

you will get following error:

TypeError: super(type, obj): obj must be an instance or subtype of type

Before #101441 , [super().foo() for _ in range(10)] would cause the same error but it was fixed implicitly by PEP709.
You can fix this yourself by specifying type and instance in super():

class A:
    def foo(self):
        return 1


class B(A):
    def foo(self):
        return list(super(B, self).foo() for _ in range(10))


print(B().foo())

If you decide that fixing this makes sense, can I work on fixing it? To me it makes sense (if you decide to leave it as it is) at least to give some kind of warning or error that in this case you should explicitly provide type and instance to super, because it is quite confusing why user gets this error as semantically this is valid and should work

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux, macOS

Linked PRs

@WolframAlph WolframAlph added the type-bug An unexpected behavior, bug, or error label Dec 16, 2023
@ericvsmith
Copy link
Member

cc @carljm

@WolframAlph
Copy link
Sponsor Contributor Author

WolframAlph commented Dec 16, 2023

After digging into source code I found out that super takes its object argument from fist local variable which is self under normal circumstances.

cpython/Objects/typeobject.c

Lines 10454 to 10472 in f26bfe4

PyObject *firstarg = _PyFrame_GetLocalsArray(cframe)[0];
// The first argument might be a cell.
if (firstarg != NULL && (_PyLocals_GetKind(co->co_localspluskinds, 0) & CO_FAST_CELL)) {
// "firstarg" is a cell here unless (very unlikely) super()
// was called from the C-API before the first MAKE_CELL op.
if (_PyInterpreterFrame_LASTI(cframe) >= 0) {
// MAKE_CELL and COPY_FREE_VARS have no quickened forms, so no need
// to use _PyOpcode_Deopt here:
assert(_PyCode_CODE(co)[0].op.code == MAKE_CELL ||
_PyCode_CODE(co)[0].op.code == COPY_FREE_VARS);
assert(PyCell_Check(firstarg));
firstarg = PyCell_GET(firstarg);
}
}
if (firstarg == NULL) {
PyErr_SetString(PyExc_RuntimeError,
"super(): arg[0] deleted");
return -1;
}

So I can trigger unpredicted behaviour with the following code:

class Base:
    def foo(self):
        return self.val


class Derived(Base):
    def __init__(self, val):
        self.val = val

    def foo(self):
        def inner(other: Derived):
            return super().foo() + other.val
        return inner


b1 = Derived(1)
b10 = Derived(10)

print(b1.foo()(b10))

Expected output is 11 but you get 20 because super looks at first local var and finds other.
if you pass in some other value. For instance:

print(d1.foo()(1))

This will throw:

TypeError: super(type, obj): obj must be an instance or subtype of type

which could be extremely confusing to user.

@carljm
Copy link
Member

carljm commented Dec 17, 2023

Hi! Thanks for the report. This report covers a lot of territory, and it will be useful to clarify a few things.

Most of the behavior you describe here has been true for the entire life of Python 3, since zero-argument super() was added to the language. Zero-argument super() has never worked inside a nested function (except "by accident" in a carefully constructed case where the nested function's first argument happens to be an instance of the right type, as you note), nor has it ever worked in a generator expression or comprehension, because these have always been implemented as nested functions. (Until Python 3.12, where comprehensions are no longer nested functions.)

So the only recent change in behavior here is in your "list comprehension" example, where code that raised a TypeError in all previous versions of Python 3 now works in Python 3.12, because the comprehension no longer implicitly creates a nested function. All of your other examples behave the same in all versions of Python 3; none of them worked in 3.11 and stopped working in 3.12.

The first and most important question: is the change in Python 3.12 a regression, and should we attempt to restore the Python 3.11 behavior? I don't think so: there is no value to the previous TypeError, and the new behavior is both more intuitive and more useful. So I consider this a beneficial side effect of PEP 709 that doesn't require any further action.

Next question: should any of the other longstanding behaviors of zero-argument super() described here be considered a bug? This is debatable. Unfortunately the documentation is no help here, as it doesn't discuss zero-arg super() in nested functions or generator expressions at all. All else equal, I do think it would be more intuitive and more useful if zero-arg super() worked inside a generator expression. But ultimately because this has never worked, there is no documentation claiming that it should work, and because supporting it would require a fairly complex and invasive implementation that would be too risky to backport to stable versions of Python, I think we must consider this a request for a new feature, not a bug. I'm personally not all that excited about the feature, because I think the complexity it would add to the implementation would outweigh the value of it.

Last question: are there any improvements to documentation or error messages warranted here? I think so: the current documentation for zero-argument super seems quite sparse to me, and doesn't really describe precisely how it infers the values for the two arguments. I think it would be useful for the documentation to clarify that zero-argument super uses the first argument of the immediately enclosing function as the "second argument" to super(), and thus it must be an instance or subtype of the enclosing class, and specifically clarify that this means zero-arg super() won't work as expected inside a nested function or generator expression.

And I also think the error message "super(type, obj): obj must be an instance or subtype of type" could be clearer. The biggest improvement IMO would be for the error to name the specific type and the type of obj; this alone would give the user a much better chance of understanding what happened and how to fix it. I also think we could consider using a different error message for zero-arg super() than for explicit-arg super(); then the former could be clearer about where the relevant values come from. E.g. "super(): first argument of enclosing function must be an instance or subtype of enclosing class definition."

@carljm carljm added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Dec 17, 2023
@carljm carljm changed the title super() not working inside generators allow zero-arg super() to work inside generator expressions Dec 17, 2023
@WolframAlph
Copy link
Sponsor Contributor Author

Thanks for commenting on this. I do agree that making it work inside nested functions/generators and generator expressions is not straightforward and would require careful design, maybe even worth a separate PEP 🤔 ? Currently super will find it's first argument correctly since __class__ is placed implicitly:

cpython/Python/compile.c

Lines 1314 to 1323 in d91e43e

if (u->u_ste->ste_needs_class_closure) {
/* Cook up an implicit __class__ cell. */
Py_ssize_t res;
assert(u->u_scope_type == COMPILER_SCOPE_CLASS);
res = dict_add_o(u->u_metadata.u_cellvars, &_Py_ID(__class__));
if (res < 0) {
compiler_unit_free(u);
return ERROR;
}
}

and copied every time from closure to nested function/generator, generator expression by:
TARGET(COPY_FREE_VARS) {
frame->instr_ptr = next_instr;
next_instr += 1;
INSTRUCTION_STATS(COPY_FREE_VARS);
/* Copy closure variables to free variables */
PyCodeObject *co = _PyFrame_GetCode(frame);
assert(PyFunction_Check(frame->f_funcobj));
PyObject *closure = ((PyFunctionObject *)frame->f_funcobj)->func_closure;
assert(oparg == co->co_nfreevars);
int offset = co->co_nlocalsplus - oparg;
for (int i = 0; i < oparg; ++i) {
PyObject *o = PyTuple_GET_ITEM(closure, i);
frame->localsplus[offset + i] = Py_NewRef(o);
}
DISPATCH();
}

so maybe implementing something similar with self would make sense? But again I agree with what you said:

I'm personally not all that excited about the feature, because I think the complexity it would add to the implementation would outweigh the value of it.

I also agree with:

I think it would be useful for the documentation to clarify that zero-argument super uses the first argument of the immediately enclosing function as the "second argument" to super(), and thus it must be an instance or subtype of the enclosing class, and specifically clarify that this means zero-arg super() won't work as expected inside a nested function or generator expression.

But this is true when you explicitly define function/generator. When you have generator expression (which is converted to generator implicitly), it's first argument internally would be range(10) in my example mentioned earlier. So if I modify my example to trick super like this:

class Base:
    def foo(self):
        return 1


class Derived(Base):
    def __init__(self, val=10):
        self.it = iter(range(val))

    def __iter__(self):
        return self
    
    def __next__(self):
        return next(self.it)

    def foo(self):
        return list(super().foo() for _ in Derived(10))


b1 = Derived()
print(b1.foo())

now first argument is in fact instance of enclosing class and everything starts working. But user has no idea how generator expression is implemented under the hood and that range would be it’s first argument. So I think this would be special case.

This would be very helpful:

And I also think the error message "super(type, obj): obj must be an instance or subtype of type" could be clearer. The biggest improvement IMO would be for the error to name the specific type and the type of obj; this alone would give the user a much better chance of understanding what happened and how to fix it.

Also agree with you:

I also think we could consider using a different error message for zero-arg super() than for explicit-arg super(); then the former could be clearer about where the relevant values come from. E.g. "super(): first argument of enclosing function must be an instance or subtype of enclosing class definition.”

So the only concern is generator expression as it is not obvious to user where that “first argument” comes from. But it can be deducted from potential improved error message revealing types of super arguments, so I am not sure what is best in this case 🤷 . Maybe it would make sense to check current frame owner (while resolving super arguments), and if it’s generator and super was called without arguments to give a warning of some kind? E.g. calling super without arguments inside generator, or something like that. Giving error would not be backward compatible I guess.

@gaogaotiantian
Copy link
Member

zero-arg super() is one of the dark magics that shocked me when I realized how it works. I agree we should have better documentations for it, but should we explain the details of how it actually works? I have a couple of concerns for that:

  1. It's not easy to understand, at least for casual Python users.
  2. What we write on the documentation is kind of a promise from us - it would be more difficult to change it if we wanted something else in the future (maybe we wouldn't, I mean it's been like this for a long time).

So, instead of explain how it works in detail, would it be better to just ask users to only use it (or not use it) in certain circumstances? We might get some flexibility in the future, and it probably is easier to understand to a lot of users. How we do that, is implementation detail.

@carljm
Copy link
Member

carljm commented Dec 19, 2023

@WolframAlph

Maybe it would make sense to check current frame owner (while resolving super arguments), and if it’s generator and super was called without arguments to give a warning of some kind? E.g. calling super without arguments inside generator, or something like that. Giving error would not be backward compatible I guess.

There's no backward-compatibility problem with giving a different exception message when zero-arg super() is used inside a generator expression and throws an error. We should not throw an error in the unusual case where it happens to work today; that would be a backwards-compatibility problem. Also for runtime-overhead reasons we would not want to do the "is this a generator expression?" check unless we were already in the error path.

But on the whole it's not clear to me that accidental use of zero-arg super() inside generator expressions is common enough to warrant all this special treatment. I just haven't seen large numbers of users confused by this.

@gaogaotiantian

So, instead of explain how it works in detail, would it be better to just ask users to only use it (or not use it) in certain circumstances? We might get some flexibility in the future, and it probably is easier to understand to a lot of users. How we do that, is implementation detail.

I think there's a continuum of options here, not just a binary choice. E.g. I definitely agree that we should simply say "enclosing class definition" and not go into details of how the __class__ cell works. On the other hand, I feel that "first argument of enclosing function" might be easier to understand than just saying "only use it inside methods and not inside nested functions or generator expressions" without any explanation of why. And I don't foresee any likelihood of a change to how zero-arg super() works that would mean it no longer uses the first argument of the enclosing function.

@WolframAlph
Copy link
Sponsor Contributor Author

@carljm

We should not throw an error in the unusual case where it happens to work today; that would be a backwards-compatibility problem.

Yeah, this is what I meant.

Also for runtime-overhead reasons we would not want to do the "is this a generator expression?" check unless we were already in the error path.

Yes, this makes sense. But my point was that generator expressions are still implicit generators under the hood and “first argument of immediately enclosing function” does not apply to them. So there could be some additional info in error if current frame is generator.

But on the whole it's not clear to me that accidental use of zero-arg super() inside generator expressions is common enough to warrant all this special treatment.

It's not accidental. User assumes it would work or at least tell him something meaningful in case it fails.

I just haven't seen large numbers of users confused by this.

This is not right IMHO. You cant let something work in an unpredicted way just because there is not enough people complaining about it. It’s like having a segfault in a module, but wait - that module is not used a lot so let's leave it like that. While my example can be a bit exaggerated, it makes a point. And I don’t think there is few people who encountered this. I did myself and I did not know what was wrong, so I filled in parameters explicitly and it was fixed. This is probably how it is done by users who encounter it. They just try to fill in parameters, it starts working and they just move on.

@carljm
Copy link
Member

carljm commented Dec 19, 2023

I'm not necessarily opposed to adding extra info to the error message when zero-arg super() fails inside a genexp; I'd like to see how much complexity the PR for it would add.

I definitely think the documentation update, whatever form it takes, should include specific mention of the fact that genexps create implicit nested functions and thus zero-arg super() doesn't work inside them. I think mentioning this in the docs, plus generally improving the error message as described above, might be sufficient without additional special-casing of the error message for genexps.

Also, one more point about this:

I don't foresee any likelihood of a change to how zero-arg super() works that would mean it no longer uses the first argument of the enclosing function.

If we were to change zero-arg super to use any other mechanism for finding self, it would almost certainly break real-world code and require a deprecation process. I don't think it's practically possible that this could ever be changed silently as an internal implementation detail. So I don't think there is any real future flexibility lost by documenting it.

WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
@WolframAlph WolframAlph changed the title allow zero-arg super() to work inside generator expressions Improve super() error message and document zero-arg super() usage inside nested functions and generator expressions. Dec 20, 2023
@WolframAlph
Copy link
Sponsor Contributor Author

Changing issue title to reflect what is going to be done. Adding special treatment for generator expressions would introduce unnecessary complexity to PR so documenting it + improving error message on super check fail sounds like a good improvement.

WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 20, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 21, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 21, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 21, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 21, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 22, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 22, 2023
WolframAlph added a commit to WolframAlph/cpython that referenced this issue Dec 22, 2023
serhiy-storchaka pushed a commit that referenced this issue Dec 22, 2023
ryan-duve pushed a commit to ryan-duve/cpython that referenced this issue Dec 26, 2023
kulikjak pushed a commit to kulikjak/cpython that referenced this issue Jan 22, 2024
aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

5 participants