Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F841 preview changes to tuple unpacking #8884

Open
hauntsaninja opened this issue Nov 28, 2023 · 15 comments
Open

F841 preview changes to tuple unpacking #8884

hauntsaninja opened this issue Nov 28, 2023 · 15 comments
Labels
needs-decision Awaiting a decision from a maintainer rule Implementing or modifying a lint rule

Comments

@hauntsaninja
Copy link
Contributor

hauntsaninja commented Nov 28, 2023

#8489 changes F841 so that it complains about unused variables resulting from tuple unpacking.

There are a few issues:

  • It's not clear to all folks that this can be disabled by prefixing the variable with an underscore (and unlike other cases, you can't just get rid of the variable / var name is often good documentation)
  • An autofix to automatically add the underscore prefix would help! Implemented in F841: support fixing unused assignments in tuples by renaming variables #9107
  • We actually currently mark F841 as unfixable, because it often keeps the RHS around (in case of side effect), and this is usually not the fix we want. So maybe both autofix and a custom error message?
  • Maybe best option is separate error code. A good parallel is how B007 is separate from F841. This also helps if you don't find the new lint useful, which I don't really
@zanieb
Copy link
Member

zanieb commented Nov 29, 2023

Thanks for issue.

I think the first step may be to just change the violation message to suggest using a _.

It seems like a separate error code might help with your unfixable goal, but I'm a little hesitant. Like... what if you have

unused_a, unused_b = foo()

We should probably still allow a fix for this to remove both variables as they would for

unused_c = foo()

Perhaps you could try demoting the fix for F841 to unsafe?

@zanieb zanieb added good first issue Good for newcomers rule Implementing or modifying a lint rule preview Related to preview mode features labels Nov 29, 2023
@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Nov 29, 2023

Thanks, yup, custom error message mentioning underscore prefix is definitely a good step and is maybe enough to solve this issue.

Note one reason why separate error code is viable here is that flake8 doesn't complain about this tuple unpacking case.

The reason I mention the unfixable thing is that we use ruff's autofixes in pre-commit, and I would always want the (not yet written) autofix for adding an underscore in tuple unpacking, but I never want the autofix that only removes LHS, and this is probably a weak argument in favour of separate error code.

@bluetech
Copy link
Contributor

bluetech commented Dec 4, 2023

I'd like to voice my support a separate error code for unused variable in unpacking. The regular F841 is unambiguously useful, but having to add underscores for the unpacking case is just noise in most cases.

There are some cases where it might be useful, e.g. if trying to mimic Go-style error handling in Python, foo, err = func(), don't want err to go unused. So I'm not suggesting to remove the feature entirely.

Other languages have a special treatment for a _ identifier which just skips the binding, but there's no such thing in Python. Two issues with using _ anyway: it is often used as a gettext() alias, and if it's used for more than one type then mypy complains.

@zanieb
Copy link
Member

zanieb commented Dec 4, 2023

@bluetech I think the proposal here is that we'd avoid raising a violation for x, _y = func() as well as x, _ = func() which should avoid the rebinding issue.

@bluetech
Copy link
Contributor

bluetech commented Dec 4, 2023

@zanieb I may be misunderstanding your message, but as far as I can see, x, _y = func() and x, _ = func() already don't raise a violation. My main trouble is with a, unused_b = func(), the current fixes for this are:

  • a, _unused_b = func() - noise, I prefer to just keep it a, unused_b = func() in this case -- also helpful in case unused_b actually does find some use.

  • a, _ = func() - conflict with gettext, rebinding issue (in case there's more than one tuple unpacking, or multiple _ in the same unpacking). Also I think it's nice to let the reader know what they're missing, sometimes.

(To be clear, I'll completely accept an answer that my preference is too specific and not worth the complexity and I should just use a, _unused_b = func() and get over it 😀 )

@zanieb
Copy link
Member

zanieb commented Dec 5, 2023

Hm. Personally, I don't see a violation for a, unused_b = func() as any noiser than the one for unused = func() — I'm not sure why tuple unpacking deserves a special case for unused variables. Are there other ecosystems that use separate rule codes for these cases? I find prefixing unused variables from unpacked tuples in Rust as very clear and consistent with other unused variable diagnostics.

@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Dec 5, 2023

The difference is that (in codebases I work on) it's pretty common to unpack tuples and not use all the elements. There isn't dead code to be removed and so the signal to noise ratio ends up being different (especially without autofix).

The other difference is that the autofix for the old behaviour is something I don't usually want (it's too conservative and can end up leaving dead code confusingly looking like it has side effects), whereas an autofix for new behaviour is something I'd always want.

Note that ruff already has different codes for different kinds of unused variables: B007 is already a different error code.

@bluetech
Copy link
Contributor

bluetech commented Dec 5, 2023

To add my answer:

The difference is, with unused = func() the obvious fix is to just remove unused = , which is clearly better. But with a, unused_b = func(), you have to fix in one of the two ways I mentioned above, and I tried to explain why I don't like them.

In Rust _ is special non-binding identifier (underscore expression), and gettext _('my string') alias is not a thing there AFAIK, so the main Python downsides are fixed.

@TeamSpen210
Copy link
Contributor

For a separate error code, there is an edge case which may want to be treated differently - if all members in the unpacking are unused, it probably should still produce F841.

@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Dec 23, 2023

Maybe... note such an unpacking isn't safe to remove entirely because it acts as an assertion on length.

@zanieb @charliermarsh would you accept a PR moving this check to a different error code? (There's no other way of getting my ideal autofix behaviour here is there?)

@charliermarsh
Copy link
Member

@hauntsaninja -- I'm fine with moving this to a separate rule like unused-tuple-element since B007 is something of a precedent there and the fix is indeed different. Others may disagree (dunno if @zanieb will agree) but at least from my perspective it seems reasonable. I probably would want F841 to continue flagging tuples that are entirely unused (as it does on stable, IIRC).

@charliermarsh
Copy link
Member

To be clear, your "ideal autofix" is that we do offer the "prefix unused tuple elements" fix but have a way to disable the "remove the left-hand side" autofix?

@hauntsaninja
Copy link
Contributor Author

hauntsaninja commented Dec 23, 2023

Yes, that is my ideal autofix behaviour.

Context:
We're a big user of autofixing (including unsafe fixes) and the "remove the left hand side" autofix for normal unused variables is a little too conservative for us. It makes pure function calls look impure, which then becomes confusing to future readers of the code, so we had F841 marked as unfixable. Whereas the unpacking lint is always fine to autofix (and in my opinion, is only really tolerable when autofixed, given how commonly it occurs for us and the tiny value it provides).

@charliermarsh charliermarsh mentioned this issue Jan 11, 2024
13 tasks
@MichaReiser MichaReiser added this to the v0.2.0 milestone Jan 19, 2024
@dhruvmanila dhruvmanila modified the milestones: v0.2.0, v0.3.0 Feb 12, 2024
@MichaReiser MichaReiser modified the milestones: v0.3.0, v0.4 Feb 14, 2024
@MichaReiser MichaReiser added needs-decision Awaiting a decision from a maintainer and removed good first issue Good for newcomers preview Related to preview mode features labels Mar 28, 2024
@MichaReiser
Copy link
Member

I'm returning this to "needs-decision" after we've discussed this more internally. From my notes:

  • It’s unclear to me why we should have multiple rules to track unused variables. I agree that B007 sets some precedence but as far as I understand it’s more narrow in that the variable might be used outside of the loop, but it isn’t used inside of the loop.
  • I understand that the main motivation for a separate rule is to allow different fixability. I don't know if that's reason enough and could set a bad precedence. I think we should consider alternative options to address this specific concern that doesn't require splitting rules.

We concluded that we need to develop a guideline that specifies when it is okay to have multiple rules that detect the same or very similar code smell and when it is not.

@MichaReiser MichaReiser removed this from the v0.4.0 milestone Apr 4, 2024
@AlexWaygood
Copy link
Member

AlexWaygood commented Aug 12, 2024

I would support splitting this out into a separate check. In general, I think it's very useful for users for rules to be as fine-grained as possible so that they can easily select exactly which diagnostics and fixes they find to be helpful for them.

This also seems to be to be distinct enough that I think users would understand why this would be a separate check. For an assignment that doesn't involve unpacking, such as x = foo(), or x: str = foo(), it's nearly always a mistake to leave the variable unused. You'll either want to delete the line entirely, or (if the function has side effects), call the function without assigning it to a variable. But unused variables that were assigned via unpacking are really more a stylistic error than something that you can say with confidence was a "mistake". Giving unpacked variables descriptive names ((a, b, c, foo)) can be considered much more readable than unpacking a tuple as (_, _, _, foo).

  • I understand that the main motivation for a separate rule is to allow different fixability. I don't know if that's reason enough and could set a bad precedence.

I think for a lot of our users, the ability we have to provide autofixes is a key value add over other tools in the Python ecosystem. I know when I first started using Ruff, this was a far bigger attraction than the performance gains that Ruff boasted. I think changes that make it easier for people to enable autofixes on their code are pretty valuable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-decision Awaiting a decision from a maintainer rule Implementing or modifying a lint rule
Projects
None yet
Development

No branches or pull requests

8 participants