Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lint unused variable assignment / dead store (false negative unused-variable?) #5838

Open
richardebeling opened this issue Feb 25, 2022 · 14 comments
Labels
C: unused-variable Enhancement ✨ Improvement to a component High effort 🏋 Difficult solution or problem to solve Needs specification 🔐 Accepted as a potential improvement, and needs to specify edge cases, message names, etc.

Comments

@richardebeling
Copy link

richardebeling commented Feb 25, 2022

(Issue altered after the discussions up to #5838 (comment) to better reflect intention)

Problem description

Pylint will give unused-variable / W0612 when code assigns a variable that is never used, but this message will not appear if there was any usage of the variable in the past. However, in code, I'd consider it a code smell to assign an unused value to a variable, even if the variable was used before. Consider this code:

#! /usr/bin/env python3
# pylint: disable=missing-docstring

def some_func():
    var = 3
    print(var)
    var = 4
    # var is never used again. The last assignment thus could be removed.

In c++ code, clang-tidy would warn about the dead store.

Expected behavior

Initially, I had considered this to be a case for unused-variable. As @jacobtylerwalls pointed out, the variable is technically used, so maybe it should be a different message.

************* Module test
test.py:6:4: W0612: Unused variable 'var' (unused-variable)

-------------------------------------------------------------------
Your code has been rated at 6.67/10 (previous run: 10.00/10, -3.33)

Pylint version

pylint 2.13.0-dev0
astroid 2.9.3
Python 3.9.7 (default, Sep 10 2021, 14:59:43) [GCC 11.2.0]

(installed from main at 6622d10)

@richardebeling richardebeling added the Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling label Feb 25, 2022
@Pierre-Sassoulas Pierre-Sassoulas added False Negative 🦋 No message is emitted but something is wrong with the code and removed Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling labels Feb 25, 2022
@Pierre-Sassoulas
Copy link
Member

Thank you for opening the issue after checking on the latest main branch :)

@jacobtylerwalls
Copy link
Member

Interesting case to think about. What definition would we rely on for "used", then? var was used in the course of getting the max of itself and 1. What about that is not "use"? 🤔

@Pierre-Sassoulas
Copy link
Member

Hmm, I marked as false negative too fast, I think @jacobtylerwalls is right.

@richardebeling
Copy link
Author

I was thinking of "unused" as "the value you assign here is never used afterwards", inspired by how clang-tidy would lint it as a dead store in c++ code.

I thought pylint would lint such cases in general, but just learned that it doesn't. In this case, this is not really a bug report. Would it make sense to open a feature request for this?

@Pierre-Sassoulas Pierre-Sassoulas added Discussion 🤔 Proposal 📨 and removed False Negative 🦋 No message is emitted but something is wrong with the code labels Feb 25, 2022
@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.14.0 milestone Feb 25, 2022
@DanielNoord
Copy link
Collaborator

I think this could be a feature request for the CodeStyle checker. Although obviously impacting performance I think such intermediate assignment are often used for clarity and readability and I don't think we should warn against them outright.

@richardebeling richardebeling changed the title False negative unused-variable on reassignment depending on old value Lint unused variable assignment / dead store (false negative unused-variable?) Feb 25, 2022
@richardebeling
Copy link
Author

richardebeling commented Feb 25, 2022

I think this could be a feature request for the CodeStyle checker. Although obviously impacting performance I think such intermediate assignment are often used for clarity and readability and I don't think we should warn against them outright.

Just to make sure there's no misunderstanding, let me clarify: My problem is not with immediate reassignment. This is fine because var is used after its last assignment:

def some_func():
    var = some_computation_that_gives_immediate_result()
    var = some_follow_up_computation_that_gives_final_result(var)
    print(var)

My problem is with assignments that are not used:

def some_func():
    var = 3
    print(var)
    var = 4
    # var now goes out of scope, so the last value assigned will never be used.

The code I initially stumbled upon just happened to combine both:

def some_func():
    var = some_computation_that_gives_immediate_result()
    var = some_follow_up_computation_that_gives_final_result(var)
    # again, value stored to var is unused

I initially thought pylint would already lint this, and the self-reassignment was a special case, but turns out it is not. I changed the issue title and text to better reflect my actual intention (hope that's fine with the maintainers).

@DanielNoord
Copy link
Collaborator

DanielNoord commented Feb 26, 2022

Ah, this is a different example and I actually think this is a nice enhancement. Although I wonder why we don't already see this.

@jacobtylerwalls Thoughts? The second example (which is now also in the OP) seems to be something we could warn against?

@jacobtylerwalls
Copy link
Member

It's an interesting idea, for sure.

Although I wonder why we don't already see this.

The variable consumers (per scope) are all or nothing, per name. There's not a concept of "name is used for part of the function/module and not used in later parts".

A really crude diff to surface test failures would be

diff --git a/pylint/checkers/variables.py b/pylint/checkers/variables.py
index 1781866a..dbadadbf 100644
--- a/pylint/checkers/variables.py
+++ b/pylint/checkers/variables.py
@@ -1434,7 +1434,9 @@ class VariablesChecker(BaseChecker):
                 # They will have already had a chance to emit used-before-assignment.
                 # We check here instead of before every single return in _check_consumer()
                 nodes_to_consume += current_consumer.consumed_uncertain[node.name]
-                current_consumer.mark_as_consumed(node.name, nodes_to_consume)
+                current_consumer.mark_as_consumed(node.name,
+                    [n for n in nodes_to_consume if n.lineno is None or n.lineno <= stmt.lineno]
+                )
             if action is VariableVisitConsumerAction.CONTINUE:
                 continue

I get 32 failures, some of which look good, but others look dicey such as:

https://github.com/PyCQA/pylint/blob/5bdd5034c556d2d4986a365a666e923bccef2a94/tests/functional/a/access/access_attr_before_def_false_positive.py#L29-L38

Is the last maxLength used or unused here? How do we decide?

@Pierre-Sassoulas
Copy link
Member

Is the last maxLength used or unused here? How do we decide?

I think in this example it's really unused, it's used line 37 but only to change the value line 38. Then the value line 38 is not returned or used in another variable. So for this example it's really a nice check.

But in the general case I'm pretty sure we can have false positive, probably when analyzing function that have side effects ?

@jacobtylerwalls
Copy link
Member

But in the general case I'm pretty sure we can have false positive, probably when analyzing function that have side effects ?

Right, I was concerned about side effects also.

So for this example it's really a nice check.

I guess I thought this example was fine. How might you suggest rewriting it? Other than just max(len(x) for x in ...), which sort of defeats the example. I guess the proposed checker would need to treat anything under a loop as "used".

@Pierre-Sassoulas
Copy link
Member

Pierre-Sassoulas commented Feb 27, 2022

How might you suggest rewriting it?

The example is fine. Let's construct some almost similar one. I'm thinking as I write here.

So I guess this should not raise anything:

     def readUntilArray(self, matches, _=None): 
         self.process_rawq() 
         maxLength = 0
         for match in matches: 
             if len(match) > maxLength:
                 maxLength = len(match)
         return maxLength 

Then we'd have to be able to tell that the code has no side_effect to be sure that maxLength is unused here ?

     def readUntilArray(self, matches, _=None): 
         self.process_rawq() 
         maxLength = 0 #[unused-variable] ?
         for match in matches:
             do_something_so_loop_isnt_useless(match)
             if len(match) > maxLength:
                 maxLength = len(match)

Here it's almost the same than before but we really use maxLength in the conditional for a variable that is used later. So it's unused but only inside the if of the for loop (?)

     def readUntilArray(self, matches, _=None): 
         self.process_rawq() 
         maxLength = 0
         other_variable = None
         for match in matches: 
             if len(match) > maxLength: # maxLength is used to calculate other_variable here
                 maxLength = len(match) # [unused-variable] ?
                 other_variable = 1
         return other_variable 

I'm pretty sure I'm missing dozen of tricky/problematic constructs though.

@DanielNoord
Copy link
Collaborator

@jacobtylerwalls Is this supposed to be on the 2.14 milestone? Or should it be removed?

@jacobtylerwalls jacobtylerwalls modified the milestones: 2.14.0, 2.15.0 May 6, 2022
@jacobtylerwalls
Copy link
Member

Yeah I don't know that it needs to be on any milestone, really. I tried to summarize what I know about it so far!

@Pierre-Sassoulas Pierre-Sassoulas added Needs specification 🔐 Accepted as a potential improvement, and needs to specify edge cases, message names, etc. Enhancement ✨ Improvement to a component labels Jul 5, 2022
@nickdrozd
Copy link
Contributor

Unused assignment can come before or after use. Hopefully they will have the same solution, but they should both be fixed:

def unused_before():
    x = 4  # bad
    x = 5
    print(x)

def unused_after():
    x = 4
    print(x)
    x = 5  # bad

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: unused-variable Enhancement ✨ Improvement to a component High effort 🏋 Difficult solution or problem to solve Needs specification 🔐 Accepted as a potential improvement, and needs to specify edge cases, message names, etc.
Projects
None yet
Development

No branches or pull requests

5 participants