Add unused variables linter #1159

larsk21 · 2022-05-19T13:41:32Z

Add a linter for reporting unused variables in Lean code using Lean's linter API. A variable is reported as unused if it has a declaration, but no uses, except when

its name starts with an underscore
it is defined with the variable command
it appears in the signature of a constructor or structure field
it appears in the signature of a constant without a definition or an axiom
it appears in the signature of a function that is defined elsewhere (extern, implementedBy)

larsk21 · 2022-05-19T13:43:42Z

Examples of unused variables in this repo

Whether all of these should be reported as unused is open for debate.

unnecessary variables:

-- src/Lean/Elab/PreDefinition/Structural/IndPred.lean
def mkIndPredBRecOn (recFnName : Name) (recArgInfo : RecArgInfo) (value : Expr) : M Expr := do
  ...
  forallBoundedTelescope FType (some 1) fun below _ => do
    let main ← mkFreshExprSyntheticOpaqueMVar FType
     -- ^
    let below := below[0]
    let valueNew     ← replaceIndPredRecApps recFnName recArgInfo motive value
    let Farg         ← mkLambdaFVars (recArgInfo.indIndices ++ #[major, below] ++ otherArgs) valueNew
    let brecOn       := mkApp brecOn Farg
    return mkAppN brecOn otherArgs

unused pattern variables:

-- src/Lean/Data/Name.lean
def updatePrefix : Name → Name → Name
  | anonymous, newP => anonymous
            -- ^
  | str p s _, newP => Name.mkStr newP s
     -- ^
  | num p s _, newP => Name.mkNum newP s
     -- ^

unused loop variables:

-- src/Lean/Elab/BuiltinCommand.lean
private def popScopes (numScopes : Nat) : CommandElabM Unit :=
  for i in [0:numScopes] do
   -- ^
    popScope

~~unused function arguments:~~

-- src/Lean/Elab/Tactic/Simp.lean
def dsimpLocation (ctx : Simp.Context) (fvarIdToSimp : FVarIdToLemmaId := {}) (loc : Location) : TacticM Unit := do
  ...
where
  go (fvarIdsToSimp : Array FVarId) (simplifyTarget : Bool) (fvarIdToSimp : Lean.Meta.FVarIdToLemmaId) : TacticM Unit := do
                                                          -- ^
    let mvarId ← getMainGoal
    let result? ← dsimpGoal mvarId ctx (simplifyTarget := simplifyTarget) (fvarIdsToSimp := fvarIdsToSimp)
    match result? with
    | none => replaceMainGoal []
    | some mvarId => replaceMainGoal [mvarId]

variables that were meant to be used:

-- src/Lean/Data/Json/Basic.lean
def setObjVal! : Json → String → Json → Json
  | obj kvs, k, v => obj <| kvs.insert compare k v
  | j      , _, _ => panic! "Json.setObjVal!: not an object: {j}"
 -- ^                                                         ^ (missing `s!`)

-- src/Lean/Data/JsonRpc.lean
instance : ToString RequestID where
  toString
  | RequestID.str s => s!"\"s\""
               -- ^         ^ (probably missing `{}`)
  | RequestID.num n => toString n
  | RequestID.null => "null"

variables that are used in a structure constructor, but actually unnecessary:

-- src/Lean/Compiler/IR/Sorry.lean
def updateSorryDep (decls : Array Decl) : CompilerM (Array Decl) := do
  let (_, s) ← Sorry.collect decls |>.run {}
  return decls.map fun decl =>
    match decl with
    | Decl.fdecl f xs t b info =>
                       -- ^
      match s.localSorryMap.find? f with
      | some g => Decl.fdecl f xs t b { info with sorryDep? := some g }
                                     -- ^ (`sorryDep?` is the only field of the structure)
      | _ => decl
    | _ => decl

~~variables that are used as documentation in dependent arrows:~~

-- src/Lean/Elab/PreDefinition/WF/Fix.lean
private partial def processSumCasesOn (x F val : Expr) (k : (x : Expr) → (F : Expr) → (val : Expr) → TermElabM Expr) : TermElabM Expr := do
                                                          -- ^            ^            ^
  ...

variables that are used as documentation in Syntax patterns ($_ is not allowed):

-- src/Lean/Elab/Match.lean
private def getMatchGeneralizing? : Syntax → Option Bool
  | `(match (generalizing := true)  $[$motive]? $discrs,* with $alts:matchAlt*) => some true
                                    -- ^         ^              ^
  | `(match (generalizing := false) $[$motive]? $discrs,* with $alts:matchAlt*) => some false
                                    -- ^         ^              ^
  | _ => none

Kha

Very nice. Apart from the dependent arrows that we discussed before, to me all the examples are either clear wins or very reasonable to change.

src/Lean/Linter/Basic.lean

src/Lean/Linter/Util.lean

src/Lean/Linter/Basic.lean

src/util/shell.cpp

Kha · 2022-05-26T14:54:02Z

Leo pointed out the following special case to me, where at least currently there is no connection between the different h₂s in the info tree:

theorem foo (h₁ : x = 0) (h₂ : y = x) : y = 0 := by
  subst h₁
  -- We now have `h₂ : y = 0`
  exact h₂

This is probably an issue for quite a few (and potentially user-written) tactics, so I'm wondering if we should look for references on the Expr level after all. Perhaps in combination with the References module so as to avoid edge cases like

let mut x := 0  -- technically dead, but perhaps not helpful as an "unused variable" lint
if b then x := 1 else x := 2

In case we expect that there may be further issues in practice, there is also the option of merging the lint as opt-in for now so that people can test-drive it (which I assume can be done very easily with Lake's moreLean/ServerArgs) and we can further improve it without annoying too many people at the same time. In that case I'm fine with merging it as is.

src/Lean/Linter/Basic.lean

gebner · 2022-05-26T15:06:03Z

This is probably an issue for quite a few (and potentially user-written) tactics, so I'm wondering if we should look for references on the Expr level after all.

Oh I didn't realize this didn't even look at the expression. Then there's of course lots of obvious corner cases, here are a few off the top of my head that I haven't seen mentioned so far:

def f (x : Nat) := ‹_›  -- false positive
def g [Inhabited α] := 42  -- false negative
def h :=
  let inst : Inhabited Nat := ⟨42⟩ -- false positive
  (default : Nat)

larsk21 · 2022-05-27T08:48:14Z

Oh I didn't realize this didn't even look at the expression. Then there's of course lots of obvious corner cases, here are a few off the top of my head that I haven't seen mentioned so far:
def f (x : Nat) := ‹_›  -- false positive
def g [Inhabited α] := 42  -- false negative
def h :=
  let inst : Inhabited Nat := ⟨42⟩ -- false positive
  (default : Nat)

I agree that the α should be marked as an unused variable. For the other two, I'm unsure if these are false positives, since we could instead write:

def f (_ : Nat) := ‹_›
def h :=
  have : Inhabited Nat := ⟨42⟩
  (default : Nat)

The first example is similar to this test where HQ is also marked as unused. Currently the unused marker doesn't always mean that the value of the variable is not needed, but that variable is never referenced by name.

lean4/tests/lean/linterUnusedVariables.lean

Lines 8 to 11 in f8b1b21

    
           theorem implicitlyUsedVariable : P ∧ Q → Q := by 
        
             intro HPQ 
        
             have HQ : Q := by exact And.right HPQ 
        
             assumption

One problem is that this interpretation doesn't work when a variable is never referenced by name, but the name is still relevant:

theorem foo (h₁ : x = 0) (h₂ : y = x) : y = 0 := by
 subst h₁
 -- We now have `h₂ : y = 0`
 exact h₂

gebner · 2022-05-27T09:03:51Z

[...] we could instead write:

def f (_ : Nat) := ‹_›

This is a semantic change, with the previous version you could write f (x := 42), with this one you can't. Argument names are, for better or for worse, part of the public API.

def h :=
  have : Inhabited Nat := ⟨42⟩
  (default : Nat)

Usually you need let instead of have for local instances. Unfortunately let : Inhabited Nat := ⟨42⟩ doesn't work, but that's fixable.

larsk21 · 2022-05-27T09:14:32Z

Argument names are, for better or for worse, part of the public API.

I agree. As it is now, the linter reports function arguments as unused like any other variable - maybe I should change this.

Kha · 2022-05-27T09:29:06Z

Right, there are two slightly different potential lints: a) "this binder can be safely removed (it is unused)" and b) "this binder can be safely made anonymous (its name is unused)". b) is, as discussed, problematic in public signatures, and not really solvable for some tactics currently. So we should probably focus on a) using a combination of lexical and Expr-level references.

Kha · 2022-05-27T20:09:54Z

I'm wondering if we should look for references on the Expr level after all.

Easier said than done, of course. There is no direct association between the free variables from the info tree and the bound variables in the final term. However, since the above issues are mostly about tactics, it might be sufficient to go through all the terms assigned to TacticInfo mvars. We should make sure not to visit any mvar twice though to avoid quadratic overhead.

That would leave the instance binders. I don't immediately see a better solution than just whitelisting them.

Kha · 2022-06-01T11:56:55Z

Excellent work! Now I really think we should just merge it and then await feedback by users. Even for the many warnings in the stdlib, I can't yet tell if I would be annoyed enough by them that we should turn them off there, so let's just try and see I'd suggest.

If they do tend to be a bit too much in e.g. the stdlib (until we can, say, fix them automatically), we could look into suppressing them in patterns only, which seems to be the vast majority of cases.

Kha · 2022-06-01T12:00:53Z

If we do anticipate some degree of annoyance, we could also evaluate it first on other projects, such as mathlib4.

Btw, with the tactic fix in place, it should be safe to re-enable the lint for parameters, no? Because if a parameter is truly unused, that is definitely worth a warning.

src/Lean/Linter/Basic.lean

gebner · 2022-06-01T12:04:58Z

Is there something like the @[nolint unused_arguments] we have in mathlib to disable the lint? There's 84 definitions in mathlib where one of the arguments is intentionally unused.

Kha · 2022-06-01T14:02:07Z

Yes, I assume set_option linter.unusedVariables false (in) would work for that. I see that you ported the mathlib 3 infrastructure using an attribute, but using options has the distinct advantage that we can apply them at many different levels: packages (via moreLeanArgs), modules, section, declarations, and even subterms. On the other hand, the attribute sure can be more compact, which we could emulate using a no_lint command macro desugaring to the set_option code.

Kha · 2022-06-01T14:04:16Z

Then there is probably a larger conversation to be had about how the Lean and mathlib approaches to linters should be correlated/integrated/coexist, but maybe not in this PR.

gebner · 2022-06-01T14:36:08Z

I see that you ported the mathlib 3 infrastructure using an attribute

I didn't want to make any big refactorings yet. Most of the mathlib linters are 1) per declaration and 2) non-local. An attribute makes perfect sense here, since we need to run the linters with all of mathlib imported (when an option would have long been discarded).

but using options has the distinct advantage that we can apply them at many different levels:

That this is impossible is a feature, not a bug. Opting out of best practices should be annoying, and not happen accidentally. We could of course add a "no nolint on sections/modules" linter. 😄

Then there is probably a larger conversation to be had about how the Lean and mathlib approaches to linters should be correlated/integrated/coexist, but maybe not in this PR.

Indeed, it would be great if e.g. disabling a linter had the same syntax in both cases. But I don't see either of the two APIs subsume the other. Non-local checks (like checking for simp-normal form, or for typeclass diamonds) clearly don't fit into Lean concept of "lint the command syntax right here and now". Syntactical checks (like for line-length, comments, deprecated syntax, etc.) clearly don't fit into the mathlib concept of "check the full mathlib environment".

Even the unused variable linter doesn't neatly fit into either category. You need to look at the AST to flag by have x := 42; exact 1※. But on the other hand we also want to check declarations generated by metaprograms, which do not have an input syntax.

※ I just noticed that have behaves differently in Lean 4 than it did in Lean 3. In Lean 3 the have disappeared, so you could use it to declare local instances without affecting the elaborated term. But now it produces let_fun.

larsk21 · 2022-06-02T14:17:19Z

With these fine-grained options to disable linters, do we still want the unused variable pattern _x, i.e. variables starting with an underscore are ignored by the linter?

Kha · 2022-06-02T14:28:55Z

It's still useful for other kinds of variables, no? And I assume most code will (eventually) not make use of these options.

larsk21 · 2022-06-02T14:38:07Z

Sure, it can be used on every variable detected by the linter.

larsk21 · 2022-06-02T16:40:26Z

Some statistics from a stage2 build of the Lean repo:

category	number of unused variables	percentage of all unused variables
all	1337	100%
function arguments	353	26.4%
pattern variables	571	42.7%

Kha · 2022-06-02T16:56:53Z

Interesting, I would have thought it would be more pattern variables. In that case I'm fine with merging it as is, and I believe @leodemoura as well. @gebner Do you have anything in mind that should be addressed right now?

Opting out of best practices should be annoying, and not happen accidentally.

I don't feel that strongly about opting out btw. Clippy/Rust also supports opt-outs at the crate & module level. And it's still not really accidental, is it?

leodemoura · 2022-06-02T17:16:21Z

Interesting, I would have thought it would be more pattern variables. In that case I'm fine with merging it as is, and I believe @leodemoura as well. @gebner Do you have anything in mind that should be addressed right now?

Yes, I am fine with merging as is. I will be happy to address the unused variable warnings.

gebner · 2022-06-02T17:38:52Z

Do you have anything in mind that should be addressed right now?

I don't think there's anything which needs to be done right now. The PR doesn't really break anything, and set_option linter.unusedVariables false is always an option.

But we should have a plan on how to fix all the new warnings (almost every file has warnings now).

Clippy/Rust also supports opt-outs at the crate & module level. And it's still not really accidental, is it?

I'm not particularly concerned about configuring linters at the crate/package level. Linters need to be configured somewhere after all. However a section-opt-out can easily affect more declarations than originally intended, e.g. if new declarations are added to the section.

The set_option is even easier to accidentally misuse, all you need to forget are two characters and the linters are disabled for the rest of the file.

- ignore unused variables in dep arrows - avoid negated options - make syntax stack generation more performant - make ignore functions more extensible - change message severity to `warning`

Kha reviewed May 20, 2022

View reviewed changes

src/util/shell.cpp Show resolved Hide resolved

larsk21 force-pushed the linter/unused-variables branch 2 times, most recently from 3dca07e to f8b1b21 Compare May 23, 2022 16:21

larsk21 requested a review from Kha May 25, 2022 20:19

Kha reviewed May 26, 2022

View reviewed changes

src/Lean/Linter/Basic.lean Outdated Show resolved Hide resolved

larsk21 force-pushed the linter/unused-variables branch from bbe47f3 to 56d7e93 Compare May 31, 2022 14:38

Kha reviewed Jun 1, 2022

View reviewed changes

src/Lean/Linter/Basic.lean Show resolved Hide resolved

fix: run linters after elaboration

f092843

larsk21 added 10 commits June 3, 2022 11:33

fix: add additional information to Lean.Server.Reference

60a66e8

feat: add unused variables linter

765bb7d

fix: disable linters in tests

ce3f83a

fix: forward lean options to workers

4519a7c

fix: unused variables linter review comments

e92df38

- ignore unused variables in dep arrows - avoid negated options - make syntax stack generation more performant - make ignore functions more extensible - change message severity to `warning`

refactor: extend Lsp.ModuleRefs in Server.References

f56b1d3

fix: use findModuleRefs in unused variables linter

8d3e80c

fix: ignore exposed function arguments in unused variables linter

11fe65a

fix: consider tactic mvar assignments for used variables

c1fab4d

feat: add nolint options for function arguments and pattern variables

301e8fb

Kha force-pushed the linter/unused-variables branch from b1cfd2d to 301e8fb Compare June 3, 2022 11:03

Kha merged commit caa8804 into leanprover:master Jun 3, 2022

larsk21 deleted the linter/unused-variables branch June 3, 2022 11:50

Kha mentioned this pull request Dec 24, 2023

linter.unusedVariables false negative within cases #3112

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add unused variables linter #1159

Add unused variables linter #1159

larsk21 commented May 19, 2022

larsk21 commented May 19, 2022 •

edited

Loading

Kha left a comment

Kha commented May 26, 2022 •

edited

Loading

gebner commented May 26, 2022

larsk21 commented May 27, 2022 •

edited

Loading

gebner commented May 27, 2022

larsk21 commented May 27, 2022 •

edited

Loading

Kha commented May 27, 2022 •

edited

Loading

Kha commented May 27, 2022

Kha commented Jun 1, 2022

Kha commented Jun 1, 2022

gebner commented Jun 1, 2022

Kha commented Jun 1, 2022

Kha commented Jun 1, 2022 •

edited

Loading

gebner commented Jun 1, 2022

larsk21 commented Jun 2, 2022

Kha commented Jun 2, 2022

larsk21 commented Jun 2, 2022

larsk21 commented Jun 2, 2022

Kha commented Jun 2, 2022

leodemoura commented Jun 2, 2022

gebner commented Jun 2, 2022

Add unused variables linter #1159

Add unused variables linter #1159

Conversation

larsk21 commented May 19, 2022

larsk21 commented May 19, 2022 • edited Loading

Examples of unused variables in this repo

Kha left a comment

Choose a reason for hiding this comment

Kha commented May 26, 2022 • edited Loading

gebner commented May 26, 2022

larsk21 commented May 27, 2022 • edited Loading

gebner commented May 27, 2022

larsk21 commented May 27, 2022 • edited Loading

Kha commented May 27, 2022 • edited Loading

Kha commented May 27, 2022

Kha commented Jun 1, 2022

Kha commented Jun 1, 2022

gebner commented Jun 1, 2022

Kha commented Jun 1, 2022

Kha commented Jun 1, 2022 • edited Loading

gebner commented Jun 1, 2022

larsk21 commented Jun 2, 2022

Kha commented Jun 2, 2022

larsk21 commented Jun 2, 2022

larsk21 commented Jun 2, 2022

Kha commented Jun 2, 2022

leodemoura commented Jun 2, 2022

gebner commented Jun 2, 2022

larsk21 commented May 19, 2022 •

edited

Loading

Kha commented May 26, 2022 •

edited

Loading

larsk21 commented May 27, 2022 •

edited

Loading

larsk21 commented May 27, 2022 •

edited

Loading

Kha commented May 27, 2022 •

edited

Loading

Kha commented Jun 1, 2022 •

edited

Loading