Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linter / Formatter Wishlist #7217

Open
11 of 64 tasks
BoltonBailey opened this issue Sep 17, 2023 · 5 comments
Open
11 of 64 tasks

Linter / Formatter Wishlist #7217

BoltonBailey opened this issue Sep 17, 2023 · 5 comments
Labels
enhancement New feature or request help-wanted The author needs attention to resolve issues please-adopt

Comments

@BoltonBailey
Copy link
Collaborator

BoltonBailey commented Sep 17, 2023

I am compiling a list of things we might like our linter / possible future code formatters / reviewdog to detect/correct/comment on. Feel free to add to this list, or augment items with related issues.

Linters

Style Guide Items

  • Variable conventions
  • 100 Character line limit
  • Header and Imports
    • Presence of copyright header
    • Presence of module docstring
  • Structuring Definitions and Theorems
    • : and := and infix operators
    • After stating the theorem, we indent the lines in the subsequent proof by 2 spaces.
    • If the theorem statement requires multiple lines, indent the subsequent lines by 4 spaces.
    • the by is placed on the line prior to the first tactic
    • No orphaned parentheses
    • In a class or structure definition, fields are indented 2 spaces
    • and moreover each field should have a docstring.
    • When using a constructor taking several arguments in a definition with each argument on a new line, the arguments line up
    • When providing terms of structures or instances of classes, the where syntax should be used to avoid the need for enclosing braces
  • Hypotheses left of colon
  • Use a space after binders
  • Anonymous functions
    • preferred to => (only slightly)
    • λ disallowed in favor of fun
  • Calculations
    • calc keyword should be placed on the line prior to the start of the calculation, with the calculation indented.
    • Whichever relations are involved (e.g., = or ≤) should be aligned from one line to the next.
    • The underscores _ used as placeholders for terms indicating the continuation of the calculation should be left-justified.
  • Tactic mode
    • When new goals arise as side conditions or steps, they are indented and preceded by a focusing dot · (inserted as .); the dot is not indented.
    • Often t0 <;> t1 is used to execute t0 and then t1 on all new goals. Either write the tactics in one line, or indent the following tactic.
  • Make sure the right number of newlines exist between lemmas/sections
  • Whitespace and delimiters
    • while $ is a synonym for <|, its use in mathlib is disallowed in favor of <| for consistency as well as because of the symmetry with |>.
  • Comments
    • Documentation strings for declarations are delimited with /-- -/.

Items not in the style guide which might nevertheless be nice

  • Nonterminal simp (calls to simp that do not close the goal) should be replaced with simp? output.
  • When a namespace is open, ensure it isn't redundantly used on lemma invocations in proofs.
  • Similarly, when a notation is available, make sure it is used.
  • Look at namespaces / scoped notations that could be opened to shorten parts of the code.
  • Use variables declared via variables keyword in the file when possible.
  • Check for unnecessary parentheses
  • Remove unnecessary/transitively-provided imports (This now exists in the form of Shake)
  • Perhaps alphabetize the imports (apparently this changes behavior though, so be careful)
  • Replace common ASCII with the corresponding unicode, such as <- to .
  • Collapse sequences of calls to rw / simp only into one call, when this doesn't affect proof readability.
  • Replace calls to simp/simp only with simp_rw where possible/not too verbose, so that readers can use their cursor to inspect the effect of each step.
  • Don't rw with unspecified arguments, this makes the proof harder to read.
  • Make sure new lemma names follow the naming conventions (or at least that they don't make obvious mistakes).
  • Double declarations of the same variable. i.e. don't use variable {α : Type*} at the top of the file, and then the same declaration again below.
  • Use (∀ i, α i) or ((i : X) -> α i) according to whether the α i values are Props
  • And even more compute intensive:
    • Make sure simp only calls don't use unnecessary lemmas
    • Generalize typeclasses as much as possible by attempting to replace instance arguments with weaker ones.
  • Linter for unneeded by_cases
  • Linter for unnecessarily strong TC assumptions
  • Similarly, a linter to check that types that do not need to be the same in a theorem definition aren't.

Formatters

A formatter does not exist yet, but when it does, the above issues should be revisited.


(Thanks to Floris, @MohanadAhmed, Sebastien Gouezel for contributing points)

@BoltonBailey BoltonBailey added the help-wanted The author needs attention to resolve issues label Sep 17, 2023
@MohanadAhmed
Copy link
Collaborator

MohanadAhmed commented Sep 21, 2023

Hello @BoltonBailey
Nice list.
I like the idea of a formatter that can run automatically and get rid of the small pesky annoyances during review.
I am however not optimistic about such a formatter working in Lean4. This is based on the following observation:

  • Say you have a file with 300 lines.
  • And you comment Lines 100 through. 120
  • assuming nothing breaks you will notice that until the yellow line finishes the whole file your commented out lines will not have the correct comment coloring.
    I don't know if this is just a small issue, but to me it tells me anything that depends on lean doing a full walk through a file to give feedback is not going to work.
    What do you think?

White spacing

Here is a list I collected from comments on a PR a while ago:

  1. Variable Type spacing. i.e: white space between variable and colon, white space between colon and type, no whitespace from open bracket(brace) side, no white space from close bracket (brace) side.
    • (A: Type) to (A : Type)
    • (A:Type ) to (A : Type) ... etc.
  2. Spacing between variable declarations:
    • (A : Type)(B : Type) becomes (A : Type) (B : Type)
  3. Space between definition identifier (name) and the variable declarations
    • def myMatrix(m n : Nat): Matrix (Fin m) (Fin n) Nat becomes def myMatrix (m n : Nat) : Matrix (Fin m) (Fin n) Nat
    • Note that this is similar to rule 1 the type of the definition (includes lemmmas and theorems is also separated from the color preceding it and the colon is separated from the identifier and variables preceding it)
  4. Type of definitions and theorems, if they extend beyond the line should be indented by 4 spaces.
  5. Subgoals of suffices should be focused using \.
  6. Infix operators should have whitespace from their operands.
    • B⬝A becomes B ⬝ A

@MohanadAhmed
Copy link
Collaborator

Non-Terminal Simp

The second to last item in your current list is "Non-terminal simp". My understanding is a terminal simp is a simp without only and possibly a list of lemmas. If my understanding of this term is correct, then I have a data point against this being something a linter or formatter should warn against.

See this #6052 (comment)

@digama0
Copy link
Member

digama0 commented Sep 22, 2023

@MohanadAhmed A "terminal" simp means a simp at the end of a tactic block. The example in that comment is of a terminal simp. So the linter warning would be for a simp without only, which is followed by more tactics rather than being the last tactic in a sequence. There are exceptions for this too, for example maybe simp; simp or simp; ring is okay, but the allowed exceptions could presumably be added to the linter.

@MohanadAhmed
Copy link
Collaborator

@digama0 Thanks for the correction.

I am curios what is your opinion on how feasible is it to make a formatter in Lean code that runs almost instantly (say like javascript or similar in VSCode) ?
Is it reasonable to expect Lean in the near future to instantly parse and reformat say 300 line file?

@digama0
Copy link
Member

digama0 commented Sep 25, 2023

Significant architectural changes are required to make a formatter which runs faster than lake build. See https://leanprover.zulipchat.com/#narrow/stream/270676-lean4/topic/quick.20and.20dirty.20mode for context.

@BoltonBailey BoltonBailey added the enhancement New feature or request label Oct 3, 2023
bors bot pushed a commit that referenced this issue Nov 1, 2023
It's [proposed](#7217) that a code formatter remove those automatically, so we might as well remove them as we see them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help-wanted The author needs attention to resolve issues please-adopt
Projects
None yet
Development

No branches or pull requests

3 participants