Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental adoption of rules #42

Closed
jfmengels opened this issue Sep 26, 2020 · 18 comments
Closed

Incremental adoption of rules #42

jfmengels opened this issue Sep 26, 2020 · 18 comments
Labels
design-discussion enhancement New feature or request

Comments

@jfmengels
Copy link
Owner

jfmengels commented Sep 26, 2020

Abstract

I am looking for ways to make it easier to adopt static analysis/linting rules in a project when there are many errors, and fixing all of them in one go is not feasible or too daunting.

Context

At the moment, when you add a new rule to the elm-review configuration (or simply run elm-review for the first time), you're likely to end up with a lot of new errors, which you may not want to deal with right now.

If the rule happens to provide automatic fixes, that will help out a lot. But if there were a lot of errors, even that might result in a lot of changes that your colleagues (and yourself!) will not fancy reviewing.

Currently the best option is probably to enable the rule, but to ignore the errors for every file containing an error using ignoreErrorsForFiles, and then get your team to fix the errors gradually.

config : List Rule
config =
    -- Opt-in list of rules to enable
    [ Some.Rule.rule
        |> Rule.ignoreErrorsForFiles
            [ "src/A.elm"
            , "src/B.elm"
            , "..."
            ]
	]

You can also fix the errors before enabling the rule, but there is then a risk of new errors appearing in the meantime.

I propose a solution for this problem, but I first want to go through the "obvious" alternative solutions that other tools seem to have chosen.

Alternative solutions

Severity levels

I believe that one way this is usually done in other static analysis tools (ESLint for instance), is to associate rules with a severity level, like "warning" and "error" (which we don't support). I believe this to be a non-helpful solution in the long run, as I describe here. I personally don't enable any rule as a "warning" when I work with ESLint.

I therefore really don't want to support severity levels, and prefer this problem to remain as it is than to solve it this way (except if it does solve other kinds of problems).

Add suppression comments

Again, this is something that you could see in ESLint-land. Though if there are too many errors, the rule will most certainly end up as a warning. But it would be well possible to automate adding a // eslint-disable-line <rule-name> on every reported line, or even a /* eslint-disable <rule-name> */ at the top of each reported file.

elm-review does not support suppression comments. I have described over here why I think this is not a good idea in my opinion, even though that can make a lot of things harder.

I would add to this that people don't go hunting for disable comments to fix the errors. The reasoning is that it's often hard to know whether a rule was disabled for good reason (bug in the rule, ...) or not (lazy developer, ...), plus there is no centralized location that tells you how many disabled errors there are or where to start.

Proposed solution

My proposal is this: Add new functions that temporarily suppress a rule for a list of files. It would look very much like ignoreErrorsForFiles. I will for the sake of this proposal name the function temporarilySuppressErrors, but the bikeshed is open.

config : List Rule
config =
    [ Some.Rule.rule
        |> Rule.temporarilySuppressErrors
            [ "src/A.elm"
            , "src/B.elm"
            , "..."
            ]
	]

It looks very much like ignoreErrorsForFiles, but there would be some differences.

When we notice that no error is showing up in the given file anymore (even if by coincidence), elm-review would report an error and fail, telling the user to remove the file from the list of ignored files in their review configuration. This is a small improvement but would help clean up step by step. Maybe the user would get a dopamine rush and feel like fixing the rest of the issues!

If a file turns out not to exist, then the CLI would report that. Otherwise we'd never be able to clean up the list of files if a file was deleted.

At the moment and when possible, ignored files are not being analyzed by rules, when we know that not analyzing it will not impact the analysis of the rest of the project. For suppressed files, we would need to analyze them, because otherwise we can't report when they do not contain errors anymore.

The CLI would provide a flag (and editors would provide some toggle?) allowing the user to disable the suppressions (i.e. re-activating the rules), allowing you to easily go hunt for these errors. The user could also manually remove a file from the suppression list if they prefer. Having everything in the configuration makes it easier to see the amount of errors left, even though we don't see how many errors are in each file.

An idea from a colleague of mine was to run elm-review in the CI twice: Once the "usual" way, and if there are no errors, run it once again with the previously mentioned flag (but ignoring the exit code), and feed that to our monitoring service (we are a monitoring company after all), so we can follow the quantity of suppressed errors and prioritize their hunt. An alternative solution would be to make elm-review NOT exit with an error code if only suppressed errors were reported, so that we don't have to run the tool twice. I imagine that if some people see a good reason for the tool to fail even in that case, we could add a separate flag or make it take an argument.

For this last thing to work well, we do need to output of the error to be marked as "suppressed", so I think the output will most likely look something like this

-- ELM-REVIEW ERROR ----------------------------------------- src/Main.elm:10:11

(fix) (suppressed) NoUnused.Variables: Imported variable `span` is not used

...

and the JSON output would have an additional field "suppressed": true added to it.

We could have done the same thing by having the rules to be enabled in a different review configuration, but I don't believe that to be practical for several reasons (that I could go into if asked, I'm being lazy now).

After reviewing, the CLI would report the number of suppressed errors (without showing them individually). The wording is temporary, but it shows the idea.

# Currently
I found no errors!
# With proposed and if there suppressed errors
I found no errors, but there are 500 suppressed errors lying around.
# If there are some fixable errors in there, the message could be more precise
I found no errors, but there are 500 suppressed errors lying around, including 12 that provide an automatic fix. To fix them, run

    elm-review --fix --no-suppressions

I don't think we should mention suppressed errors when there are non-suppressed errors being reported. It's likely just distracting.

Granularity

As you may have noticed, you can't disable a rule for only a section of a file as the smallest item you can customize here is a file. The issue with this is that you can start with a file with N suppressed errors, and when you want to fix the issues, notice that they have grown to a number bigger than N since the time of suppression.

We could ignore errors at given locations (Rule.temporarilySuppressErrors [ ("src/A.elm", [{start={row=1, column=1},end={row=2, column=1}}]) ], but since files change and elm-review can't compare with previous versions of the source code (I imagine that including Git in the analysis would make several things very complex), if a reported line gets moved because another line was included 100 lines prior, you'd get a false positive. It's technically possible, but I think it would make things very annoying.

We could provide an API that allow for both (example below), but that does not solve the flakiness of the suppressions.

config : List Rule
config =
    [ Some.Rule.rule
        |> Rule.temporarilySuppressErrors
            [ forWholeFile "src/A.elm"
            , onlyInSomeLocations "src/B.elm"
                  [ {start={row=1, column=1},end={row=2, column=1}}
                  ]
            ]
	]

A alternative solution - which I think would be the way to go - is to count the number of errors in each file, and if the number of errors changes, we will require it to be updated (and we'd forbid specifying 0).

config : List Rule
config =
    [ Some.Rule.rule
        |> Rule.temporarilySuppressErrors
            [ { path = "src/A.elm", nbErrors = 1 }
            , { path = "src/B.elm", nbErrors = 5 }
            ]
	]

This would not help prevent exchanging errors (one gets fixed while one gets removed), but it would at least help detect when one gets introduced and the codebase gets worse.

I imagine that if the count increases, we'd have to display all the suppressed errors for that file so that the user could figure what was introduced. Not great but better than not showing anything.

Making this easy to use

There would not be an equivalent to ignoreErrorsForDirectories. While it makes sense to have ignoreErrorsForDirectories to disable elm-review on generated and vendored code, temporarily suppressing an entire folder would make it much harder for users to gradually fix the errors.

I have not personally felt the pain of adding the list of ignored files into the configuration, because I am good enough with editors to extract the file names from the errors and format them in the required way. But I imagine that not to be the case for most people.

I think it would be nice if the CLI helped with that. I am imagining a flag that would generate the suppression code for you. Or even modify the configuration, but that's a way more daunting task.

An example (again, just temporary names and texts)

> elm-review --generate-suppressions
  [ Some.Rule.rule
      |> Rule.temporarilySuppressErrors
          [ "src/A.elm"
          , "src/B.elm"
          ]
  , Some.Other.Rule.rule
      |> Rule.temporarilySuppressErrors
          [ "src/A.elm"
          , "src/C.elm"
          ]
  ]

The tricky part would be to provide a way that makes it as easy as possible to just copy-paste this in the file. At Humio, parts of our configuration looks like

config : List Rule
config =
    [ Some.Rule.rule
        |> Rule.ignoreErrorsForFiles ignoredForSomeRule
	]

ignoredForSomeRule  : List String
ignoredForSomeRule =
    [ "src/A.elm"
    , "src/B.elm"
    ]

Which doesn't prone itself to easy copy pasting of what I suggested above.

The CLI could suggest using this flag when the number of shown errors reaches a certain threshold (20?).

Additional alternative solution: Integrate with Betterer or do the same thing

I haven't yet mentioned @phenomnomnominal's Betterer as a solution. It is a tool that aims to solve the same issue in a more generic way. I only talk about it now because I believe it tries to solve the issue in a similar manner, and it's worth comparing. That said, I have not yet been able to use it, so my understanding is likely a bit shallow...

From what I understand, it either snapshots the results from ESLint (for the ESLint integration), stores that file in the project for it to be committed, and then checks that nothing got worse than before. I have no clue how the location gets matched with the previous run.

That or it simply counts the number of errors, but I could not figure out whether it's the total number of errors or the number of errors per file 🤷‍♂️

Summary

The proposed solution

  • 👍 allows you to enable a rule gradually
  • 👍 rules that are not being suppressed in a file will be enforced in that file (not the case with warnings)
  • 👍 keeps all the errors that need to be fixed in a single place
  • 👍 would make it easier to hunt and monitor suppressed errors
  • 👍 would not introduce features with nasty side-effects, as far as I'm aware
  • 👍 helps differentiate between files that have been cleared as okay for infringing the rules and those that still need to be fixed
  • 👍 would make sure no additional errors show up
  • 👎 would not help against exchanging errors

What I'm looking for

I'm looking for general feedback on this feature. Do you think this is a good idea, a bad idea? Do you think you'd use it?

What do you think about the granularity issue? Can you think of a better solution?

It's not my main concern now, but I could use some bikeshedding on the names here (for which I have not yet put any effort into).

Do you think we should always show problems related to suppression errors (new errors showed up, some errors got fixed), or only if there are no more non-suppressed errors?

@anagrius
Copy link

anagrius commented Sep 29, 2020

I think for granularity, you probably want to just match with Regex (see -Wconf in scalac for an example). Maybe rather than line numbers as in the example you specify the full path including function name.

Lines change a lot, function names less so.

@anagrius
Copy link

anagrius commented Sep 29, 2020

In general I think that Rule + file name is by far good enough. Else the comment solution known from other linters, would both be easy to use and familiar.

@phenomnomnominal
Copy link

phenomnomnominal commented Oct 4, 2020

I understand that you may not want to introduce a new tool, but this is literally what I designed Betterer to do.

Here's the example ESLint test: https://github.com/phenomnomnominal/betterer/blob/v4/packages/eslint/src/eslint.ts

For elm-review to work with Betterer it would just need a JS API (probably runElmReview) that takes set of file paths to check, and then map it back to a startLine, startColumn, endLine, endColumn and error message.

@jsegura
Copy link

jsegura commented Oct 4, 2020

Actually I like the approach of ignoring files and I think that it should support regex when listing which files should be ignored. Even Modules instead of files could be an option.

TLDR: Internally we specify which modules we want to validate.


In our project we took a different approach. The frontend of our project is elm and the codebase is evolving at the same time as our elm skills. So we try new ways of doing things every now and then and we don't have a lot of consistency.

Recently we tried successfully something that we call RequestResponse pattern. After we felt confident with the new approach, we realised that elm-review could be a tool to help us with that but we only wanted to review specific files. We explicitly indicate which modules we want to validate:

ReviewConfig.elm:

config : List Rule
config =
    [ NoUnused.Variables.rule
    , NoUnused.Dependencies.rule
    , ...
    , RequestResponseUpdatedFields.rule
        [ "API1", "API2" ]
    ]

Context of our Rule:

type alias Context =
  { ...
  , reviewEnabled: Bool
  }

Then we use Rule.withModuleDefinitionVisitor to mark the module as "reviewable" or not depending on the module list.

But thinking about it, I think that the approach of having this feature in elm-review will be beneficial because you will know that that file/module was specifically ignored

I found no errors, but there are 500 suppressed errors lying around.

@jfmengels
Copy link
Owner Author

jfmengels commented Oct 4, 2020

@phenomnomnominal: I understand that you may not want to introduce a new tool, but this is literally what I designed Betterer to do.

I totally get that! And I agree that it's probably not too hard to integrate with Betterer.

That said, there are some benefits to doing this in the tool rather than through a third-party tool.

  1. Even if I'm fine with using Betterer, my users might not want to use it.
  2. Relying on a single tool makes it easier to integrate in other environments like editors or predefined GitHub actions without having to add the use-case for betterer
    (We could maybe integrate Betterer under the hood inside the tool though. I'll have to think of the implications of this)
  3. Potentially, users could have multiple review configurations, and I don't think (but please correct me) that betterer supports that.
  4. With this proposal, you have a single location where to look at for everything: enabled rules, ignored files and suppressed errors (since elm-review doesn't have disable comments). This way if you want to see whether someone disabled a rule that they shouldn't have, you only have to look at the ReviewConfig file. Also, and more to the point of this issue, if you want to tackle the suppressed errors, you only have location to look at, and not scourge the project for disable comments and/or look at a Betterer file which users might be less aware of, especially if they are new to the project.
  5. Betterer makes it I think too easy to ignore/swallow errors if people are lazy. And I would think that it is a bit hard for the team to review and see that their colleague ignored an error that should not have been ignored. But maybe I'm wrong and the diff makes it very clear? 🤔
  6. With an in-house solution, we have more control over what we can do. For instance as mentioned in the issue, we could run elm-review in a way that shows all the suppressed errors but doesn't fail the process if there are ONLY suppressed errors (for monitoring purposes for isntance). Other tools like ESLint use warnings for this, but I think that warnings have other downsides that I think make the static analysis tool worse when present (from experience with ESLint at least).

I hope this makes sense and why I want to explore this space, which I think would be valuable for static analysis tools out there too.

@jfmengels
Copy link
Owner Author

jfmengels commented Oct 4, 2020

@jsegura: Actually I like the approach of ignoring files and I think that it should support regex when listing which files should be ignored. Even Modules instead of files could be an option.

You mean you like ignoring files indefinitely over suppressing the errors in the files? I am open to the idea of having a ignoreErrorsForModules by the way (but that's a separate issue).

TLDR: Internally we specify which modules we want to validate.

For everything? That doesn't sound like a great idea to me 🤔 Is it because you're still wondering whether review rules are a good idea? If so, I think you'd be better served if you could suppress errors, because you would at least be notified in errors showing up in ignored files, and based on that you could decide whether to fully adopt or remove this rule.

For the use-case of the RequestResponse pattern, I think what you did wasa fine way to approach it. We could imagine APIs to make it easier to only run on some modules/filepaths, but I believe this would be a separate issue (feel free to open one to discuss this). That said, I think it would be better to target everywhere you have a RequestResponse type instead of listing the modules, so that you can keep the same benefits and guarantees in future files without having to think about editing your ReviewConfig file.

@jsegura
Copy link

jsegura commented Oct 4, 2020

You mean you like ignoring files indefinitely over suppressing the errors in the files? I am open to the idea of having a ignoreErrorsForModules by the way (but that's a separate issue).

Oh sorry, I meant suppressing the error. 🙇‍♂️

For everything? That doesn't sound like a great idea to me 🤔 Is it because you're still wondering whether review rules are a good idea? If so, I think you'd be better served if you could suppress errors, because you would at least be notified in errors showing up in ignored files, and based on that you could decide whether to fully adopt or remove this rule.

Only for our custom rules that target specific modules

@chshersh
Copy link

chshersh commented Oct 5, 2020

I'm not using Elm currently, but I can share my experience of adopting Haskell linter HLint and Haskell Static Analyser Stan on a large Haskell codebase.

When running HLint initially, it produced 130K suggestions, and it was not possible to fix them automatically. What we did is probably similar to Betterer approach, though I haven't heard about Betterer before.

We've implemented a snapshot file that stores serialized output of HLint — hints name, file, source position. This snapshot file represents ignored hints, and by default, everything is ignored. Hints found in a Haskell file are filtered by the snapshot file. If the file content changes (e.g. new lines are added), all hints below this line are triggered. Our CI requires to fix all triggered hints, so when you touch new files for the first time, you need to fix all hints in them. This approach allows us to fix problems incrementally, while we work in the most active files. This makes sense because if you don't touch some source code file for years, it probably works even with some warnings 🙂

To summarize, the approach contains several parts:

  1. A CLI command to generate a lint-snapshot file by ignoring all current hints. This command is also used to update the snapshot file after fixes.
  2. A CLI command to run linters and static analysers incrementally on our codebase.

We have our own build tool to build our complicated project, so our linting commands support incremental linting out-of-the-box and are rerunning only for changed files.

For Stan we took a similar approach, but Stan supports ignoring observations via config natively. Each observation found in code has a unique ~15-character length ID that depends on check, module name and source code position. You can specify in the config all ignored ids. So a similar approach is taken — generate all hints, ignore them and fix gradually.

Stan also allows ignoring whole directories, which is indeed helpful for ignoring auto-generated code. Our projects dynamically generates 200 Haskell modules with 100K lines of code total, and ignoring them file by file is not convenient 🙂

@Arkham
Copy link

Arkham commented Oct 9, 2020

I would love to introduce elm-review to the main NoRedInk monorepo, so I'd like to provide some context:

  • we have around 500K lines of Elm
  • add to that around 30K lines of autogenerated Elm modules
  • many different Elm applications served on separate endpoints, some of which share common modules

Due to the size of the repo it is not feasible to introduce elm-review with a single PR, so my current plan was to introduce it gradually in various areas of the codebase by writing a custom script and providing this sort of configuration:

{
  "groups": [
    {
      "name": "Teach/Classes",
      "folders": [
        "src/Page/Teach/Classes",
        "ui/tests/Page/Teach/Classes"
      ],
      "ignore": [
        {
          "path": "src/Page/Teach/Classes/Routes.elm",
          "rule": "NoUnused.Exports",
          "match": "classDetailsUrl"
        },
        {
          "path": "src/Page/Teach/Classes/Routes.elm",
          "rule": "NoUnused.Exports",
          "match": "studentDetailsUrl"
        }
      ]
    },
    {
      "name": "Curriculum",
      "folders": [
        "src/Page/Curriculum",
        "src/Page/Admin/Curriculum/CsvImportTool",
        "ui/tests/Page/Curriculum"
      ],
      "ignore": [
        {
          "path": "src/Page/Curriculum/Update.elm",
          "rule": "NoUnused.Variables",
          "match": "curriculumAPI"
        }
      ]
    }
  ]
}

Each group represents a list of folders which are meant to be analyzed as a single area in the codebase, with a list of errors that we want to ignore. My plan is to filter through the output of elm-review --report=json and remove the error if it matches the path, the type and any part of the message field. I don't like the idea of tying the error to a line of code because it makes refactoring and moving things around more of a pain.

In this way I'd be able to add a CI step that verifies that elm-review passes for these areas of the codebase and we would be able to introduce it gradually. Due to the structure of our teams, when you are working on a feature it is very likely you are going to touch code in those folders. Most of our shared UI components are hosted at https://github.com/NoRedInk/noredink-ui so they are outside the monorepo.

I was also thinking I should be able to hack together a script to integrate this in an editor: check if the current path is part of any of the folders, find the group that folder belongs to and then run elm-review.

@jfmengels
Copy link
Owner Author

jfmengels commented Oct 13, 2020

Thank you for the info @chshersh!

If the file content changes (e.g. new lines are added), all hints below this line are triggered.

Hmm, so you use the brittleness of this approach as a feature rather than a problem. That's interesting. I imagine that if you do not want to fix the issue right now anyway, you simply re-run the lint-snapshot? If so, I guess it would not be too annoying, except for the fact that you will get a lot of errors popping up that are unrelated to the change you made. I guess we could show an additional mention that this reported error was initially suppressed.

You mention you suppress those errors by source position. Do you do so by the exact position from start to end, or do you only take the start position or just the lines?

our linting commands support incremental linting out-of-the-box and are rerunning only for changed files.

This is a bit tricky to do with elm-review, because elm-review allows rules to look at the contents of several files before reporting errors, allowing you for instance to report when exposed functions from a file are never used in other files and could therefore stop from being exposed. It's a great feature that is really useful and that I would recommend other static analysis tools to have. The downside is that modules changing can impact the results of other modules, so you either need to re-run everything (what we currently do) or cache the results of intermediate analysis (which we may do one day) for performance reasons. Anyway this was not relevant to the discussion, but I wanted to mention it.

Stan also allows ignoring whole directories

Yes, the same thing is true for elm-review currently. In case it wasn't clear in my original post, there is a difference between what I mean with ignored errors (or ignored files/directories) and suppressed errors.

Ignored files/directories means that we do not want elm-review to report problems for these files/directories. This is meant to be used for generated source code, vendored code or for ignoring rules in specific directories like tests where what you are allowed to do more unsafe things (like Debug.todo which would crash the application in production code).

Suppressed errors, which I'm proposing here, would mean that these issues should be fixed but have yet to be fixed. Currently, suppressing errors is done by ignoring files, but I find that having this distinction would help much with incrementally adopting a rule. It is hard to know whether a file is ignored temporarily or ignored for good reason when using the same API for both.

@Arkham Let me know how your setup works. I think that you could do most of the same things by using the current ignoreErrorsForFiles function. That said, you make me think that my proposal ties the review configuration to the project it reviews, so I am not sure whether it's a good idea when you have multiple projects like I remember you do at NoRedInk. Maybe a separate file would make sense, but that has other implications that make the usage not as nice.

@chshersh
Copy link

chshersh commented Oct 17, 2020

I imagine that if you do not want to fix the issue right now anyway, you simply re-run the lint-snapshot?

Yes, that's correct! 👍

Do you do so by the exact position from start to end, or do you only take the start position or just the lines?

We do it only by the starting position: line start and column.

@dillonkearns
Copy link

dillonkearns commented Dec 20, 2020

I don't think we should mention suppressed errors when there are non-suppressed errors being reported. It's likely just distracting.

You could show "3 errors, 15 suppressed" in the output, similar to how elm-test shows output if there are skipped test cases.

@dillonkearns
Copy link

dillonkearns commented Dec 20, 2020

Here are a few thoughts regarding granularity.

In my opinion, having a way to specify a specific number of errors to expect in a specific file is going to become a crutch that leads to problems in people's codebases. It will lead to not trusting the tool, because you can have a new issue crop up without any warning, so it's only giving you a false sense of security there.

My thinking is that it's more realistic to present it as rules being either on or off for a file (no in-between). Then you can focus on getting a particular file sorted out. Getting the number down and partially fixing a file seems like a bad approach to incremental adoption in my opinion.

@dillonkearns
Copy link

dillonkearns commented Dec 20, 2020

The tricky part would be to provide a way that makes it as easy as possible to just copy-paste this in the file.

I think it might be a good idea to separate out temporarily ignored rules for files into a generated file. Then just use elm-review to add new entries, and remove obsolete entries. As you said above, I think it's a great idea to have an error in the case that there are obsolete entries. Then you can run a single command to remove all obsolete entries from the generated file if you use this approach.

Adding to Ignored

$ elm-review --introduce-ignore-entries

Found 14 errors, 0 ignored problems

Would you like me to add this ignored entry?

Main.elm, InconsistentImports..., 2 errors

[Y/n]

Would you like me to add this ignored entry?

User.elm, InconsistentImports..., 12 errors

[Y/n]

elm-review-ignored.json has been updated.

Found 0 errors, 14 ignored problems in 2 ignore entries

Addressing/Removing Ignored Entries

elm-review

Found 0 errors, 14 ignored problems in 2 ignore entries
Run `elm-review --show-ignored` for a breakdown

$ elm-review --address-ignored
Found 0 errors, 14 ignored problems in 2 ignore entries

Here are some good places to start (sorted by least problems to most).
Fix all issues in a file and then re-run `elm-review --address-ignored`.

Main.elm, InconsistentImports..., 2 errors

User.elm, InconsistentImports..., 12 errors

<etc.>

# After fixing the InconsistentImport errors in `Main.elm`

$ elm-review --address-ignored

An ignored file/rule is now obsolete. I've updated elm-review-ignored.json.

Found 0 errors, 12 ignored problems in 1 ignore entry

Here are some good places to start (sorted by least problems to most).
Fix all issues in a file and then re-run `elm-review --address-ignored`.

User.elm, InconsistentImports..., 12 errors

Summary

I think that having this be generated has a few advantages:

  • The tooling can remove obsolete entries for the user, without any worry about accidentally removing something that's not yet obsolete by mistake. elm-review will give you an error if you make this mistake, but it would be a tedious process.
  • The tool can walk through files that are have errors that you might want to ignore, and either add them all at once, or interactively.

Also, I think having the tool give you suggestions for "low hanging fruit" by showing ignored items with the fewest problems at the top would be nice. It might be useful to have some extra CLI options for filtering out to only see ignored issues for specific rules (or hide for certain rules).

@jfmengels
Copy link
Owner Author

jfmengels commented Dec 20, 2020

Granularity

Thanks for the comments @dillonkearns 😊

In my opinion, having a way to specify a specific number of errors to expect in a specific file is going to become a crutch that leads to problems in people's codebases. It will lead to not trusting the tool, because you can have a new issue crop up without any warning, so it's only giving you a false sense of security there.

I am not sure what you mean with what you mean with "going to become a crutch" (I might not understand the expression).

I don't see how new errors can crop up without warning. If the configuration says that there are 5 suppressed errors in a file X for rule Y, and the elm-review passes, then that is exactly how many you have. Not more, not less.

My thinking is that it's more realistic to present it as rules being either on or off for a file (no in-between). Then you can focus on getting a particular file sorted out. Getting the number down and partially fixing a file seems like a bad approach to incremental adoption in my opinion.

I would say with this approach (which is similar to the one you currently have to take with the ignored files), it will be easier for new errors to crop up. When you add rule Y, you notice there are 20 reported errors in file X and you ask to suppress the file. Some time later you want to tackle the errors and you notice there are now 30. The same problem happens if you start fixing them but can't them all in a single session.

Also, keep in mind that since elm-review has rules that span multiple codebase, fixing a problem in one file can potentially mean changing plenty of other files. For instance, if you want to force a change in the types of several functions (the function being the reported part) that are used all over the codebase. In this case, you'd like that if you fix one problem (changing the type of a function) in this file, another function with the same problem doesn't creep up.

I can imagine something like the migration from elm/http v1 to v2 being a example of that, if you imagine that the codebase was using functions that were returning Tasks instead of Cmds.

Generation

I think it might be a good idea to separate out temporarily ignored rules for files into a generated file. Then just use elm-review to add new entries, and remove obsolete entries.

Yeah, I have been leaning towards this too recently. Generate it into a different file, and then either

  • make elm-review include this file as a special-case file - which means the format could be anything, a JSON file, or another Elm file similar to ReviewConfig.elm
  • make it just another regular Elm file, and have the user include and use it explicitly. Potentially elm-review could try to update the ReviewConfig itself to add that code.
module ReviewConfig exposing (config)

import SuppressedErrors exposing (suppressedErrors) -- new addition

config =
  [ RuleA.rule
  , RuleB.rule
  ]
    |> Rule.suppressErrors suppressedErrors -- new addition

And the generated file:

module SuppressedErrors exposing (suppressedErrors)

suppressedErrors =
  [ { ruleName = "RuleA", path = "src/A.elm", nbErrors = 1 } -- or the variant with the locations of the errors. We could have the data be in a different format/data structure too, not sure yet.
  ]

Adding to Ignored

My current thoughts are to make the generation file to be all or nothing. I'll think about having an interactive mode, but I am not entirely sold yet. I intend to make the generated file readable and editable by the user (one reason why I'm thinking of having it be in Elm) so that they can go and set targets themselves for their refactor, like manually remove the suppressed errors for a file.

I think that there will be two flags for the generation: one to (re)generate the errors from scratch, and one to remove the obsolete entries (or to reduce the number of allowed errors) that would fail if new errors have been introduced.

Also, I think having the tool give you suggestions for "low hanging fruit" by showing ignored items with the fewest problems at the top would be nice.

Instinctively, I would have put the files that need the most attention (have the most errors) at the top. But yeah, you may be right that in the idea of incrementally improving the codebase, having the low-hanging fruit fixed first will be better 👍

It might be useful to have some extra CLI options for filtering out to only see ignored issues for specific rules (or hide for certain rules).

Yeah. I think using --rules RuleA and/or something like --no-suppressions=RuleA will get you quite far. I'm all ears for things that could make it easier.

@dillonkearns
Copy link

dillonkearns commented Dec 21, 2020

In my opinion, having a way to specify a specific number of errors to expect in a specific file is going to become a crutch that leads to problems in people's codebases. It will lead to not trusting the tool, because you can have a new issue crop up without any warning, so it's only giving you a false sense of security there.

I am not sure what you mean with what you mean with "going to become a crutch" (I might not understand the expression).

I don't see how new errors can crop up without warning. If the configuration says that there are 5 suppressed errors in a file X for rule Y, and the elm-review passes, then that is exactly how many you have. Not more, not less.

Become a crutch is just an expression that means you'll start depending on something in a way that becomes a bad habit.

I'll elaborate a little on why I think that's the case here.

Say you have 1 suppressed error in this file for an InconsistentImport rule.

import Html.Attributes
-- ^ InconsistentImport error, should be `as Attr`

Let's imagine that this rule requires a manual fix. You might go about coding, and end up not depending on Html.Attributes in this module now. But instead, you depend on Element from elm-ui.

import Element.Border
-- ^ InconsistentImport error, should be `as Border`

We changed the code, and in the process ended up with the same number of suppressed errors. But it's a different one. That's not necessarily a problem. But it seems misleading to give the impression that you can "lock in" the specific set of errors in a file/rule as ignored, because that's in fact not what's happening.

So I see it as less misleading to just ignore a rule for a whole file, because it doesn't give the false impression that you've prevented new problems from arising in that file.

That's just my 2 cents there, reasonable people can certainly disagree there. But at least I think it's worth pointing out that it could be misleading.

@dillonkearns
Copy link

dillonkearns commented Dec 21, 2020

Also, it seems to me that it will be easier to gain confidence that you've fully fixed a rule for a module and keep that fixed (rather than partially fixing several modules). That's the crutch part (bad habit).

I think it will be easier to keep track of addressing all problems for a rule in a specific module at a time, then moving on. And I think it's a good idea to start with the low-hanging fruit and keep those areas clean. When you start to mix modules with problems as "partially fixed", that seems like it could be harder to keep track of what's going on.

@jfmengels
Copy link
Owner Author

jfmengels commented Dec 21, 2020

Ok, so what you mean is that this method does not prevent what I called "exchanging errors" in the description of the issue.

I agree that it's a problem, but since you at least didn't make the codebase worse (you exchanged one error with an error of the same severity), I think it's a relatively low problem. My take is that it will at least prevent regressions in one file, which doesn't happen when you ignore per file.

If we take @chshersh's approach where we ignore by location, we would get a lot less error exchanges because you're less likely to have errors at the exact same locations.

The implication of that is that elm-review will report unrelated errors a lot more. Even just renaming a function used in different folders would invalidate error suppressions.

I am thinking we could provide both options to the user: ignore by count (more stable, but less likely to be fixed) or by range (more unrelated errors, but more likely to be fixed as you go). But I would really like it if we could suggest the "better" approach, and maybe that is @chshersh's. Because at least the team is more likely. And by re-generating the suppressed file, you can

I am more worried about the full re-generation of the suppressed errors becoming the crutch (in both options). Just like people like to automatically add linter disable comments when they get annoyed, I'm worried that they'll default to using this too much. I'm hoping that the user's colleagues will have an easier time noticing this at review time and asking for a justification at least 🤷‍♂️

@jfmengels jfmengels transferred this issue from jfmengels/elm-review-design-discussions Mar 14, 2021
@jfmengels jfmengels added design-discussion enhancement New feature or request labels Mar 14, 2021
Repository owner locked and limited conversation to collaborators Mar 14, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
design-discussion enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants