Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

per-file patterns #22

Closed
wants to merge 1 commit into from
Closed

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Oct 26, 2022

Including the <package>:<file> prefix as alternative text that gets matched
against makes it possible to write patterns that only match when that prefix is
as intended.

This builds on top of #21 which must be merged first.

Comment on lines 186 to 187
for i := 0; i < numTexts; i++ {
texts = append(texts, filename+":"+texts[i])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for i := 0; i < numTexts; i++ {
texts = append(texts, filename+":"+texts[i])
for _, text := range texts {
texts = append(texts, filename+":"+text)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unsure whether slices can be modified while iterating over them and still haven't found a definitive statement about that.

But as it seems to work and you suggested it, I'll trust that it is safe and change the code.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did miss that bit but per https://go.dev/ref/spec, "The range expression x is evaluated once before beginning the loop."

@pohly
Copy link
Contributor Author

pohly commented Dec 18, 2022

From #21 (comment):

I've also been trying to put off #22 because I don't find regex as the conducive to filename matching as glob syntax. A pattern something like this seems most readable to me:

 {*.go,!*_test.go}^fmt.Print(f|ln)$

File globing can be used to enable or disable certain rules based on the file
that is being analyzed.
@pohly
Copy link
Contributor Author

pohly commented Jan 29, 2023

I've rebased to get get rid of the commits from PR #21 .

@pohly
Copy link
Contributor Author

pohly commented Jan 29, 2023

@ashanbrown: do you want me to start the review process for the golangci-lint update now by creating a preliminary PR for it?

Before that PR can get merged, it would be good to tag forbidigo and then use that tagged release in golangci-lint. But perhaps folks over there have feedback that we should consider before a formal release.

@ashanbrown
Copy link
Owner

How would you feel about inverting the sense of the check to be inclusion and just calling the matcher files? I think the negated check becomes more confusing when the sense is inverted. I think the name files would also make it clearer what we are ignoring.

@ashanbrown
Copy link
Owner

I'm also trying to figure out if we need the package prefix at all. Why wouldn't the package always just by the top-level package? I think this is what golangci-lint assumes as well.

At https://golangci-lint.run/usage/configuration/, the example for the "skip-files" control is:

 # Which files to skip: they will be analyzed, but issues from them won't be reported.
  # Default value is empty list,
  # but there is no need to include all autogenerated files,
  # we confidently recognize autogenerated files.
  # If it's not please let us know.
  # "/" will be replaced by current OS file path separator to properly work on Windows.
  skip-files:
    - ".*\\.my\\.go$"
    - lib/bad.go

@ashanbrown
Copy link
Owner

I'd favor holding off on the golangci-lint review until we get this MR merged. I'd prefer not to cause noise in there until we are ready.

@pohly
Copy link
Contributor Author

pohly commented Jan 29, 2023

How would you feel about inverting the sense of the check to be inclusion and just calling the matcher files?

I started with that, but then explaining the field became a bit weird: the default (empty list = enabled) is the opposite of what the list means when it has entries. I can try again with that.

The example you gave with skip-files matches the current ignore - if we don't invert the meaning, then I probably should use the same name.

I'm also trying to figure out if we need the package prefix at all. Why wouldn't the package always just be the top-level package?

The same configuration might be used for multiple different packages. We do that in Kubernetes, which currently consists of a k8s.io/kubernetes module in the root and multiple different "staged" modules under staging/src. The file path would be harder to define (relative to what or absolute?) than rooting it in the package.

@ashanbrown
Copy link
Owner

ashanbrown commented Jan 30, 2023

I see the issue regarding an empty list. I had been thinking that the files value would be a string like "in/**.go,!in/bad.go" and that we could parse it using a csv parser. Were that the case, the absence of the value entirely would just say "allow anything." It seems that "!" also means something to yaml, making it not ideal (since it force us to wrap things in quotes).

So that would look something like:

{p: "bad\\.No", files: "**/*.go,!skip.go"}

As it stands we're looking at:

{p: "bad\\.No", ignore: [skip.go]}

and

{p: "bad\\.No", ignore: [skip/**/*.go,"!skip/exception.go"]}

Just feeling these things out.

Regarding the kubernetes example, would you argue the skip-files should work the same way for golangci-lint (i.e. it should include the package name)?

Once we allow a "**" glob matcher, would that match package names too? It feels unusual to have the glob matcher matching parts of package names.

@pohly
Copy link
Contributor Author

pohly commented Jan 30, 2023

ignore: [skip.go]

That won't work, currently the entire file name must be matched. So that would have to be ignore: ["**/skip.go"] (with quotes, because YAML doesn't like** without them). I don't see a good way to allow a sub-string match in the doublestar API.

A single string with comma-separation may indeed be simpler. We can also do both (split string at comma, store list directly).

Regarding the kubernetes example, would you argue the skip-files should work the same way for golangci-lint (i.e. it should include the package name)?

It would be more useful, yes.

Once we allow a "**" glob matcher, would that match package names too? It feels unusual to have the glob matcher matching parts of package names.

The package name in example.com/some/pkg/pkg.go is example.com/some/pkg, and it gets matched by ** because it gets treated just like a path, including the leading domain name.

@pohly
Copy link
Contributor Author

pohly commented Jan 30, 2023

I just got bitten (again!) by a gotcha in the globbing library: k8s.io/kubernetes/test/e2e** does not match all paths starting with k8s.io/kubernetes/test/e2e because ** is supposed to come after a slash. The correct pattern is k8s.io/kubernetes/test/e2e*/** .

We are already using regular expressions for the pattern. Does it really make sense to use globbing for the file path? Regular expressions may be better understood and thus predictable.

@pohly
Copy link
Contributor Author

pohly commented Jan 30, 2023

WIP PR using the feature in Kubernetes: kubernetes/kubernetes#109728

@pohly
Copy link
Contributor Author

pohly commented Feb 5, 2023

I just got bitten (again!) by a gotcha in the globbing library

But that could be just me. .gitignore works the same way.

@ashanbrown
Copy link
Owner

Ok, thanks for experimenting with this. Using regexp to match the behavior of skip-files in golangci-lint sounds fine to me. Sorry to backtrack on this. I'm still struggling with the idea of pretending with module name. As a developer, I'd argue I shouldn't have to know the module name of the code I'm working on and that I should be free to change the module name without effect.

@ashanbrown
Copy link
Owner

(The name skip-files sounds good to me to so it's obvious that it's the same sort of behavior as golangci-lint)

@ashanbrown
Copy link
Owner

Stepping back a moment, I'm still trying to understand why per-file matching should have a skip/ignore behavior. https://golangci-lint.run/usage/false-positives/ has a whole system for skipping false positives. Can you remind me why it isn't sufficient? I think the reason that I had suggested using globs was that it was better at describing inclusion. In particular, I had envisioned some syntax like {somedir/**/*.go}ThisMethod for scoping rules to a particular part of the tree.

@pohly
Copy link
Contributor Author

pohly commented Feb 5, 2023

As a developer, I'd argue I shouldn't have to know the module name of the code I'm working on and that I should be free to change the module name without effect.

The same could be said about moving the golangci.yml file around. But golangcli-lint normalizes paths so that they are relative to the directory of the config file before matching, so one has to change all paths inside the config when moving the config.

One can argue both ways. The problem in forbidigo is that there's no concept of "config directory", so it cannot do the same as golangci-lint. It could use "current directory", but that then prevents invoking golangci-lint in different directories (currently supported).

https://golangci-lint.run/usage/false-positives/ has a whole system for skipping false positives. Can you remind me why it isn't sufficient?

In Kubernetes, I want to forbid certain patterns in test/** and allow them elsewhere.

If I had to use the Exclude Issue by Text feature for this, then I would have to list all directories besides test, and repeat that for every pattern that is only forbidden in test/**. This isn't practical.

Instead of matching specific patterns, I could exclude all issues by forbidgo. But then I still need to list all the non-test directories.

Adding nolint directories all over the code base is a non-starter, too.

@pohly
Copy link
Contributor Author

pohly commented Feb 5, 2023

We could use relative to "config directory" (when running in golangci-lint) with "current directory" as fallback (when not). But I need to check whether the "config directory" is available.

@pohly
Copy link
Contributor Author

pohly commented Feb 5, 2023

skip-files uses regular expressions. I think we should do the same.

@pohly
Copy link
Contributor Author

pohly commented Feb 5, 2023

Suppose the path filter in Exclude Issue by Text was extended to support **,!test/**. Then each forbidigo pattern would have to be configured in two places (under "forbidigo: forbid" and "exclude-rules`), and the reference to the pattern would be through the message text - that seems cumbersome and fragile.

@ashanbrown
Copy link
Owner

ashanbrown commented Feb 5, 2023

Per https://golangci-lint.run/usage/configuration/, golangci-lint seems to have some concept of where the root is. I can't find the documentation that suggests rules are relative to the config file location though. Absent other information, I'd expect rules to be relative to the current directory in which we run the linter.

I still don't quite see why the configuration needs of this particular linter are different than any other linter. For every linter in golangci-lint, exceptions still need to be documented in every section. I think the claim is that we often have sets of rules that we want to apply to particular directories. Even for sets of rules, we'll be repeating the skip-files properties, although maybe yaml bookmarks can allow for some reason. Is the reason this linter is different than others that this particular linter can have an ever-expanding number of rules (however many you choose to define)?

I'm just brainstorming here but here are a couple of alternatives we could consider to skip-files:

  1. Allow the assignment of names or tags to forbidden patterns, so that we can ignore them in the exclude-rules section. Note that I think we could do this already by using comments as names or tags ("tag:my-tag") and then matching that text in the exclude-rule.
  2. Add a tests option that defaults to true but can be set to false for each rule. This might be convenient for lot of typical cases.

I'm not sure there's harm in adding a skip-files directive, but it still would need to be repeated per rule unlike option (1) above, where a single exclude-rule entry could match multiple tags. It just feels like we're creating our own version of a special handler that every linter needs.

@pohly
Copy link
Contributor Author

pohly commented Feb 6, 2023

Per https://golangci-lint.run/usage/configuration/, golangci-lint seems to have some concept of where the root is. I can't find the documentation that suggests rules are relative to the config file location though. Absent other information, I'd expect rules to be relative to the current directory in which we run the linter.

I also think that it is not documented, but I know from experience that it is not the current directory. That makes sense because it is possible to invoke golangci-lint anywhere in the source tree and it will then walk up to find the configuration file. That wouldn't work if the paths in that configuration file were relative to the current directory.

I think the claim is that we often have sets of rules that we want to apply to particular directories.

Correct. Usage of a certain function may be fine in some parts of a program and forbidden elsewhere. This is different from coding conventions and best practices.

Even for sets of rules, we'll be repeating the skip-files properties, although maybe yaml bookmarks can allow for some reason.

That is a valid point.

Is the reason this linter is different than others that this particular linter can have an ever-expanding number of rules (however many you choose to define)?

Yes. To me, defining the scope of the rule as part of the rule seems more logical and easier to review.

Allow the assignment of names or tags to forbidden patterns, so that we can ignore them in the exclude-rules section. Note that I think we could do this already by using comments as names or tags ("tag:my-tag") and then matching that text in the exclude-rule.

That may be viable. I'll experiment with that.

@pohly
Copy link
Contributor Author

pohly commented Feb 9, 2023

Using exclude-rules was okayish. The advantage is that the path only needs to be specified once. The disadvantage is the limited expressiveness of exclude-rules: path: it's not possible to say "exclude everywhere except here".

Here's how I solved that:

  exclude-rules:
    # exclude ineffassing linter for generated files for conversion
    - path: conversion\.go
      linters:
        - ineffassign
    # Exclude forbidigo rules for parts of the code base where they
    # don't apply. The exclude entries depend on the common message prefix
    # in the forbid section below.
    #
    # Path is everything in test except for test/integration. Even that could be covered?
    - path: ^(api|cluster|cmd|hack|pkg|plugin|staging|test/integration|third_party|vendor)/
      msg: "E2E test:"
      linters:
      - forbidigo
    # Same for test/e2e/framework
    - path: ^(api|cluster|cmd|hack|pkg|plugin|staging|test/(cmd|conformance|e2e/(apimachinery|apps|architecture|auth|autoscaling|chaosmonkey|cloud|common|dra|instrumentation|kubectl|lifecycle|network|node|perftype|reporters|scheduling|storage|testing-manifests|upgrades|windows)|e2e_kubeadm|e2e_node|fixtures|fuzz|images|instrumentation|integration|kubemark|list|soak|typecheck|utils)|third_party|vendor)/
      msg: "E2E framework:"
      linters:
      - forbidigo

But that is something that could be solved by enhancing golangci-lint - the only problem is that the only reaction I got for my proposal on Slack was a -1.

The other problem is that Kubernetes currently has to invoke golangci-lint in sub-directories (multi-module repo) and paths don't work as I thought: what I said about "relative to config" was wrong, it just looked like that because the wrapper script in Kubernetes uses --path-prefix to compensate for changing the directory. When doing that, path matching fails because it doesn't consider that prefix. I have a fix pending for review.

Let's close this PR and come back to it only if really needed, i.e. there's too much push-back against the solution above and nothing else gets accepted either.

@pohly pohly closed this Feb 9, 2023
@ashanbrown
Copy link
Owner

Thanks for testing out the comment-based approach @pohly. I'll keep an eye on golangci/golangci-lint#3571 and see how that goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants