Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the option to instantiate the same rule more than once #6738

Open
BraisGabin opened this issue Dec 19, 2023 · 15 comments
Open

Add the option to instantiate the same rule more than once #6738

BraisGabin opened this issue Dec 19, 2023 · 15 comments
Assignees
Labels
Milestone

Comments

@BraisGabin
Copy link
Member

BraisGabin commented Dec 19, 2023

This is a feature that we want to have for detekt 2.0

Needed changes

There are multiple changes that should be made to make this possible. Here a list with all the changes that we need:

  1. RuleSets shouldn't instantiate the rules. They should provider factories to pass a confguration to create that rule.
  2. Rules shouldn't pick their own configuration from its parent configuration. We should provide their configuration directly.
  3. The Rule id can't be hardcoded. If a rule can have multiple instances the id should change between those instances.
    • State: No started
  4. The Config class should provide an API to read all the subconfigurations inside it.
  5. The core should read the Config and instantiate the rules from there.
    • State: This should be the easiest. We need to align on the new yaml key to find the rule but the change should be as easy as this:
.flatMap { (ruleSet, ruleSetConfig) ->
    ruleSetConfig.subconfigs
        .map { (configId, config) ->
            val ruleId = config.valueOrNull("rule") ?: configId
            val ruleProvider = checkNotNull(ruleSet.rules[ruleId]) {
                "Unknown rule ${ruleSet.id}:$ruleId"
            }
            ruleProvider.invoke(config)
        }
}

This is a huge change. There is code that assumes that the rules have a hardcoded id or that the there is only one instance of each. And this is a breaking change, we need to change the API for 3 main classes/interfaces: RuleSet, Rule and Config. This will impact every single third party rule out there.

Also, those are the big "chunks" that should be done, then a lot of minor changes should be made on the way to be able to make those possible.

@BraisGabin BraisGabin added this to the 2.0.0 milestone Dec 19, 2023
@BraisGabin BraisGabin self-assigned this Dec 19, 2023
@mgroth0
Copy link
Contributor

mgroth0 commented Dec 27, 2023

Hey, I just stumbled upon this issue. I think I have a use case:

style:
  ForbiddenImport:
    active: true
    ignoreAnnotated:
      - "JavaIoFileIsOk"
    imports:
      - value: "java.io.File"
        reason: 'case-sensitivity issues'
      - value: "some.unrelated.import"
        reason: "some other reason"

I have a custom annotation for suppressing the rule that forbids java.io.File. But in the configuration above, JavaIoFileIsOk will also suppress the rule for all other forbidden imports as well.

Am I correct in my understanding that there is currently no other workaround, since there can only be one instance of ForbiddenImport?

@BraisGabin
Copy link
Member Author

You are correct, that's exactly the use case. Could I ask you how do you imagine the configuration should look like to make your use case work?

@BraisGabin
Copy link
Member Author

BraisGabin commented Dec 28, 2023

I need to write down all my thoughts about Rule/Issue/Finding and all that part of the detekt API:

We need to talk about rule ids. Right now it's easy! Every Rule has an Issue. The Issue stores the id and the description of the Rule. And Rule.id is just syntax sugar for Rule.issue.id.

That works when you only have one instance per rule. But once you have multiple instance of the same rule the thing gets messy really quick.

First of all the rules need 2 ids:

  • One to identify the type of the rule (needed to link for its documentation, for example)
  • One to identify the instance of the rule (as a user I want to know which configuration of the rule raised that issue)

The first one can be hardcode. It can even be the simpleName or the fullQualifiedName of the Rule. But the second is way trickier. First of all, that id should come from Config (aka yaml). And how are we going to expose those different ids?

Issue is a nice API. It's used in two different places: the Rule and the Finding. Issue is like a "RuleInfo" class. It is an elegant way to tell the Finding who created it but without having a reference to a Rule.

So, if we need two different ids for a rule both should be inside Issue.

An issue right now is:

class Issue(
    val id: String,
    val description: String,
)

I would propouse:

class Issue(
    val id: String,
    val ruleId: String,
    val description: String,
)

Where instanceId, in general, would be the same as ruleId. It will change only when we define a different id.

So, how would this look like on the yaml?

style:
  ForbiddenJavaIoFile:
    active: true
    rule: ForbiddenImport
    ignoreAnnotated:
      - "JavaIoFileIsOk"
    imports:
      - value: "java.io.File"
        reason: 'case-sensitivity issues'
  ForbiddenImport:
    active: true
    imports:
      - value: "some.unrelated.import"
        reason: "some other reason"

A configuration like that will create this two instances like this:

listOf(
  ForbiddenImport(
    Issue(id = "ForbiddenJavaIoFile", ruleId = "ForbiddenImport", ...),
    ...
  ),
  ForbiddenImport(
    Issue(id = "ForbiddenImport", ruleId = "ForbiddenImport", ...),
    ...
  ),
)

Also I would vote to use value class for the ids to have better type safety. If we have 2 different ids it will be difficult to know which one to use.

@marschwar
Copy link
Contributor

Sounds good. I like the proposal.

@mgroth0
Copy link
Contributor

mgroth0 commented Dec 31, 2023

@BraisGabin Thanks for sharing your ideas!

There are two potential issues I see from the user's perspective:

Issue 1: It seems messy and counterintuitive for there to be one "main" rule instance (which does not need to set the rule parameter in the configuration and which gets to use the rule class name as its ID.). It seems better to me to make all instances of a rule equal rather than one instance having a slightly different syntax than the rest.

Here is my idea for how to possibly solve this issue:

style:
  active: true
  rules:
    - rule: ForbiddenImport
      id: ForbiddenJavaIoFile # optional, defaults to ForbiddenImport-1
      active: true
      ignoreAnnotated:
        - "JavaIoFileIsOk"
      imports:
        - value: "java.io.File"
          reason: 'case-sensitivity issues'
    - rule: ForbiddenImport
      # in this example id is not defined, so it defaults to something like ForbiddenImport-1 or ForbiddenImport-2 
      active: true
      imports:
        - value: "some.unrelated.import"
          reason: "some other reason"

Issue 2: Users may want some sort of inheritance of parameter values between instances.

Here I have two instances for TooManyFunctions, and I format it according to your idea rather than in the same way as my idea above:

complexity:
  active: true
  TooManyFunctions:
    active: true
    thresholdInFiles: 116
    thresholdInObjects: 22
    thresholdInClasses: 74
    thresholdInInterfaces: 51
    thresholdInEnums: 3
  TooManyPublicFunctions:
    ignorePrivate: true
    active: true
    thresholdInFiles: 50
    thresholdInObjects: 22
    thresholdInClasses: 74
    thresholdInInterfaces: 51
    thresholdInEnums: 3

As you can see, I just wanted to have a limit of 50 public functions per file, but for the rest of the scopes, I wanted to keep the numbers the same. The issue is that the numbers that stayed the same had to be written twice.

I have to admit, I don't know if there is already a built-in feature in YAML that would solve this issue. If there is not though, maybe we would want to provide our own "templates" or other inheritance mechanism in the configuration files.

@marschwar
Copy link
Contributor

I have to admit, I don't know if there is already a built-in feature in YAML that would solve this issue. If there is not though, maybe we would want to provide our own "templates" or other inheritance mechanism in the configuration files.

There is the concept of anchors that you might be able to use.

@VovaStelmashchuk
Copy link
Contributor

My proposal about implement config in case user need a the rule twice or more time. We add special char for separate rule name and the rule id.

style:
  MaxLineLength:
    maxLineLength: 100
  MaxLineLength%some_id:
    maxLineLength: 100

On parsing stage we can check the key for contains % and split the string to rule name. Then add two rule to ruleset.

  • MaxLineLenght()
  • MaxLineLenght(id = some_id)

I use a % as spec char.

Is the proposal have any sence ?

@BraisGabin
Copy link
Member Author

It does have sense. I like it! I would say that : is a good candidate MaxLineLength:some_id

style:
  MaxLineLength:
    maxLineLength: 100
  MaxLineLength:some_id:
    maxLineLength: 100

The only problem with that is that yaml uses : a lot already so I don't know if it is a good idea or not.

Other char options: ;, /, |, =, , and -.

@VovaStelmashchuk
Copy link
Contributor

I suggest a + as special char. The + is already using as splitter for alias in email naming convention. Also, the plus symbol clearly indicates an addition/extension.

Email Sub-addressing

@mgroth0
Copy link
Contributor

mgroth0 commented Jan 10, 2024

I think : would not be the best because, as you mentioned, it is part of yaml syntax. Same problem with # and &.

+ makes me think of addition so I don't prefer it.

What about @? The default java object toString method returns "someObjectClassname@hashcodenumber". So there is a known association with the thing to the left of the @ being the class name and the thing to the right of the @ being an instance identifier (a hash code is not a unique id, but it is similar)

@VovaStelmashchuk
Copy link
Contributor

I don't agree with the @ special character, this character is used in so many programs for communication as a special character, (slack, discord, github, mail...)

This will be inconvenient for people when communicating about their detekt rules. Each time you type the @ some program suggest a popup.

@mgroth0
Copy link
Contributor

mgroth0 commented Jan 10, 2024

Interesting point. Well, I did check GitHub and slack, and I only get the popup for both if I type the @ alone. If I prefix the @ with anything else in the same word, no popup.

@VovaStelmashchuk
Copy link
Contributor

Ok, I think we need other options in discussion:)

One more point about works with ids, Do the [special char] must the valid id itself?

I think it should, because otherwise we will need extra validation of yaml parsing.

@marschwar
Copy link
Contributor

Do we know if anyone is using yaml schema validation for the detekt config? I think auto-complete should still work for the general rule instance, but validation would fail. But I guess that is already the case for 3rd party rules.

We should take into account #4162, when thinking about this.

@BraisGabin
Copy link
Member Author

/ can work. We can think of the ID as a path or the name of the rule as a directory. I like the discussion to see pros and cons

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants