Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The confidence option is not very intuitive to use #7121

Open
Pierre-Sassoulas opened this issue Jul 4, 2022 · 10 comments
Open

The confidence option is not very intuitive to use #7121

Pierre-Sassoulas opened this issue Jul 4, 2022 · 10 comments
Labels
Enhancement ✨ Improvement to a component Minor 💅 Polishing pylint is always nice Needs specification 🔐 Accepted as a potential improvement, and needs to specify edge cases, message names, etc.

Comments

@Pierre-Sassoulas
Copy link
Member

Current problem

Right now you have to list all the confidence that you want to activate. So you have to know the confidences levels and use a list.

Desired solution

By giving onle one value and all the values with at least this confidence are activated. If you choose HIGH for example only HIGH is activated, but if you choose UNDEFINED, everything is activated.

Additional context

One of the point left from #746

@Pierre-Sassoulas Pierre-Sassoulas added Enhancement ✨ Improvement to a component Minor 💅 Polishing pylint is always nice Needs specification 🔐 Accepted as a potential improvement, and needs to specify edge cases, message names, etc. labels Jul 4, 2022
@DanielNoord
Copy link
Collaborator

How would we determine the order though? Is INFERENCE_FAILURE higher than INFERENCE? And is CONTROL_FLOW below or above INFERENCE?

@Pierre-Sassoulas
Copy link
Member Author

Pierre-Sassoulas commented Jul 8, 2022

I think it's already ordered from high to low in the code (%= it should be an enum):

HIGH = Confidence("HIGH", "Warning that is not based on inference result.")
CONTROL_FLOW = Confidence(
    "CONTROL_FLOW", "Warning based on assumptions about control flow."
)
INFERENCE = Confidence("INFERENCE", "Warning based on inference result.")
INFERENCE_FAILURE = Confidence(
    "INFERENCE_FAILURE", "Warning based on inference with failures."
)
UNDEFINED = Confidence("UNDEFINED", "Warning without any associated confidence level.")

@DanielNoord
Copy link
Collaborator

I think INFERENCE_FAILURE should probably be second? Or maybe I'm misinterpreting it: if we depend on control flow we might still create false positives, if we depend on an inference failure the fact that we're not understanding everything correctly is the reason for a message?

@Pierre-Sassoulas
Copy link
Member Author

I think I'm the one who ordered it this way. My reasoning is that inference failure is not as good as successful inference. But to be honest I never saw a check verify the result of the inference before adding a message (often we do not raise anything if the inference fail). So I don't know if we're even using INFERENCE_FAILURE. The way we would be using this confidence would determine the order, right now we don't so it's a little hard to be assertive.

@DanielNoord
Copy link
Collaborator

I think I'm the one who ordered it this way. My reasoning is that inference failure is not as good as successful inference. But to be honest I never saw a check verify the result of the inference before adding a message (often we do not raise anything if the inference fail). So I don't know if we're even using INFERENCE_FAILURE. The way we would be using this confidence would determine the order, right now we don't so it's a little hard to be assertive.

We are using it in some places 😉 I just saw 2/3 occurrences I think.

@Pierre-Sassoulas
Copy link
Member Author

Ok ! I'm on mobile so I can't check the codebase (😅). To be clearer : If we do not raise a message in case of inference failure and raise another message that we're more confident about instead... then inference failure is > inference. But if we're raising the same message based on inference failure anyway, then inference failure < inference. Maybe we should remove the distinction ? First case could be HIGH, second case could be INFERENCE.

@jacobtylerwalls
Copy link
Member

But if we're raising the same message based on inference failure anyway, then inference failure < inference. Maybe we should remove the distinction ? First case could be HIGH, second case could be INFERENCE.

That seems to be how it's used, see:
https://github.com/PyCQA/pylint/blob/254b260255e21f5bf7e3d10b82322a420df3cb85/pylint/checkers/base/name_checker/checker.py#L359-L371

Let's leave the distinction for now, it's not hurting anything.

I also think the current order is appropriate.

@jacobtylerwalls
Copy link
Member

We should probably block #1727 on this, so if people find #1727 unwelcome, they can run pylint with --confidence=HIGH instead of --confidence=HIGH,INFERENCE, INFERENCE_FAILURE,UNDEFINED

@jacobtylerwalls
Copy link
Member

On second thought, it doesn't need to block that issue, we can always document both options.

@Pierre-Sassoulas
Copy link
Member Author

I'm thinking making a breaking change for this in 4.0 wouldn't be very problematic (we added confidence for real not so long ago, and most message do not have confidence yet).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ✨ Improvement to a component Minor 💅 Polishing pylint is always nice Needs specification 🔐 Accepted as a potential improvement, and needs to specify edge cases, message names, etc.
Projects
None yet
Development

No branches or pull requests

3 participants