Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[similar_names]: Configuration option to whitelist specific names #10926

Open
Centri3 opened this issue Jun 11, 2023 · 4 comments
Open

[similar_names]: Configuration option to whitelist specific names #10926

Centri3 opened this issue Jun 11, 2023 · 4 comments
Assignees
Labels
C-enhancement Category: Enhancement of lints, like adding more cases or adding help messages

Comments

@Centri3
Copy link
Member

Centri3 commented Jun 11, 2023

Description

Currently, similar_names will lint on code like

fn awa(tcx: TyCtxt<'_>) {
	let ocx = { ... };
}

Which is something I ran into on #10891. IMO, tcx and ocx are short enough that you can tell them apart. Despite this, simply setting a minimum length isn't desirable; What about lli and lll? Therefore, I think this lint should have a configuration option that is by default empty that allows specific names.

@rustbot claim

Version

No response

Additional Labels

@rustbot label +C-enhancement

@rustbot rustbot added the C-enhancement Category: Enhancement of lints, like adding more cases or adding help messages label Jun 11, 2023
@eloc3147
Copy link

eloc3147 commented Jul 12, 2023

Just ran into this as well with msb and lsb.

There is a hardcoded list of "names that are allowed to be similar". This list could be added to, but I can imagine many scenarios where using some similar domain-specific acronyms would make the most sense. I agree this should be configurable

@tgross35
Copy link
Contributor

tgross35 commented Oct 2, 2023

Maybe an easy refinement rule is to skip this check if the first letters are visually different. That would cover the txc/ocx, msb/lsb, slen/dlen cases, and I think probably a good number of other cases that exist. A list would also be nice, but I think that clippy could probably be a bit smarter here.

A slightly more complex but more general algorithm that could make sense is something like:

fn raise_similar_lint(mut a: &str, mut b: &str) -> bool {
    // comparing identifiers a and b
    
    // commonize similar looking characters, e.g. l->i, 1->i, 5->s
    // from here on out, we are comparing visual difference
    a = replace_similar(a)
    b = replace_similar(b)
    
    // If the first characters are visually different, call it good
    if a[0] != b[0] {
        return false;
    }
   
    let lev = levenshtein(a, b);

    // If the difference is large relative to the string's length, call it OK. 
    // This means that `abb` and `abc` are ok, but `abbbb` and `abbbc` are not
    lev * 4 < a.len()
}

tgross35 added a commit to tgross35/rust-clippy that referenced this issue Feb 10, 2024
A lot of cases of the "noise" cases of `similar_names` come from two
idents with a different first letter, which is easy enough to
differentiate visually but causes this lint to be raised.

Do not raise the lint in these cases, as long as the first character
does not have a lookalike.

Link: rust-lang#10926
@tgross35
Copy link
Contributor

I did part of my above suggestion in #12258. Not raising this lint if the first letter is different removes almost all of the cases I run into.

Doing something with replacing and then levenshtein is probably still more accurate yet, but this is a good start.

bors added a commit that referenced this issue Feb 10, 2024
[`similar_names`]: don't raise if the first character is different

A lot of cases of the "noise" cases of `similar_names` come from two idents with a different first letter, which is easy enough to differentiate visually but causes this lint to be raised.

Do not raise the lint in these cases, as long as the first character does not have a lookalike.

Helps with #10926 (does not fix)

This is per-commit reviewable, the first commit is just refactoring.

changelog: [`similar_names`]: don't raise if the first character is different
@demurgos
Copy link

demurgos commented Jun 3, 2024

req/res are another example of a common identifier pair when remote calls are involved. They don't differ in the first letter. Having some way to configure the list of exceptions would be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: Enhancement of lints, like adding more cases or adding help messages
Projects
None yet
Development

No branches or pull requests

5 participants