Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggest giving recommending substitution based on naming conventions with available #64585

Open
maaku opened this issue Sep 18, 2019 · 6 comments
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-suggestion-diagnostics Area: Suggestions generated by the compiler applied by `cargo fix`. C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@maaku
Copy link

maaku commented Sep 18, 2019

New rustacean here. In writing simple programs to get a feel for the language, I tried testing if a string slice is empty as the conditional for a if/else statement. Something like the following:

if mystr.empty() {
    // ...
} else {
    // ...
}

Of course the correct method is .is_empty(). I would suppose that this is a common mistake for people coming from languages like C++. It would be friendly of the compiler to recognize that in this boolean context there is a .is_empty() method that is the same name but with is_ prepended, takes the provided (no) parameters, and returns the required boolean, and suggest that alternative to the user in the error output.

@varkor
Copy link
Member

varkor commented Sep 18, 2019

If #51398 was implemented, we could add empty as a synonym. However, we might also be able to improve our word-distance heuristic to catch examples such as this ("is_empty" shouldn't be very far from "empty").

@jonas-schievink jonas-schievink added A-diagnostics Area: Messages for errors, warnings, and lints A-suggestion-diagnostics Area: Suggestions generated by the compiler applied by `cargo fix`. C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 18, 2019
@sam09
Copy link
Contributor

sam09 commented Sep 21, 2019

@varkor I looked at src/libsyntax/util/lev_distance.rs which has a function find_best_match_for_name. The function uses Levenshtein distance to find the closest match to a given word from a list of words.

My guess is that this function is the one used to suggest alternatives to the user in case there is no method with a given name. The caveat here is that the function takes the distance by default as one-third of the word size.

In the above case empty is of length 5 and one-third of that is 2. However, empty and is_empty are at a distance of 3 and hence the compiler doesn't suggest the correct alternative.

I tried some examples to verify this,

Distance 1

if mystr.isempty() {
    // ...
} else {
    // ...
}
 
error[E0599]: no method named `isempty` found for type `&str` in the current scope
if mystr.isempty() {
  |          ^^^^^^^ help: there is a method with a similar name: `is_empty`

Distance 2

if mystr._empty() {
    // ...
} else {
    // ...
}
 
error[E0599]: no method named `_empty` found for type `&str` in the current scope
if mystr._empty() {
  |          ^^^^^^^ help: there is a method with a similar name: `is_empty`
if mystr.iempty() {
    // ...
} else {
    // ...
}
 
error[E0599]: no method named `iempty` found for type `&str` in the current scope
if mystr.iempty() {
  |          ^^^^^^^ help: there is a method with a similar name: `is_empty`
if mystr.sempty() {
    // ...
} else {
    // ...
}
 
error[E0599]: no method named `sempty` found for type `&str` in the current scope
if mystr.sempty() {
  |          ^^^^^^^ help: there is a method with a similar name: `is_empty`
if mystr.jsempty() {
    // ...
} else {
    // ...
}
error[E0599]: no method named `jsempty` found for type `&str` in the current scope
 --> a.rs:3:10
  |
3 | if mystr.jsempty() {
  |          ^^^^^^^ help: there is a method with a similar name: `is_empty`

I think distance should probably be a function of the size, smaller words might need a larger distance.

Again, I am not sure if this is the same function that is being used to find the closest name to a method

@varkor
Copy link
Member

varkor commented Sep 22, 2019

@sam09: I think this is the right function. I agree that it'd be helpful to have a higher cutoff for smaller words. You could try using a slightly different default distance function, and see if there are many changes after running x.py test src/test/ui --bless. I almost feel Levenshtein distance isn't the right metric — inserting consecutive new characters should have a lower distance than the naïve estimate. If you were interested, you could play around with some other examples, see if you can find a better estimator, and open a pull request with it?

@maaku
Copy link
Author

maaku commented Sep 24, 2019

I’m not sure that is the right approach. If you allow a larger Levenstein distance you’ll just let in an lot of totally unrelated method names. The idea is that the compiler can exploit knowledge about naming conventions to give better suggestions. So, e.g. use the current cut offs but compare against ‘is_{}’ as well, at least in a Boolean context.

@estebank estebank added the D-papercut Diagnostics: An error or lint that needs small tweaks. label Oct 5, 2019
@estebank
Copy link
Contributor

An option would be to try both levenstein distance and substring containment, as in for possible in suggestions { if possible.contains(&name) ... }.

I would say that this would need a cutoff of words shorter than 4 or 5 chars to avoid matching things like is, but I wouldn't be surprised to find that this approach would work quite well in the wild.

@estebank
Copy link
Contributor

Current output:

error[E0599]: no method named `empty` found for struct `Vec<_>` in the current scope
 --> src/main.rs:2:11
  |
2 | if vec![].empty() {
  |           ^^^^^ help: there is a method with a similar name: `is_empty`

We still don't check for an appropriate return type though, it is purely based on levenshtein distance.

@estebank estebank removed the D-papercut Diagnostics: An error or lint that needs small tweaks. label Nov 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-suggestion-diagnostics Area: Suggestions generated by the compiler applied by `cargo fix`. C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants