Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve errors for unknown binding should consider commented out code #105469

Open
estebank opened this issue Dec 8, 2022 · 8 comments
Open

Resolve errors for unknown binding should consider commented out code #105469

estebank opened this issue Dec 8, 2022 · 8 comments
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-resolve Area: Name resolution D-terse Diagnostics: An error or lint that doesn't give enough information about the problem at hand. P-low Low priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@estebank
Copy link
Contributor

estebank commented Dec 8, 2022

When refactoring code, it is not uncommon to comment out pieces of code try try alternatives. When doing so, it's easy to forget to uncomment at least one of the alternatives. Given code like

fn main() {
    let option = Some("");
    // let Some(string) = x else { return; };
    println!("{string}");
}

the compiler should emit output similar to

error[E0425]: cannot find value `string` in this scope
 --> src/main.rs:4:16
  |
3 |     // let Some(string) = x else { return; };
  |     --          ------ you might have meant to use this commented out binding
  |     |
  |     note the comment starts here
4 |     println!("{string}");
  |                ^^^^^^ not found in this scope
@estebank estebank added A-diagnostics Area: Messages for errors, warnings, and lints A-resolve Area: Name resolution P-low Low priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. D-terse Diagnostics: An error or lint that doesn't give enough information about the problem at hand. labels Dec 8, 2022
@Ezrashaw
Copy link
Contributor

@rustbot claim

I'll have a go at this but I need some mentoring instructions. This seems to requires parsing comments?

@Ezrashaw
Copy link
Contributor

@estebank I've been thinking about this and I think that perhaps the best way to do this would be as follows:

  1. In the lexer, if we see a comment, continue lexing until we see another comment (as in two // in one line) or an EOL. We then place/mark these tokens separately.
  2. In the parser, we try to parse these tokens normally, again marking AST nodes separately.
  3. In any other parts of the compiler, like during variable resolution, we could then check to see if any variables exist in our special AST nodes (just like how this suggestion normally works).

Obviously we immediately back off silently if bad syntax is encountered (think Option, not Result). I wonder about the perf impact this could have, I assume little if we back off a lot.

@estebank
Copy link
Contributor Author

@Ezrashaw I think that a best effort of keeping an AST node (or a spanned list of comments per file) for comments and a simple string search might be enough for this, as any commented out code is bound to be syntactically suspect (and whatever be add has to be cheap both in execution and maintenance burden).

@Ezrashaw
Copy link
Contributor

Ezrashaw commented Jan 11, 2023

@Ezrashaw I think that a best effort of keeping an AST node (or a spanned list of comments per file) for comments and a simple string search might be enough for this, as any commented out code is bound to be syntactically suspect (and whatever be add has to be cheap both in execution and maintenance burden).

Right, so all I'm suggesting in addition to that is that we use AST nodes (for easy scoping etc) and that instead of a finicky string search, we just try to parse a let statement (which can be expanded later). AFAIK, parsing is a very small part of the compilers runtime and just parsing a few comments would have a small effect even on that.

Edit: the maintenance burden of this would be small, we can just reuse existing machinery.

@estebank
Copy link
Contributor Author

My concern is more around the state of the code than anything else: the commented out code might not be parseable by any reasonable parser. But we could tokenize the comments to aid us, and have simplified parsing for things like let ident or ident = and nothing else.

@Ezrashaw
Copy link
Contributor

@estebank Hmm, why not just use the normal parsing? I don't really see the reason not to. If the simplified parsing fails, then we ignore silently. Why not with proper parsing too, keep in mind that it's parsing, we aren't doing type checking or anything.

@estebank
Copy link
Contributor Author

My concern is that you could have something like

fn foo(
//    bar: i32,
) {
    // qux(); let bar = 42;
...

That would require you to try multiple alternative parses of the comments, including some that only work within other items, like the arg parse. That being said, if only a handful of cases are handled I'd still be happy.

@Ezrashaw
Copy link
Contributor

@estebank Yeah, I think for now that your example probably won't be handled, although we could just do completely normal parsing. My main issue is how to decide whether a comment is code or junk (ideally in the lexer), so that we don't have to even parse it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-resolve Area: Name resolution D-terse Diagnostics: An error or lint that doesn't give enough information about the problem at hand. P-low Low priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

2 participants