-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong error message in CFE when parsing >?..
#47060
Comments
Seems that the following failure caused by the same issue void f(x, [y]) {}
main() {
int a = 1;
int b = 2;
int c = 3;
f(a<b,
c> as int);
// ^^
// [analyzer] unspecified
// [cfe] unspecified
} Output is
|
As for
it should - seen in regards to the Constructor-tear-offs feature in isolation - be parsed as
as the test also says.
which is why the analyzer and/or CFE complains about 3 arguments vs 2 expected.
The second example
is similar.
but
The parser itself gives this error:
I'll say this is working as expected. Johnni, Brian, Erik --- any comments on this? (/cc @johnniwinther @bwilkerson @eernstg ) Cf. also #47045. |
As far as I can see, the behavior of the parser is fine: We don't require anything specific about the nature of the recovery, and hence we can't complain because recovery introduces an extra comma. Pragmatically, it may or may not be the best choice (isn't it just creating extra confusion to invent a comma? --- and is it likely that the real error is that the developer forgot a comma?), but it is certainly not obvious to me how the recovery could proceed if it doesn't make an attempt to reduce the big, wrong expression to a number of smaller (wrong or correct) expressions, so it makes a lot of sense to introduce that comma. However, this would imply that a syntax error test should in general use a function declaring many optional positional parameters ( Tools like So, @jensjoha, unless you can see a way to reduce the confusion caused by unexpected extra arguments from parser recovery, I believe that you could close this issue. @sgrekhov, it would be great if you could create a co19 issue to adjust the test such that it will tolerate some additional arguments. |
As I'm sure you're aware, I have strong opinions when it comes to recovery. :-) In large part that's because recovery has a huge impact on the UX. The guiding principle for recovery should be that of least surprise. That is, the best recovery is to interpret the code the way the user is most likely to interpret it and/or intended it to be. Given something like the first example:
I expect that most users would interpret that to be an invocation of In addition to being closest to what a user would expect, it has the advantage that we wouldn't need to add or remove any tokens in order to recover that way. Unfortunately, unless we make it possible for the parser to backtrack as part of recovery (which I would love to see, but don't expect), it's probably too late to parse it that way by the time the parser has figured out that it needs to recover. (I'd love to find out that one or both of those isn't true.)
I'd actually argue that that depends on the nature of the test. For the purposes of the co19 tests I suspect that you're right. But I do think it's worthwhile having tests that capture the behavior of recovery because I think that they help us improve it over time. |
That's well justified, that's a topic where I'll just try to understand what's going on. ;-) The difficulty here is that we have specified how to disambiguate terms where The main reasons for adopting this kind of rule is that (1) we wish to avoid gobbling up too much syntactic language design space, and (2) terms like So, following these rules, the parser has already decided that However, the parser is free to have additional rules (that may need to change if we ever introduce an expression that starts with This means that it might be possible for the parser to lump all the tokens of I don't know whether it would give better results, though, or even how to measure such things... |
Yes, I understand and agree with the rules for the disambiguation, but they only apply to valid code.
I like that suggestion. I think the effect of doing that would be equivalent to what I was talking about in terms of backtracking. The way I was thinking of it was that the parser should use the disambiguation rules to determine what's going on, but when an error is found (prior to some point, such as the end of the expression), then it should back up and try making the other choice in spite of the disambiguation rules to see whether that yields better results. The big problem with backtracking in the current parser is that we've generally already invoked methods on the listener that have committed the parser to a specific interpretation of the tokens. I don't know whether that would impact the approach you suggested or not.
One way to measure something like this is through a user study: ask users questions to determine whether they understand the diagnostics and whether the diagnostics seem right. Unfortunately we don't always have the resources to run user studies, so we tend to use our own perceptions to try to guess how users would feel about the product. That isn't terrible because we are also users, though we're not always typical users. There are also some heuristics that can be applied. One of them that often helps me measure such things is the number of diagnostics being produced for a single error. Ideally there will be one diagnostic per error. For example, if I type |
About backtracking: That shouldn't be necessary (for the approach that I was suggesting): At the point where the token sequence derived from When t is not a stop/continuation token, we know that there are several values for t that will inevitably give rise to a syntax error, in particular: Every token which cannot be the first token of an expression, including This is about parsing recovery (not about the grammar of the language), and it needs to change if we ever introduce an expression that starts with And even if we can't run full-fledged user studies, there are lots of developers out there, and many of them are willing to speak up, which is also a great resource! |
The following test fails on CFE
Error message is the following
Why parser founds 3 positional arguments here when we have only one comma here?
Tested on
Dart SDK version: 2.15.0-edge.8f9113d9f1cb11400c384a4ac68fc050636cf573 (be) (Tue Aug 31 00:10:08 2021 +0000) on "linux_x64"
The text was updated successfully, but these errors were encountered: