-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ugly cql syntax error messages #1703
Comments
@nyh So did you try the same CQL statements with Cassandra? What kind of of errors are you getting with that? |
The output in today's Cassandra trunk:
As you can see it includes much more useful context on each error. I wasn't necessarily expecting exactly the same error message, but the bizarre text |
Yes, we don't append the query snippet to the error message. There's some work to be done for that in in I'm not sure what the |
Just confirmed this bug (unhelpful parsing error messages) still exists in master. |
Four years later, we still have this unhelpful We also have a similar complaint in #5546. Some examples from that issue: For
where the error is missing quotes on
This is not very helpful. The error should be that If we add the quotes but write one too many L's in the ALLOW FILTERING:
the error is
which lacks any context on the error. |
Another example I saw today - an incorrect query which tried to use an "OR" keyword in a way not supported in CQL:
The Cassandra error is helpful - it tells you the problem is the "OR":
The Scylla error message is just
|
Another ugly example I saw today... Trying the command
I get the unintelligible error:
The By the way, Cassandra recognizes this syntax, but prints a message that it's no longer supported without being separately enabled:
(this issue is just about the ugly error message - the fact we don't support this feature at all is a separate question - #3882). |
Another ugly example I saw today:
Is missing a third parentheses before the "WITH" but it's hard to spot. Scylla doesn't help:
Again with that weird
|
We should move to antlr4. I tried before, but there's a problem with the lexer losing some capabilties, which we have to work around. |
My work-in-progress for reference: https://github.com/avikivity/scylladb/commits/antlr4 |
Why is antlr4 needed? If I understand correctly (but please correct me if I'm wrong), Cassandra also uses antl3 and managing to provide good error messages. So I thought we just have a bug in the code which produces the error messages. |
I thought they switched to antlr4, but can't find evidence for it now. |
I verified in Cassandra 4.1.1's build.xml that they use antlr 3.5.2. I'm having a really hard time figuring out why we have much worse error messages than Casandra. The error-handling code is extremely convoluted, with the worst of all worlds of object oriented and templates, but on the other hand looks like was copied from Cassandra. I know why the nice "full context" part of the message (the parentheses at the end) is missing - we didn't translate that function to C++, but I can't figure out why the main part of the message (e.g., "mismatched input 'KE' expecting K_KEY") is missing. I'm still working on it... |
The mystery deepens, and my dislike for Antlr also deepens :-( The example I'm focusing now is
Whereas Scylla prints
Three observations:
|
By the way, if you're curious where our string |
We have known for a long time (see issue scylladb#1703) that the quality of our CQL "syntax error" messages leave a lot to be desired, especially when compared to Cassandra. This patch doesn't yet bring us great error messages with great context - doing this isn't easy and it appears that Antlr3's C++ runtime isn't as good as the Java one in this regard - but this patch at least fixes **garbage** printed in some error messages. Specifically, when the parser can deduce that a specific token is missing, it used to print line 1:83 missing ')' at '<missing ' After this patch we get rid of the meaningless string '<missing ': line 1:83 : Missing ')' Also, when the parser deduced that a specific token was unneeded, it used to print: line 1:83 extraneous input ')' expecting <invalid> Now we got rid of this silly "<invalid>" and write just: line 1:83 : Unexpected ')' Refs scylladb#1703. I didn't yet marked that issue "fixed" because I think a complete fix would also require printing the entire misparsed line and the point of the parse failure. Scylla still prints a generic "Syntax Error" in most cases now, and although the character number (83 in the above example) can help, it's much more useful to see the actual failed statement and where character 83 is. Signed-off-by: Nadav Har'El <nyh@scylladb.com>
We have known for a long time (see issue scylladb#1703) that the quality of our CQL "syntax error" messages leave a lot to be desired, especially when compared to Cassandra. This patch doesn't yet bring us great error messages with great context - doing this isn't easy and it appears that Antlr3's C++ runtime isn't as good as the Java one in this regard - but this patch at least fixes **garbage** printed in some error messages. Specifically, when the parser can deduce that a specific token is missing, it used to print line 1:83 missing ')' at '<missing ' After this patch we get rid of the meaningless string '<missing ': line 1:83 : Missing ')' Also, when the parser deduced that a specific token was unneeded, it used to print: line 1:83 extraneous input ')' expecting <invalid> Now we got rid of this silly "<invalid>" and write just: line 1:83 : Unexpected ')' Refs scylladb#1703. I didn't yet marked that issue "fixed" because I think a complete fix would also require printing the entire misparsed line and the point of the parse failure. Scylla still prints a generic "Syntax Error" in most cases now, and although the character number (83 in the above example) can help, it's much more useful to see the actual failed statement and where character 83 is. Unfortunately some tests enshrine buggy error messages and had to be fixed. Other tests enshrined strange text for a generic unexplained error message, which used to say " : syntax error..." (note the two spaces and elipses) and after this patch is " : Syntax error". So these tests are changed. Another message, "no viable alternative at input" is deliberately kept unchanged by this patch so as not to break many more tests which enshrined this message. Signed-off-by: Nadav Har'El <nyh@scylladb.com>
We have known for a long time (see issue #1703) that the quality of our CQL "syntax error" messages leave a lot to be desired, especially when compared to Cassandra. This patch doesn't yet bring us great error messages with great context - doing this isn't easy and it appears that Antlr3's C++ runtime isn't as good as the Java one in this regard - but this patch at least fixes **garbage** printed in some error messages. Specifically, when the parser can deduce that a specific token is missing, it used to print line 1:83 missing ')' at '<missing ' After this patch we get rid of the meaningless string '<missing ': line 1:83 : Missing ')' Also, when the parser deduced that a specific token was unneeded, it used to print: line 1:83 extraneous input ')' expecting <invalid> Now we got rid of this silly "<invalid>" and write just: line 1:83 : Unexpected ')' Refs #1703. I didn't yet marked that issue "fixed" because I think a complete fix would also require printing the entire misparsed line and the point of the parse failure. Scylla still prints a generic "Syntax Error" in most cases now, and although the character number (83 in the above example) can help, it's much more useful to see the actual failed statement and where character 83 is. Unfortunately some tests enshrine buggy error messages and had to be fixed. Other tests enshrined strange text for a generic unexplained error message, which used to say " : syntax error..." (note the two spaces and elipses) and after this patch is " : Syntax error". So these tests are changed. Another message, "no viable alternative at input" is deliberately kept unchanged by this patch so as not to break many more tests which enshrined this message. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #13731
In PR #13731, I fixed all the ugly I think the only thing remaining to do before this issue can be finally closed is to copy Cassandra's idea of printing not just the line and character number where the error occurred - but also show the full line (or snippet of the line) with that specific position emphasized. I don't think we necessarily need to use the same format that Cassandra used for this snippet - it's not particularly pretty and other compilers do it nicer. |
How far off are we with antlr4 conversion? |
I don't think anybody ever started. Also, I don't think anyone has any idea of what kind of benefits it will give us, and specifically whether this conversion will give us better error messages. |
No particular reason, apart from that it looks to being actively developed, I saw antlr/antlr4#4237 , etc. I did not look into the differences in depth. |
Another example from a user: due to a bug in Java Driver, it was generating a double negation in counter updates (essentially
The error message returned by Scylla was unhelpful for the user to find the bug:
The user said: "the syntax error log seems to be unkind" |
The full error message is:
No wonder the user thought it is "unkind". |
@avelanarius I tried
Which is also completely unhelpful (even when I know where the error is, I don't know what the message means). The error message from today's Scylla (after 57ffbcb) is:
I don't know why it only has one quote character, and missing the "EOF" token that the Cassandra version generated. We'll need to fix that. But the problem here is funny (or sad, depending on how you look at it) - in CQL "--" (two minus signs) can be used to start a comment. I don't know why and who decided it but it documented https://docs.datastax.com/en/cql-oss/3.3/cql/cql_reference/cqlRefComment.html :-( So everything after the "--" was a comment, so the expression indeed ended prematurely, and the "unexpected EOF" was the right message. But "unclosed comment" message would have been much, much, more helpful. |
|
This also happens when you provide an invalid UUID : cqlsh:test> CREATE TABLE id (id uuid PRIMARY KEY);
cqlsh:test> SELECT * FROM id WHERE id = 00000000-0000-0000-0000-0000-000000000000 ;
SyntaxException: line 1:56 : unexpected input...
expected one of : Actually dude, we didn't seem to be expecting anything here, or at least
I could not work out what I was expecting, like so many of us these days!
cqlsh:test> Normally I would open a separate issue, but it seems like we are grouping all weird CQL syntax errors here? |
And a single dash, with a space... that's likely it.
|
If the CQL syntax requires a certain keyword, but it's not there, we get a weird error. For example consider that the correct syntax is:
Note, that the word "VIEW" is mandatory - there is no materialized anything-else. If you try to use the word "DOG" instead,
The error message is:
The "missing K_VIEW" message is correct (although the user should be told about the word "VIEW", not "K_VIEW"), but the bizarre missing context message that follows is ugly.
Another example: the new IS NOT NULL syntax requires NULL - cannot be anything else. So if you try "is not 3":
The result is again the ugly:
These examples use materialized views syntax, but I'm sure a non-MV syntax example can be found.
The text was updated successfully, but these errors were encountered: