Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[antlr4-python3-runtime-4.13.1] ANTLR runtime and generated code versions disagree: 4.13.1!=4.11.2-SNAPSHOT #4655

Open
Thomasb81 opened this issue Jul 6, 2024 · 5 comments

Comments

@Thomasb81
Copy link
Contributor

Hello

This a bug report for https://pypi.org/project/antlr4-python3-runtime/
Apparently something went wrong during the release process of 4.13.1, the generated lexer of the xpath feature has not been regenerated.

This lead to systematic runtime error, when we try to use any antlr4-python3-runtime different than 4.11.2-SNAPSHOT
ie:

ANTLR runtime and generated code versions disagree: 4.13.1!=4.11.2-SNAPSHOT

# Generated from XPathLexer.g4 by ANTLR 4.11.2-SNAPSHOT

self.checkVersion("4.11.2-SNAPSHOT")

Actually the issue still exist in dev branch,:
https://github.com/antlr/antlr4/blob/dev/runtime/Python3/src/antlr4/xpath/XPathLexer.py

Probably the release procedure describe here https://github.com/antlr/antlr4/blob/master/doc/releasing-antlr.md miss this update ?
@parrt any thought ?

@Thomasb81
Copy link
Contributor Author

My mistake, apparently it is only an informative message. But running tests coming with runtime show them.

@kaby76
Copy link
Contributor

kaby76 commented Jul 11, 2024

Actually, it's worse than this one issue. Not only are the generated files inconsistent in timestamps/versions, the XPathLexer.g4's are slightly different, and not just in target-specific code, or symbol renaming because target-agnostic format is not followed. I see a token range difference in the .g4 grammar.

$ find . -name XPathLexer.\*
./Cpp/runtime/src/tree/xpath/XPathLexer.cpp
./Cpp/runtime/src/tree/xpath/XPathLexer.g4
./Cpp/runtime/src/tree/xpath/XPathLexer.h
./Cpp/runtime/src/tree/xpath/XPathLexer.tokens
./CSharp/src/Tree/Xpath/XPathLexer.cs
./CSharp/src/Tree/Xpath/XPathLexer.g4
./CSharp/src/Tree/Xpath/XPathLexer.tokens
./Java/src/org/antlr/v4/runtime/tree/xpath/XPathLexer.class
./Java/src/org/antlr/v4/runtime/tree/xpath/XPathLexer.java
./Java/target/classes/org/antlr/v4/runtime/tree/xpath/XPathLexer.class
./Python3/src/antlr4/xpath/XPathLexer.g4
./Python3/src/antlr4/xpath/XPathLexer.py
07/11-06:24:04 /c/Users/Kenne/Documents/GitHub/antlr4/runtime
$ e ./Java/src/org/antlr/v4/runtime/tree/xpath/XPathLexer.java
07/11-06:24:41 /c/Users/Kenne/Documents/GitHub/antlr4/runtime
$ diff ./Cpp/runtime/src/tree/xpath/XPathLexer.g4 ./CSharp/src/Tree/Xpath/XPathLexer.g4
3c3
< tokens { TOKEN_REF, RULE_REF }
---
> tokens { TokenRef, RuleRef }
15,17c15,17
< word: TOKEN_REF
<       |       RULE_REF
<       |       STRING
---
> word: TokenRef
>       |       RuleRef
>       |       String
22,25c22,25
< ANYWHERE : '//' ;
< ROOT   : '/' ;
< WILDCARD : '*' ;
< BANG   : '!' ;
---
> Anywhere : '//' ;
> Root   : '/' ;
> Wildcard : '*' ;
> Bang   : '!' ;
29,32c29,33
<                               if (isupper(getText()[0]))
<                                 setType(TOKEN_REF);
<                               else
<                                 setType(RULE_REF);
---
>                               String text = Text;
>                               if ( Char.IsUpper(text[0]) )
>                                       Type = TokenRef;
>                               else
>                                       Type = RuleRef;
58,59c59,60
<             |   '\uFDF0'..'\uFFFF' // implicitly includes ['\u10000-'\uEFFFF]
<             ;
---
>             |   '\uFDF0'..'\uFFFD'
>             ; // ignores | ['\u10000-'\uEFFFF] ;
61c62
< STRING : '\'' .*? '\'';
---
> String : '\'' .*? '\'' ;
63c64
< //WS : [ \t\r\n]+ -> skip ;
---
> //Ws : [ \t\r\n]+ -> skip ;
07/11-06:25:10 /c/Users/Kenne/Documents/GitHub/antlr4/runtime
$ grep -i -e generated `find . -name 'XPathLexer.*'`
./Cpp/runtime/src/tree/xpath/XPathLexer.cpp:// Generated from XPathLexer.g4 by ANTLR 4.13.0
./Cpp/runtime/src/tree/xpath/XPathLexer.h:// Generated from XPathLexer.g4 by ANTLR 4.13.0
./CSharp/src/Tree/Xpath/XPathLexer.cs:// <auto-generated>
./CSharp/src/Tree/Xpath/XPathLexer.cs://     This code was generated by a tool.
./CSharp/src/Tree/Xpath/XPathLexer.cs://     the code is regenerated.
./CSharp/src/Tree/Xpath/XPathLexer.cs:// </auto-generated>
./CSharp/src/Tree/Xpath/XPathLexer.cs:// Generated from XPathLexer.g4 by ANTLR 4.11.2-SNAPSHOT
./CSharp/src/Tree/Xpath/XPathLexer.cs:[System.CodeDom.Compiler.GeneratedCode("ANTLR", "4.11.2-SNAPSHOT")]
./Python3/src/antlr4/xpath/XPathLexer.py:# Generated from XPathLexer.g4 by ANTLR 4.11.2-SNAPSHOT
07/11-06:27:00 /c/Users/Kenne/Documents/GitHub/antlr4/runtime
$

Note: For one of the current ongoing rewrites of Antlr, I'd highly recommend a complete rewrite of the Antlr tree representation in order to support tree edits and querying of off-channel content, along with replacing the current XPath engine in the with a real one, preferably Selenium, which is the gold standard. Trash uses a port of the ancient Xerces engine, which is excellent, but only supports XPath version 2. I'm still porting Selenium to C#.

@jimidle
Copy link
Collaborator

jimidle commented Jul 11, 2024

A difference in case. Not sure why there is such a difference, but it I can't see how this is an issue. I've never had to write a target agnostic frontend/compiler in the last 3 decades... er, ever.

Also, ANTLR produces parse trees. They are only useful for facilitating production of a more formal AST. That's where we do the micro optimizations. If writing a real world compiler, generate LLVM and leave it there.

This is not an official position in any way. Just my thoughts. Python for parsing? Hmmm

@kaby76
Copy link
Contributor

kaby76 commented Jul 11, 2024

A difference in case.

'\uFDF0'..'\uFFFF' is not the same range as '\uFDF0'..'\uFFFD'. It is not a matter of a difference in case.

@jimidle
Copy link
Collaborator

jimidle commented Jul 11, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants