Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"EMPTY" is a reserved word in CSharp target #3199

Open
kaby76 opened this issue Jun 7, 2021 · 13 comments
Open

"EMPTY" is a reserved word in CSharp target #3199

kaby76 opened this issue Jun 7, 2021 · 13 comments

Comments

@kaby76
Copy link
Contributor

kaby76 commented Jun 7, 2021

As the title says, EMPTY is a reserved symbol in grammars for the C# target (and likely in the alt tool/runtime Antlr4cs). That's because it's defined in ParserRuleContext here. If you define EMPTY as a lexer symbol in your grammar, the compiler will give warnings. For further information, see this issue. The workaround is to just rename EMPTY to EMPTY_ in your grammar. (I use trrename to bulk rename such conflicts quickly.)

@KvanTTT
Copy link
Member

KvanTTT commented Jun 7, 2021

Could be added to badwords list for C#.

@kaby76
Copy link
Contributor Author

kaby76 commented Jun 7, 2021

Ha! C# is the only language without "bad words". Goodness, what's a language without of few colorful metaphors. (All the other targets have the API Cpp, Dart, Go, JavaScript, Java, etc.)

@KvanTTT
Copy link
Member

KvanTTT commented Jun 7, 2021

This is because C# allows keyword escaping. But EMPTY seems like is not the case and I guess such a set should be added to C#. It should contain EMPTY, children, _start, _stop, and other members that potentially conflict with grammar rule names. I'm curious, can this word set be generated using reflection?

C# "bad words" is described in CSharp.stg.

@kaby76
Copy link
Contributor Author

kaby76 commented Jun 8, 2021

Goodness, why wasn't this mapping added for each of the other targets? This seems like an easy change and it would have prevented hundreds, if not thousands, of renames in grammars-v4 to remove symbol conflicts that I and others manual made over many months. (I implemented and used trrename a few months ago to change whole-scale the remaining grammars--half--in grammars-v4 to complete the renaming to eliminate the symbol conflicts.) Symbol rewrite in the generated code was suggested in 2015 but nothing came of it.

_stop, _start wouldn't have to be added because they're not legal Antlr nonterminal names. But, maybe "children", and whatever else is being used as a symbol name but that's a method or field name in the generated code. And, move the actual "badWords" list to this mapping.

@KvanTTT
Copy link
Member

KvanTTT commented Jun 8, 2021

Goodness, why wasn't this mapping added for each of the other targets?

Because other targets don't support keyword escaping. Maybe Swift also supports (see docs).

Symbol rewrite in the generated code was suggested in 2015 but nothing came of it.

Yes, it's a good idea. Just add _ suffix to conflicting identifiers at the generation step.

@kaby76
Copy link
Contributor Author

kaby76 commented Jun 8, 2021

Goodness, why wasn't this mapping added for each of the other targets?

Because other targets don't support keyword escaping.

Good grief. We didn't add the code to the .stg's (and rip out "badWords" elsewhere) because it's not already added.

This is really an easy change. Will give a try.

@kaby76
Copy link
Contributor Author

kaby76 commented Jun 9, 2021

StringTemplate is delightful code! I just wish it was possible to pass the templates for code gen into the Antlr tool via command line, so I could write custom parsers, and new targets. As a prototype, I emptied out the "badWords" list for the Java target, then refactored it and added it to Java.stg. I also needed to change a few places where names need to be mapped to altered form. I tried it out on a grammar with "null" and it works beautifully. Attached is the Java.stg code.
Java.txt. It should be easy to change the other templates and take care of symbol conflicts once and for all.

@KvanTTT
Copy link
Member

KvanTTT commented Jun 9, 2021

It would be great!

Attached is the Java.stg code.

I recommend using GitHub Gist or another code-sharing service because they provide code highlighting, editing, and other useful features. Also, they are more convenient for viewers.

@KvanTTT
Copy link
Member

KvanTTT commented Nov 20, 2021

@kaby76 I'm not sure that placing of such keywords into .stg code is a very good idea:

  1. It's redundant because all dictionary values corresponds to their keys plus suffix (@ for C#) or prefix (_ for others).
  2. In theory it's possible that replaced value also conflicts with runtime keywords. Especially it's actual for C languages where _ using is widespread.

Instead I suggest replacing on ANTLR side. We can have prefix or suffix fields for every runtime. For instance, C# will have empty suffix but @ prefix. Swift will have both suffix and prefix (```). Other runtimes will have _ suffix.

I'm going to fix it in my ANTLR fork together with other tasks that I consider important especially for grammars (first of all, introducing of case insensitive option).

@kaby76
Copy link
Contributor Author

kaby76 commented Nov 20, 2021 via email

@NickStrupat
Copy link

Name collisions could be resolved by placing the new modifier on the generated lexer/parser class members which collide with the library member.

The EMPTY token member can be retrieved via <LexerName>Lexer.EMPTY and the original ParserRuleContext.EMPTY member can be retrieved by static cast of your instance to ParserRuleContext.

@KvanTTT
Copy link
Member

KvanTTT commented Jan 19, 2022

We are working on the problem of conflicting identifiers.

@NickStrupat
Copy link

NickStrupat commented Jan 19, 2022

@KvanTTT you're awesome! Wondering if this issue I found might be related or of interest:

#3503

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants