Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify output path generation; new -Xexact-output-dir option #2065

Closed
wants to merge 1 commit into from
Closed

Simplify output path generation; new -Xexact-output-dir option #2065

wants to merge 1 commit into from

Conversation

mike-lischke
Copy link
Member

This is a new version of PR #1829 from 19 April and #1905 (which lost all the changes I made during a git merge).

The twisted way to create a javac like output path breaks many use cases, including very simple ones with lexer + parser grammar in a relative subdir (where parser generation doesn't find the lexer tokens). Additionally, not all projects prefer to have their generated files in an output folder which is comprised of the given output path and the subdir given for the grammar files. The check for absolute paths as decision criterion is a weak one anyway - it makes simple usage difficult and doesn't deal with path expansion on all platforms.

If a javac like output path is required it should be easy to construct this upfront and provide it as the output path parameter, instead of implicitely creating it. This patch simplifies things without any WTF moments because generated files end up at unexpected locations. It now can deal with simple cases (no output, no grammar subdir) up to complicated cases (nested and/or absolute output dir, multiple grammar subfolders).

If more than one output folder is wanted run the tool multiple times with different output path settings and add the lib option if you need to reference previously created files.

It might look like a disadvantage that now generation of parser files ends up at a different path when run with relative grammar subfolders, but actually that was broken anyway so I don't expect anyone having used that (but instead either absolute paths or no subfolders at all).

Fixes #1087, #753, #638, tunnelvisionlabs/antlr4ts#306, tunnelvisionlabs/antlr4ts#303

@parrt
Copy link
Member

parrt commented Nov 4, 2017

@mike-lischke I'm working on integrating this one now. I will need to add an option so that I can leave everything as backward-compatible. trying to understand the Delta and the functionality from your changes and based on our discussions above. hang on...

@parrt
Copy link
Member

parrt commented Nov 4, 2017

If I may summarize the changes...

  1. Output -o directory specifier is the exact directory containing the output. Previously it would include the relative path specified on the grammar itself for the purposes of packages.

new: -o /tmp subdir/T.g4 => /tmp/subdir/T.java
old: -o /tmp subdir/T.g4 => /tmp/T.java

  1. Previously we looked for the tokens vocab file in the -lib dir or in the output dir. New: also look in the directory containing the grammar, particularly if it it is specified with a path.

Example for the output directory (4.7)

Here is the existing 4.7 functionality.

$ cd /tmp/parrt
$ tree
.
├── B.g4
└── src
    └── pkg
        └── A.g4
$ a4.7 -o /tmp/build src/pkg/A.g4
$ tree /tmp/build
/tmp/build/
└── src
    └── pkg
        ├── A.tokens
        ├── ABaseListener.java
        ├── ALexer.java
        ├── ALexer.tokens
        ├── AListener.java
        └── AParser.java

Now, let's build a grammar that sits in the current directory:

$ a4.7 -o /tmp/build B.g4
$ tree /tmp/build
/tmp/build
├── B.tokens
├── BBaseListener.java
├── BLexer.java
├── BLexer.tokens
├── BListener.java
├── BParser.java
└── src
    └── pkg
        ├── A.tokens
        ├── ABaseListener.java
        ├── ALexer.java
        ├── ALexer.tokens
        ├── AListener.java
        └── AParser.java

Finally, if we don't specify the output directory, it paid attention to the relative path specified on the input grammar:

$ a4.7 src/pkg/A.g4
$ tree
.
├── B.g4
└── src
    └── pkg
        ├── A.g4
        ├── A.tokens
        ├── ABaseListener.java
        ├── ALexer.java
        ├── ALexer.tokens
        ├── AListener.java
        └── AParser.java

Example for the output directory (4.7.1 with -Xexact-output-dir)

Now, the output directory is the exact directory where output is generated regardless of relative paths on the grammar

$ cd /tmp/parrt
$ a4.7.1 -Xexact-output-dir  -o /tmp/build src/pkg/A.g4
$ tree /tmp/build
/tmp/build
├── A.tokens
├── ABaseListener.java
├── ALexer.java
├── ALexer.tokens
├── AListener.java
└── AParser.java

If you use the package option, it still does not change where the output is generated if you use -o

$ a4.7.1 -Xexact-output-dir -package pkg -o /tmp/build src/pkg/A.g4
$ tree /tmp/build
/tmp/build
├── A.tokens
├── ABaseListener.java
├── ALexer.java
├── ALexer.tokens
├── AListener.java
└── AParser.java

4.7.1 does however add the package specification into the generated files:

$ grep package /tmp/build/A*.java
/tmp/build/ABaseListener.java:package pkg;
/tmp/build/ALexer.java:package pkg;
/tmp/build/AListener.java:package pkg;
/tmp/build/AParser.java:package pkg;

Compare this to 4.7:

$ a4.7 -package pkg -o /tmp/build src/pkg/A.g4
beast:/tmp/parrt $ tree /tmp/build
/tmp/build
└── src
    └── pkg
        ├── A.tokens
        ├── ABaseListener.java
        ├── ALexer.java
        ├── ALexer.tokens
        ├── AListener.java
        └── AParser.java

Example of where it looks for tokens vocab

In 4.7, we got an error for an obvious case that should work:

$ cd /tmp/parrt
$ tree
.
└── src
    └── pkg
        ├── L.g4
        └── P.g4
$ a4.7 -o /tmp/build src/pkg/*.g4
error(160): P.g4:2:21: cannot find tokens file /tmp/build/L.tokens
warning(125): P.g4:3:4: implicit definition of token A in parser

In 4.7.1 it looks in the directory containing the grammars as well:

$ a4.7.1 -o /tmp/build src/pkg/*.g4
$ tree /tmp/build
/tmp/build
├── L.java
├── L.tokens
├── P.java
├── P.tokens
├── PBaseListener.java
├── PListener.java
└── src
    └── pkg
        ├── L.java
        └── L.tokens

parrt added a commit to parrt/antlr4 that referenced this pull request Nov 4, 2017
…ct if you use a new option `-Xexact-output-dir`. Fixes antlr#1087, Fixes antlr#753, Fixes antlr#638.
@parrt
Copy link
Member

parrt commented Nov 4, 2017

Merged manually after adding -Xexact-output-dir

@parrt parrt closed this Nov 4, 2017
@mike-lischke mike-lischke deleted the output-path branch November 6, 2017 18:34
@parrt parrt changed the title Simplify output path generation Simplify output path generation; new -Xexact-output-dir option Dec 8, 2017
@DamianReeves
Copy link

Is there a combination of -package and -output that works properly for 4.7? I'm stuck on 4.7 and it will be weeks before I can get 4.7.1 in my organization. I bump up with this on the regular and am feeling the pain.

My scenario looks like the one described above for 4.7:

$ cd /tmp/parrt
$ tree
.
└── src
    └── pkg
        ├── L.g4
        └── P.g4
$ a4.7 -o /tmp/build src/pkg/*.g4 -package "com.myorg.myproj"
error(160): P.g4:2:21: cannot find tokens file /tmp/build/L.tokens
warning(125): P.g4:3:4: implicit definition of token A in parser

@parrt
Copy link
Member

parrt commented Jan 27, 2018

Depends on "properly" I guess. Most people have had no trouble doing what they need beforehand so maybe it's a matter or reorg of output dirs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants