Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhanced EOL support #2685

Merged
merged 36 commits into from May 26, 2020
Merged

Conversation

MysterAitch
Copy link
Member

Fixes #2647

This PR adds an enum to represent different types of end-of-line / line-separator characters.

  • It also adds a utility method which allows you to "detect" which style of line-separator a given string uses.

  • This is incorporated into the parsing process by reading in the content of the given provider and adding this as a data field within the top-most node.

  • This is then used as part of the CSM and pretty printer where new added nodes will reflect this (defaulting to the system's eol character if it is unknown/ambiguous e.g. it is a single line or there are mixed endings).

Reading in the provider content and then creating a brand new StringProvider using this string is a bit "hacky" and I'm sure there must be a better way to do it, I just don't know what it is -- comments and suggestions welcomed!

… which line ending was found during parsing -- this is then used during the lexical preservation csmtoken insertion/diffing to ensure that the "new" newlines are of the same type as the parent....
…r one isn't set or it is a meta-value e.g. EMPTY or NONE
…t should be unescaped line separators, but the lookup returns escaped line separators; deprecated isSpaceOrTab and added isWhitespaceButNotEndOfLine as a replacement;
…pace within the grammar (e.g. non-breaking spaces and tab characters)
@matozoid
Copy link
Contributor

Random informational remark: the textblock feature of Java 12 or 13 or so is very explicit about the kind of eols it wants. Be careful not to break it :-)

@MysterAitch
Copy link
Member Author

Random informational remark: the textblock feature of Java 12 or 13 or so is very explicit about the kind of eols it wants. Be careful not to break it :-)

It would be a shame if our test suite wasn't able to highlight any regression like this ;)

But yes -- Just had a read through the JEP and found this:

https://openjdk.java.net/jeps/378#1--Line-terminators

The content may include line terminators directly, unlike the characters in a string literal. The use of \n in a text block is permitted, but not necessary or recommended. For example, the text block:

"""
line 1
line 2
line 3
"""

is equivalent to the string literal:

"line 1\nline 2\nline 3\n"

or a concatenation of string literals:

"line 1\n" +
"line 2\n" +
"line 3\n"

If a line terminator is not required at the end of the string, then the closing delimiter can be placed on the last line of content. For example, the text block:

"""
line 1
line 2
line 3"""

is equivalent to the string literal:

"line 1\nline 2\nline 3"

@MysterAitch
Copy link
Member Author

Hmm... as far as I can tell, there is no reason for the tests to fail on appveyor (consistently) but then pass on all other places...

As far as I can see, the "expected" and "actual" sections are identical, but there's no information if there are line separator differences... I'll push a revised test in a moment.

[INFO] Results:
[INFO] 
[ERROR] Failures: 
[ERROR]   CommentsInserterTest.issue412:114 expected: </* 1
1 */
/*
2
2
 */
package /* a */
/* b */
// c
// d
com.pany/*alma*/
/*k�rte*/
// l�fasz
// j�ska
.experiment;
// z
// x
/*y*/
/*w*/
/*aa*/
/*bb*/
// cc
// dd
import com.github.javaparser.JavaParser;
public class Main {
    public static void main(String[] args) throws FileNotFoundException, IOException, ParseException {
    // try (FileInputStream fisTargetFile = new FileInputStream(new File(FILENAME))) {
    // final String content = IOUtils.toString(fisTargetFile, "UTF-8");
    // System.out.println(content);
    // }
    }
}
> but was: </* 1
1 */
/*
2
2
 */
package /* a */
/* b */
// c
// d
com.pany/*alma*/
/*k�rte*/
// l�fasz
// j�ska
.experiment;
// z
// x
/*y*/
/*w*/
/*aa*/
/*bb*/
// cc
// dd
import com.github.javaparser.JavaParser;
public class Main {
    public static void main(String[] args) throws FileNotFoundException, IOException, ParseException {
    // try (FileInputStream fisTargetFile = new FileInputStream(new File(FILENAME))) {
    // final String content = IOUtils.toString(fisTargetFile, "UTF-8");
    // System.out.println(content);
    // }
    }
}
>
[INFO] 
[ERROR] Tests run: 1085, Failures: 1, Errors: 0, Skipped: 6
[INFO] 
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for javaparser-parent 3.15.23-SNAPSHOT:
[INFO] 
[INFO] javaparser-parent .................................. SUCCESS [  1.594 s]
[INFO] javaparser-core .................................... SUCCESS [ 27.814 s]
[INFO] javaparser-core-testing ............................ FAILURE [01:02 min]
[INFO] javaparser-core-testing-bdd ........................ SKIPPED
[INFO] javaparser-core-generators ......................... SKIPPED
[INFO] javaparser-core-metamodel-generator ................ SKIPPED
[INFO] javaparser-core-serialization ...................... SKIPPED
[INFO] javaparser-symbol-solver-core ...................... SKIPPED
[INFO] javaparser-symbol-solver-testing ................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  01:31 min
[INFO] Finished at: 2020-05-22T16:57:37Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M4:test (default-test) on project javaparser-core-testing: There are test failures.
[ERROR] 

…`input` meaning that it checks out the code using the repo's line endings not the system's line endings

this was discovered / came up as a problem in javaparser#2685 where javaparser started to respect the line endings of test resources instead of reading line-by-line and inserting the system's line endings
@MysterAitch
Copy link
Member Author

MysterAitch commented May 23, 2020

EDIT: As a reminder to self for something to investigate tomorrow, maybe the appveyor git configuration is not performing autocrlf changes, meaning that the checked out code is unexpectedly using the repo's line endings (\n iirc) as opposed to the system's line endings or something? That might explain why the issues is only being seen on appveyor.

Oh dear, I should have followed up on this passing thought for just a couple of minutes... cursory searching turned up many results about this...

I read on another thread (http://help.appveyor.com/discussions/problems/671-coreautocrlf) the default configuration for the git autocrlf setting is "input". This means AppVeyor is checking out files with LF (\n) endings on Windows. That broke some of my tests.

https://help.appveyor.com/discussions/problems/1119-allow-changing-git-autocrlf-setting

The git core.autocrlf setting defaults to true (ie, all text files get
checked out as CRLF on Windows), except on Appveyor where's set to
"input" (ie, all text files get checked out with the upstream
repository's line endings, which for us typically means LF.)

https://gitlab.freedesktop.org/mesa/mesa/commit/9e5e3a8ead5321a6abc0add8f6e59e50052f9698

Adding this to the appveyor.yml has the tests passing :)

init:
  # Setup autocrlf -- by default, appveyor uses autocrlf input
  # ... This affects tests which expect resource files to have the systems's line separator.
  - git config --global core.autocrlf true

https://ci.appveyor.com/project/ftomassetti/javaparser/builds/33067787

@matozoid
Copy link
Contributor

matozoid commented May 23, 2020

Adding this to the appveyor.yml has the tests passing :)

Yeeeeessss, but the tests passed before, and shouldn't need an AppVeyor change. Can you reason why this PR broke all those tests in the first place?

One place to look for is files with test content. Those have some kind of EOL, and tests probably depend on that, or better: expect certain EOLs to be in there. That means AppVeyor was doing the right thing - leave the EOLs alone.

Note that the AppVeyor build exists for only one thing: check EOL behaviour on Windows!

@matozoid
Copy link
Contributor

I was notified that I was mentioned here or so? I don't see anything? Or was it a notification of a new commit maybe?

@MysterAitch
Copy link
Member Author

MysterAitch commented May 23, 2020

Adding this to the appveyor.yml has the tests passing :)

Yeeeeessss, but the tests passed before, and shouldn't need an AppVeyor change. Can you reason why this PR broke all those tests in the first place?

One place to look for is files with test content. Those have some kind of EOL, and tests probably depend on that, or better: expect certain EOLs to be in there. That means AppVeyor was doing the right thing - leave the EOLs alone.

Note that the AppVeyor build exists for only one thing: check EOL behaviour on Windows!

Basically appveyor checked out the code using the repo's line endings. That meant that the files all have \n endings instead of \r\n.

Once this PR started to respect the file's ACTUAL line endings (e.g. within lexical preservation tests), it broke the tests.

Setting autocrlf to true causes appveyor to checkout the code using the host's line separator again 😄.

@MysterAitch
Copy link
Member Author

From chat:

Danny van Bruggen @matozoid 19:02
Yeah, I think you can simply register it as a pre- and post-processor. Look through the stream while preprocessing, store the eol info inside the processor, and in the post-processor you have the result object so you can store the eol info wherever you're storing it currently.

The only difficult thing is handling the JavaCC Provider implementation :-# That interface is begging for one-off errors.

@matozoid
Copy link
Contributor

Well done @MysterAitch ! This is a fundamental improvement!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lexical Preservaton -- Injected code uses host's EOL not the source's
2 participants