[java] New expression and type grammar #1759

oowekyala · 2019-04-04T19:45:59Z

(This is testable with the jar in the wiki)

Main changes

Many nodes are now interfaces and not separate nodes in the tree
PrimaryExpression grammar is completely rewritten
[java] About operator nodes #1661 is implemented
Fixes [java] Add new node for anonymous class declaration #905
Type grammar is substantially changed
- This adds support for some Java 8 constructs our parser doesn't recognise yet (refs [java] Java8 parsing corner case with annotated array types #997, fixes [java] Parsing error on annotated inner class #1367)

There are many things still to do to have a consistent AST, eg the new ArrayTypeDims should be used in the other places it's meant for, Dimensionable should be removed, etc. This is already big enough (keep in mind, about half of the line changes are tests).

All the changes to the AST structure are already described on this wiki page in the category "Done".

There's one change that was not in the preview version you've already seen: the parameters of lambda expressions are now their own node, which removes inconsistencies between the case when there's a single VariableDeclaratorId and the case where there's a full FormalParameter.

Temporary measures

I didn't fix any tests of rules, I just kept them compilable, which sometimes involved shamelessly commenting out some code
The only tests I made sure did pass are the kotlin tests of the ast package
Nodes that are not pushed anymore by the parser are not removed, just deprecated (for the rules to compile).
They're not on the visitor interface anymore but there are some deprecated methods on the adapter and AbstractJavaRule to keep implementors compilable

PrimaryExpression

JavaCC doesn't support left-recursion, so the left recursive nodes are built by manipulating the jjtree stack.

You can see in alljavacc.xml that the method extendLeft is inserted in JJTreeJavaParserState.java. This bumps the arity of the current node (the one being built, ie is not on the stack) by one, which means that if the node ends up being closed, the node that was pushed immediately before the current one is added as a child of the current node.

The productions PrimaryPrefix and PrimarySuffix are kept but made #void.

The first parses the expressions that can only occur in start position, eg array allocation. It tries its best to not push an AmbiguousName with some lookaheads
The second parses a segment that needs a left-hand side, e.g. .foo() or ::new. The extendLeft method is called systematically by all of those productions to enclose the previous node on the stack, which necessarily exists because we've already parsed a PrimaryPrefix (or another suffix).

Disambiguation

The parser needs to parse AmbiguousNames to keep lookaheads to a minimum. The disambiguation that uses only syntactic context is implemented at various points in the parser and the initialization code of the nodes

The javadoc of ASTAmbiguousName gives an overview of that with relevant links.

Since this version already needs to reclassify some ambiguous names, the machinery to actually replace a name with an unambiguous version is already in place. A later PR can add a naive disambiguation pass that uses the current symbol table in order to remove most of the ambiguous names

Operator nodes (#1661)

That's implemented using basically the same template for all the relevant nodes. See AbstractLrBinaryExpr and eg the production ShiftExpression

Optional vs `@Nullable`

One thing I decided which may be controversial is not to use java.util.Optional in the AST API. Instead I went with JSR 305's @Nullable and @Nonnull annotations, for several reasons:

The Optional type was initially added as a form of documentation, specifically for return types (eg not for fields or method parameters). Nullable annotations achieve the same goal but may be used in all places where it's useful, without having to explicitly wrap and unwrap optionals everywhere.
Using annotations makes it possible to incrementally migrate the codebase to this better documented form, without introducing API breaking changes. Many existing methods that have a nullable result won't need to be broken.
Using annotations improves the Kotlin compatibility by a long range. The AST API can be deemed idiomatic in both Java and Kotlin.
The Optional API is very incomplete anyway, eg it lacks an isEmpty method, ifPresentOrElse, or or (added in Java 11, 9 and 9 respectively), there's still no ifEmpty. Many times you just want to check whether a node is present, in which case dealing with Optional instead of doing a simple null check is obnoxious. When you do want to use Optional features like map or stuff then Optional.ofNullable is easy to add.
The difference that made me switch is that it's possible to override a Nullable method with a Nonnull method, which is of course not possible with Optional. This allows more things to be abstracted. E.g. you can have an interface QualifiableExpr { @Nullable getLhs(); } and implement it in ASTArrayAccess with a @Nonnull, instead of having to return a never-empty Optional.

`@NoAttribute`

This is used to remove the XPath attributes of ASTAmbiguousName to the minimum. It has many irrelevant attributes because it implements ASTExpression and ASTType (maybe that should be changed though). We can eg also use it to remove the @Image attribute from the new nodes to avoid people using them. It's pretty simple but not tested yet.

Other stuff

I made it so that the children table is never null (just use an empty array). That makes it easier to work with and also ensures that when trying to fetch a child that doesn't exist, only ArrayIndexOutOfBoundsException is thrown, and never NullPointerException
The test DSL now asserts the text bounds of nodes are ordered by inclusion (which is messed up by touching with the jjtree stack). What's currently done on master with SwitchLabeledRule breaks this invariant, so that I can't extract those changes into a separate PR without bringing in some more changes from here.
I removed the SemanticCheck attribute from AmbiguousName, we can reuse the removed code to do the temporary disambiguation pass later
JavaParserVisitorAdapter now delegates by default to abstract supertypes of nodes. So eg overriding visit(ASTExpression) effectively visits all expressions.
- This is error prone to maintain manually so we should probably generate the delegation code somehow (would be nice to have it in the default interface methods so that AbstractJavaRule can have the same behaviour without duplication).
- I deprecated JavaParserVisitorReducedAdapter because it doesn't serve any purpose now. MethodLikeNode introduces a discrepancy because now LambdaExpression delegates to visit(ASTExpression) and not visit(ASTMethodLikeNode). I think it should be removed entirely, it makes LambdaExpression implement AccessNode (!). This can go with a simplification of the metrics framework. There's no reason that metrics be constrained to be computed on methods, most of them can actually be computed on any block of code...

jsotuyod

I still have to look into the grammar itself in more detail, but this is looking really good so far

docs/pages/next_major_development.md

pmd-java/pom.xml

pmd-java/src/main/java/net/sourceforge/pmd/lang/java/ast/ASTClassOrInterfaceType.java

pmd-java/src/main/java/net/sourceforge/pmd/lang/java/ast/ASTPrimitiveType.java

pmd-java/src/main/java/net/sourceforge/pmd/lang/java/ast/AbstractJavaNode.java

pmd-java/src/main/java/net/sourceforge/pmd/lang/java/ast/AssignmentOp.java

pmd-java/src/main/java/net/sourceforge/pmd/lang/java/ast/AstImplUtil.java

.../src/main/java/net/sourceforge/pmd/lang/java/rule/errorprone/AvoidDuplicateLiteralsRule.java

...rc/main/java/net/sourceforge/pmd/lang/java/rule/performance/AppendCharacterWithCharRule.java

jsotuyod · 2019-04-08T03:20:33Z

I'd love to see test cases for the resolved issues, as some of the new added methods for completeness.

pmd-java/src/main/java/net/sourceforge/pmd/lang/java/ast/AbstractJavaNode.java

pmd-core/src/main/java/net/sourceforge/pmd/lang/ast/AbstractNode.java

pmd-apex/src/main/java/net/sourceforge/pmd/lang/apex/ast/AbstractApexNodeBase.java

pmd-java/src/main/java/net/sourceforge/pmd/lang/java/ast/ASTAdditiveExpression.java

adangel · 2019-05-01T19:56:01Z

So far I have separated these out (for 7.0.x):

children not null in AbstractNode -> [core] refactor: Make the children array never null #1823
pmd-lang-test improvements -> pmd-lang-test: Improvement to check node position #1824
xpath no attribute annotation -> [core] Support NoAttribute for XPath #1825
abstract node changes -> [core] Remove default implementation of getXPathNodeName #1826
checkers framework (just the dependency) -> [core] Add checker-qual dependency #1827

The rebased version of this PR is in my branch pr-1759-rebased2. Unfortunately not much smaller yet.
(Edit: need to compare pr-1759-newbase...pr-1759-rebased2, otherwise the two separated changes would be included).

all are generic (aka not java related) - although one or two changes need some adjustments as well to fix compile errors or tests.

Other things, that are generic, so should be separated and go into pmd/7.0.x:

the @Nonull stuff / checkersframework. It doesn't seem to have anything to do with expressions/grammar changes. Do we need it now? Or can we do this later?
- See next comment / [core] Add checker-qual dependency #1827

Then there are the actual grammar changes, which should go into the java-grammar branch:

Array changes
Support for "LeftRecursiveNode" - fixing the node positions
Dealing with operators
Annotation changes
Literal changes
Lambda Expressions. The plan is, to move them in the grammar from being a child of PrimaryExpression to Expression (a expression is either a LambdaExpression or a AssignmentExpression).
Switch Expressions. Should be separated and an done in an extra PR. There is some TODO in the grammar. Needs lambda expressions to be finished, to be able to remove the workaround "inSwitchLabel"
Type changes
PrimaryExpression changes

adangel · 2019-05-03T13:21:34Z

Re: checker framework

I think, this is useful and we should use it in general as one of api design goals. Let me know if you agree/disagree.
That's why I created this section in the wiki https://github.com/pmd/pmd/wiki/PMD-7.0.0-API#api-general-design-goals

For this PR, I tend to keep the changes related to it in - although it modifies pmd-core/pom.xml ... But I think, adding the checker dependency in pmd-core is the correct way.

Update: Actually, I moved adding the dependency to pmd-core out, but left everything else in.

From PR #1826 * This PR is for PMD 7 * It's extracted from #1759 Changes: * Ensures that a node has it's parent, when it is added as a child. * Makes `getXPathNodeName()` abstract by removing the default * implementation.

From PR #1824 * This PR is for PMD 7 * It's extracted from #1759 Changes: * adds the new `assertTextRangeIsOk`, that is executed on every node in * the kotlin-based tests * this ensures, that the position of the nodes in the AST is the same as * they appear in the source code * For the new switch labeled rules, we have the following grammar ``` void SwitchStatement(): {} { "switch" "(" Expression() ")" SwitchBlock() } void SwitchBlock() #void : {} { "{" ( SwitchLabel() ( "->" SwitchLabeledRulePart() (SwitchLabeledRule())* | ":" (LOOKAHEAD(2) SwitchLabel() ":")* (BlockStatement())* (SwitchLabeledStatementGroup())* ) )? "}" } void SwitchLabeledRulePart() #void: {checkForSwitchRules();} { ( ( Expression() ";" ) #SwitchLabeledExpression(2) | ( Block() ) #SwitchLabeledBlock(2) | ( ThrowStatement() ) #SwitchLabeledThrowStatement(2) ) } ``` `#SwitchLabeledBlock(2)` takes the last two nodes (SwitchLabel + Block) and adds them as SwitchLabeledBlock to SwitchStatement. JavaCC sets the first token of SwitchLabeledBlock to be the Block node, but it should be the SwitchLabel node. This is fixed in the `jjtClose` methods in the three related SwitchLabeled* nodes. On the expression grammar PR, there is a better solution (you can mark the nodes via the interface `LeftRecursiveNode`). * So, this actually fixes a bug, that is present in PMD 6.

From PR #1823 * This PR is for PMD 7. * It's extracted from #1759 Changes: * The children array in AbstractNode is now initialized with an empty * array. * This means, it is now never null, thus the null checks can be removed. * The only change to the children array is, when adding a new child * (`jjtAddChild`), or removing a child (`removeChildAtIndex`). Future: * the children array is protected. This means, sub classes could assign * null to it... do we really need field available in subclasses? I'd * assume, the already available methods are enough. -> this is something * for defining a general AST API (which methods should be package * private only, as they are only used by the parser, which methods * define the API, when do we implement iterator, etc..)

From PR #1827 * This PR is for PMD 7 * It's extracted from #1759 This only adds the dependency (compile time), and does not make use of it yet. I've added a section in the wiki: https://github.com/pmd/pmd/wiki/PMD-7.0.0-API We'll need to flesh out the details over time and verify our APIs, that we have properly annotated them (if we all agree, that we use `@Nullable`).

oowekyala · 2019-05-22T14:08:36Z

@adangel I investigated the failing tests and fixed the following problems:

NumericLiteral really had a problem, the tests were not well written
The optimisation in 96ea140 wasn't tested well-enough, it made the parser parse (a) * 2 as a cast, then choke on the * token.
ReportTest and similar were failing, because violation suppression with annotation depends on type resolution for annotations. Since there is no Name anymore in annotations, it didn't work anymore. I added a sketchy fix in ClassTypeResolver: d2f3968
ASTTryStatementTest was failing with a ClassCastException. That is because the grammar for concise try-with-resources was changed in this PR. Instead of pushing a raw Name, the parser now pushes an Expression. The exception is fixed in 72c931c even though the occurrence finder probably doesn't work.

I also ignored some tests:

Metrics test, because metrics are just like rules
DocumentNavigator tests: because they relied on PrimarySuffix, and anyway will probably end up being replaced with [core] NodeStream API #1622
Some tests that depend on type resolution

With those changes, the remaining failing tests are either tests of type resolution, or of the symbol table. I think we should ignore them too while we're not done with the AST. And I think we should proceed with #1566 instead of fixing the current symbol table.

No numeric promotion is performed when comparing numeric values. Some implicit assertions are added to check an invariant about literals.

adangel

So, I finally got through the changes. I think we should merge this PR and then work step by step on the open TODOs.

Here are some general notes/TODOs:

I've added the TODOs I've found in the code here: https://github.com/pmd/pmd/wiki/PMD-7.0.0-Java#todo
Are the following issues now fixed or what is missing? Maybe we can go through each one by one?
grammar: java.jjt
- TODO for later: cleanup grammar to remove checks for old java version (like check for bad diamond usage...). When do we do this? I think, it should be also done on the java-grammar branch?
- We sometimes have *List nodes, but sometimes they are #void. E.g. LambdaParameterList is not void, but AnnotationList is...
Documentation
- Example usage of AssignmentOp (AssignmentExpressions), with the removal of ASTAssignmentOperator
- Annotation() is now #void
- ASTParenthesizedExpression: provide example to illustrate how the AST is improved now

Deprecations due on PMD 6.x

constructors or methods, that are package private now. We need to mark them on the master branch first.
maybe we should simply go through all AST nodes on master and mark them accordingly for once
Classes
- ASTAdditiveExpression (constructor, getOperator())
- ASTArgumentList (constructor)
- ASTArrayInitializer (constructor)
- ASTBooleanLiteral (constructor, setTrue())
- ASTCastExpression (constructor, setIntersectionTypes/hasIntersectionType())
- ASTCatchStatement (constructor)
- ASTClassOrInterfaceType (constructor, isArray() and getArrayDepth())
- ASTConditionalAndExpression (constructor)
- ASTConditionalExpression (constructor)
- ASTEnumConstant (constructor, getQualifiedName())
- ASTEqualityExpression (constructor, getOperator())
- ASTExplicitConstructorInvocation (constructor, setIsThis(), setIsSuper())
- ASTLambdaExpression (constructor)
- ASTMarkerAnnotation (constructor, getAnnotationName())
- ASTMemberValueArrayInitializer (constructor)
- ASTMemberValuePair (constructor)
- ASTMethodDeclarator (constructor)
- ASTMethodReference (constructor)
- ASTMultiplicativeExpression (constructor, getOperator())
- ASTName (constructor)
- ASTNormalAnnotation (constructor, getAnnotationName())
- ASTNullLiteral (constructor)
- ASTPostfixExpression (constructor)
- ASTPrimitiveType (constructor)
- ASTRelationalExpression (constructor)
- ASTShiftExpression (constructor, getOperator())
- ASTSingleMemberAnnotation (constructor, getAnnotationName())
- ASTSwitchStatement (constructor)
- ASTTypeBound (getBoundTypeNodes())
- ASTTypeParameter (constructor)
- ASTUnaryExpression (constructor)

Nodes/classes that are deprecated on PMD 7, but not on PMD 6

should we remove them directly in PMD 7 and deprecate in 6?
~~we need to deprecate them at least in PMD 6~~ Since we have no replacement for them on 6.0.x, we won't deprecate them (it would only be noise to users). Instead we should keep next_major_development.md up-to-date with the planned deprecations, and maybe link to it in the minor release notes to keep users informed.
- ASTAllocationExpression
- ASTArguments
- ASTArrayDimsAndInits
- ASTAssignmentOperator
- ASTMemberSelector
- ASTMemberValuePairs
- ASTTypeArgument
- ASTUnaryExpression (method getOperator())
- ASTUnaryExpressionNotPlusMinus
- ASTVariableInitializer
- ASTWildcardBounds
- JavaParserVisitorReducedAdapter

Nodes that are moved from class to interface

How do we mark these? Should we mark these nodes in PMD 6 as deprecated as well (is this useful at all?)?
- ASTAnnotation
- ASTLiteral
- ASTMemberValue
- ASTPrimaryExpression
- ASTReferenceType
- ASTType
- ASTVariableInitializer (note: this is also deprecated in PMD 7...) Yes this is kept for compatibility with rules but is not necessary

Double File License: Some AST nodes have the license header twice. Not sure, if I introduced this during rebase... oowekyala: This is because when I copy paste node files to create new ones, IntelliJ ignores the copyright comment if it starts with a double star like a javadoc one, and inserts a new comment.
Commented out code and removed test cases - we need to revisit later and not to forget
- NameFinder
- ClassTypeResolver
- ASTAssignmentOperatorTest
- ASTFieldDeclarationTest.testGetVariableName()
- ASTLiteralTest
- ASTPrimarySuffixTest

docs/pages/next_major_development.md

pmd-java/src/main/java/net/sourceforge/pmd/lang/java/ast/JavaNode.java

pmd-java/src/main/java/net/sourceforge/pmd/lang/java/ast/TokenOps.java

adangel · 2019-05-22T19:45:06Z

pmd-java/src/main/java/net/sourceforge/pmd/lang/java/ast/TokenOps.java

+     *
+     * @throws NoSuchElementException If there's less than n tokens to the left of the anchor.
+     */
+    // test only


Do you mean, this method is public only for tests? Do we need an annotation @TestOnly?

pmd-java/src/main/java/net/sourceforge/pmd/lang/java/rule/AbstractJavaRule.java

pmd-java/src/main/java/net/sourceforge/pmd/lang/java/typeresolution/ClassTypeResolver.java

They conflict with pmd#1759

oowekyala added the in:ast About the AST structure or API, the parsing step label Apr 4, 2019

oowekyala added this to the 7.0.0 milestone Apr 4, 2019

jsotuyod reviewed Apr 8, 2019

View reviewed changes

jsotuyod reviewed Apr 12, 2019

View reviewed changes

pmd-java/src/main/java/net/sourceforge/pmd/lang/java/ast/AbstractJavaNode.java Outdated Show resolved Hide resolved

pmd-core/src/main/java/net/sourceforge/pmd/lang/ast/AbstractNode.java Outdated Show resolved Hide resolved

This was referenced Apr 20, 2019

[core] Allow abstract node types to be valid rulechain visits #1785

Closed

[core] Generate visitor delegation logic #1786

Open

[core] Match abstract types in XPath queries #1787

Open

[core] Wire processing stages into SourceCodeProcessor #1796

Merged

adangel reviewed Apr 29, 2019

View reviewed changes

pmd-apex/src/main/java/net/sourceforge/pmd/lang/apex/ast/AbstractApexNodeBase.java Outdated Show resolved Hide resolved

adangel reviewed May 1, 2019

View reviewed changes

pmd-java/src/main/java/net/sourceforge/pmd/lang/java/ast/ASTAdditiveExpression.java Show resolved Hide resolved

adangel added the is:WIP For PRs that are not fully ready, or issues that are actively being tackled label May 1, 2019

oowekyala added 7 commits May 20, 2019 20:13

Add nodes for array type dimensions

6378080

Improve Array AST

d27a754

Parse ClassOrInterfaceType recursively

2e7742b

Add TODOs

ecbeb0f

Fix some leftovers

7bed35f

Update comments

d76b6f3

Update CTResolver

70b8c38

oowekyala added 7 commits May 22, 2019 03:08

Ignore unfixable failing tests

8597d18

Fix parser choosing cast expression too easily

1df3dc5

Ignore doc nav tests

e1ac1f6

Ignore metrics until we're done with AST

3de2851

Replace some matchers

b3bb553

Test annotation syntax

748076b

Simplify numeric literal

aec8b48

oowekyala added 3 commits May 22, 2019 19:28

Improve testing utils a bit

5463f3b

No numeric promotion is performed when comparing numeric values. Some implicit assertions are added to check an invariant about literals.

Improve number literal test coverage

8cc2c23

Add type alias to shorten signatures

ee27db8

adangel reviewed May 22, 2019

View reviewed changes

oowekyala added 7 commits May 22, 2019 22:51

Doc corrections

b8a02be

Improve explicit constructor invocation parsing

08fda65

Fix comment & visibility

326b361

Cleanup tests

58c7205

Comment fixes

e3ccff3

Rename TokenOps

c5ba6c7

Remove outdated todos

d75b6de

adangel merged commit d75b6de into pmd:java-grammar May 26, 2019

oowekyala mentioned this pull request May 29, 2019

[java] About operator nodes #1661

Closed

adangel removed the is:WIP For PRs that are not fully ready, or issues that are actively being tackled label Jun 3, 2019

oowekyala deleted the grammar-expressions branch June 4, 2019 01:57

oowekyala mentioned this pull request Jul 21, 2019

[java] Refactor annotation suppression #1927

Merged

oowekyala added a commit to oowekyala/pmd that referenced this pull request Jul 31, 2019

Remove changes to MemberValuePairs

a660a37

They conflict with pmd#1759

This was referenced Apr 25, 2020

[java] RFC: new layer of abstraction atop PrimaryExpressions #497

Closed

[java] Add new node for anonymous class declaration #905

Closed

This was referenced Jan 16, 2023

[java] Breaking Java Grammar changes for PMD 7.0.0 #1019

Closed

[doc] Document usage of checker-qual @Nullable as ADR #4347

Open

PMD 7 Tracking Issue #3898

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[java] New expression and type grammar #1759

[java] New expression and type grammar #1759

oowekyala commented Apr 4, 2019 •

edited by jsotuyod

Loading

jsotuyod left a comment

jsotuyod commented Apr 8, 2019

adangel commented May 1, 2019 •

edited

Loading

adangel commented May 3, 2019 •

edited

Loading

oowekyala commented May 22, 2019

adangel left a comment •

edited

Loading

adangel May 22, 2019

[java] New expression and type grammar #1759

[java] New expression and type grammar #1759

Conversation

oowekyala commented Apr 4, 2019 • edited by jsotuyod Loading

Main changes

Temporary measures

PrimaryExpression

Disambiguation

Operator nodes (#1661)

Optional vs @Nullable

@NoAttribute

Other stuff

jsotuyod left a comment

Choose a reason for hiding this comment

jsotuyod commented Apr 8, 2019

adangel commented May 1, 2019 • edited Loading

adangel commented May 3, 2019 • edited Loading

oowekyala commented May 22, 2019

adangel left a comment • edited Loading

Choose a reason for hiding this comment

adangel May 22, 2019

Choose a reason for hiding this comment

oowekyala commented Apr 4, 2019 •

edited by jsotuyod

Loading

Optional vs `@Nullable`

`@NoAttribute`

adangel commented May 1, 2019 •

edited

Loading

adangel commented May 3, 2019 •

edited

Loading

adangel left a comment •

edited

Loading