New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add token type to LocalizedMessage #4419

Closed
MEZk opened this Issue Jun 4, 2017 · 9 comments

Comments

@MEZk
Contributor

MEZk commented Jun 4, 2017

XPath to AST mapper which will be implemented in the scope of #4369 will help us to find AST node which corresponds the XPath expression starting from the root. When the AST node is found, we need to compare the node with the AST node which causes the violation. If they are equal, the audit event need to be denied.

For example, if we have the following XPath expression to suppress a violation logged by MethodLengthCheck

//CLASS_DEF//METHOD_DEF[@Text='foo']

then the mapper will return the method definition AST node. MethodLengthCheck emits the violation for method foo. We need to log method definition AST node, because we should compare the node with the AST node given by mapper. If both nodes are equal, then audit event (violation) should be denied.

Thus, in order to suppress the violation by XPath expression we need to have the AST node which causes the violation. However, now the whole information about a violation (lineNo, columnNo, etc) is collected in LocalizedMessage. I propose to rename LocalizedMessage to Violation ann log AST everywhere. AST contains all required information (columnNo, lineNo, token type, ets).

@MEZk MEZk added the GSoC2017 label Jun 4, 2017

@MEZk MEZk changed the title from AST node which causes a violation should be a part of LocalizedMessage to AST node which causes a violation should be part of LocalizedMessage Jun 4, 2017

@MEZk

This comment has been minimized.

Show comment
Hide comment
@MEZk

MEZk Jun 14, 2017

Contributor

@romani @rnveach
Please, take a look at the issue.

Contributor

MEZk commented Jun 14, 2017

@romani @rnveach
Please, take a look at the issue.

@rnveach

This comment has been minimized.

Show comment
Hide comment
@rnveach

rnveach Jun 14, 2017

Member

Will xpath suppression support javadoc, as well as Java? If so, you would need both DetailAST and DetailNode.

I agree that the node will be required to attach to LocalizedMessage for violations. FileSets won't need to supply any nodes, so the fields will have to be optional.

@MEZk Instead of passing the AST, can we convert it to something XPath-y and store that in the message so we are not adding new fields for every type of AST we have to store in it?

Member

rnveach commented Jun 14, 2017

Will xpath suppression support javadoc, as well as Java? If so, you would need both DetailAST and DetailNode.

I agree that the node will be required to attach to LocalizedMessage for violations. FileSets won't need to supply any nodes, so the fields will have to be optional.

@MEZk Instead of passing the AST, can we convert it to something XPath-y and store that in the message so we are not adding new fields for every type of AST we have to store in it?

@MEZk

This comment has been minimized.

Show comment
Hide comment
@MEZk

MEZk Jun 17, 2017

Contributor

If so, you would need both DetailAST and DetailNode.

In the first implementation I want to work only with AST nodes in order to prove that:

  1. We can work with XPath
  2. Users will really use our new suppression model.

I agree that the node will be required to attach to LocalizedMessage for violations.

From my point of view, LocalizedMessage will become inappropriate name for the class.

Instead of passing the AST, can we convert it to something XPath-y and store that in the message so we are not adding new fields for every type of AST we have to store in it?

No, we cannot. The basic idea is:

  1. Store AST-node which causes violation. Lets call it AST1.
  2. Starting from root AST node create ElementNode (Element node is a part of Saxon XPath model) in order to apply Saxon-based tree traversal algorithms (searching in tree, iterating, etc).
  3. Parse XPath-expression. Find corresponding ElementNode (which actually stores the reference to AST nodes). In this step we find each AST-node in the AST by XPath-node. Lets call the found node AST2.
  4. We should check whether AST1 is equal to AST2. Once again, AST1 is a node which we found in the basic AST starting from root by XPath-expression. AST2 is a node which causes the violation. If AST1 is equal to AST2, we deny the AuditEvent.

Based on the sequence of actions mentioned above, we need to have the DetailAST object to work with.

Contributor

MEZk commented Jun 17, 2017

If so, you would need both DetailAST and DetailNode.

In the first implementation I want to work only with AST nodes in order to prove that:

  1. We can work with XPath
  2. Users will really use our new suppression model.

I agree that the node will be required to attach to LocalizedMessage for violations.

From my point of view, LocalizedMessage will become inappropriate name for the class.

Instead of passing the AST, can we convert it to something XPath-y and store that in the message so we are not adding new fields for every type of AST we have to store in it?

No, we cannot. The basic idea is:

  1. Store AST-node which causes violation. Lets call it AST1.
  2. Starting from root AST node create ElementNode (Element node is a part of Saxon XPath model) in order to apply Saxon-based tree traversal algorithms (searching in tree, iterating, etc).
  3. Parse XPath-expression. Find corresponding ElementNode (which actually stores the reference to AST nodes). In this step we find each AST-node in the AST by XPath-node. Lets call the found node AST2.
  4. We should check whether AST1 is equal to AST2. Once again, AST1 is a node which we found in the basic AST starting from root by XPath-expression. AST2 is a node which causes the violation. If AST1 is equal to AST2, we deny the AuditEvent.

Based on the sequence of actions mentioned above, we need to have the DetailAST object to work with.

@rnveach

This comment has been minimized.

Show comment
Hide comment
@rnveach

rnveach Jun 18, 2017

Member

@MEZk It looks like putting DetailAST directly into LocalizedMessage which is in api package might conflict with #3417 .

Attachable DetailAST might have to be declared as Object?

Member

rnveach commented Jun 18, 2017

@MEZk It looks like putting DetailAST directly into LocalizedMessage which is in api package might conflict with #3417 .

Attachable DetailAST might have to be declared as Object?

@timurt

This comment has been minimized.

Show comment
Hide comment
@timurt

timurt Jun 20, 2017

Collaborator

@rnveach @MEZk
Can't we just compare primitive variables like columnNo, lineNo and etc. If nodes are the same, they will have same values, no?

Collaborator

timurt commented Jun 20, 2017

@rnveach @MEZk
Can't we just compare primitive variables like columnNo, lineNo and etc. If nodes are the same, they will have same values, no?

@MEZk

This comment has been minimized.

Show comment
Hide comment
@MEZk

MEZk Jun 26, 2017

Contributor

@timurt
columnNo, lineNo are not enough. For example,

public class TestClass {
    void method() {
    }
}
CLASS_DEF -> CLASS_DEF [1:0]
|--MODIFIERS -> MODIFIERS [1:0]
|  `--LITERAL_PUBLIC -> public [1:0]
|--LITERAL_CLASS -> class [1:7]
|--IDENT -> TestClass [1:13]
`--OBJBLOCK -> OBJBLOCK [1:23]
    |--LCURLY -> { [1:23]
    |--METHOD_DEF -> METHOD_DEF [2:4]
    |  |--MODIFIERS -> MODIFIERS [2:4]
    |  |--TYPE -> TYPE [2:4]
    |  |  `--LITERAL_VOID -> void [2:4]
    |  |--IDENT -> method [2:9]
    |  |--LPAREN -> ( [2:15]
    |  |--PARAMETERS -> PARAMETERS [2:16]
    |  |--RPAREN -> ) [2:16]
    |  `--SLIST -> { [2:18]
    |      `--RCURLY -> } [3:4]
    `--RCURLY -> } [4:0]

As it can be seen there are 4 tokents which have the same column and raw number:

    |--METHOD_DEF -> METHOD_DEF [2:4]
    |  |--MODIFIERS -> MODIFIERS [2:4]
    |  |--TYPE -> TYPE [2:4]
    |  |  `--LITERAL_VOID -> void [2:4]

However, they have the different token types.

We may identify the node by columnNo, lineNo, tokenType.

Contributor

MEZk commented Jun 26, 2017

@timurt
columnNo, lineNo are not enough. For example,

public class TestClass {
    void method() {
    }
}
CLASS_DEF -> CLASS_DEF [1:0]
|--MODIFIERS -> MODIFIERS [1:0]
|  `--LITERAL_PUBLIC -> public [1:0]
|--LITERAL_CLASS -> class [1:7]
|--IDENT -> TestClass [1:13]
`--OBJBLOCK -> OBJBLOCK [1:23]
    |--LCURLY -> { [1:23]
    |--METHOD_DEF -> METHOD_DEF [2:4]
    |  |--MODIFIERS -> MODIFIERS [2:4]
    |  |--TYPE -> TYPE [2:4]
    |  |  `--LITERAL_VOID -> void [2:4]
    |  |--IDENT -> method [2:9]
    |  |--LPAREN -> ( [2:15]
    |  |--PARAMETERS -> PARAMETERS [2:16]
    |  |--RPAREN -> ) [2:16]
    |  `--SLIST -> { [2:18]
    |      `--RCURLY -> } [3:4]
    `--RCURLY -> } [4:0]

As it can be seen there are 4 tokents which have the same column and raw number:

    |--METHOD_DEF -> METHOD_DEF [2:4]
    |  |--MODIFIERS -> MODIFIERS [2:4]
    |  |--TYPE -> TYPE [2:4]
    |  |  `--LITERAL_VOID -> void [2:4]

However, they have the different token types.

We may identify the node by columnNo, lineNo, tokenType.

@rnveach

This comment has been minimized.

Show comment
Hide comment
@rnveach

rnveach Jun 26, 2017

Member

Can't we just compare primitive variables like columnNo, lineNo and etc.
We may identify the node by columnNo, lineNo, tokenType.

Please read #3417 (comment) .

Member

rnveach commented Jun 26, 2017

Can't we just compare primitive variables like columnNo, lineNo and etc.
We may identify the node by columnNo, lineNo, tokenType.

Please read #3417 (comment) .

@MEZk

This comment has been minimized.

Show comment
Hide comment
@MEZk

MEZk Jul 13, 2017

Contributor

@timurt
In summary, we should add token type to LocalizedMessage. LocalizedMessage object should contain token type, line, and column number. This information should be enough for AST filters to find the AST node in the tree (by line, column number, and token type). So, in the scope of this issue LocalizedMessage should be updated to have token type and a new constructor which receives token type and other arguments required to form the object. It is not expected to update checks, or other logic. All changes have to be done only in LocalizedMessage.

Contributor

MEZk commented Jul 13, 2017

@timurt
In summary, we should add token type to LocalizedMessage. LocalizedMessage object should contain token type, line, and column number. This information should be enough for AST filters to find the AST node in the tree (by line, column number, and token type). So, in the scope of this issue LocalizedMessage should be updated to have token type and a new constructor which receives token type and other arguments required to form the object. It is not expected to update checks, or other logic. All changes have to be done only in LocalizedMessage.

@MEZk MEZk changed the title from AST node which causes a violation should be part of LocalizedMessage to Add token type to LocalizedMessage Jul 13, 2017

@MEZk MEZk added the approved label Jul 13, 2017

timurt added a commit to timurt/checkstyle that referenced this issue Jul 26, 2017

@MEZk MEZk moved this from To Do to In Progress in Flexible Suppression Model Jul 26, 2017

timurt added a commit to timurt/checkstyle that referenced this issue Jul 26, 2017

romani added a commit that referenced this issue Jul 27, 2017

@romani romani added the new feature label Jul 27, 2017

@romani romani added this to the 8.2 milestone Jul 27, 2017

@romani

This comment has been minimized.

Show comment
Hide comment
@romani

romani Jul 27, 2017

Member

fix is merged

Member

romani commented Jul 27, 2017

fix is merged

@romani romani closed this Jul 27, 2017

@MEZk MEZk moved this from In Progress to Done in Flexible Suppression Model Jul 27, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment