Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Potential Incorrect behaviour of ActionClassifier when computing roots actions #125

Open
martinezmatias opened this issue May 25, 2020 · 12 comments
Labels

Comments

@martinezmatias
Copy link
Contributor

martinezmatias commented May 25, 2020

Summary of the problem

The ActionClassifier (probably, IMO) produces an incorrect outputs when Updates operations are the root operations.

Description of the problem:

There are two mains problems.

First, when we have two operations: 1) update on node X and 2) an Insert of Y on X (i.e., Y is a new child of X), the roots operation only shows 1 action: the update.
IMO, That is incorrect: as we discussed in a previous issue (roots of multiple updates), updates should not be grouped (in this case with Insert)

One example is:

- int xxxxxx1 = foo001(0) ;
+ int xxxxxx1 = foo002(0,1)

Roots is only one: [Update Invocation at X:1 foo001(0) to foo002(0, 1)]
All actions are 2: [Update Invocation at X:1 foo001(0) to foo002(0, 1), Insert Literal at X:1 1]

Second related problem.
In the previous case, the inserted node (literal 1) is a child of the updated node (foo001).
However, if we consider a slightly different example, the ActionClassifier's behaviour is different.

- int xxxxxx1 = foo001(0) ;
+ int xxxxxx2 = foo001(0,1);

Here, we update the var name instead of the invoked method name (and we always insert a new parameter).

The root operation there are 2:
Update LocalVariable at X:1 int xxxxxx1 = foo001(0) to int xxxxxx2 = foo001(0, 1) , Insert Literal at X:1 1 ]

(in the previous example the roots is only 1).

The difference of behaviour is caused by in this line : the classifier checks if the parent of an action was updated. In this case, the parent of the Insert was not updated (but the grandparent (xxxxxx1) was).

(I found the problem in a real diff, but it's easier to use a fake example to explain the problem)

Solution proposed

Ignore Updates in Root classification. IMH, Updates are atomic ops that only affect one single node.
That would solve the two previous issues.
(basically, the fix removes the term && !srcUpdTrees.contains(t.getParent() from this line and from this one)

@martinezmatias
Copy link
Contributor Author

martinezmatias commented Jun 3, 2020

Actually, one of the problem is when a change (e.g., Insert) also produces that a label of a parent node is updated.
For example: test_t_223454

  + ( new File(filePath, fileName), "UTF-8" );
 -             ( new File(filePath, fileName) );

There, the remove of the Literal produces an update of the ConstructorCall (which label is the signature of the executable).
In this case, It could make sense to "hide" the Delete i.e., the Delete is not considered as Root. But I am not sure if in all cases.

@martinezmatias
Copy link
Contributor Author

A different case would be if you have an update of Method's name and an insert on its body:
There, it will show only one root: the update.

@slarse
Copy link
Contributor

slarse commented Jun 2, 2021

Just for context, I'm approaching this from the perspective of applying operations from gumtree-spoon to an AST model (i.e. applying an edit script).

I agree with the idea that updates are atomic operations on labels in GumTree. But when this is translated to Spoon, it's not that simple. If I get an update operation from gumtree-spoon, the only information I get is the source node and destination node, but there's no indication of exactly what was updated as the concept of labels doesn't carry over to Spoon. Given only that information, I think it's confusing to not include update operations in root operations, because from an API perspective an update kind of looks like a root operation, if that makes sense.

A potential solution there would be to also include the role of the item that was updated. That would give sufficient information to allow for finding the updated "thing". For example, given that we update the name of a method, the update operation contains the source and destination nodes, as well as CtRole.NAME. To me, with that information it would make sense to treat update operations as orthogonal to root operations. Any idea if that would be feasible to implement?

@martinezmatias
Copy link
Contributor Author

martinezmatias commented Jun 2, 2021

Hi @slarse

but there's no indication of exactly what was updated as the concept of labels doesn't carry over to Spoon.

Agree. Only CtNamedElements have an equivalent of "label": the simpleName.
I also agree that we lost information passing from Spoon to Tree nodes.

I think it's confusing to not include update operations in root operations, because from an API perspective an update kind of looks like a root operation, if that makes sense.

If I am not wrong, the current implement considers all updates as root ops, right?

Any idea if that would be feasible to implement?

Just an idea: What about to only consider updates of CtNamedElement? This would simplify the application of partial diffs into the source tree.
(However, I dont have any idea about that to do with updates of VirtualElements i.e., update of modifiers).

@algomaster99
Copy link
Member

What about to only consider updates of CtNamedElement?

I think I have a case which will be an exemption to this. Consider an update from int x = a + b to int x = a - b. That should qualify as an update but it won't come under CtNamedElement. So I agree with @slarse that CtRole should also be passed along with src and dst nodes. CtRole.OPERATOR_KIND can used to identify the operator and update it to -.

@slarse
Copy link
Contributor

slarse commented Jun 2, 2021

I think it's confusing to not include update operations in root operations, because from an API perspective an update kind of looks like a root operation, if that makes sense.

If I am not wrong, the current implement considers all updates as root ops, right?

Yes! Which is also confusing, as we then get root operations on elements that are direct descendants of other root operation elements.

Any idea if that would be feasible to implement?

Just an idea: What about to only consider updates of CtNamedElement? This would simplify the application of partial diffs into the source tree.
(However, I dont have any idea about that to do with updates of VirtualElements i.e., update of modifiers).

That simplicity would be very neat from the use case of applying edit scripts, as @algomaster99 suggest I think we lose some amount of precision there.

As for virtual elements, I'm not sure. The fact that modifiers aren't proper nodes is just an eternal bother :(

I'll consider this. I think the only way for all of this to make sense is to, as you suggest, dislodge the concept of update operations from root operations. Perhaps the best way to move forward here is that @algomaster99 and I just try to work out a solution that works for applying edit scripts, and then we'll get back to you with some new thoughts.

@algomaster99
Copy link
Member

algomaster99 commented Jun 20, 2021

I had a go at this problem with what both of you suggested. Might be a good thing to go ahead with if we don't have a lot of roles to care about. Let me elaborate in the steps below.

  1. Removed the check in del and add trees (getRootActions) for parent node in update trees.

    - .filter(t -> !srcDelTrees.contains(t.getParent()) && !srcUpdTrees.contains(t.getParent()))
    + .filter(t -> !srcDelTrees.contains(t.getParent()))
    - .filter(t -> !dstAddTrees.contains(t.getParent()) && !dstUpdTrees.contains(t.getParent()))
    + .filter(t -> !dstAddTrees.contains(t.getParent()))

    This will ensure that insert/delete operation does not get ignored just because they are applied to a node which is a child of an updated node.

  2. Add API for returning CtRole in UpdateOperation

      public UpdateOperation(Update action) {
             super(action);
     	 destElement = (CtElement) action.getNode().getMetadata(SpoonGumTreeBuilder.SPOON_OBJECT_DEST);
    +	 if (destElement instanceof CtBinaryOperator<?>) {
    +	 	role = CtRole.OPERATOR_KIND;
    +	 } else {
    +		role = CtRole.NAME;
    +	 }
     }

    role is a private field in the UpdateOperation class.

The problem with this approach is that we will have to analyse many spoon elements for assigning the correct role. On the upside, there may not be a large number of elements that can be updated. Following this approach, we can use the CtRole to get the value from dstNode and set it on the srcNode using getValueByRole and setValueByRole respectively in diffmin. Following is a list of elements I have come across till now.

  • CtLiteral
    - System.out.println(4+8);
    + System.out.println(10000+8);
    setValueByRole(CtRole.VALUE, dstNode.getValueByRole(CtRole.VALUE))
  • CtBinaryOperator
    - System.out.println(4-4);
    + System.out.println(4+4); 
    setValueByRole(CtRole.OPERATOR_KIND, dstNode.getValueByRole(CtRole.OPERATOR_KIND))
  • CtInvocation
    - System.out.println(1);
    + System.exit(1);
    This yields three operations.
    1. Update Invocation
    2. Delete FieldRead
    3. Insert TypeAccess
      So I am not sure how to go about this because UpdateOperation does not make sense here. But if we really want to use it (for diffs like System.out.println() -> System.out.print()), we could probably update values of CtRole.TARGET and CtRole.EXECUTABLE_REF.
  • CtMethod
    - public void add() { }
    + public void multiply() { }
    setValueByRole(CtRole.NAME, dstNode.getValueByRole(CtRole.NAME)) (valid for return types too)
    I have a feeling that many entities will fall under CtRole.NAME.

Still not quite sure how to handle CtWrapper elements using roles.

@algomaster99
Copy link
Member

@slarse Did you get time to go through the above comment?

@slarse
Copy link
Contributor

slarse commented Jul 5, 2021

It seems largely reasonable. What I can add is:

  • With this model, UpdateOperation should not at all be a root operation, so getRootOperations() should not return any update operations
  • Therefore, it would probably be most convenient to add a getUpdateOperations() method

As for the update from System.out.println() to System.exit(), isn't that just the name of the called method that is updated? I.e. println() -> exit()? The deleted field read is probably a deletion of reading System.out, replaced with a type access to System. If those are the operations, it does make sense.

System.out.println(1);
System.out.exit(1); // update println -> exit
exit(1); // delete field read
System.exit(1); // insert type access

?

@algomaster99
Copy link
Member

If those are the operations, it does make sense.

It makes sense to me now too. Instead of it making sense, I think I meant it was unnecessary because Delete FieldRead and Insert TypeAccess could have combined into one update operation and we could handle it using CtRole.TARGET.

@algomaster99
Copy link
Member

If we follow this approach to update spoon nodes, we shall only be updating the value corresponding to a specific role. We would be skipping the update of the source position of the element since we no longer "replace" elements instead we just updated their so-called labels. I am not sure if that would affect diffmin. It would definitely fail ElementOriginTest, but we just need to modify it for update operations. Other than that, I don't think so it would affect the functionality as of now.

@algomaster99
Copy link
Member

@martinezmatias I needed to confirm what is the expected behavior of the classifier in the examples you have proposed above.

- int xxxxxx1 = foo001(0) ;
+ int xxxxxx1 = foo002(0,1)

Should this have an update operation (foo1 -> foo2) and insertion of literal 1 as root operations?

In more general terms, any operation on any children node of an updated node should not be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants