Indentation and Comments inconsistent with the original code #42

Symbolk · 2018-09-14T03:33:42Z

Since the code is parsed and prettyprinted with JavaCC, though indented with JavaParser, the indentation of the original code and some comments (it's strange that some are orphan comments but some are not) are lost. When diff with git merged results using textual tools (like git diff or VSCode diff), there will be too many differences. And in some cases, comments are as important as codes.

I am trying to preserve the indentation and comments in original files, but in face of some strange problems. I want to know as authors or contributors, what's your way of thinking about implementing it? @guilhermejccavalcanti @pauloborba @gaabs

pauloborba · 2018-09-14T06:52:09Z

@leoribeiro36 this might be the explanation for some of the differences that lead to s3m FNs. I would not expect many FNs due to that, but maybe a few.

pauloborba · 2018-09-14T08:27:15Z

@Symbolk I guess the basic idea would be to keep comments and indentation info in the AST (as in Eclipse JDT), but I haven't really thought in detail about that. Could you please detail a bit more the problems you are having? Have you also seen this alternative approach? https://www.modelica.org/events/modelica2008/Proceedings/sessions/session6a3.pdf

pauloborba · 2018-09-14T08:29:21Z

@jvcoutinho could you please add tests to check how the tool should handle comments? please check @Symbolk comments above, and ask for the concrete merge scenarios in which he observed lost comments.

jvcoutinho · 2018-09-14T11:17:16Z

@Symbolk as requested above by @phmb, can you please post here the files that lead to outputs losing comments? You can compact them.

jvcoutinho · 2018-10-08T15:32:31Z

Comments problem example

pauloborba · 2018-10-08T19:02:35Z

@jvcoutinho could not understand that. both developers commented different declarations? what's the merge result?

jvcoutinho · 2018-10-09T16:11:27Z

@pauloborba Actually, by their description, one developer included a field while another developer commented an existing one.

The problem is that the comment disappeared.

jvcoutinho · 2018-10-09T16:31:29Z

The only way I could reproduce this exact scenario is when the field in base is also commented (and it seems to be the case in the example):

But that's reasonable, since only right changed the file.

jvcoutinho · 2018-10-09T16:33:11Z

Although, I'd tested more and found some inconsistencies (?), indeed.

pauloborba · 2018-10-09T18:42:13Z

yes @jvcoutinho, we need to debug that. please check when base has field A, left removes A, right adds a comment just before A. the result should contain only the comment. does the tool returns an empty class?

jvcoutinho · 2018-10-09T19:15:11Z

yes, @pauloborba

pauloborba · 2018-10-09T19:18:04Z

OK @jvcoutinho, so we have enough information now. I believe s3m stores comments in the closest declaration node, not in special CommentNodes, right @guilhermejccavalcanti?

guilhermejccavalcanti · 2018-10-09T22:13:33Z

Right, comments are stored into a separate field prefix of each node:

Also, they are not merged. If the three versions have comments, the left comment prevails due to precedence order of the merge calls and how the merged node is created. First, merge(left,base), then merge(leftBase, right), each call creates the merged node as a clone of the first parameter, which is always left, and update its body/subtree with the merge result. So, besides updating node body/subtree, would be necessary to update the prefix field, perhaps calling textual merge on it.

Symbolk · 2018-10-10T03:00:50Z

Sorry for the delay to reply.

Originally, I want to develope a merge tool based on jFSTMerge, which produces better or more reasonable merge results than git textual merge.

However, since the code is parsed, merged and then prettyprinted with AST, the original indentation and orphan comments can be missing in the merged results, which are undesirable in practical projects.

For example, in some big teams or projects (say AOSP), developers are required to wrap their newly added or edited code within two block comments, for subsequent review, blame or debug:

import android.widget.Toolbar;

/* Peter/62645 20180637  begin*/
//[HSM]
import android.view.Menu;
/* Peter/62645 20180637  end*/
import com.android.internal.annotations.GuardedBy;

(P.S. the comment format: /* Developer/ID Date begin(end) */)

Under this case, the jFSTMerge results drops the end comment while the git merge results does not (of course). Here are the 3-way contents:

Ours:

import android.widget.Toolbar;

/* Peter/62645 20180637  begin*/
//[HSM]
import android.view.Menu;
/* Peter/62645 20180637  end*/
import com.android.internal.annotations.GuardedBy;

Base:

import android.widget.Toolbar;

import com.android.internal.annotations.GuardedBy;

Theirs:

import android.widget.Toolbar;

import com.android.internal.annotations.GuardedBy;

jFSTMerge results:

import android.widget.Toolbar;

/* <DTS2014042818262 xiongshiyi/00165767 20140428 begin */
//[HSM]
import android.hsm.HwSystemManager;

import com.android.internal.annotations.GuardedBy;

Symbolk · 2018-10-10T03:01:42Z

yes, @pauloborba

What is the 3-way diff tool you are using? Seems VSCode?

jvcoutinho · 2018-10-10T19:08:27Z

@Symbolk Yes, it is VSCode.
You can split the editor 2 times and activate Zen Mode (Ctrl + K + Z by default) to get this view.

Symbolk · 2018-10-11T01:40:16Z

@Symbolk Yes, it is VSCode.
You can split the editor 2 times and activate Zen Mode (Ctrl + K + Z by default) to get this view.

Ok, thanks. Previously I thought that was an extension to provide the view like BeyondCompare 3-way merge, I am thinking about developing one like this.

pauloborba · 2018-10-11T12:05:27Z

that would be great @Symbolk. you might want to have a look at https://semanticmerge.com.

pauloborba · 2018-10-11T17:29:45Z

@jvcoutinho it seems we have similar problems with annotations too

guilhermejccavalcanti · 2018-10-11T17:41:39Z

Annotations are considered as part of declarations body, and as such are merged with textual merged. So the question is: annotations order matters? If so, the behavior is correct. Otherwise, would be necessary to have separate nodes representing annotations. Em qui, 11 de out de 2018 14:29, Paulo Borba <notifications@github.com> escreveu:

…

@jvcoutinho <https://github.com/jvcoutinho> it seems we have similar problems with annotations too — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#42 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADLfXUtB6P7pXY49YXcnjrS8lUrQXWAHks5uj4AQgaJpZM4Woj42> .

pauloborba · 2018-10-31T20:13:59Z

@jvcoutinho ask if ISI developers adopt an indentation tool/standard

jvcoutinho · 2019-06-17T16:44:37Z

@pauloborba this might be the next to go

pauloborba · 2019-06-18T21:10:38Z

yes @jvcoutinho, please go ahead

pauloborba · 2019-06-18T21:11:29Z

that would be great @Symbolk. you might want to have a look at https://semanticmerge.com.

have you made progress here @Symbolk?

jvcoutinho · 2019-06-28T22:14:34Z

@pauloborba @guilhermejccavalcanti I've been looking the way the creator of Spoofax, Eelco Visser, had treated parsing and pretty-printing. In his paper "Declarative Specification of Indentation Rules", he says that the grammar should have special declarations to specify the indentation. These declarations are ignored by the parser (for layout-sensitive languages like Haskell or Python, these declarations play an important role in parsing too) but relevant for the pretty-printer.

guilhermejccavalcanti · 2019-06-29T00:40:57Z

he says that the grammar should have special declarations to specify the indentation

So, the starting point is the generator package https://github.com/guilhermejccavalcanti/jFSTMerge/tree/master/src/main/java/br/ufpe/cin/generator which uses classes from the featurehouse.jar dependency. Afterwards, it would be necessary to change the classes from this package to support new annotations. Em sex, 28 de jun de 2019 às 19:14, João Victor <notifications@github.com> escreveu:

…

I've been looking the way the creator of Spoofax <http://www.metaborg.org/en/latest/index.html>, Eelco Visser <https://eelcovisser.org/>, had treated parsing and pretty-printing. In his paper "Declarative Specification of Indentation Rules", he says that the grammar should have special declarations to specify the indentation. These declarations are ignored by the parser (for layout-sensitive languages like Haskell or Python, these declarations play an important role in parsing too) but relevant for the pretty-printer. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#42>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAZN6XIAV2VGRHPYGC3RV63P42EMXANCNFSM4FVCHY3A> .

pauloborba · 2019-07-04T10:52:47Z

@jvcoutinho you could first try to solve the comments issue.

this should be simpler, and satisfactory for companies that use a code formatting standard/tool. s3m could simply merge and then call the formatting tool.

then you try to solve the indentation/white space issue.

jvcoutinho · 2019-07-09T18:39:35Z

So, the comments issue.
Currently, S3M stores comments in the FSTTerminal field prefix. It contains all the tokens since the end of the last terminal read. This implies comments depend on terminals to be considered. Also, textual merge runs over this field, leading possibly to conflicts.

There are two main problems with the current approach:

False positives;
Orphan comments.

In the scenario below, left removed method a() and right added a comment above method a() and another below a().

Comments represent no semantic "threat". I believe comments merge shouldn't lead to conflicts (S3M used to consider left version, while SemanticMerge always consider right version). Anyway, resolving false positives is the easy part.

Orphan comments, on the other hand, are harder to deal with. I propose two solutions:

Use unstructured merge to search for comments that s3m had forgotten. This would be basically a textual approach and it's more susceptible to errors.
Store comments as tree nodes (just like JDT parser does). In this approach, they have no identifiers, so we would have to build a handler for them.

What do you think, @pauloborba @guilhermejccavalcanti?

jvcoutinho · 2019-07-10T19:20:25Z

Check if it's worth to update the grammar to include an attribute comments in a FSTNonTerminal node.
Check if we should keep a comment of a removed node.
S3M doesn't ignore comments in the middle of the signature:

pauloborba · 2019-07-10T19:24:29Z

check example “comment public final…” vs “final comment static"

…

On 10 Jul 2019 21:20 +0200, João Victor ***@***.***>, wrote: 1. > Check if it's worth to update the grammar to include an attribute comments in a FSTNonTerminal node. 2. > Check if we should keep a comment of a removed node. 3. > S3M doesn't ignore comments in the middle of the signature: — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

jvcoutinho · 2019-07-10T19:29:17Z

S3M reports 2 conflicts (the prefix field and the declaration field) in different lines.

jvcoutinho · 2019-07-30T17:38:22Z

Indentação
preservar original;
se nós forem adicionados ou removidos? (não há problema)
se left/right adicionarem exatamente o mesmo método, qual escolher?
espaçamento entre métodos, contar linhas ou padronizar?

Comentários
fazer com que a chave que fecha a classe ("}") virasse um FSTTerminal para capturar os comentários órfãos como prefixo

guilhermejccavalcanti · 2019-08-01T18:14:17Z

@jvcoutinho me deparei com um possível bug no cenário em anexo, dá uma checada por favor.
Basicamente, em left e base havia Comentário seguido de Anotação. E right inverteu isso, ficou Anotação seguida de Comentário. Left não mudou nada no comentário nem na anotação, e deu conflito.

comment.zip

jvcoutinho · 2019-08-02T17:24:02Z

Bem, de fato é um bug. Vou colocar os trechos de código aqui para explicar.

LEFT/BASE

@LocalBean
@Stateless
/**
 * Classe de servido para gerenciar a entidade Apelido.
 * 
 * @author rbonifacio
 */
public class PalavraChaveServico extends ServicoGenerico<PalavraChave> {

RIGHT

/**
 * Classe de servido para gerenciar a entidade Apelido.
 * 
 * @author rbonifacio
 */
@LocalBean
@Stateless
public class PalavraChaveServico extends AbstractServicoSisdot<PalavraChave> {

MERGE

<<<<<<< MINE
=======
/**
 * Classe de servido para gerenciar a entidade Apelido.
 * 
 * @author rbonifacio
 */
>>>>>>> YOURS
@LocalBean
@Stateless
public class  PalavraChaveServico extends AbstractServicoSisdot<PalavraChave> {

Em left, tanto as anotações, o comentário e o token public foram colocados dentro do corpo de um mesmo nó (a princípio, esse nó conteria apenas os modificadores do método, o que sugere bug também na gramática). Em right, as anotações e o public foram colocados dentro do corpo de um mesmo nó. Porém, o comentário entrou como prefixo desse nó.

Assim, o conflito se deu quando o merge textual foi chamado nos prefixos. O estranho é que o prefixo de left é idêntico ao prefixo da base, então não sei ao certo o motivo do conflito. O não-estruturado não reportou.

Aí vai alguns prints:
Nó de left/base:

Nó de right:

pauloborba · 2019-08-05T16:38:38Z

@jvcoutinho please check if this is related to the case where left removes declaration but not its associated comment. add a test case for that.

also check how modifiers are represented in the tree.

jvcoutinho · 2019-08-06T01:10:43Z

@pauloborba

also check how modifiers are represented in the tree.

Exactly one Modifiers node is created for each class declaration, and it contains all of the modifiers in its content (even if the class has no modifiers, resulting in an empty string)

please check if this is related to the case where left removes declaration but not its associated comment.

the result is only the comment (and only one, no duplicates), but wrapped in a conflict:

although, just as the bug above, one of the contributions and the base are exactly the same. I believe prefix merge has a internal bug.

pauloborba · 2019-08-06T13:21:59Z

@jvcoutinho yes, it should be handled as a child of a non-terminal node, right?

guilhermejccavalcanti · 2019-08-06T17:09:07Z

I believe that the comments are not in the prefix because they are beetwen the annotations and the modifier. So, I guess this is the condition implemented on featurehouse "the comments are prefix only if there is nothing before them". Not sure if this is a grammar issue since this is based on original Java 8 grammar. We should avoid changes on grammar unless it is wrong, otherwise the tool will have behavior hard to explain.

Now, the question about "why is a conflict reported if there is only a single change?". This happens because the base version of the comment is empty, so, as there is no base, textual merge applies a two-way merge, and reports a conflict for every difference.

So, a slightly improvement would be to compare the strings representing the three versions, and when only one of them is different, the difference is merge result. This will fix this issue.
This is actually what we do for the entire files (https://github.com/guilhermejccavalcanti/jFSTMerge/blob/master/src/main/java/br/ufpe/cin/app/JFSTMerge.java#L210)

pauloborba · 2019-08-07T20:10:05Z

@jvcoutinho would we have a similar problem when merging a method body that is empty in the base?

jvcoutinho · 2019-08-09T14:42:31Z

@pauloborba No because its signature is included in the content to be merged. Base could never be empty in these cases.

jvcoutinho · 2019-08-09T15:13:33Z

Now, the question about "why is a conflict reported if there is only a single change?". This happens because the base version of the comment is empty, so, as there is no base, textual merge applies a two-way merge, and reports a conflict for every difference.

@guilhermejccavalcanti @pauloborba Fixed in the current PR (#89).

pauloborba · 2019-08-14T16:29:53Z

@jvcoutinho please add tests

pauloborba · 2019-08-14T16:43:31Z

@guilhermejccavalcanti could you please review that?

pauloborba · 2019-08-14T16:46:07Z

@pauloborba No because its signature is included in the content to be merged. Base could never be empty in these cases.

just to make sure that a similar problem doesn't happen when handling bodies in isolation.

pauloborba added this to To do in s3m Oct 31, 2018

jvcoutinho mentioned this issue Nov 21, 2018

Call textual merge over comments #73

Merged

guilhermejccavalcanti added the bug label Nov 26, 2018

pauloborba mentioned this issue Jul 4, 2019

Is it possible to replace the parser module with JavaParser? #36

Closed

jvcoutinho mentioned this issue Jul 4, 2019

A signature check is crashing the program #82

Closed

jvcoutinho mentioned this issue Jul 24, 2019

Comments & Indentation bugfix #89

Merged

jvcoutinho mentioned this issue Aug 11, 2019

Indentation using spaces and tabs are different for renaming matches #93

Closed

guilhermejccavalcanti closed this as completed in #89 Aug 28, 2019

pauloborba moved this from To do to Done in s3m Oct 11, 2019

Indentation and Comments inconsistent with the original code #42

Indentation and Comments inconsistent with the original code #42

Comments

Symbolk commented Sep 14, 2018

pauloborba commented Sep 14, 2018

pauloborba commented Sep 14, 2018

pauloborba commented Sep 14, 2018

jvcoutinho commented Sep 14, 2018

jvcoutinho commented Oct 8, 2018

pauloborba commented Oct 8, 2018

jvcoutinho commented Oct 9, 2018 • edited Loading

jvcoutinho commented Oct 9, 2018

jvcoutinho commented Oct 9, 2018 • edited Loading

pauloborba commented Oct 9, 2018

jvcoutinho commented Oct 9, 2018

pauloborba commented Oct 9, 2018

guilhermejccavalcanti commented Oct 9, 2018

Symbolk commented Oct 10, 2018

Symbolk commented Oct 10, 2018

jvcoutinho commented Oct 10, 2018

Symbolk commented Oct 11, 2018

pauloborba commented Oct 11, 2018

pauloborba commented Oct 11, 2018

guilhermejccavalcanti commented Oct 11, 2018 via email

pauloborba commented Oct 31, 2018

jvcoutinho commented Jun 17, 2019

pauloborba commented Jun 18, 2019

pauloborba commented Jun 18, 2019

jvcoutinho commented Jun 28, 2019 • edited Loading

guilhermejccavalcanti commented Jun 29, 2019 via email

pauloborba commented Jul 4, 2019

jvcoutinho commented Jul 9, 2019 • edited Loading

jvcoutinho commented Jul 10, 2019

pauloborba commented Jul 10, 2019 via email

jvcoutinho commented Jul 10, 2019

jvcoutinho commented Jul 30, 2019

guilhermejccavalcanti commented Aug 1, 2019

jvcoutinho commented Aug 2, 2019

pauloborba commented Aug 5, 2019

jvcoutinho commented Aug 6, 2019 • edited Loading

pauloborba commented Aug 6, 2019

guilhermejccavalcanti commented Aug 6, 2019 • edited Loading

pauloborba commented Aug 7, 2019

jvcoutinho commented Aug 9, 2019

jvcoutinho commented Aug 9, 2019

pauloborba commented Aug 14, 2019

pauloborba commented Aug 14, 2019

pauloborba commented Aug 14, 2019

jvcoutinho commented Oct 9, 2018 •

edited

Loading

jvcoutinho commented Oct 9, 2018 •

edited

Loading

jvcoutinho commented Jun 28, 2019 •

edited

Loading

jvcoutinho commented Jul 9, 2019 •

edited

Loading

jvcoutinho commented Aug 6, 2019 •

edited

Loading

guilhermejccavalcanti commented Aug 6, 2019 •

edited

Loading