[core] Add text operations to manipulate a document #641

gibarsin · 2017-09-21T23:56:38Z

Prologue

Hi to everyone, this is my first contribution to the project! This is the first of the many upcoming changes to the PMD project. This contribution is part of the feature Autofixable issues. Many of the changes which support this feature will be brought by my partner @MatiasComercio and me. Our mentor is @jsotuyod.
Any feedback, sugestion or problem with the following changes are welcome!

Concepts

Document

A document is a representation of a source file which can be read or updated. To represent the order of application of those manipulations, the concept of text operation is introduced.

Text Operation

A text operation can be represented as an insertion, replacement or deletion on a document, at a given offset and of a determined length.
Text operations, in a document, are ordered by their region. When any operation is to be inserted, it must satisfy that its region does not overlap with the region of the existing text operations. This list of text operations correspond to one and only one document, and they may only be applied once.

Region

A region is defined by an non-negative offset and a non-negative length, and represent a region in a document.

Summary

Text manipulations are to be applied to a document after ending the AST traversal where rule violations are found and reported.

jsotuyod · 2017-09-22T01:13:37Z

pmd-core/src/main/java/net/sourceforge/pmd/autofixes/DocumentImp.java

+    private StringBuffer stream;
+
+    public DocumentImp() {
+        stream = new StringBuffer();


do not use StringBuffer, it's a synchronized class. You should use StringBuilder instead

jsotuyod · 2017-09-22T01:16:47Z

pmd-core/src/main/java/net/sourceforge/pmd/autofixes/DocumentImp.java

+
+public class DocumentImp implements Document {
+
+    private StringBuffer stream;


is it really a good option to have the whole file contents in memory in a single buffer? This lends itself to reallocations as it's being edited, with large chunks of consecutive memory blocks being copied and moved...

This may work as an initial implementation, but I don't expect this to be final

I did it as a proof of concept to test TextOperation, surely it will not be a final implementation. Should I move it to the test package?

jsotuyod · 2017-09-22T01:19:52Z

pmd-core/src/main/java/net/sourceforge/pmd/autofixes/DocumentImp.java

+
+    @Override
+    public String getAsString() {
+        return new String(stream);


use stream.toString() instead

jsotuyod · 2017-09-22T01:41:19Z

pmd-core/src/main/java/net/sourceforge/pmd/autofixes/operations/PlaceholderTextOperation.java

+ * all the text operations of its children over the document, but this instance will not apply any
+ * operation over the document.
+ */
+public class PlaceholderTextOperation extends TextOperation {


I'm not sure I understand the reason for the existence of this class

The placeholderOperation defines a region from which all its children operations will be done. As the operations are modeled in a tree manner, if i have a list of children who represent a set of changes, I would not always want to apply a change in the root node which covers all the region after it is applied to my children.

So... this will always be the root text operation? If so...

the name should reflect this better

you shouldn't be able to freely create new instances... maybe have the document create and hold one internally?

instead of freely adding a text operation to another, you could just add operations to the document, meaning you don't need to keep the root of the tree yourself...

Just thinking out loud here

That's right, this will always be the root text operation and we shouldn't be able to freely create new instances, however we currently don't have any class in which to delegate the addition of edits to the tree. I put some thought on adding the operations to the document but that would increase coupling between the Document implementation and the way the text operations are applied, having two reasons to exist, the representation of a source file and the changes it holds.

True, yet there is an inherent coupling between them both, as the root defines a range equals to the document length. That also means, the same tree CAN'T be simply applied to a different document, but is tied to a single one. This can be on the side of the operation, but still, the one-on-one relationship and the binding between doc and operations must be enforced.

Also, the operations is setup as a tree, yet I see no logic to build this tree (just adding nodes to a parent, but not a logic on what nodes should be children of which), so I'm not sure how this would work out, or if a tree structure is even needed.

jsotuyod · 2017-09-22T01:44:41Z