Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Add text operations to manipulate a document #641

Closed
wants to merge 4 commits into from

Conversation

gibarsin
Copy link
Contributor

@gibarsin gibarsin commented Sep 21, 2017

Prologue

Hi to everyone, this is my first contribution to the project! This is the first of the many upcoming changes to the PMD project. This contribution is part of the feature Autofixable issues. Many of the changes which support this feature will be brought by my partner @MatiasComercio and me. Our mentor is @jsotuyod.
Any feedback, sugestion or problem with the following changes are welcome!

Concepts

Document

A document is a representation of a source file which can be read or updated. To represent the order of application of those manipulations, the concept of text operation is introduced.

Text Operation

A text operation can be represented as an insertion, replacement or deletion on a document, at a given offset and of a determined length.
Text operations, in a document, are ordered by their region. When any operation is to be inserted, it must satisfy that its region does not overlap with the region of the existing text operations. This list of text operations correspond to one and only one document, and they may only be applied once.

Region

A region is defined by an non-negative offset and a non-negative length, and represent a region in a document.

Summary

Text manipulations are to be applied to a document after ending the AST traversal where rule violations are found and reported.

@jsotuyod jsotuyod self-assigned this Sep 22, 2017
private StringBuffer stream;

public DocumentImp() {
stream = new StringBuffer();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not use StringBuffer, it's a synchronized class. You should use StringBuilder instead


public class DocumentImp implements Document {

private StringBuffer stream;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it really a good option to have the whole file contents in memory in a single buffer? This lends itself to reallocations as it's being edited, with large chunks of consecutive memory blocks being copied and moved...

This may work as an initial implementation, but I don't expect this to be final

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it as a proof of concept to test TextOperation, surely it will not be a final implementation. Should I move it to the test package?


@Override
public String getAsString() {
return new String(stream);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use stream.toString() instead

* all the text operations of its children over the document, but this instance will not apply any
* operation over the document.
*/
public class PlaceholderTextOperation extends TextOperation {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the reason for the existence of this class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The placeholderOperation defines a region from which all its children operations will be done. As the operations are modeled in a tree manner, if i have a list of children who represent a set of changes, I would not always want to apply a change in the root node which covers all the region after it is applied to my children.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... this will always be the root text operation? If so...

  1. the name should reflect this better
  2. you shouldn't be able to freely create new instances... maybe have the document create and hold one internally?
  3. instead of freely adding a text operation to another, you could just add operations to the document, meaning you don't need to keep the root of the tree yourself...

Just thinking out loud here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, this will always be the root text operation and we shouldn't be able to freely create new instances, however we currently don't have any class in which to delegate the addition of edits to the tree. I put some thought on adding the operations to the document but that would increase coupling between the Document implementation and the way the text operations are applied, having two reasons to exist, the representation of a source file and the changes it holds.

Copy link
Member

@jsotuyod jsotuyod Sep 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, yet there is an inherent coupling between them both, as the root defines a range equals to the document length. That also means, the same tree CAN'T be simply applied to a different document, but is tied to a single one. This can be on the side of the operation, but still, the one-on-one relationship and the binding between doc and operations must be enforced.

Also, the operations is setup as a tree, yet I see no logic to build this tree (just adding nodes to a parent, but not a logic on what nodes should be children of which), so I'm not sure how this would work out, or if a tree structure is even needed.

} else if (other.getOffsetAfterEnding() <= getOffset()) {
comparison = 1;
} else {
throw new IllegalArgumentException("region overlaps with one of its potential siblings");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no reason why this wouldn't actually happen under normal circumstances...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure in what case could I want to overlap regions. If there were any case, I still can't think of a way to put a total order relationship between them, causing that i would not know what operation to apply first.

* @return the child removed
* @throws IndexOutOfBoundsException if a child does not exist in the corresponding index
*/
public final TextOperation removeAndGetChild(final int indexOfChild) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would we want to do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary for the time being, I did them because they weren't time consuming, I'll remove it with the rest of those unused methods

* @param childToRemove child instance to be removed from this instance's children.
* @return true if the child was removed; false if the child's instance was not found
*/
public final boolean removeChild(final TextOperation childToRemove) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would we want to do this?

*
* @return a list of the removed children. If there are not children, then the list will be empty
*/
public final List<TextOperation> removeAndGetChildren() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would we want to do this?

*
* @return the list containing the children
*/
public final List<TextOperation> getShallowCopyOfChildren() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would we want to do this?

return accumulatedDelta;
}

private TextOperation getChildAtIndex(final int indexChild) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need this wrapper over a field's method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not needed, it may be an overkill

@jsotuyod jsotuyod added an:enhancement An improvement on existing features / rules in:pmd-internals Affects PMD's internals labels Sep 22, 2017
} else {
final int childrenLastIndex = children.size() - 1;
if (potentialIndex < 0) { // It is not an insert operation
return -(potentialIndex + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return ~potentialIndex?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the Collections.binarySearch method's documentation, the return value is defined as follows:

The index of the search key, if it is contained in the list; otherwise, (-(insertion point) - 1). The insertion point is defined as the point at which the key would be inserted into the list: [...]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, that's why I say it... for signed variables under two's complement, you just invert every bit...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh my god, I just realised you were using the bitwise operator, I couldn't make any sense of inverting every bit with just a substraction op. Brillant.

@jsotuyod jsotuyod added the is:WIP For PRs that are not fully ready, or issues that are actively being tackled label Sep 30, 2017
- Refactor getIndexFromChild method in TextOperation class
- Remove unused methods from TextOperation class
- Replace Stringbuilder for Stringbuffer in DocumentImp class
- Change PlaceholderTextOperation for RootTextOperation
- Add clarifying documentation
- Changed TextRootOperation to TextOperations which hold a list of text
operations for a given document and those changes may be only applied
once and for that document only.
- Reviewed return value of potentialIndex.
@gibarsin gibarsin closed this Dec 26, 2017
@gibarsin gibarsin deleted the text-manipulation branch December 26, 2017 12:49
@adangel adangel added the in:autofixes Affects the autofixes framework label Feb 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
an:enhancement An improvement on existing features / rules in:autofixes Affects the autofixes framework in:pmd-internals Affects PMD's internals is:WIP For PRs that are not fully ready, or issues that are actively being tackled
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants