New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Add operations to manipulate a document #828
Conversation
I cannot seem to be able to use lambdas since 1.8 or some API methods since 1.9 because the source compiling version is 1.7, is this done on purpose? |
@gibarsin Yes, most of pmd is compiled with Java 7, except the UI module, and the Apex and Scala modules IIRC. You don't seem to be using many lambdas anyway, it should be simple enough to fix? |
@oowekyala Everything's Java 7 compliant now, thanks for the answer! |
private int currentPosition = 0; | ||
|
||
private final Path temporaryPath = Files.createTempFile("pmd-", null); | ||
private final Writer writer = new FileWriter(temporaryPath.toFile()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer Files.newBufferedWriter
to avoid finalizers. Refers to AvoidFileStream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't use this method because it's documented since Java 1.8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I was looking at the method without the charset parameter.
private final Writer writer = new FileWriter(temporaryPath.toFile()); | ||
|
||
public DocumentFile(final File file) throws IOException { | ||
reader = new BufferedReader(new FileReader(file)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer Files.newBufferedReader
to avoid finalizers. Refers to AvoidFileStream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't use this method because it's documented since Java 1.8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I was looking at the method without the charset parameter.
|
||
while (scanner.hasNextLine()) { | ||
lineToOffset.add(currentGlobalOffset); | ||
currentGlobalOffset += scanner.nextLine().length(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this offset computation is actually wrong. scanner.nextLine()
will ignore new-line characters (which may be 1 or 2 depending on the usage of LF / CR). Your tests should validate this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you recommend that I access the private field lineToOffset in a test to assert the stored values? One approach is to make it protected and create a class which extends this one which adds a getter to the field, but I do not know if it's a pleasant way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can start by having your tests not use single-line files... And you should check different new lines are honored. Your current implementation for writing lines also breaks that by forcing the usage of the system-defined new line characters.
reader.close(); | ||
writer.close(); | ||
Files.copy(temporaryPath, Paths.get(filePath), StandardCopyOption.REPLACE_EXISTING); | ||
temporaryPath.toFile().delete(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do this copy + delete rather than simply rename?
return comparison; | ||
} | ||
|
||
private boolean areInsertOperations(final RegionByLine r1, final RegionByLine r2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
areInsertOperations
doesn't check if the operation is an insert? That's weird...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of the method is misleading because it checks for two insert operations at the same offset in the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nothing guarantees such assumption
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An insert operation doesn't take a region of length greater than zero in the file, but it has an offset to where insert it. For both operations to be an insertion at the same offset they have to start at the same line and column and their length representation at the file to be zero (i.e. beginLine == endLine and beginColumn == endColumn). This is what the method is checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an implementation detail of an insert operation that:
- this class shouldn't know
- this class can't guarantee will not be true for any other type of operation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- By the class you mean the DocumentOperationNonOverlappingRegionsComparator or DocumentOperationsApplierForNonOverlappingRegions?
- It is true that you can create a replace/delete operation with the same region as an insert operation, although it would not make any sense in a real case scenario. The underlying question in this method would be: if both start at the same offset, will they bother each other? If both operations don't start at the same offset, then I have to check that one starts after the other one finishes. If both operations start at the same offset and have length zero then they can be put one after the other, because they won't overlap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- For this scenario, neither.
- If that's what your method actually checks, that's how this method should be called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a solution I can delegate the comparison of operations to the comparison of their regions by using the comparable interface implemented in a region.
|
||
public DocumentOperationsApplierForNonOverlappingRegions(final Document document) { | ||
this.document = Objects.requireNonNull(document); | ||
comparator = new DocumentOperationNonOverlappingRegionsComparator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no reason for this to be an instance field, could be a static final
|
||
@After | ||
public void deleteTemporaryFiles() { | ||
temporaryFile.delete(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TemporaryFolder
does this on it's own
…ine separators in the same source file && other minor fixes
The way in which the length of the line separator is being obtained now works (tested with one source file with different types of line separators) but DocumentFile knows "too much" about what group of the matchResult to query to have the line separation match by the scanner. I think a better approach would be to extend the Scanner class and add the method getLineLengthWithLineSeparator() which is currently being held in the DocumentFile class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't bother about that right now. You seem to be constantly over-engineering your solutions; and are yet to:
- produce a minimal working sample
- actually edit an AST during analysis
- bind this to an AST edition
There will be plenty of time to refactor once anything proves to be a problem.
private final Path temporaryPath = Files.createTempFile("pmd-", null); | ||
private final Writer writer = new FileWriter(temporaryPath.toFile()); | ||
private final Path temporaryPath = Files.createTempFile("pmd-", ".tmp"); | ||
private final Writer writer = Files.newBufferedWriter(temporaryPath, StandardCharsets.UTF_8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the charset assumption may be wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anywhere I can get the charset for the file i'm editing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PMD itself takes it from the user input (CLI, Gradle, Maven, etc.), we should probably honor that. Just have both encodings be the same, and taken at the constructor, so you can hook it to the current execution configuration at the proper layer.
|
||
public DocumentFile(final File file) throws IOException { | ||
reader = new BufferedReader(new FileReader(file)); | ||
reader = Files.newBufferedReader(file.toPath(), StandardCharsets.UTF_8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the charset assumption may be wrong
this.filePath = file.getAbsolutePath(); | ||
mapLinesToOffsets(); | ||
} | ||
|
||
private void mapLinesToOffsets() throws IOException { | ||
try (Scanner scanner = new Scanner(filePath)) { | ||
try (Scanner scanner = new Scanner(Paths.get(filePath))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not have filePath
be a Path
directly?
@@ -85,7 +85,7 @@ public int compare(final DocumentOperation o1, final DocumentOperation o2) { | |||
return comparison; | |||
} | |||
|
|||
private boolean areInsertOperations(final RegionByLine r1, final RegionByLine r2) { | |||
private boolean areInsertOperationsAtTheSameOffset(final RegionByLine r1, final RegionByLine r2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we keep not knowing (from these classes perspective) if these are insert operations. You either check the operation type, or you drop that from the name.
@@ -146,4 +163,8 @@ private void writeUntilEOF() throws IOException { | |||
writer.write(line); | |||
} | |||
} | |||
|
|||
public List<Integer> getLineToOffset() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why public? at most it should be package-private if this is test-only
I have checked all the problems you mentioned and it seems there are none left. |
Merged! Thanks for the wait. Bare in mind I moved the |
@jsotuyod Thanks for the heads up! No worries about the waiting time. |
Summary
To be able to make the fixes persist on the file on which the AST was created on, a new handler is needed that is able to manage all the operations (insert, delete and replace) and apply them in order to the original file.
Additions
Additional notes