Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] CPD: --skip-duplicate-files has no effect (7.0.0 regression) #4942

Closed
C-Otto opened this issue Apr 8, 2024 · 1 comment · Fixed by #4956
Closed

[core] CPD: --skip-duplicate-files has no effect (7.0.0 regression) #4942

C-Otto opened this issue Apr 8, 2024 · 1 comment · Fixed by #4956
Assignees
Labels
a:bug PMD crashes or fails to analyse a file. in:cpd Affects the copy-paste detector
Milestone

Comments

@C-Otto
Copy link

C-Otto commented Apr 8, 2024

Affects PMD Version:

7.0.0

Description:

CPD reports duplication for identical files even though --skip-duplicate-files is enabled.
This also happens via Gradle (CPDConfiguration.setSkipDuplicates(true)).

Code Sample demonstrating the issue:

/*
 * Copyright 2019 Andreas Schmid
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package de.aaschmid.test;

import static org.junit.Assert.assertTrue;

public class Test {

    public testCpd() {
        assertTrue(true);
    }
}

Steps to reproduce:

$ cat a/Test.java
(as above)
$ cat b/Test.java
(as above)
$ diff a/Test.java b/Test.java
(no output => no difference)
$ pmd-bin-7.0.0/bin/pmd cpd --skip-duplicate-files --minimum-tokens=10 a/ b/
Found a 6 line (15 tokens) duplication in the following files: 
Starting at line 20 of /tmp/pmd/a/Test.java
Starting at line 20 of /tmp/pmd/b/Test.java

public class Test {

    public testCpd() {
        assertTrue(true);
    }
}

This is not an issue with 6.55.0:

$ ./pmd-bin-6.55.0/bin/run.sh cpd --minimum-tokens 10 --skip-duplicate-files --dir a/ --dir b/
Skipping /tmp/pmd/b/Test.java since it appears to be a duplicate file and --skip-duplicate-files is set

Running PMD through:

CLI, Gradle

@C-Otto C-Otto added the a:bug PMD crashes or fails to analyse a file. label Apr 8, 2024
@C-Otto C-Otto changed the title [core] [cpd] --skip-duplicate-files has no effect [cpd] --skip-duplicate-files has no effect Apr 8, 2024
@C-Otto C-Otto changed the title [cpd] --skip-duplicate-files has no effect [cpd] --skip-duplicate-files has no effect (7.0.0 regression) Apr 8, 2024
@adangel adangel added the in:cpd Affects the copy-paste detector label Apr 8, 2024
@adangel
Copy link
Member

adangel commented Apr 8, 2024

I can confirm, that this is broken now. The flag is set on CPDConfiguration but never used.

In PMD 6, the implementation is here:

if (configuration.isSkipDuplicates()) {
// TODO refactor this thing into a separate class
String signature = file.getName() + '_' + file.length();
if (current.contains(signature)) {
System.err.println("Skipping " + file.getAbsolutePath()
+ " since it appears to be a duplicate file and --skip-duplicate-files is set");
return;
}
current.add(signature);
}

Note, that PMD was actually only comparing the simple file name and the file size, but not the content of the files.

@adangel adangel self-assigned this Apr 12, 2024
@adangel adangel changed the title [cpd] --skip-duplicate-files has no effect (7.0.0 regression) [core] CPD: --skip-duplicate-files has no effect (7.0.0 regression) Apr 12, 2024
@adangel adangel added this to the 7.1.0 milestone Apr 12, 2024
adangel added a commit to adangel/pmd that referenced this issue Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:bug PMD crashes or fails to analyse a file. in:cpd Affects the copy-paste detector
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants