Creator: Yegor Bugayenko
Creator: Toby Byron
I cannot use the StrictDuplicateCode check because it reports excessive essentially false duplicates.
Specifically, each of my source code files starts with the same file header (a comment that contains a copyright notice and boilerplate relating to the GPL3 license that my code is released under).
Yes, literally speaking, StrictDuplicateCode is correctly finding duplicates here however standard file headers are clearly not what people want to have reported to them.
So, what StrictDuplicateCode needs is to automatically suppress any duplicate comments at the start of files from its reports.
Would be nice to have a property "header" for StrictDuplicateCodeCheck:
I would like to work on the issue.
Let's clarify requirements.
I assume that:
1. license header starts from the first line of a class and may start from /, /* or //
2. all the comments should be ignored until package declaration (yep, it's possible that the class is in default package but it's the case we can ignore since it should not appear)
Are you OK with the assertions? They will help to use simple text parsing instead of TreeWalker.
Question: should we add "ignoreHeader" property to StrictDuplicateCodeCheck?
If yes, should it be set to true by default?
Hi Yuri, thanks for desire to help, I appreciate that.
That Check have to be removed from Checkstyle completely, it is extremely buggy. Whole concept of searching code duplication should be reviewed, reimplemented and re-tested before introducing it to Checkstyle.
Doing strict compare of code is worst realization. Amount of false positives that such tools produce make them unusable. I even in favour of keeping items separately with copy paste father then introducing dependency between classes. I am not asking completely abandon idea of code duplication ..... but that should not be done Checkstyle for now, and should be moved to our experimental project (sevntu.checkstyle).
please do me a favour - remove that Check completely from Checkstyle to make Checkstyle more reliable for users.
Similar problem but in different scope is: #473
StrictDuplicateCode has been removed from Checkstyle
any further explanation for removal reason? is there or will there be replacement? thanks
reason of removal - #523 (comment)
No replacement is planned for now, may be eventually after all completion of hundred ideas of dirty code detection in Java. I would recommend to use CPD to copy paste detection is you are crazy about that. But what I really recommend it to review your application to minimize dependencies between code, look at report against Chekstyle - http://checkstyle.sourceforge.net/dsm/index.html