[core] Add integration tests against real projects #360

adangel · 2017-04-21T15:53:21Z

Ideally we would run the current PMD against a few other (open-source) projects like Spring, Solr, openjdk, ...

This should help in finding NPE, ClassCastExceptions, Parser errors earlier.

ryan-gustafson · 2017-04-21T22:52:10Z

I'm experimenting with a Gradle build script to try this, since it's pretty easy to pull down and analyze dependencies dynamically, allowing multiple versions of PMD and real projects. My thought would be to run a combination matrix of sorts, with some analysis of results between PMD versions. Not far into it, I'll share when I have something interesting. Anyone, feel free to reach out if you'd like to pick it up in the meantime.

ryan-gustafson · 2017-05-06T03:04:52Z

I've got something working and producing a report. Many things I could keep polishing and improving, but I'd like to get feedback before putting further effort in. To that end, this weekend I'm going to clean it up, collect and organize the loose ends, and submit a PR for discussion purposes.

In the meantime, I've attached an example report, produced using PMD versions 5.0.0 to 5.6.1, against some Hibernate, Solr, and Spring Framework dependencies, using all the Java rules grabbed from java-core/.../rulesets.properties. It took about 40 minutes to produce on my box (which is a bit dated). The report has 4 sections:

The PMDs used.
The Source used.
A matrix of PMD vs Source runs, including the PMD text report, stdout, stderr.
A matrix of diffs between the PMD text reports between adjacent PMD versions and Source.

No analysis is performed, although if a file is empty a link is omitted from the table.

Feel free to share thoughts!

Also, good news is, I was able to reliably manifest the problem from #364 in 5.6.0 when running with more than 1 thread.

This file was compressed by 7-Zip, to reduce to less than 3MB, it's over 30MB with ZIP reports.zip, extract and open the index.html file.

jsotuyod · 2017-06-13T19:27:11Z

@ryan-gustafson sorry it took me this long to look at it, but that report is amazing! It would be of great help to both avoid regressions, and battle test fixes and improvements beyond our test cases before a release.

One interesting thing of those diffs is the number of differences between builds for DFA results.... specially considering we haven't touched that code directly in quite some time (check this and this)... seems that module is more fragile than I ever thought...

We definitely need to move this forward with master vs PR for PRs. We could upload diffs to chunk.io for free from Travis :) Please, contact me if you need help to set this up.

ryan-gustafson · 2017-06-14T01:28:28Z

@jsotuyod I've been so busy lately I've not been able to get back to this. This weekend however is looking rather clear, so I'll see about getting a PR up, that should enable progress on other fronts. I've not looked at all into chunk.io or Travis, I assume one of you guys could work on that.

Glad you found it interesting. My hope is it has practical promise for allowing greatly expanding coverage and regression detection. But not just between release, but for comparing your local dev against latest CI SNAPSHOT build, or on a PR basis.

The two thinks I know I'd like to add, but likely not before I send a PR, would be:

Multiple language support. Currently just doing Java, but that's a gimme with Gradle and Maven repositories. Any non-Java repositories out there?
The recent CPD related issue [cpp] CPD gives wrong duplication blocks for CPP code #431 makes me think that would be good to add a CPR report too.

As for DFA, it's heavily dependent upon analysis of the symbol table data (here), so changes there could indirectly change DFA results (for better or worse).

jsotuyod · 2017-06-14T04:22:05Z

@jsotuyod I've been so busy lately I've not been able to get back to this. This weekend however is looking rather clear, so I'll see about getting a PR up, that should enable progress on other fronts. I've not looked at all into chunk.io or Travis, I assume one of you guys could work on that.

That's exactly the kind of things I was offering my assistance with. Let me know if you need anything.

Glad you found it interesting. My hope is it has practical promise for allowing greatly expanding coverage and regression detection. But not just between release, but for comparing your local dev against latest CI SNAPSHOT build, or on a PR basis.

Definitely, as I said, I'm really looking forward to have this on all PRs by making Travis do PR vs master.

Multiple language support. Currently just doing Java, but that's a gimme with Gradle and Maven repositories. Any non-Java repositories out there?

Not sure how you are getting the sources now, but for JS at least there are several big open source projects to look at. For Apex, Visualforce, PLSQL, Apache Velocity, XML and XSL things may be harder... But we can ask our Salesforce guys if there is a good OSS for Apex / VF to use as benchmark.

The recent CPD related issue [cpp] CPD gives wrong duplication blocks for CPP code #431 makes me think that would be good to add a CPR report too.

Definitely, but as you said, at a later stage. Just rolling this out as is is most valuable.

As for DFA, it's heavily dependent upon analysis of the symbol table data (here), so changes there could indirectly change DFA results (for better or worse).

I had no idea, good to know. Reports should be better then, assuming the DFA code is right, since we improved symbol table a lot for some scenarios such as anonymous inner classes.

jsotuyod · 2018-01-15T23:33:52Z

@ryan-gustafson any chance we can get our hands on this, whatever state it's in? We would love for this to see the light, maybe even as part of GSoC 2018, and what you had shown us would be an amazing starting point.

This could be extended to other languages once we tackle pmd#360

ryan-gustafson · 2018-05-08T03:41:44Z

@jsotuyod I totally missed the ask for the code on this. My apologies! GSoC is in flight already, is it too late for the code to be useful? I could dig it up this week sometime yet.

jsotuyod · 2018-05-08T03:42:49Z

@ryan-gustafson it's never late! @djydewang has already started on his own version, but yours may give him some insight or ideas.

ryan-gustafson · 2018-05-08T07:37:26Z

See attached ZIP. It is Gradle Groovy based, version 3.5 using the Gradle wrapper. Depending on the available Source dependencies/configurations, and PMD versions, it will dynamically create the appropriate tasks (a lot of them!). The pmdRegressionDiffReport task will run everything, it can take a long time depending on product of the number of PMD versions and Source dependencies. There are other smaller tasks for the different parts, you need to look at the code to understand how they are all wired together. Roughly the parts are:

a clean target to delete the regression directory
extract source code for all dependencies into regression directory
run all PMD versions setup against all dependencies, save off stdout, stderr, and pmd report
generate diffs between adjacent PMD versions for a given dependency
generate the HTML report

The code isn't pretty, but it worked. There's no shortage of kludges and work arounds, not all PMD versions worked right. The FIXME comments, in the code that creates the task that runs PMD, outlines quite a few of the considerations I never got to.

pmd-regression.zip

djydewang · 2018-05-08T09:12:59Z

@ryan-gustafson Amazing! I have never thought of using dependency to generate PMD reports. Maybe I can refactor my code to generate PMD reports in this way. But since I'm not familiar with gradle, I may not be able to use the code directly. There is no doubt that your code has inspired me a lot e.g. TODO and FIXME in the code are worth thinking about. Thank you for showing us a amazing starting point again:)

adangel added the in:pmd-internals Affects PMD's internals label Apr 21, 2017

adangel mentioned this issue Apr 21, 2017

[java] AccessorClassGeneration throws ClassCastException when seeing array construction #352

Closed

ryan-gustafson mentioned this issue Apr 25, 2017

[core] Stream closed exception when running through maven #364

Closed

jsotuyod changed the title ~~[core] Add integration tests agains real projects~~ [core] Add integration tests against real projects Jan 8, 2018

jsotuyod mentioned this issue Feb 1, 2018

[java] Maven PMD plugin fails to process some files without any explanation #894

Closed

oowekyala added a commit to oowekyala/pmd that referenced this issue Mar 26, 2018

Ensure java rules don't use deprecated attributes

74bb759

This could be extended to other languages once we tackle pmd#360

jsotuyod closed this as completed Jan 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] Add integration tests against real projects #360

[core] Add integration tests against real projects #360

adangel commented Apr 21, 2017

ryan-gustafson commented Apr 21, 2017 •

edited

Loading

ryan-gustafson commented May 6, 2017

jsotuyod commented Jun 13, 2017

ryan-gustafson commented Jun 14, 2017

jsotuyod commented Jun 14, 2017

jsotuyod commented Jan 15, 2018

ryan-gustafson commented May 8, 2018

jsotuyod commented May 8, 2018

ryan-gustafson commented May 8, 2018

djydewang commented May 8, 2018

[core] Add integration tests against real projects #360

[core] Add integration tests against real projects #360

Comments

adangel commented Apr 21, 2017

ryan-gustafson commented Apr 21, 2017 • edited Loading

ryan-gustafson commented May 6, 2017

jsotuyod commented Jun 13, 2017

ryan-gustafson commented Jun 14, 2017

jsotuyod commented Jun 14, 2017

jsotuyod commented Jan 15, 2018

ryan-gustafson commented May 8, 2018

jsotuyod commented May 8, 2018

ryan-gustafson commented May 8, 2018

djydewang commented May 8, 2018

ryan-gustafson commented Apr 21, 2017 •

edited

Loading