Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Improve benchmarking #4365

Merged
merged 5 commits into from
Jan 29, 2023
Merged

Conversation

jsotuyod
Copy link
Member

@jsotuyod jsotuyod commented Jan 26, 2023

Describe the PR

Rule benchmarking:

  • Recover the old behavior, where the number of rule applications is split from the number of tree transversals.
    • This means that rules that are applied, but didn't match any node where previously unlisted, but know show they were applied to all files, and evaluated 0 nodes.
    • This also allows to more neatly understand which rules are making use of the rulechain and which don't.

Language Processing Stages:

  • Remove numbers from Java LPS names
    • Only Java used them, so it was inconsistent
    • The numbers relate to the order they are executed, but don't necessarily imply a required order (ie: Comment Assignment could be done at any point in reality)
    • Since the benchmark report sorts by time spent on each one, the numbered labels are simply confusing
    • AST Disambiguation is only tracked in the dedicated phase started by the AstProcessor, having the AstDissambiguation Pass track this was inconsistent.
      • Since these passes can be triggered by the symbol table LPS phase, we ended up double counting total time (2 nested "LPS phases" were counting total time on top of each other).
      • This however puts all disambiguation passes done for symbol table as symbol table cost, which although accurate,
        may not help to identify speed up opportunities as clearly, but the benchmark is not a profiling tool.

Unaccounted:

  • Unaccounted time was significantly larger than it used to be, and in proportion took about 10% of total execution time. Properly account attaining ruleset copies, TextDocuments, and overall parser and rule setup as processing time.

Ready?

  • Added unit tests for fixed bug/feature
  • Passing all unit tests
  • Complete build ./mvnw clean verify passes (checked automatically by github actions)
  • Added (in-code) documentation (if needed)

 - Recover the old behavior, where the number of rule applications is split from
the number of tree transversals.
 - This means that rules that are applied, but didn't match any node where previously
unlisted, but know show they were applied to all files, and evaluated 0 nodes.
 - We also count towards the rule's benchmark the cost of getting to the nodes it want's
to evaluate, so poor xpath selectors actually show for the particular rule.
 - The numbers relate to how they are executed, but don't necessarilly imply a required order
(ie: Comment Assignment could be done at any point)
 - Since the benchmark report sorts by time spent on each one, the numbered labels are simply confusing
 - First off, all LPS are benchmarked in the AstProcessor, having this here was inconsistent
 - Since these passes can be triggered by the symbol table LPS phase, we ended up double counting total time.
 - This however puts all disambiguation passes done for symbol table as symbol table cost, which although accurate,
may not help to identify speed up opportunities as clearly, but the benchmark is not a profiling tool.
 - Unnacounted time was significantly larger than it used to be,
and in proportion tok about 10% of total execution time. Properly account
attaining ruleset copies, TextDocuments, and overal parser and rule setup
as processing time.
@jsotuyod jsotuyod added this to the 7.0.0 milestone Jan 26, 2023
@jsotuyod jsotuyod added this to In progress in PMD 7 via automation Jan 26, 2023
@pmd-test
Copy link

pmd-test commented Jan 26, 2023

2 Messages
📖 Compared to pmd/7.0.x:
This changeset changes 6 violations,
introduces 1 new violations, 0 new errors and 0 new configuration errors,
removes 1 violations, 0 errors and 0 configuration errors.
Full report
📖 Compared to master:
This changeset changes 49641 violations,
introduces 33909 new violations, 1446 new errors and 0 new configuration errors,
removes 195793 violations, 4 errors and 7 configuration errors.
Full report
Compared to pmd/7.0.x:
This changeset changes 6 violations,
introduces 1 new violations, 0 new errors and 0 new configuration errors,
removes 1 violations, 0 errors and 0 configuration errors.
Full report
Compared to master:
This changeset changes 49641 violations,
introduces 33909 new violations, 1446 new errors and 0 new configuration errors,
removes 195793 violations, 4 errors and 7 configuration errors.
Full report

Generated by 🚫 Danger

Copy link
Member

@oowekyala oowekyala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@oowekyala oowekyala merged commit cde72a6 into pmd:pmd/7.0.x Jan 29, 2023
PMD 7 automation moved this from In progress to Done Jan 29, 2023
@jsotuyod jsotuyod deleted the improve-benchmarking branch January 29, 2023 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
PMD 7
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants