Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

maven indexing: local repo indexing optimizations #4095

Merged
merged 1 commit into from May 23, 2022

Conversation

mbien
Copy link
Member

@mbien mbien commented May 10, 2022

flamegraphs showed that ClassDependencyIndexCreator.addDependencyToMap
had some potential for optimizations.

  • use stream instead of HashSet which gives a very small perf improvement
    but also slightly more readable code IMO
  • use simple startsWith() loop for prefix filtering instead of
    stateful impl (MatchWords), the JVM is likely very good at
    optimizing/vectorizing String intrinsics

benchmark showed ~20% speedup

full indexing takes on my system and with my current local maven repo ~53s (down from ~66s)

@timboudreau what is your opinion on this? You wrote most of the original optimizations (including the MatchWords class). I can't see you in the reviewer list for some reason.

sidenote:
I did also write a parallel version using an ExecutorService which could reduce it further to ~38s, but it made the code slightly more complex which is probably not worth the trouble. Parallelizing on a higher level (outside of ClassDependencyIndexCreator) might give better results (better scaling) since this is only a small slice of the task.

flamegraphs:
stream_match
stream_startWith

@mbien mbien added performance Maven [ci] enable "build tools" tests labels May 10, 2022
@mbien mbien force-pushed the local-indexing-performance branch 2 times, most recently from 7753ee9 to 646ff2c Compare May 10, 2022 08:57
flamegraphs showed that ClassDependencyIndexCreator.addDependencyToMap
had some potential for optimizations.

 - use stream instead of HashSet which gives a small perf improvement
   but also slightly more readable code
 - use simple startsWith() loop for prefix filtering instead of
   stateful impl (MatchWords), the JVM is likely very good at
   optimizing/vectorizing String intrinsics

benchmark showed ~20% speedup
@mbien mbien force-pushed the local-indexing-performance branch from 646ff2c to f8be12c Compare May 22, 2022 07:38
@mbien
Copy link
Member Author

mbien commented May 22, 2022

rebased to latest master so that its easier to test together with the latest fix #4136

@mbien mbien marked this pull request as ready for review May 22, 2022 07:40
Copy link
Contributor

@vieiro vieiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me!

Copy link
Contributor

@matthiasblaesing matthiasblaesing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I did a short test and can verify speedups in the promised ranged. Finding dependencies based on class name still worked.

I left a minimal nitpick inline. Not critical though, so feel free to apply or not.

Nice cleanup - shorter code, higher performance.

@mbien mbien merged commit 1c82904 into apache:master May 23, 2022
@neilcsmith-net neilcsmith-net added this to the NB15 milestone Jul 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Maven [ci] enable "build tools" tests performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants