Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[performance improvement] Delete outdated hack for legacy JVM #128

Merged
merged 6 commits into from
Feb 18, 2017

Conversation

KengoTODA
Copy link
Member

According to my benchmark, SpotBugs is 6% slower than FindBugs 3.0.1. And some of bottlenecks can be solved easily without adding new dependency nor upgrading library. This pull-request proposes these easy fixes.

Benchmark script

I use Google Guava as target, because it has no big dependencies but it depends on JSR305 so we can use almost full-set feature of SpotBugs.

#!/bin/bash -e
for i in {1..20}; do
  /usr/bin/time java -Xmx2g -Xms2g \
  -jar findbugs/build/distributions/spotbugs-3.1.0-SNAPSHOT/lib/spotbugs.jar \
  -textui -auxclasspath ~/.m2/repository/com/google/code/findbugs/jsr305/3.0.1/jsr305-3.0.1.jar \
  -output /dev/null ~/.m2/repository/com/google/guava/guava/19.0/guava-19.0.jar 2>&1 | \
  grep real | sed 's/ \{1,\}/ /g' | cut -d ' ' -f 2,4,6
done;

Note

  • I use this script on OS X, your time might not be the same one if you're using different OS.
  • I run benchmark 20 times, and use its MEDIAN as performance indicator.
  • After applying these optimizations, bottleneck still remains especially in collection handling. I will list the big one as follows, and will consider to introduce Guava, FastUtil or other libraries.
    • edu.umd.cs.findbugs.classfile.DescriptorFactory.getMethodDescriptor(String, String, String, boolean)
      • We creates instance first, and check the existing instance in collection.
      • HashMap#get(key) costs much time even though MethodDescriptor implements hashCode() and equals() in acceptable quality.
      • Deleting this cache doesn't improve performance (26.2 sec).
    • edu.umd.cs.findbugs.classfile.impl.AnalysisCache.getClassAnalysis(Class, ClassDescriptor)
      • HashMap#get(key) costs much time even though ClassDescriptor implements hashCode() and equals() in acceptable quality.
    • edu.umd.cs.findbugs.classfile.DescriptorFactory.getClassDescriptor(String)
      • HashMap#get(key) costs much time even though we use String as key.
    • edu.umd.cs.findbugs.classfile.analysis.ClassInfo.findMethod(String, String, boolean)
      • We use O(n) algorithm to search. Better to introduce Table or some other collection to search by O(log(n)) algorithm.
  • About detailed report, please check SpotBugsPerformance_3adfcbf.ods.zip

This change improves performance about 16%. I also tested performance
with `s.intern()` but it makes performance 33% slower.

note: test environment is OS X 10.11.6,
Java(TM) SE Runtime Environment (build 1.8.0_92-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.92-b14, mixed mode)
This optimization can improve performance about 12%.

note: test environment is OS X 10.11.6,
Java(TM) SE Runtime Environment (build 1.8.0_92-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.92-b14, mixed mode)
@KengoTODA KengoTODA self-assigned this Feb 18, 2017
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) to 57.53% when pulling fdb6b99 on KengoTODA:optimise-performance into edd7da9 on spotbugs:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) to 57.53% when pulling fdb6b99 on KengoTODA:optimise-performance into edd7da9 on spotbugs:master.

@jsotuyod
Copy link
Member

I wonder how we are 6% slower if we didn't add code, but on the contrary killed quite some...

This PR covers the contents of findbugsproject/findbugs#132 regarding string canonicalization (original implementation), I strongly support it.

The rest is an alternative of part (not all) the changes suggested on findbugsproject/findbugs#135, you may want to look at the other changes proposed in that PR.

findbugsproject/findbugs#134 also introduces performance improvements (avoid an array allocation and iniialization).


}
/*
* Andrei, 27.02.2008: "optimized" code below takes ~18% overall FB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment should be removed if the implementation is dropped.

@KengoTODA
Copy link
Member Author

KengoTODA commented Feb 18, 2017

This PR covers the contents of findbugsproject/findbugs#132 regarding string canonicalization (original implementation)

I've run the same benchmark with Apache Cassandra 3.1.0, which I believe that really big project (6.3 MB, about 3,600 classes). This optimisation could improve performance from 284s to 265s. I think we can merge this fix if we have no another OSS which can prove potential problem.

https://github.com/findbugsproject/findbugs/pull/134/files

I will try to introduce this change to this branch.

@KengoTODA
Copy link
Member Author

findbugsproject/findbugs/pull/134 doesn't affect performance, but it's more intuitive implementation so I will add it to this pull-request.

jsotuyod and others added 2 commits February 18, 2017 15:39
 There is no real need to copy things over and over
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.008%) to 57.532% when pulling 6454648 on KengoTODA:optimise-performance into edd7da9 on spotbugs:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.01%) to 57.53% when pulling e8d8839 on KengoTODA:optimise-performance into edd7da9 on spotbugs:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.01%) to 57.554% when pulling 19c99a3 on KengoTODA:optimise-performance into edd7da9 on spotbugs:master.

@jsotuyod jsotuyod merged commit 6bc2ff9 into spotbugs:master Feb 18, 2017
@KengoTODA KengoTODA deleted the optimise-performance branch February 18, 2017 23:23
@KengoTODA KengoTODA modified the milestone: SpotBugs 3.1.0 May 16, 2017
@KengoTODA
Copy link
Member Author

  • edu.umd.cs.findbugs.classfile.analysis.ClassInfo.findMethod(String, String, boolean)
    • We use O(n) algorithm to search. Better to introduce Table or some other collection to search by O(log(n)) algorithm.

I tried introducing Table in a9ac183, but it cannot improve performance (25.83s -> 26.46s) in microbench with Guava 19.0.

Refs: SpotBugsPerformance_a9ac1838.o.zip

@jsotuyod
Copy link
Member

@KengoTODA I'm theorizing here, but the main key for the table is the method name, under high overloading this means several conflicts... Also, 2 nested maps + an array means lots of jumps in memory to get to the value...

Possibly simplifying the table to:

class MethodDescriptor {
  private final String name;
  private final String signature;
  private final boolean static;

  // hashCode() and equals()
}

Map<MethodDescrptor, XMethod> table = new HashMap<>();

may produce better results. But there is only one way to know for sure...

sewe pushed a commit to sewe/spotbugs that referenced this pull request Jul 13, 2017
These methods have been deprecated and turned into no-ops with
spotbugs#128. This change removes the calls (but not the methods
themselves, so not to break API) to reduce the number of
distracting warnings.
henrik242 pushed a commit that referenced this pull request Jul 18, 2017
These methods have been deprecated and turned into no-ops with
#128. This change removes the calls (but not the methods
themselves, so not to break API) to reduce the number of
distracting warnings.
@KengoTODA
Copy link
Member Author

note: I run a microbenchmark on cloud (could be unstable), and it says that SpotBugs 3.1.12 & 4.0.6 is faster than FindBugs 3.0.1 (4% slower) and SpotBugs 4.1.4 (1% slower).

https://github.com/KengoTODA/spotbugs-benchmark/runs/1362695598?check_suite_focus=true#step:7:36

Benchmark #1: SpotBugs 4.1.4
  Time (mean ± σ):     69.282 s ±  1.517 s    [User: 134.499 s, System: 1.448 s]
  Range (min … max):   67.957 s … 70.950 s    5 runs
 
Benchmark #2: SpotBugs 4.0.6
  Time (mean ± σ):     68.837 s ±  2.301 s    [User: 134.689 s, System: 1.511 s]
  Range (min … max):   66.999 s … 72.762 s    5 runs
 
Benchmark #3: SpotBugs 3.1.12
  Time (mean ± σ):     68.759 s ±  1.073 s    [User: 134.635 s, System: 1.490 s]
  Range (min … max):   67.019 s … 69.968 s    5 runs
 
Benchmark #4: FindBugs 3.0.1
  Time (mean ± σ):     71.121 s ±  1.054 s    [User: 139.361 s, System: 1.468 s]
  Range (min … max):   69.945 s … 72.370 s    5 runs
 
Summary
  'SpotBugs 3.1.12' ran
    1.00 ± 0.04 times faster than 'SpotBugs 4.0.6'
    1.01 ± 0.03 times faster than 'SpotBugs 4.1.4'
    1.03 ± 0.02 times faster than 'FindBugs 3.0.1'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants