Convert to using public JDK API #618

Arthurm1 · 2023-07-23T22:33:14Z

This is an attempt to use the JDK public API instead of the private one.

The advantage is that for JDK 17+ there should be (mostly) no need for --add-exports which should make it easier for tools that use this plugin e.g. Metals.

I can't test this completely locally as I'm using Windows and some tests won't run.
I'm also not entirely sure what the outputs should be. In some cases they look better e.g. enums and for some parameterised types.
For the library tests there are a lot of differences but the correct packages seem to now be displayed in test results instead of _root_.

I have no idea how this affects performance. Types, Trees and Elements now sometimes have to be queried to lookup data instead of being able to query the internal classes directly so maybe slower 🤷

I've changed the -targetroot:javac-classes-directory option to use reflection so --add-exports jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED only has to be supplied if that option is specified.

Test plan

Rerun all the tests I guess

olafurpg

Thank you for this contribution @Arthurm1! This PR represents a lot of work and the changes are a super valuable improvement. I'm very much in favor of merging this PR.

It's OK to update the snapshots for the failing tests. I can execute snapshots/run locally and push to your branch if there are problems running that command on Windows.
The only blocking comment is about getOverriddenMethods.

olafurpg · 2023-07-25T07:37:55Z

tests/snapshots/src/main/generated/tests/minimized/src/main/java/minimized/Enums.java

@@ -20,21 +19,21 @@ enum Enums {
  A("A", 420),
 //^ definition semanticdb maven . . minimized/Enums#A.
 //  documentation ```java\nEnums.A("A", 420) /* ordinal 0 */\n```
+//^ reference semanticdb maven . . minimized/Enums#`<init>`().


Nice improvement 👍🏻

olafurpg · 2023-07-25T07:38:50Z

tests/snapshots/src/main/generated/tests/minimized/src/main/java/minimized/Fields.java

@@ -61,6 +61,7 @@ public static String app() {
 //  ^^^^^^^^^^^ reference semanticdb maven . . minimized/Fields#InnerFields#
 //              ^^^^^^^^^^^ definition local 1
 //                          documentation ```java\nInnerFields innerFields\n```
+//                            ^^^^^^ reference local 0


Great improvement 👍🏻

olafurpg · 2023-07-25T07:41:31Z

tests/snapshots/src/main/generated/tests/minimized/src/main/java/minimized/InnerClasses.java

@@ -41,12 +41,15 @@ public enum InnerEnum {
    A,
 //  ^ definition semanticdb maven . . minimized/InnerClasses#InnerEnum#A.
 //    documentation ```java\nInnerEnum.A /* ordinal 0 */\n```
+//  ^ reference semanticdb maven . . minimized/InnerClasses#InnerEnum#`<init>`().


While this is technically correct, the navigation here might be confusing since the constructor is auto-generated. I'm OK with merging this PR with this change, but it would be nice to follow up with an issue or separate PR to undo this change.

olafurpg · 2023-07-25T07:41:47Z

tests/snapshots/src/main/generated/tests/minimized/src/main/java/minimized/InnerClasses.java

@@ -167,9 +170,6 @@ public static void testEnum(InnerEnum magicEnum) {
 //                   ^^^^^^^^ definition semanticdb maven . . minimized/InnerClasses#testEnum().
 //                            documentation ```java\npublic static void testEnum(InnerEnum magicEnum)\n```
 //                            ^^^^^^^^^ reference semanticdb maven . . minimized/InnerClasses#InnerEnum#
-//                            ^^^^^^^^^ reference semanticdb maven . . minimized/InnerClasses#InnerEnum#`<init>`().


Nice improvement 👍🏻

olafurpg · 2023-07-25T07:44:05Z

tests/snapshots/src/main/generated/tests/minimized/src/main/java/minimized/LombokBuilder.java

+//                      documentation ```java\nfinal String message\n```
+//              ^^^^^^^ definition local 1
+//                      documentation ```java\nfinal String message\n```
+//              ^^^^^^^ definition semanticdb maven . . minimized/Hello#HelloBuilder#message().


Great improvement 👍🏻

olafurpg · 2023-07-25T07:44:24Z

.../snapshots/src/main/generated/tests/minimized/src/main/java/minimized/MinimizedJavaMain.java

@@ -18,6 +18,7 @@ public static void main(String[] args) {
    TypeVariables.app(new TypeVariables.CT());
 //  ^^^^^^^^^^^^^ reference semanticdb maven . . minimized/TypeVariables#
 //                ^^^ reference semanticdb maven . . minimized/TypeVariables#app().
+//                        ^^^^^^^^^^^^^ reference semanticdb maven . . minimized/TypeVariables#


Very nice improvement!

olafurpg · 2023-07-25T07:44:34Z

.../snapshots/src/main/generated/tests/minimized/src/main/java/minimized/MinimizedJavaMain.java

@@ -46,6 +47,8 @@ public static void main(String[] args) {
 //                       ^^^ reference semanticdb maven . . minimized/Primitives#app().
            + new ParameterizedTypes<Integer, String>().app(42, "42")
 //                ^^^^^^^^^^^^^^^^^^ reference semanticdb maven . . minimized/ParameterizedTypes#`<init>`().
+//                                   ^^^^^^^ reference semanticdb maven jdk 11 java/lang/Integer#


semanticdb-javac/src/main/java/com/sourcegraph/semanticdb_javac/GlobalSymbolsCache.java

semanticdb-javac/src/main/java/com/sourcegraph/semanticdb_javac/SemanticdbVisitor.java

Arthurm1 · 2023-07-25T15:29:18Z

@olafurpg I've reverted the overrides search to use the internal API. I've also switched to regex split for package names.

I've also managed to run the tests under WSL.

I guess I could fix the enums <init> reference by creating an <init> definition on enum itself - as is done with classes that don't have constructors.

This should be good enough to merge though.

I've highlighted internal API usage with warning comments. Is it possible to test the performance of the overrides code? I've left the public API version in the file but commented it out, so switching between public and private implementations should be easy for whoever is able to benchmark it. There is another option for implementing using the public API if this one sucks.

olafurpg · 2023-07-25T15:37:39Z

Thank you @Arthurm1 ! The repo has JMH benchmarks that we can use to confirm performance claims. I will review the changes tomorrow and share instructions on how to run the benchmarks.

olafurpg

LGTM 👍🏻 Thank you @Arthurm1 !

I opened #621 to get the benchmarks running again. The following command should target only the relevant benchmarks to measure overhead of the compiler plugin

bench/jmh:run -i 10 -wi 10 -f1 -t1 -p lib=guava .*.CompileBench

You can reduce the number of iterations from 10 to a smaller number like 5, but you may not get as accurate numbers then. I suspect the performance difference is negligible when using your reimplementation of overrides computation, but it's nice to at least confirm it with actual benchmarks instead of relying on intuition.

olafurpg · 2023-07-26T09:29:39Z

To fix the override suite, you need to manually update the string literal in the assertion. There's no automatic way to update that snapshot

Arthurm1 · 2023-07-26T11:25:04Z

@olafurpg

Using the public API is about 3% slower if I'm reading these results right? I'm no expert on benchmarking

Private API

[info] Result "benchmarks.CompileBench.compileSemanticdb":
[info]   N = 10
[info]   mean =  16387.775 ±(99.9%) 281.073 ms/op
[info]   Histogram, ms/op:
[info]     [16100.000, 16150.000) = 0 
[info]     [16150.000, 16200.000) = 2 
[info]     [16200.000, 16250.000) = 1 
[info]     [16250.000, 16300.000) = 0 
[info]     [16300.000, 16350.000) = 1 
[info]     [16350.000, 16400.000) = 2 
[info]     [16400.000, 16450.000) = 0 
[info]     [16450.000, 16500.000) = 2 
[info]     [16500.000, 16550.000) = 0 
[info]     [16550.000, 16600.000) = 0 
[info]     [16600.000, 16650.000) = 1 
[info]     [16650.000, 16700.000) = 0 
[info]     [16700.000, 16750.000) = 1
[info]   Percentiles, ms/op:
[info]       p(0.0000) =  16151.304 ms/op
[info]      p(50.0000) =  16379.456 ms/op
[info]      p(90.0000) =  16724.546 ms/op
[info]      p(95.0000) =  16737.928 ms/op
[info]      p(99.0000) =  16737.928 ms/op
[info]      p(99.9000) =  16737.928 ms/op
[info]      p(99.9900) =  16737.928 ms/op
[info]      p(99.9990) =  16737.928 ms/op
[info]      p(99.9999) =  16737.928 ms/op
[info]     p(100.0000) =  16737.928 ms/op
[info] # Run complete. Total time: 00:09:34
[info] REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
[info] why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
[info] experiments, perform baseline and negative tests that provide experimental control, make sure
[info] the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
[info] Do not assume the numbers tell you what you want them to tell.
[info] Benchmark                       (lib)  Mode  Cnt      Score     Error  Units
[info] CompileBench.compile            guava    ss   10  10235.191 ± 239.764  ms/op
[info] CompileBench.compileSemanticdb  guava    ss   10  16387.775 ± 281.073  ms/op
[success] Total time: 585 s (09:45), completed Jul 26, 2023, 10:52:52 AM

Public API

[info] Result "benchmarks.CompileBench.compileSemanticdb":
[info]   N = 10
[info]   mean =  16835.790 ±(99.9%) 166.706 ms/op
[info]   Histogram, ms/op:
[info]     [16600.000, 16650.000) = 0 
[info]     [16650.000, 16700.000) = 2 
[info]     [16700.000, 16750.000) = 0 
[info]     [16750.000, 16800.000) = 1 
[info]     [16800.000, 16850.000) = 1 
[info]     [16850.000, 16900.000) = 4 
[info]     [16900.000, 16950.000) = 1 
[info]     [16950.000, 17000.000) = 0 
[info]     [17000.000, 17050.000) = 1 
[info]   Percentiles, ms/op:
[info]       p(0.0000) =  16653.959 ms/op
[info]      p(50.0000) =  16853.743 ms/op
[info]      p(90.0000) =  17013.641 ms/op
[info]      p(95.0000) =  17026.258 ms/op
[info]      p(99.0000) =  17026.258 ms/op
[info]      p(99.9000) =  17026.258 ms/op
[info]      p(99.9900) =  17026.258 ms/op
[info]      p(99.9990) =  17026.258 ms/op
[info]      p(99.9999) =  17026.258 ms/op
[info]     p(100.0000) =  17026.258 ms/op
[info] # Run complete. Total time: 00:09:38
[info] REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
[info] why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
[info] experiments, perform baseline and negative tests that provide experimental control, make sure
[info] the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
[info] Do not assume the numbers tell you what you want them to tell.
[info] Benchmark                       (lib)  Mode  Cnt      Score     Error  Units
[info] CompileBench.compile            guava    ss   10   9931.021 ± 134.260  ms/op
[info] CompileBench.compileSemanticdb  guava    ss   10  16835.790 ± 166.706  ms/op
[success] Total time: 585 s (09:45), completed Jul 26, 2023, 11:51:50 AM

You should be able to merge this PR now as tests pass

Arthurm1 · 2023-08-02T12:39:22Z

I think I'll have to look at performance some more. The above figures were just for when swapping out the overrides function.

The figures for public API vs private API are...

Public...

[info] Benchmark                       (lib)  Mode  Cnt      Score     Error  Units
[info] CompileBench.compile            guava    ss   10   9832.928 ± 233.608  ms/op
[info] CompileBench.compileSemanticdb  guava    ss   10  16445.371 ± 286.690  ms/op

Private...

[info] Benchmark                       (lib)  Mode  Cnt      Score     Error  Units
[info] CompileBench.compile            guava    ss   10   9915.177 ± 189.343  ms/op
[info] CompileBench.compileSemanticdb  guava    ss   10  12442.871 ± 295.333  ms/op

So Private API adds 25% to compile time and Public API adds 67%

Arthurm1 · 2023-08-04T19:39:05Z

Reworked to cache all Trees in a Map until full scan is complete. Then tree references can be looked up directly in the Map (which caused a lot of the performance drop off).

Now performance is on par with current release...

Private API (current release)
[info] Benchmark                       (lib)  Mode  Cnt      Score     Error  Units
[info] CompileBench.compile            guava    ss   10   9915.177 ± 189.343  ms/op
[info] CompileBench.compileSemanticdb  guava    ss   10  12442.871 ± 295.333  ms/opn

Public API (this PR)
[info] Benchmark                       (lib)  Mode  Cnt      Score     Error  Units
[info] CompileBench.compile            guava    ss   10  10338.916 ± 246.845  ms/op
[info] CompileBench.compileSemanticdb  guava    ss   10  12200.354 ± 716.330  ms/op

keynmol · 2023-08-08T13:42:28Z

Hey @Arthurm1 sorry for radio silence on this PR - I will take a look and re-run benchmarks just to reproduce what you are seeing.

Exciting times!

Arthurm1 · 2023-08-08T13:58:35Z

@keynmol No worries. I've got a final commit that I want to push through - changing a HashMap to a LinkedHashMap to preserve ordering so the tests don't keep showing different local XX numbers depending on what machine they're run on.
Should make the differences easier to review.
Give me 10 mins.

Arthurm1 · 2023-08-08T14:13:35Z

@keynmol Sorry - accidentally included some notes so pushed twice

All done I think - you should be able to test performance.

keynmol · 2023-08-16T08:11:46Z

@Arthurm1 This morning I ran the benchmarks on my linux machine, glanced at them, put them in a file and... left the house without pasting the results 🤦

From my quick glance I could confirm the performance was on part over 10 warmup iterations and 10 iterations. I'll review the PR with that in mind, and will paste the results later today to make sure there's no regression.

Arthurm1 · 2023-08-16T14:22:17Z

@keynmol doh. Just merge it - what's the worst that could happen 😉

keynmol · 2023-08-16T17:53:24Z

I finally got back to that machine - benchmarks look good!

main:

[info] CompileBench.compile              bytebuddy    ss   10  2293.563 ± 147.362  ms/op
[info] CompileBench.compile                  guava    ss   10  3065.379 ± 107.811  ms/op
[info] CompileBench.compileSemanticdb    bytebuddy    ss   10  2871.277 ± 109.309  ms/op
[info] CompileBench.compileSemanticdb        guava    ss   10  3856.719 ±  61.454  ms/op
[info] ScipSemanticdbBench.json                N/A    ss   10  3154.109 ±  41.637  ms/op
[info] ScipSemanticdbBench.jsonParallel        N/A    ss   10  2208.542 ±  53.408  ms/op

This PR

[info] CompileBench.compile              bytebuddy    ss   10  2294.170 ± 97.459  ms/op
[info] CompileBench.compile                  guava    ss   10  2921.679 ± 37.208  ms/op
[info] CompileBench.compileSemanticdb    bytebuddy    ss   10  3016.769 ± 35.410  ms/op
[info] CompileBench.compileSemanticdb        guava    ss   10  3789.773 ± 76.272  ms/op
[info] ScipSemanticdbBench.json                N/A    ss   10  3144.038 ± 33.890  ms/op
[info] ScipSemanticdbBench.jsonParallel        N/A    ss   10  2226.563 ± 60.759  ms/op

I'll take one more look tomorrow morning and we'll merge and release this

olafurpg

I'm sorry for going completely silent for so long 🙇🏻

Thank you @Arthurm1 for closing the performance gap. It's great that we are able to move to the public API without performance penalties. This PR is a huge contribution 🙏🏻 I really appreciate your work on this. I'll let Anton do a last round of review before we merge. LGTM 👍🏻

keynmol

Amazing effort!

I ran benchmarks again in a codespace:

Benchmarks - on Arthur's PR

[info] Benchmark                             (lib)  Mode  Cnt     Score     Error  Units
[info] CompileBench.compile              bytebuddy    ss   10  2523.846 ±  75.302  ms/op
[info] CompileBench.compile                  guava    ss   10  3288.573 ±  85.234  ms/op
[info] CompileBench.compileSemanticdb    bytebuddy    ss   10  3227.423 ±  78.047  ms/op
[info] CompileBench.compileSemanticdb        guava    ss   10  4270.741 ± 119.716  ms/op
[info] ScipSemanticdbBench.json                N/A    ss   10  2311.662 ±  48.781  ms/op
[info] ScipSemanticdbBench.jsonParallel        N/A    ss   10  1280.459 ±  61.906  ms/op

On main 

[info] CompileBench.compile              bytebuddy    ss   10  2565.034 ±  81.598  ms/op
[info] CompileBench.compile                  guava    ss   10  3456.703 ±  87.916  ms/op
[info] CompileBench.compileSemanticdb    bytebuddy    ss   10  3149.586 ±  79.741  ms/op
[info] CompileBench.compileSemanticdb        guava    ss   10  4406.520 ± 134.021  ms/op
[info] ScipSemanticdbBench.json                N/A    ss   10  2750.678 ±  76.973  ms/op
[info] ScipSemanticdbBench.jsonParallel        N/A    ss   10  1495.191 ±  54.868  ms/op

Thanks to the work in sourcegraph/scip-java#618 these are no longer necessary!

Strum355 requested review from keynmol and olafurpg July 24, 2023 12:32

olafurpg requested changes Jul 25, 2023

View reviewed changes

Arthurm1 force-pushed the public_api branch from a6a169f to b2fb0b9 Compare July 25, 2023 15:22

olafurpg approved these changes Jul 26, 2023

View reviewed changes

Arthurm1 force-pushed the public_api branch from b2fb0b9 to 91ea659 Compare July 26, 2023 09:35

Arthurm1 force-pushed the public_api branch from 91ea659 to 2f6fdb0 Compare August 4, 2023 19:14

Arthurm1 force-pushed the public_api branch from 2f6fdb0 to 9a75d16 Compare August 8, 2023 14:08

convert to public API

70e7ec6

Arthurm1 force-pushed the public_api branch from 9a75d16 to 70e7ec6 Compare August 8, 2023 14:12

olafurpg approved these changes Aug 17, 2023

View reviewed changes

keynmol approved these changes Aug 17, 2023

View reviewed changes

keynmol merged commit 4ca307a into sourcegraph:main Aug 17, 2023
11 checks passed

Arthurm1 deleted the public_api branch August 17, 2023 13:30

ckipp01 added a commit to ckipp01/mill-scip that referenced this pull request Sep 1, 2023

refactor: remove the extra --add-exports

3f43424

Thanks to the work in sourcegraph/scip-java#618 these are no longer necessary!

This was referenced Mar 25, 2024

semanticdb, the silent killer scalameta/metals#6191

Open

semanticdb-javac terminates in error "silently", due to a NullPointerException #686

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert to using public JDK API #618

Convert to using public JDK API #618

Arthurm1 commented Jul 23, 2023

olafurpg left a comment

olafurpg Jul 25, 2023

olafurpg Jul 25, 2023

olafurpg Jul 25, 2023

olafurpg Jul 25, 2023

olafurpg Jul 25, 2023

olafurpg Jul 25, 2023

olafurpg Jul 25, 2023

Arthurm1 commented Jul 25, 2023

olafurpg commented Jul 25, 2023

olafurpg left a comment

olafurpg commented Jul 26, 2023

Arthurm1 commented Jul 26, 2023

Arthurm1 commented Aug 2, 2023

Arthurm1 commented Aug 4, 2023

keynmol commented Aug 8, 2023

Arthurm1 commented Aug 8, 2023

Arthurm1 commented Aug 8, 2023

keynmol commented Aug 16, 2023

Arthurm1 commented Aug 16, 2023

keynmol commented Aug 16, 2023

olafurpg left a comment

keynmol left a comment

Convert to using public JDK API #618

Convert to using public JDK API #618

Conversation

Arthurm1 commented Jul 23, 2023

Test plan

olafurpg left a comment

Choose a reason for hiding this comment

olafurpg Jul 25, 2023

Choose a reason for hiding this comment

olafurpg Jul 25, 2023

Choose a reason for hiding this comment

olafurpg Jul 25, 2023

Choose a reason for hiding this comment

olafurpg Jul 25, 2023

Choose a reason for hiding this comment

olafurpg Jul 25, 2023

Choose a reason for hiding this comment

olafurpg Jul 25, 2023

Choose a reason for hiding this comment

olafurpg Jul 25, 2023

Choose a reason for hiding this comment

Arthurm1 commented Jul 25, 2023

olafurpg commented Jul 25, 2023

olafurpg left a comment

Choose a reason for hiding this comment

olafurpg commented Jul 26, 2023

Arthurm1 commented Jul 26, 2023

Arthurm1 commented Aug 2, 2023

Arthurm1 commented Aug 4, 2023

keynmol commented Aug 8, 2023

Arthurm1 commented Aug 8, 2023

Arthurm1 commented Aug 8, 2023

keynmol commented Aug 16, 2023

Arthurm1 commented Aug 16, 2023

keynmol commented Aug 16, 2023

olafurpg left a comment

Choose a reason for hiding this comment

keynmol left a comment

Choose a reason for hiding this comment