Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert to using public JDK API #618

Merged
merged 1 commit into from
Aug 17, 2023
Merged

Conversation

Arthurm1
Copy link
Contributor

This is an attempt to use the JDK public API instead of the private one.

The advantage is that for JDK 17+ there should be (mostly) no need for --add-exports which should make it easier for tools that use this plugin e.g. Metals.

I can't test this completely locally as I'm using Windows and some tests won't run.
I'm also not entirely sure what the outputs should be. In some cases they look better e.g. enums and for some parameterised types.
For the library tests there are a lot of differences but the correct packages seem to now be displayed in test results instead of _root_.

I have no idea how this affects performance. Types, Trees and Elements now sometimes have to be queried to lookup data instead of being able to query the internal classes directly so maybe slower 🤷

I've changed the -targetroot:javac-classes-directory option to use reflection so --add-exports jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED only has to be supplied if that option is specified.

Test plan

Rerun all the tests I guess

Copy link
Member

@olafurpg olafurpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution @Arthurm1! This PR represents a lot of work and the changes are a super valuable improvement. I'm very much in favor of merging this PR.

  • It's OK to update the snapshots for the failing tests. I can execute snapshots/run locally and push to your branch if there are problems running that command on Windows.
  • The only blocking comment is about getOverriddenMethods.

@@ -20,21 +19,21 @@ enum Enums {
A("A", 420),
//^ definition semanticdb maven . . minimized/Enums#A.
// documentation ```java\nEnums.A("A", 420) /* ordinal 0 */\n```
//^ reference semanticdb maven . . minimized/Enums#`<init>`().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement 👍🏻

@@ -61,6 +61,7 @@ public static String app() {
// ^^^^^^^^^^^ reference semanticdb maven . . minimized/Fields#InnerFields#
// ^^^^^^^^^^^ definition local 1
// documentation ```java\nInnerFields innerFields\n```
// ^^^^^^ reference local 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvement 👍🏻

@@ -41,12 +41,15 @@ public enum InnerEnum {
A,
// ^ definition semanticdb maven . . minimized/InnerClasses#InnerEnum#A.
// documentation ```java\nInnerEnum.A /* ordinal 0 */\n```
// ^ reference semanticdb maven . . minimized/InnerClasses#InnerEnum#`<init>`().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is technically correct, the navigation here might be confusing since the constructor is auto-generated. I'm OK with merging this PR with this change, but it would be nice to follow up with an issue or separate PR to undo this change.

@@ -167,9 +170,6 @@ public static void testEnum(InnerEnum magicEnum) {
// ^^^^^^^^ definition semanticdb maven . . minimized/InnerClasses#testEnum().
// documentation ```java\npublic static void testEnum(InnerEnum magicEnum)\n```
// ^^^^^^^^^ reference semanticdb maven . . minimized/InnerClasses#InnerEnum#
// ^^^^^^^^^ reference semanticdb maven . . minimized/InnerClasses#InnerEnum#`<init>`().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement 👍🏻

// documentation ```java\nfinal String message\n```
// ^^^^^^^ definition local 1
// documentation ```java\nfinal String message\n```
// ^^^^^^^ definition semanticdb maven . . minimized/Hello#HelloBuilder#message().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvement 👍🏻

@@ -18,6 +18,7 @@ public static void main(String[] args) {
TypeVariables.app(new TypeVariables.CT());
// ^^^^^^^^^^^^^ reference semanticdb maven . . minimized/TypeVariables#
// ^^^ reference semanticdb maven . . minimized/TypeVariables#app().
// ^^^^^^^^^^^^^ reference semanticdb maven . . minimized/TypeVariables#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice improvement!

@@ -46,6 +47,8 @@ public static void main(String[] args) {
// ^^^ reference semanticdb maven . . minimized/Primitives#app().
+ new ParameterizedTypes<Integer, String>().app(42, "42")
// ^^^^^^^^^^^^^^^^^^ reference semanticdb maven . . minimized/ParameterizedTypes#`<init>`().
// ^^^^^^^ reference semanticdb maven jdk 11 java/lang/Integer#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

@Arthurm1
Copy link
Contributor Author

@olafurpg I've reverted the overrides search to use the internal API. I've also switched to regex split for package names.

I've also managed to run the tests under WSL.

I guess I could fix the enums <init> reference by creating an <init> definition on enum itself - as is done with classes that don't have constructors.

This should be good enough to merge though.

I've highlighted internal API usage with warning comments. Is it possible to test the performance of the overrides code? I've left the public API version in the file but commented it out, so switching between public and private implementations should be easy for whoever is able to benchmark it. There is another option for implementing using the public API if this one sucks.

@olafurpg
Copy link
Member

Thank you @Arthurm1 ! The repo has JMH benchmarks that we can use to confirm performance claims. I will review the changes tomorrow and share instructions on how to run the benchmarks.

Copy link
Member

@olafurpg olafurpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍🏻 Thank you @Arthurm1 !

I opened #621 to get the benchmarks running again. The following command should target only the relevant benchmarks to measure overhead of the compiler plugin

bench/jmh:run -i 10 -wi 10 -f1 -t1 -p lib=guava .*.CompileBench

You can reduce the number of iterations from 10 to a smaller number like 5, but you may not get as accurate numbers then. I suspect the performance difference is negligible when using your reimplementation of overrides computation, but it's nice to at least confirm it with actual benchmarks instead of relying on intuition.

@olafurpg
Copy link
Member

To fix the override suite, you need to manually update the string literal in the assertion. There's no automatic way to update that snapshot

@Arthurm1
Copy link
Contributor Author

@olafurpg

Using the public API is about 3% slower if I'm reading these results right? I'm no expert on benchmarking

Private API

[info] Result "benchmarks.CompileBench.compileSemanticdb":
[info]   N = 10
[info]   mean =  16387.775 ±(99.9%) 281.073 ms/op
[info]   Histogram, ms/op:
[info]     [16100.000, 16150.000) = 0 
[info]     [16150.000, 16200.000) = 2 
[info]     [16200.000, 16250.000) = 1 
[info]     [16250.000, 16300.000) = 0 
[info]     [16300.000, 16350.000) = 1 
[info]     [16350.000, 16400.000) = 2 
[info]     [16400.000, 16450.000) = 0 
[info]     [16450.000, 16500.000) = 2 
[info]     [16500.000, 16550.000) = 0 
[info]     [16550.000, 16600.000) = 0 
[info]     [16600.000, 16650.000) = 1 
[info]     [16650.000, 16700.000) = 0 
[info]     [16700.000, 16750.000) = 1
[info]   Percentiles, ms/op:
[info]       p(0.0000) =  16151.304 ms/op
[info]      p(50.0000) =  16379.456 ms/op
[info]      p(90.0000) =  16724.546 ms/op
[info]      p(95.0000) =  16737.928 ms/op
[info]      p(99.0000) =  16737.928 ms/op
[info]      p(99.9000) =  16737.928 ms/op
[info]      p(99.9900) =  16737.928 ms/op
[info]      p(99.9990) =  16737.928 ms/op
[info]      p(99.9999) =  16737.928 ms/op
[info]     p(100.0000) =  16737.928 ms/op
[info] # Run complete. Total time: 00:09:34
[info] REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
[info] why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
[info] experiments, perform baseline and negative tests that provide experimental control, make sure
[info] the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
[info] Do not assume the numbers tell you what you want them to tell.
[info] Benchmark                       (lib)  Mode  Cnt      Score     Error  Units
[info] CompileBench.compile            guava    ss   10  10235.191 ± 239.764  ms/op
[info] CompileBench.compileSemanticdb  guava    ss   10  16387.775 ± 281.073  ms/op
[success] Total time: 585 s (09:45), completed Jul 26, 2023, 10:52:52 AM

Public API

[info] Result "benchmarks.CompileBench.compileSemanticdb":
[info]   N = 10
[info]   mean =  16835.790 ±(99.9%) 166.706 ms/op
[info]   Histogram, ms/op:
[info]     [16600.000, 16650.000) = 0 
[info]     [16650.000, 16700.000) = 2 
[info]     [16700.000, 16750.000) = 0 
[info]     [16750.000, 16800.000) = 1 
[info]     [16800.000, 16850.000) = 1 
[info]     [16850.000, 16900.000) = 4 
[info]     [16900.000, 16950.000) = 1 
[info]     [16950.000, 17000.000) = 0 
[info]     [17000.000, 17050.000) = 1 
[info]   Percentiles, ms/op:
[info]       p(0.0000) =  16653.959 ms/op
[info]      p(50.0000) =  16853.743 ms/op
[info]      p(90.0000) =  17013.641 ms/op
[info]      p(95.0000) =  17026.258 ms/op
[info]      p(99.0000) =  17026.258 ms/op
[info]      p(99.9000) =  17026.258 ms/op
[info]      p(99.9900) =  17026.258 ms/op
[info]      p(99.9990) =  17026.258 ms/op
[info]      p(99.9999) =  17026.258 ms/op
[info]     p(100.0000) =  17026.258 ms/op
[info] # Run complete. Total time: 00:09:38
[info] REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
[info] why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
[info] experiments, perform baseline and negative tests that provide experimental control, make sure
[info] the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
[info] Do not assume the numbers tell you what you want them to tell.
[info] Benchmark                       (lib)  Mode  Cnt      Score     Error  Units
[info] CompileBench.compile            guava    ss   10   9931.021 ± 134.260  ms/op
[info] CompileBench.compileSemanticdb  guava    ss   10  16835.790 ± 166.706  ms/op
[success] Total time: 585 s (09:45), completed Jul 26, 2023, 11:51:50 AM

You should be able to merge this PR now as tests pass

@Arthurm1
Copy link
Contributor Author

Arthurm1 commented Aug 2, 2023

I think I'll have to look at performance some more. The above figures were just for when swapping out the overrides function.

The figures for public API vs private API are...

Public...

[info] Benchmark                       (lib)  Mode  Cnt      Score     Error  Units
[info] CompileBench.compile            guava    ss   10   9832.928 ± 233.608  ms/op
[info] CompileBench.compileSemanticdb  guava    ss   10  16445.371 ± 286.690  ms/op

Private...

[info] Benchmark                       (lib)  Mode  Cnt      Score     Error  Units
[info] CompileBench.compile            guava    ss   10   9915.177 ± 189.343  ms/op
[info] CompileBench.compileSemanticdb  guava    ss   10  12442.871 ± 295.333  ms/op

So Private API adds 25% to compile time and Public API adds 67%

@Arthurm1
Copy link
Contributor Author

Arthurm1 commented Aug 4, 2023

Reworked to cache all Trees in a Map until full scan is complete. Then tree references can be looked up directly in the Map (which caused a lot of the performance drop off).

Now performance is on par with current release...

Private API (current release)
[info] Benchmark                       (lib)  Mode  Cnt      Score     Error  Units
[info] CompileBench.compile            guava    ss   10   9915.177 ± 189.343  ms/op
[info] CompileBench.compileSemanticdb  guava    ss   10  12442.871 ± 295.333  ms/opn

Public API (this PR)
[info] Benchmark                       (lib)  Mode  Cnt      Score     Error  Units
[info] CompileBench.compile            guava    ss   10  10338.916 ± 246.845  ms/op
[info] CompileBench.compileSemanticdb  guava    ss   10  12200.354 ± 716.330  ms/op

@keynmol
Copy link
Contributor

keynmol commented Aug 8, 2023

Hey @Arthurm1 sorry for radio silence on this PR - I will take a look and re-run benchmarks just to reproduce what you are seeing.

Exciting times!

@Arthurm1
Copy link
Contributor Author

Arthurm1 commented Aug 8, 2023

@keynmol No worries. I've got a final commit that I want to push through - changing a HashMap to a LinkedHashMap to preserve ordering so the tests don't keep showing different local XX numbers depending on what machine they're run on.
Should make the differences easier to review.
Give me 10 mins.

@Arthurm1
Copy link
Contributor Author

Arthurm1 commented Aug 8, 2023

@keynmol Sorry - accidentally included some notes so pushed twice

All done I think - you should be able to test performance.

@keynmol
Copy link
Contributor

keynmol commented Aug 16, 2023

@Arthurm1 This morning I ran the benchmarks on my linux machine, glanced at them, put them in a file and... left the house without pasting the results 🤦

From my quick glance I could confirm the performance was on part over 10 warmup iterations and 10 iterations. I'll review the PR with that in mind, and will paste the results later today to make sure there's no regression.

@Arthurm1
Copy link
Contributor Author

@keynmol doh. Just merge it - what's the worst that could happen 😉

@keynmol
Copy link
Contributor

keynmol commented Aug 16, 2023

I finally got back to that machine - benchmarks look good!

main:

[info] CompileBench.compile              bytebuddy    ss   10  2293.563 ± 147.362  ms/op
[info] CompileBench.compile                  guava    ss   10  3065.379 ± 107.811  ms/op
[info] CompileBench.compileSemanticdb    bytebuddy    ss   10  2871.277 ± 109.309  ms/op
[info] CompileBench.compileSemanticdb        guava    ss   10  3856.719 ±  61.454  ms/op
[info] ScipSemanticdbBench.json                N/A    ss   10  3154.109 ±  41.637  ms/op
[info] ScipSemanticdbBench.jsonParallel        N/A    ss   10  2208.542 ±  53.408  ms/op

This PR

[info] CompileBench.compile              bytebuddy    ss   10  2294.170 ± 97.459  ms/op
[info] CompileBench.compile                  guava    ss   10  2921.679 ± 37.208  ms/op
[info] CompileBench.compileSemanticdb    bytebuddy    ss   10  3016.769 ± 35.410  ms/op
[info] CompileBench.compileSemanticdb        guava    ss   10  3789.773 ± 76.272  ms/op
[info] ScipSemanticdbBench.json                N/A    ss   10  3144.038 ± 33.890  ms/op
[info] ScipSemanticdbBench.jsonParallel        N/A    ss   10  2226.563 ± 60.759  ms/op

I'll take one more look tomorrow morning and we'll merge and release this

Copy link
Member

@olafurpg olafurpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry for going completely silent for so long 🙇🏻

Thank you @Arthurm1 for closing the performance gap. It's great that we are able to move to the public API without performance penalties. This PR is a huge contribution 🙏🏻 I really appreciate your work on this. I'll let Anton do a last round of review before we merge. LGTM 👍🏻

Copy link
Contributor

@keynmol keynmol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing effort!

I ran benchmarks again in a codespace:

Benchmarks - on Arthur's PR

[info] Benchmark                             (lib)  Mode  Cnt     Score     Error  Units
[info] CompileBench.compile              bytebuddy    ss   10  2523.846 ±  75.302  ms/op
[info] CompileBench.compile                  guava    ss   10  3288.573 ±  85.234  ms/op
[info] CompileBench.compileSemanticdb    bytebuddy    ss   10  3227.423 ±  78.047  ms/op
[info] CompileBench.compileSemanticdb        guava    ss   10  4270.741 ± 119.716  ms/op
[info] ScipSemanticdbBench.json                N/A    ss   10  2311.662 ±  48.781  ms/op
[info] ScipSemanticdbBench.jsonParallel        N/A    ss   10  1280.459 ±  61.906  ms/op

On main 

[info] CompileBench.compile              bytebuddy    ss   10  2565.034 ±  81.598  ms/op
[info] CompileBench.compile                  guava    ss   10  3456.703 ±  87.916  ms/op
[info] CompileBench.compileSemanticdb    bytebuddy    ss   10  3149.586 ±  79.741  ms/op
[info] CompileBench.compileSemanticdb        guava    ss   10  4406.520 ± 134.021  ms/op
[info] ScipSemanticdbBench.json                N/A    ss   10  2750.678 ±  76.973  ms/op
[info] ScipSemanticdbBench.jsonParallel        N/A    ss   10  1495.191 ±  54.868  ms/op

@keynmol keynmol merged commit 4ca307a into sourcegraph:main Aug 17, 2023
11 checks passed
@Arthurm1 Arthurm1 deleted the public_api branch August 17, 2023 13:30
ckipp01 added a commit to ckipp01/mill-scip that referenced this pull request Sep 1, 2023
Thanks to the work in sourcegraph/scip-java#618 these
are no longer necessary!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants