Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detekt gradle is much slower than the jar #2035

Closed
BraisGabin opened this issue Oct 17, 2019 · 18 comments
Closed

Detekt gradle is much slower than the jar #2035

BraisGabin opened this issue Oct 17, 2019 · 18 comments

Comments

@BraisGabin
Copy link
Member

./gradlew detekt is much slower than java -jar ~/projects/detekt/detekt-cli/build/libs/detekt-cli-1.1.1-all.jar -c detekt-config.yml -b baseline.xml in multi-module projects.

I did some measure with a project where I have gralde detekt installed (I can't share it because is the code of my employer).

./gradlew detekt with org.gradle.parallel=true (8 workers) ./gradlew detekt with org.gradle.parallel=false java -jar ~/projects/detekt/detekt-cli/build/libs/detekt-cli-1.1.1-all.jar -c detekt-config.yml -b baseline.xml
~51s ~78s ~11s

This project has 44 modules right now.

Any idea why is this happening? How could we improve this numbers?

@3flex
Copy link
Member

3flex commented Oct 18, 2019

This is a known problem (though we might not specifically have an issue open for it). There is a cost to setting detekt up for each module when run over a large Gradle project that you don't see when it's run as a single invocation on the CLI.

Gradle's worker API may help but I haven't investigated in any detail.

@BraisGabin
Copy link
Member Author

Does it have sense to be that heavy? Maybe is something that we can fix.

And if it can't be optimized we could do something to pay it only once.

I'm curios about why the non parallel work is just 1.5 times slower and no ~8 times

I'm going to expend some time researching this. Any idea is welcome!

@BraisGabin
Copy link
Member Author

BraisGabin commented Oct 18, 2019

Ok, I moved to an open soruce project to do all the metrics so I can share all the data. I peaked: LeakCanary

I created a fork and update detekt to the last version and that's what I'm using to get this numbers

./gradlew detekt with org.gradle.parallel=true (8 workers) ./gradlew detekt with org.gradle.parallel=false java -jar ~/projects/detekt/detekt-cli/build/libs/detekt-cli-1.1.1-all.jar -c detekt-config.yml -b baseline.xml
~11s ~20s ~3s

A gradle scan: ./gradlew detekt --scan

I took the time of each task and then substract the time that detekt reports in each log:

gradle task (ms) detekt (ms) gradle task - detekt (ms)
6092 4804 1288
5107 3993 1114
5585 4133 1452
4952 3993 959
5907 4391 1516
4183 3074 1109
6729 4538 2191
5140 4196 944
4422 2934 1488
5103 3975 1128
4653 3200 1453
4829 3801 1028
4976 3366 1610

The average is that we are using 1200ms "doing something that it's not checking rules" per task. I checked with time java -jar ... and the average is around 350ms "doing something that it's not checking rules". So here there's a lose of performance.

The total time that we use checking rules is 56,243ms. But running java -jar detekt can check all those files in ~3000ms. I know that the last version of detekt runs type resolution and it's probably running on gradle but not with java -jar. But the different is too big to be just that.

@3flex
Copy link
Member

3flex commented Oct 18, 2019

the last version of detekt runs type resolution and it's probably running on gradle but not with java -jar

Be aware that type resolution is only used in Gradle when running one of the auto-generated tasks like detektMain - so if you're comparing just against running gradlew detekt that assumption probably isn't correct.

@BraisGabin
Copy link
Member Author

Great, so my numbers are more acurate.

I'm having big troubles profiling a jvm application. Profile an Android app is much simplier.

For now I checked that this code takes around 150ms per module:

// DetektInvoker.kt
val proc = project.javaexec {
    it.main = DETEKT_MAIN
    it.classpath = classpath
    it.args = listOf("@${argsFile.absolutePath}")
    it.isIgnoreExitValue = true
}

That's the 10% of the time if we think in the non-parallel case. What do you think about add implementation(project(":detekt-cli")) to detekt-gradle-plugin?

And probably we'll decrease the time more than 10%. Because we woudn't need to load all the classes per each module. They would be load just for the first module.

@BraisGabin
Copy link
Member Author

Okay, I wrote some TERRIBLE code just to probe my theory: 1c2e586.

I'm not an gradle expert and I don't understand how the detekt gradle plugin works exactly. But with this change the command ./gradlew detekt in this project moves from 22s to just 4s. And that's a HUGE speed up.

How to test this code:

  • ./gradlew assemble
  • ./gradlew detekt

Now you have all the things you need builded. So to measure you just need to:

  • find . -name "detekt.html" -delete // this will force the detekt gradle plugin to run again
  • ./gradlew detekt // check the time!

And if you want to know what was before just use the last two commands.

So, with this data I think that we should consider to change how detekt-gradle-plugin uses the detekt-cli. But I have no idea how to do that properly. Any ideas?

@arturbosch
Copy link
Member

@BraisGabin thanks for the investigation and the late response!
I would like to come back to this after #1998.

I think the idea behind removing the hard dependency on detekt-cli was to support toolVersion - basically allowing other cli versions compatible with the gradle plugin like all the other code quality do.

Be aware that #1991 now merged can also inpact the gradle plugin performance when using detekt.parallel=false together with gradle.pralallel=true.

@BraisGabin
Copy link
Member Author

@BraisGabin thanks for the investigation and the late response!
I would like to come back to this after #1998.

Sure, we can check this after that PR.

I think the idea behind removing the hard dependency on detekt-cli was to support toolVersion - basically allowing other cli versions compatible with the gradle plugin like all the other code quality do.

I understand that but I think that the performance impact is far too big here. I have an idea but I don't know if we can do it with gradle: We can add the dependency to detekt-cli as compileOnly. And then add to the classpath the jar selected by the user.

This is an idea. If we can do it, perfect. If it's not posible I belive that we should think about removing this feature to get that performance boost because it's huge.

Be aware that #1991 now merged can also inpact the gradle plugin performance when using detekt.parallel=false together with gradle.pralallel=true.

I'll try it, but I think that the bottleneck right now is how we launch detekt-cli.

@3flex
Copy link
Member

3flex commented Oct 30, 2019

A simple workaround would be to create a custom task that runs over all source in the project at once. That way Gradle only forks once for the entire build instead of once for each module. See detektAll task at the end of detekt's root build file.

The problem with this approach is that rules relying on type resolution will not work, and there will be more of these in future.

Also remember that the Gradle plugin supports incremental builds so once it's run once for each module it won't run again for that module unless needed. Though the first build will take a little while it will be much quicker after that, especially for multi module builds.

@arturbosch
Copy link
Member

A simple workaround would be to create a custom task that runs over all source in the project at once. That way Gradle only forks once for the entire build instead of once for each module. See detektAll task at the end of detekt's root build file.

The problem with this approach is that rules relying on type resolution will not work, and there will be more of these in future.

Also remember that the Gradle plugin supports incremental builds so once it's run once for each module it won't run again for that module unless needed. Though the first build will take a little while it will be much quicker after that, especially for multi module builds.

But this will also work when we hardcode the cli version for the gradle plugin.
As we release the gradle plugin and cli together, I'm fine with dropping this feature.
I've talked to some users at meetups which also mentioned the slow gradle plugin compared to the cli.

I ping @marschwar for his gradle knowledge and hope he can take part in this discussion :)

@3flex
Copy link
Member

3flex commented Nov 3, 2019

Will there be classpath issues if detekt CLI is run in the same process as Gradle itself? Detekt and Gradle both depend on embedded Kotlin compiler.

Any change also has to make sure the project version of CLI can be used by the Gradle plugin so it's easy to still run detekt over itself using Gradle, or to use the Gradle plugin as an included build in other projects so changes in rules/CLI can be tested easily in other projects. This is very convenient when testing rules.

As long as there's a workflow that still works for development and we know there won't be classpath conflicts then I support this change.

@marschwar
Copy link
Contributor

I played around with it a little bit and I have to admit, the difference in performance is huge. Great find @BraisGabin.

As long as there's a workflow that still works for development and we know there won't be classpath conflicts then I support this change.
I totally agree.

While testing I excluded excluded everything with org.jetbrains.kotlin from the cli-all.jar and that did get rid of the warning that kotlin is on the path multiple times. But I don't know enough about classpaths with gradle plugins to say how this would play out.
I also ran into the chicken-and-egg-problem @3flex mentioned above since the project tries to build the plugin first to use it on itself but that depends on the cli which cannot be built without the plugin.

@BraisGabin
Copy link
Member Author

The chickend/egg problem I think that you can fix it depending on the cli jar itself. It's not perfect because you need to assemble first and then use it.... But it works. kotlin-mockito does that. I'm not a huge fan but if we want this feature, it's an option.

The main problem that I see here is the classpath... I'm just pointing some ideas that I have:

  • We could create a daemon (I have no idea how exactly). This way we have a different process so there is a different classpath. The problem is: when do we stop the daemon? Do gradle give us a callback to do that? And what happen if something go wrong and gradle doesn't call it? I think that this solution would increse the complexity A LOT. But I just want to mention it as an option.
  • Can we assume that the dependencies that we need are already in the gradle classpath? I mean, assume that the embedded Kotlin compiler is going to be there so we can add it as a compileOnly dependecy. I have no idea... but we can do a little research.

@BraisGabin
Copy link
Member Author

I just saw the issue #39. It should help here.

@arturbosch arturbosch modified the milestones: 1.3.0, 1.3.1 Dec 5, 2019
arturbosch added a commit that referenced this issue Dec 27, 2019
…#2035

This deprecates setting the detekt version via the toolVersion property of the extension.
@arturbosch arturbosch modified the milestones: 1.3.1, 1.4.0 Dec 27, 2019
@arturbosch
Copy link
Member

The way we bootstrap the gradle plugin now, brings a lot of complexity to our build and even slows down contributions.
Most of the time we are waiting for some gradle plugin tests when most of the time changes only for rules, cli or api are done.

Having the gradle plugin in this repository does not help with faster releases too.
I still need to increment just the version in gradle.properties, build, open a PR, wait for CI, publish everything except the gradle plugin to bintray, increment just the gradle plugin version, build, publish the plugin and then increment the used gradle plugin version for the gradle plugin itself.

As we now have our organization I propose to move the gradle plugin to it's own repository after 1.3.1. Merge #2200 in that repo and profit from faster build and ci times in the main detekt repo :).

@arturbosch
Copy link
Member

  • Can we assume that the dependencies that we need are already in the gradle classpath? I mean, assume that the embedded Kotlin compiler is going to be there so we can add it as a compileOnly dependecy. I have no idea... but we can do a little research.

I've done this for the compiler-plugin prototype, excluded the embedded compiler and it seems to work. Gradle itself comes with a file kotlin-compiler-embeddable-1.3.50-patched-for-gradle-6.0.1.jar which should be compatible to our usage of the shaddowed IntelliJ API und the kotlin compiler API's.

We are using the shadowed stuff a lot org.jetbrains.kotlin.com.intellij.psi.PsiElement so I guess it is safe to include the our cli module?

arturbosch added a commit that referenced this issue Jan 9, 2020
…#2035

This deprecates setting the detekt version via the toolVersion property of the extension.
arturbosch added a commit that referenced this issue Jan 12, 2020
…#2035

This deprecates setting the detekt version via the toolVersion property of the extension.
@arturbosch arturbosch modified the milestones: 1.4.0, 1.5.0 Jan 12, 2020
arturbosch added a commit that referenced this issue Jan 17, 2020
…#2035

This deprecates setting the detekt version via the toolVersion property of the extension.
@arturbosch
Copy link
Member

arturbosch commented Jan 27, 2020

I've written down my observations of inspecting this issue in #2282.
It is hard to come up with a solution which is faster and same flexible as our current approach with all the classloader loading and our plugin mechanism.
My next step would be to try to cache the detekt class loader to avoid metaspace OOM and improve the performance.
If this won't work reliably, I would not want to invest more time in this but further invest in the compiler plugin approach as we might have to revamp our plugin mechanism for compiler plugins.

@arturbosch arturbosch removed this from the 1.5.0 milestone Jan 27, 2020
@arturbosch arturbosch added this to the 1.7.0 milestone Mar 5, 2020
@arturbosch
Copy link
Member

I will release a beta/test version of 1.7.0 today so we can get some feedback for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants