Ivy resolve is not the most efficient #4

eed3si9n · 2014-05-06T17:56:12Z

background

There have been reports filed to sbt/sbt project on artifact resolution being "slow" especially for multi-project builds, such as sbt/sbt#413 and SBT hangs resolving dependencies. The problem has been exacerbated by the fact that sbt treats individual subprojects as independent library dependency graph (or "deps graph"), which is reasonable assumption to make for many cases.

However, there is a growing number of projects used in corporate settings in which a large project is split into subprojects with identical deps graph. In these projects, call to sbt update is repeated for dozens of times each time the deps graph is invalidated. If Ivy resolution takes 30 seconds, 20 * 30 becomes 10 minutes. The fact has lead sbt team to investigate the performance characteristics of Ivy resolution.

sample data

I have been working closely with a proprietary project to analyze this behavior, but in order to make the optimization confirmable, I have created a project with similar characteristic using only open source dependencies. eed3si9n/large-graph-project#v1. The original project contains hundreds of intertwined dependencies, and large-graph project attempts to emulate the graph. As it is an emulation, the demonstration of the speedup could be limited in for some of the fixes, but it's still useful to have an objective sample data. Here's a baseline measurement. To prime the submodules and Ivy cache, you might have to run it several times first.

$ sbt ";common/clean; common/update"

trial	sbt 0.13.2
run1	16s
run2	16s
run3	18s
run4	18s
run5	17s
median	17±1s

profiling info

Here are some analysis from running YourKit profiler on this project. Start sbt on large-graph-project, navigate to common by typing in:

> project common

Then, to make sure Ivy cache is loaded run the following:

> ;clean; update; clean

The next time you run update it should mostly measure Ivy resolution only. Run YourKit, attach it to sbt, and start CPU Profiling with Tracing mode with "Profile J2EE" unchecked, and all filtered unchecked as well.

common> update
[info] Updating {file:/Users/eyokota/workspace/large-graph-project/}common...
[info] Resolving org.fusesource.jansi#jansi;1.4 ...
[info] Done updating.
[success] Total time: 390 s, completed May 5, 2014 12:13:40 PM

Note that baseline is 17±1s so 390s is 23x slowdown caused by the profiler. Here's a link to YourKit sbt-0.13.2-large-graph-project-2014-05-05.snapshot

From Threads tab > Select the time span when I performed update task by looking at the CPU % visually. Next, expand the thread call tree that's spending 99% of the time.

sbt.Casspaths.updateTask

Eventually we find a relevant method Classpaths.updateTask.

sbt.Classpaths$$anonfun$updateTask$1.apply(Object) **372781** 99% 369

This is our starting point of analysis. The above indicates that 99% of the selected range is spent in updateTask taking 372781ms with 369 sample points. Although the actual timing of 372781ms is somewhat artificial, we can see how this timing consists of further down the call chain.

sbt.IvyActions.update

For a while the call tree remains linear, meaning all of 372781ms is spent a call which calls another method. The first diversion occurs in sbt.IvyActions.update.

sbt.IvyActions$$anonfun$update$1.apply(Ivy, DefaultModuleDescriptor, String) **372781** 99% 369

This breaks into two parts:

sbt.IvyActions$.sbt$IvyActions$$resolve(Enumeration$Value, Ivy, DefaultModuleDescriptor, String) **370724** 99% 367
sbt.IvyRetrieve$.updateReport(ResolveReport, File) 2056 1% 2

Of 372s, it spends 2 seconds updating report. Given that this 2s is stretched out 23x time, the effect should be negligible.
Of 370724ms in sbt.IvyActions.revolve, all of its time in org.apache.ivy.Ivy.resolve, which in turn spends all of its time in org.apache.ivy.core.resove.ResolveEngine.resolve.

ResolveEngine.resolve

org.apache.ivy.core.resolve.ResolveEngine.resolve(ModuleDescriptor, ResolveOptions) **370724** 99% 367

Ivy's ResolveEngine.resolve is the first time we see some breakdown. This method breaks down into five parts:

org.apache.ivy.core.resolve.ResolveEngine.getDependencies(ModuleDescriptor, ResolveOptions, ResolveReport) **363593** 97% 360
org.apache.ivy.core.resolve.ResolveEngine.outputReport(ResolveReport, ResolutionCacheManager, ResolveOptions) 4063 1% 4
org.apache.ivy.core.resolve.IvyNode.isCompletelyEvicted() 1042 0% 1
org.apache.ivy.core.report.ResolveReport.setDependencies(List, Filter) 1019 0% 1
org.apache.ivy.core.resolve.ResolveEngine.downloadArtifacts(ResolveReport, Filter, DownloadOptions) 1006 0% 1

Of 370s, it's spending 363s in ResolveEngine.getDependencies and 4s writing Ivy resolution report. The resulting XML file is quite large, so it makes sense that it takes some time. Let's keep the focus on the ResolveEngine.getDependencies method.

ResolveEngine.fetchDependencies (first call)

org.apache.ivy.core.resolve.ResolveEngine.fetchDependencies(VisitNode, String, boolean) 362555 97% 359

All of 362555ms in ResolveEngine.fecthDependencies is spent in ResolveEngine.doFetchDependencies. ResolveEngine.doFetchDependencies in turn spends all of its time in ResolveEngine.fetchDependencies.

ResolveEngine.fetchDependencies (second call)

org.apache.ivy.core.resolve.ResolveEngine.fetchDependencies(VisitNode, String, boolean) 362555 97% 359

The second call to fetchDependencies breaks down into two parts:

org.apache.ivy.core.resolve.ResolveEngine.doFetchDependencies(VisitNode, String) **361909** 97% 358
org.apache.ivy.core.resolve.VisitNode.loadData(String, boolean) **645** 0% 1

fetchDependencies is written in a recursive way. Note that the number of the sampling point has branched between the two methods. 358 sample points took doFetchDependencies, while one took visitNode.loadData.
Again, doFetchDependencies spends all of 361909ms in ResolveEngine.fetchDependencies.

ResolveEngine.fetchDependencies (third call)

org.apache.ivy.core.resolve.ResolveEngine.fetchDependencies(VisitNode, String, boolean) **361909** 97% 358

This time 358 sample points splits into three ways:

org.apache.ivy.core.resolve.ResolveEngine.doFetchDependencies(VisitNode, String) **352680** 94% 349
org.apache.ivy.core.resolve.VisitNode.loadData(String, boolean) **8194** 2% 8
org.apache.ivy.core.resolve.ResolveEngine.resolveConflict(VisitNode, String) **1035** 0% 1

So, 8 sample points calling VisitNode.loadData taking 8s, and one sample point calling ResolveEngine.resolveConflict taking 1s, and the rest branches out to yet another call to ResolveEngine.doFetchDependencies.
There seems to be a pattern here.

ResolveEngine.fetchDependencies (all calls)

The following is the call graph from the first ResolveEngine.fetchDependencies with 359 sample points. Around 90% of the sample points are expanded out at this point.

Eventually one of the following methods are being called:

org.apache.ivy.core.resolve.ResolveEngine.resolveConflict(VisitNode, String)
org.apache.ivy.core.resolve.VisitNode.getDependencies(String)
org.apache.ivy.core.resolve.VisitNode.gotoNode(IvyNode)
org.apache.ivy.core.resolve.VisitNode.isEvicted()
org.apache.ivy.core.resolve.VisitNode.isCircular()
org.apache.ivy.core.resolve.VisitNode.loadData(String, boolean)

Each spends around 1s, but with 359 sample points each spending 1s would add up to roughly to 362555ms.

Here's the callee list sorted by time:

This is useful, but likely have some overlaps. Here's the callee list sorted by the "own time."

What's concerning is the invocation count for some of these methods. For example,

org.apache.ivy.core.resolve.IvyNode.loadData(String, IvyNode, String, String, boolean, IvyNodeUsage) 19567 0 872 **32374**

This seems to be a partial sample, so the actual count could be higher.

extra logging

To find out just how many times doFetchDependencies is called for which library dependencies, I have added a simple log entry in Ivy code.

$ git diff HEAD^ HEAD
diff --git a/src/java/org/apache/ivy/core/resolve/ResolveEngine.java b/src/java/org/apache/ivy/core/resolve/ResolveEngine.java
index bb3bc95..fddfbf3 100644
--- a/src/java/org/apache/ivy/core/resolve/ResolveEngine.java
+++ b/src/java/org/apache/ivy/core/resolve/ResolveEngine.java
@@ -793,6 +793,7 @@ public class ResolveEngine {

         // now we can actually resolve this configuration dependencies
         if (!isDependenciesFetched(node.getNode(), conf) && node.isTransitive()) {
+            Message.debug("- about to get dependencies for " + node.toString());
             Collection/* <VisitNode> */dependencies = node.getDependencies(conf);
             for (Iterator iter = dependencies.iterator(); iter.hasNext();) {
                 VisitNode dep = (VisitNode) iter.next();

We can publish this Ivy locally, and modify sbt to use that version and also to pass through Ivy debug log into sbt's debug log. Next, run grep to grab only the added log entry from the debug log as follows:

$ grep "about to *" update-debug-log.txt > doFetchDepependencies-log.txt

Here is a link to doFetchDepependencies-log.txt.

[debug] - about to get dependencies for com.example.large#common_2.10;0.1.0-SNAPSHOT
[debug] - about to get dependencies for org.scala-lang#scala-library;2.10.3
[debug] - about to get dependencies for org.scala-lang#scala-library;2.10.3
[debug] - about to get dependencies for org.scala-lang#scala-library;2.10.3
[debug] - about to get dependencies for org.scala-lang#scala-library;2.10.3
[debug] - about to get dependencies for com.example.large#util1_2.10;0.1.0-SNAPSHOT
[debug] - about to get dependencies for org.scalaz#scalaz-effect_2.10;7.0.6
[debug] - about to get dependencies for org.scalaz#scalaz-core_2.10;7.0.6
[debug] - about to get dependencies for org.scalaz#scalaz-core_2.10;7.0.6
[debug] - about to get dependencies for org.scalaz#scalaz-core_2.10;7.0.6
....

If we count the lines, doFetchDepependencies is being called 7929 times.

$ cat doFetchDepependencies-log.txt | wc -l
    7929

Next run uniq to group by the library dependencies to see how much of this are duplicates.

$ cat doFetchDepependencies-log.txt | sort | uniq -c | sort -rn > sorted-doFetchDependencies-log.txt

Here is a link to sorted-doFetchDependencies-log.txt.

  29 [debug] - about to get dependencies for org.scala-lang#scala-library;2.10.3
  27 [debug] - about to get dependencies for com.example.large#common_2.10;0.1.0-SNAPSHOT
  24 [debug] - about to get dependencies for org.slf4j#slf4j-api;1.7.5
  24 [debug] - about to get dependencies for org.scalaz#scalaz-effect_2.10;7.0.6
  24 [debug] - about to get dependencies for org.scala-lang#scala-library;2.10.4
  24 [debug] - about to get dependencies for org.json4s#json4s-native_2.10;3.2.6
  24 [debug] - about to get dependencies for org.json#json-simple;1.1.1
  24 [debug] - about to get dependencies for org.eclipse.jetty.orbit#javax.servlet;3.0.0.v201112011016
  24 [debug] - about to get dependencies for org.eclipse.jetty#jetty-webapp;9.1.0.v20131115
  24 [debug] - about to get dependencies for org.eclipse.jetty#jetty-plus;9.1.0.v20131115
....

The number of occurrence is prepended to the line. For example, doFetchDependencies was called for org.scala-lang#scala-library 29 times. Of the total 449 dependencies are tracked, on average doFetchDependencies was called 17 times per library dependency. Due to eviction, certain duplications are necessary to recalculate the dependency graph, but there may be some opportunities for optimization for example around IO overhead. In the field, we are observing even higher the number of occurrence (50+ times for Jackson Json parser etc.).

The text was updated successfully, but these errors were encountered:

retronym · 2014-05-06T18:13:57Z

🍻 for showing your working!

This is for resolution performance analysis. See #4

omnilinguist · 2014-11-18T01:11:56Z

@eed3si9n Any update on this?

eed3si9n · 2014-11-18T01:16:01Z

@omnilinguist There are several ways to tackle it, but as sbt, the current approach is to aggressively cache resolved result like cached resolution (sbt/sbt#1631) feature that'll be available in sbt 0.13.7.

eed3si9n added a commit that referenced this issue May 6, 2014

Add debug log to doFetchDependencies

37b9a1f

This is for resolution performance analysis. See #4

This was referenced May 6, 2014

Cache IvyNode#getdependencies #5

Merged

Increase default size for ModuleDescriptorMemoryCache #6

Merged

eed3si9n added a commit that referenced this issue May 20, 2014

Add debug log to doFetchDependencies

dfc4712

This is for resolution performance analysis. See #4

omnilinguist mentioned this issue Nov 18, 2014

Cached resolution (minigraph caching) sbt/sbt#1631

Merged

eed3si9n added the uncategorized Used for Waffle integration label Sep 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ivy resolve is not the most efficient #4

Ivy resolve is not the most efficient #4

eed3si9n commented May 6, 2014

retronym commented May 6, 2014

omnilinguist commented Nov 18, 2014

eed3si9n commented Nov 18, 2014

Ivy resolve is not the most efficient #4

Ivy resolve is not the most efficient #4

Comments

eed3si9n commented May 6, 2014

background

sample data

profiling info

sbt.Casspaths.updateTask

sbt.IvyActions.update

ResolveEngine.resolve

ResolveEngine.fetchDependencies (first call)

ResolveEngine.fetchDependencies (second call)

ResolveEngine.fetchDependencies (third call)

ResolveEngine.fetchDependencies (all calls)

extra logging

retronym commented May 6, 2014

omnilinguist commented Nov 18, 2014

eed3si9n commented Nov 18, 2014