-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
override heap / jvm params for tests in gradle build [LUCENE-9160] #10200
Comments
Robert Muir (@rmuir) (migrated from JIRA) Here's a patch that works for me. It allows specifying these parameters similar to how you can with ant:
I tried to make the parameters match the ant build as much as possible, to reduce confusion, but I'm not stuck on the naming, just want to make it possible :) |
Robert Muir (@rmuir) (migrated from JIRA) fwiw adding |
Michael McCandless (@mikemccand) (migrated from JIRA)
I tested this option, on 72 core box, using JDK 11. In
And then ran again with this option (to tell hotspot to not try so hard?),
Net/net this is a crazy crazy speedup for our tests!!! |
Uwe Schindler (@uschindler) (migrated from JIRA) But we should not hardcode the JVM opts, as we would like to test all combinations (also C2 optimizations) on Jenkins. So we can add sane defaults, but |
Uwe Schindler (@uschindler) (migrated from JIRA) Basically |
Robert Muir (@rmuir) (migrated from JIRA) Yes, i'd like to just set |
Robert Muir (@rmuir) (migrated from JIRA) Updated patch, it sets the default, but you can override of course. I changed name to I also updated the help page. I think its ready. |
Uwe Schindler (@uschindler) (migrated from JIRA) OK, +1 |
ASF subversion and git services (migrated from JIRA) Commit 9dae566 in lucene-solr's branch refs/heads/master from Robert Muir LUCENE-9160: add params/docs to override jvm params in gradle build, default C2 off in tests. Adds some build parameters to tune how tests run. There is an example Default C2 off in tests as it is wasteful locally and causes slowdown of Some crazy lucene stress tests may need to be toned down after the |
Dawid Weiss (@dweiss) (migrated from JIRA) Correct that running tests effectively stresses the compiler. I'd do it slightly differently so that options are self-documenting but this is something that I can follow-up later on. LGTM overall. Great speedup for the common case. |
ASF subversion and git services (migrated from JIRA) Commit 9dae566 in lucene-solr's branch refs/heads/gradle-master from Robert Muir LUCENE-9160: add params/docs to override jvm params in gradle build, default C2 off in tests. Adds some build parameters to tune how tests run. There is an example Default C2 off in tests as it is wasteful locally and causes slowdown of Some crazy lucene stress tests may need to be toned down after the |
David Smiley (@dsmiley) (migrated from JIRA) While this change might improve Lucene tests (I didn't check yet), I'm finding this to be a large degradation in Solr tests. A machine I use to run tests normally takes around 38 minutes but is now taking 52 minutes. It's a corporate VM that supposedly has 16 CPUs; "ant test" uses 4 runners. I passed "-Dargs=" to undo the change args change and I'm back to normal test run times. |
Robert Muir (@rmuir) (migrated from JIRA) The solr tests are generally sleep()'ing and hence leave the cpu with a lot of cycles to run background compilation. so there is no downside for the overcompilation of tests, only the benefits. For solr I can't recommend anything, the tests are really hopeless: i'd just use as many runners as possible. |
Robert Muir (@rmuir) (migrated from JIRA) Also, if you have a machine with 16 cpus, and you are running with just 4 runners, that is a misconfigured system: you are leaving 75% of your machine free. So it shouldnt be any surprise that background compilation (even if its some insane amount) causes you no problems: 75% of your resources are wasted. Set the jvms to 16 if you want to do a comparison. |
Robert Muir (@rmuir) (migrated from JIRA) Similar comparison: if you have a 16 cpu machine and only use 4 runners, i can speed up your tests by spawning 12 background threads from the build: 12 threads that spend 80% of their time mining cryptocurrency and only 20% of their time running tests. You'd see a nice speedup, even though overall its wasting all of your resources. And if you set test jvms to 16 you'd see that these background threads only caused contention and slowed you down, because they are wasting your CPU overall. This is what C2 compiler does in our tests :) |
David Smiley (@dsmiley) (migrated from JIRA) Yep; I hear you, and thanks for your amusing comparative explanation :). I recently acquired use of this VM and hadn't tuned the build. I tried tests.jvms=4,8,10,12,16 and ultimately found 10 yielded the best times on this VM – 17:24m. |
Robert Muir (@rmuir) (migrated from JIRA) are you sure you really have 16?
|
Robert Muir (@rmuir) (migrated from JIRA) keep in mind with a vm, the admin may not have taken the time to expose resources "correctly" as far as hyperthreads and so on. I can pass |
David Smiley (@dsmiley) (migrated from JIRA) I don't have the psutil module and I'm not versed in python but I ran lscpu:
(and 48GB of RAM) |
Robert Muir (@rmuir) (migrated from JIRA) My guess is there is only really 8. Impossible to tell inside a VM :) I will open an issue, the gradle build divides number of cpus by 2, then artifically caps this at 4, I think we should change that. Divide by 2 is fine, but machines have more cores these days. 8 would have been a better default here. |
Dawid Weiss (@dweiss) (migrated from JIRA) I never had a chance to experiment on those super-beefy machines but I'm sure we can alter the defaults. // Approximate a common-sense default for running gradle with parallel
// workers: half the count of available cpus but not more than 12.
def cpus = Runtime.runtime.availableProcessors()
def maxWorkers = (int) Math.max(1d, Math.min(cpus * 0.5d, 12))
def testsJvms = (int) Math.max(1d, Math.min(cpus * 0.5d, 4)) My machines quickly saturate I/O and memory bandwidth for higher test parallelism, especially for Solr. The above is just off-the-top-off-my-head default. It can be certainly improved. |
Robert Muir (@rmuir) (migrated from JIRA) Dawid I opened #10205 to discuss further. Also keep in mindthis jira ticket alters the defaults in ways that impact this. That's because I don't have 3 CICompiler threads per JVM doing a lot of useless C2 recompilation. So it makes things more efficient and I think we should raise the hard cap of 4 jvms to 8 or 12. |
ASF subversion and git services (migrated from JIRA) Commit 16f240e in lucene-solr's branch refs/heads/branch_8x from Robert Muir LUCENE-9160: add params/docs to override jvm params in gradle build, default C2 off in tests. Adds some build parameters to tune how tests run. There is an example Default C2 off in tests as it is wasteful locally and causes slowdown of Some crazy lucene stress tests may need to be toned down after the |
Adrien Grand (@jpountz) (migrated from JIRA) Closing after the 9.0.0 release |
Currently the gradle.properties that is generated lets you control the heap and flags for the gradle build jvms.
But there is no way to control these flags for the actual forked VMs running the unit tests. For example, minHeap is hardcoded at 256m and maxHeap at 512m.
I would like to change minHeap to 512m as well, for a fixed heap, and set some other jvm flags, such as
-XX:+UseParallelGC
so that my tests are not slow for silly reasons :)I think it is stuff jenkins CI would need as well.
Migrated from LUCENE-9160 by Robert Muir (@rmuir), resolved Jan 22 2020
Attachments: LUCENE-9160.patch (versions: 2)
Linked issues:
The text was updated successfully, but these errors were encountered: