speedup: don't fork a new JVM for tests that don't actually need one #75

SethTisue · 2017-02-02T00:36:31Z

Each run (and jvm) test always forks a fresh JVM; this costs a lot in startup time, and we're paying that cost about 1800 times. @retronym observes that tests that actually need this level of isolation are probably relatively few. Most of the time we should be enough to stay in partest's own JVM and just isolate each test in a fresh classloader. We'll need to capture stdout and stderr and trap System.exit.

@vpetro, @dwijnand, @retronym and I wrote wip code for this today at a Lightbend engineering retreat. Our preliminary results indicate an average overall speedup of about 2x. If that holds, this will substantially speed up PR validation.

PR to follow. It may take a while to identify tests that need the old level of isolation. (We've already seen cases where test A does something later causing test B to fail, so it isn't a trivial matter of just running any failed test the old way.)

The text was updated successfully, but these errors were encountered:

SethTisue · 2017-02-02T00:50:41Z

in a similar vein, see also the new ticket scala/scala-dev#295, which proposes removing a lot of tests from partest's clutches entirely

(not to mention the recently merged scala/scala#5661, which removed our ScalaCheck-based tests from partest's clutches)

SethTisue · 2017-02-02T01:04:02Z

example failure causes to watch out for:

SI-5941 makes printing of type names sensitive to classloader details
use of security manager calls
calls to System.exit (Dale is working on it)
~~altering of system properties~~ (Dale added code that saves & restores them)
@retronym says using Scala reflection may alter the state of the VM in ways that may influence later tests
...?

SethTisue · 2017-02-02T18:42:10Z

very rough numbers: in our local tests so far this saves about 2/3 of a second per run test. if we can enable this for 1500 (out of 1800) tests, we'll save 1000 seconds = 16 minutes. PR validation currently takes 71 minutes total.

retronym · 2017-02-02T19:25:14Z

Some progress towards fixing SI-5941: https://github.com/retronym/scala/tree/ticket/5941

SethTisue · 2017-02-02T22:50:30Z

so it isn't a trivial matter of just running any failed test the old way

I think Dale convinced me that wouldn't be so bad as a starting point. When test A does something that later causes test B to fail, it's hard to say who's at fault. Is it test A for changing something, or test B for depending on something that could change? So yes, mass quarantine of all the test Bs will catch some innocent victims, but only some.

dwijnand · 2017-02-21T23:36:45Z

I've not forgotten this. It's on my todo list (of out-of-hours todos).

SethTisue · 2017-02-22T22:54:16Z

It's still very much on my mind as well.

retronym · 2018-02-14T07:39:25Z

To rekindle this, I've created a branch of scala/scala that uses partest-as-a-library from a JUnit suite, overriding the way that run tests are executed as discussed here.

I wrote this from scratch as I can't remember what we did with the code we wrote in Banff: system property setup, capturing stdout/stderr, trapping sys.exit.

https://github.com/retronym/scala/tree/topic/jptest

18 minutes later:

[info] Test run finished: 37 failed, 0 ignored, 1872 total, 1051.299s
[error] Failed: Total 1872, Failed 37, Errors 0, Passed 1835
[error] Failed tests:
[error] 	partest.RunTests
[error] (junit/test:testOnly) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 1057 s, completed 14/02/2018 6:36:51 PM

Running all 1877 partests under the existing sbt 'partest --run' is glacially slow -- 71 mins -- on the same machine:

[error] Failed: Total 1877, Failed 3, Errors 0, Passed 1874
[error] Failed tests:
[error] 	partest
[error] (test/it:testOnly) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 4255 s, completed 14/02/2018 7:50:01 PM

... and that's with use of all four cores!

Maybe something is off with my measurements, but they suggest an even greater improvement than the 2x we got from the back-of-the-envelope calcs above.

retronym · 2018-02-14T20:57:44Z

One advantage of using JUnit as the harness is that it we get some tooling for free. For example, running this suite in IntelliJ, I can sort by test duration.

I'm not sure why the IntelliJ runner took 29min in total, rather than the 17min I measured through SBT. But in that context, the slowest 10 tests account for > 30% of the runtime.

Some of these tests are doing black box testing for absence of memory leaks in ways that stress GC (t6969, reflection-mem-tests*). Others are performing stress tests on collections (t6853, ctries-new, ctries-old, t3989).

Test of reify tend to be slow-ish, and there are lots of them (135).

Tests that themselves perform another layer of programmatic compilation (through Toolbox.eval or through ReplTest) are also sluggish.

At the other end of the scale, the fastest executing tests still take around 130ms. This is likely limited by the time taken to do a hello world compile with a new Global, even with a warmed up JVM.

retronym · 2018-02-15T03:55:29Z

breadcrumb to self:

[info] Test partest.RunTests.t7046-2 started
32 tests failed with partest.RunTests, rerun with:
  junit/testOnly partest.RunTests -- --tests=\Qlazy$minusconcurrent\E|\QbcodeInlinerMixed\E|\Qt7805$minusrepl$minusi\E|\Qdynamic$minusupdateDynamic\E|\Qt2318\E|\Qt4332\E|\Qt4788\E|\Qt5938\E|\Qt6669\E|\Qt7106\E|\Qt7852\E|\Qt7974\E|\Qt8046\E|\Qt8433\E|\Qt9359\E|\Qt9403\E|\Qt7817$minustree$minusgen\E|\Qicode$minusreader$minusdead$minuscode\E|\QjunitForwarders\E|\Qt8266$minusoctal$minusinterp\E|\Qreflection$minusvalueclasses$minusmagic\E|\Qt6240$minusuniverse$minuscode$minusgen\E|\Qshutdownhooks\E|\QscalapInvokedynamic\E|\Qt4788$minusseparate$minuscompilation\E|\QWeakHashSetTest\E|\Qt8601$minusclosure$minuselim\E|\Qdynamic$minusselectDynamic\E|\Qreify_copypaste1\E|\Qdynamic$minusapplyDynamic\E|\Qreflection$minusmagicsymbols$minusinvoke\E|\Qt7046$minus1\E
[info] Test run finished: 33 failed, 0 ignored, 1872 total, 1441.14s
[error] Failed: Total 1872, Failed 33, Errors 0, Passed 1839
[error] Failed tests:
[error] 	partest.RunTests
[error] (junit/test:testOnly) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 1547 s, completed 15/02/2018 1:46:14 PM

retronym · 2018-02-21T10:14:32Z

At the other end of the scale, the fastest executing tests still take around 130ms. This is likely limited by the time taken to do a hello world compile with a new Global, even with a warmed up JVM.

It occurs to me that reusing a Global instance for this use case is probably quite achievable. After compilation, we can walk the Symbol table and unlink symbols that originate from the just-compiled set of source files. We can leave the rest of the symbols in place (e.g. standard library) without needing to worry about invalidating them if the corresponding JAR changes.

@lrytz mentioned that the Junit tests that perform programatic compilation already to Global instance re-use. They don't actually bother unlinking symbols from the previous run at all, and just make sure the tests use distinct class names or that overlaps don't trigger buggy behaviour.

retronym · 2018-02-21T10:15:23Z

Another means to speedup might be reducing IO by using virtual directories for the target, which could be directly used as the classpath of the executing project.

lrytz · 2018-02-21T12:24:17Z

make sure the tests use distinct class names

The tests reuse the same names often, actually; there were only few bugs, I fixed some, for example scala/scala@59d6dbc

lrytz · 2018-02-21T12:26:00Z

Ah, there are still open issues; scala/scala-dev#214

retronym · 2018-02-26T03:57:45Z

I'm currently working a branch of partest to support JVM reuse: https://github.com/retronym/scala-partest/tree/ticket/75

I think it makes sense to get this change through before changing to using JUnit as the front end.

retronym · 2018-02-26T10:34:57Z

Okay, I've got a working prototype of Global reuse in https://github.com/retronym/scala-partest/tree/ticket/75. The complementary config changes to scala/scala are in topic/jptest

That gets the single threaded partest --run down to 14m on my laptop.

retronym · 2018-03-21T13:08:52Z

JVM reuse is now merged in scala/scala#6413

PR validation down to ~23 mins, down from 55 mins.

I'm closing this ticket as fixed, as the cost/benefit of the Global reuse idea doesn't seem worth it right now.

SethTisue self-assigned this Feb 2, 2017

dwijnand self-assigned this Feb 21, 2017

dwijnand mentioned this issue Mar 29, 2017

Fix creation of negative Durations scala/scala#5796

Merged

dwijnand removed their assignment Jul 7, 2017

This was referenced Dec 4, 2017

inconsistent printing of types scala/bug#5941

Closed

Elide prefixes in printed types uniformly in runtime reflection scala/scala#6213

Merged

hrhino mentioned this issue Mar 5, 2018

Document how to run tests scala/scala#6369

Merged

retronym closed this as completed Mar 21, 2018

dwijnand mentioned this issue Oct 23, 2020

JUnit resident compiler bug / cross-talk between tests scala/scala-dev#214

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speedup: don't fork a new JVM for tests that don't actually need one #75

speedup: don't fork a new JVM for tests that don't actually need one #75

SethTisue commented Feb 2, 2017 •

edited

Loading

SethTisue commented Feb 2, 2017

SethTisue commented Feb 2, 2017 •

edited

Loading

SethTisue commented Feb 2, 2017

retronym commented Feb 2, 2017

SethTisue commented Feb 2, 2017

dwijnand commented Feb 21, 2017

SethTisue commented Feb 22, 2017

retronym commented Feb 14, 2018 •

edited

Loading

retronym commented Feb 14, 2018

retronym commented Feb 15, 2018 •

edited

Loading

retronym commented Feb 21, 2018

retronym commented Feb 21, 2018

lrytz commented Feb 21, 2018

lrytz commented Feb 21, 2018

retronym commented Feb 26, 2018

retronym commented Feb 26, 2018

retronym commented Mar 21, 2018

speedup: don't fork a new JVM for tests that don't actually need one #75

speedup: don't fork a new JVM for tests that don't actually need one #75

Comments

SethTisue commented Feb 2, 2017 • edited Loading

SethTisue commented Feb 2, 2017

SethTisue commented Feb 2, 2017 • edited Loading

SethTisue commented Feb 2, 2017

retronym commented Feb 2, 2017

SethTisue commented Feb 2, 2017

dwijnand commented Feb 21, 2017

SethTisue commented Feb 22, 2017

retronym commented Feb 14, 2018 • edited Loading

retronym commented Feb 14, 2018

retronym commented Feb 15, 2018 • edited Loading

retronym commented Feb 21, 2018

retronym commented Feb 21, 2018

lrytz commented Feb 21, 2018

lrytz commented Feb 21, 2018

retronym commented Feb 26, 2018

retronym commented Feb 26, 2018

retronym commented Mar 21, 2018

SethTisue commented Feb 2, 2017 •

edited

Loading

SethTisue commented Feb 2, 2017 •

edited

Loading

retronym commented Feb 14, 2018 •

edited

Loading

retronym commented Feb 15, 2018 •

edited

Loading