Skip to content
This repository has been archived by the owner on Sep 8, 2022. It is now read-only.

speedup: don't fork a new JVM for tests that don't actually need one #75

Closed
SethTisue opened this issue Feb 2, 2017 · 17 comments
Closed
Assignees

Comments

@SethTisue
Copy link
Member

SethTisue commented Feb 2, 2017

Each run (and jvm) test always forks a fresh JVM; this costs a lot in startup time, and we're paying that cost about 1800 times. @retronym observes that tests that actually need this level of isolation are probably relatively few. Most of the time we should be enough to stay in partest's own JVM and just isolate each test in a fresh classloader. We'll need to capture stdout and stderr and trap System.exit.

@vpetro, @dwijnand, @retronym and I wrote wip code for this today at a Lightbend engineering retreat. Our preliminary results indicate an average overall speedup of about 2x. If that holds, this will substantially speed up PR validation.

PR to follow. It may take a while to identify tests that need the old level of isolation. (We've already seen cases where test A does something later causing test B to fail, so it isn't a trivial matter of just running any failed test the old way.)

@SethTisue SethTisue self-assigned this Feb 2, 2017
@SethTisue
Copy link
Member Author

in a similar vein, see also the new ticket scala/scala-dev#295, which proposes removing a lot of tests from partest's clutches entirely

(not to mention the recently merged scala/scala#5661, which removed our ScalaCheck-based tests from partest's clutches)

@SethTisue
Copy link
Member Author

SethTisue commented Feb 2, 2017

example failure causes to watch out for:

  • SI-5941 makes printing of type names sensitive to classloader details
  • use of security manager calls
  • calls to System.exit (Dale is working on it)
  • altering of system properties (Dale added code that saves & restores them)
  • @retronym says using Scala reflection may alter the state of the VM in ways that may influence later tests
  • ...?

@SethTisue
Copy link
Member Author

very rough numbers: in our local tests so far this saves about 2/3 of a second per run test. if we can enable this for 1500 (out of 1800) tests, we'll save 1000 seconds = 16 minutes. PR validation currently takes 71 minutes total.

@retronym
Copy link
Member

retronym commented Feb 2, 2017

Some progress towards fixing SI-5941: https://github.com/retronym/scala/tree/ticket/5941

@SethTisue
Copy link
Member Author

so it isn't a trivial matter of just running any failed test the old way

I think Dale convinced me that wouldn't be so bad as a starting point. When test A does something that later causes test B to fail, it's hard to say who's at fault. Is it test A for changing something, or test B for depending on something that could change? So yes, mass quarantine of all the test Bs will catch some innocent victims, but only some.

@dwijnand dwijnand self-assigned this Feb 21, 2017
@dwijnand
Copy link
Member

I've not forgotten this. It's on my todo list (of out-of-hours todos).

@SethTisue
Copy link
Member Author

It's still very much on my mind as well.

@retronym
Copy link
Member

retronym commented Feb 14, 2018

To rekindle this, I've created a branch of scala/scala that uses partest-as-a-library from a JUnit suite, overriding the way that run tests are executed as discussed here.

I wrote this from scratch as I can't remember what we did with the code we wrote in Banff: system property setup, capturing stdout/stderr, trapping sys.exit.

https://github.com/retronym/scala/tree/topic/jptest

18 minutes later:

[info] Test run finished: 37 failed, 0 ignored, 1872 total, 1051.299s
[error] Failed: Total 1872, Failed 37, Errors 0, Passed 1835
[error] Failed tests:
[error] 	partest.RunTests
[error] (junit/test:testOnly) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 1057 s, completed 14/02/2018 6:36:51 PM

Running all 1877 partests under the existing sbt 'partest --run' is glacially slow -- 71 mins -- on the same machine:

[error] Failed: Total 1877, Failed 3, Errors 0, Passed 1874
[error] Failed tests:
[error] 	partest
[error] (test/it:testOnly) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 4255 s, completed 14/02/2018 7:50:01 PM

... and that's with use of all four cores!

Maybe something is off with my measurements, but they suggest an even greater improvement than the 2x we got from the back-of-the-envelope calcs above.

@retronym
Copy link
Member

One advantage of using JUnit as the harness is that it we get some tooling for free. For example, running this suite in IntelliJ, I can sort by test duration.

image

I'm not sure why the IntelliJ runner took 29min in total, rather than the 17min I measured through SBT. But in that context, the slowest 10 tests account for > 30% of the runtime.

Some of these tests are doing black box testing for absence of memory leaks in ways that stress GC (t6969, reflection-mem-tests*). Others are performing stress tests on collections (t6853, ctries-new, ctries-old, t3989).

Test of reify tend to be slow-ish, and there are lots of them (135).

Tests that themselves perform another layer of programmatic compilation (through Toolbox.eval or through ReplTest) are also sluggish.

At the other end of the scale, the fastest executing tests still take around 130ms. This is likely limited by the time taken to do a hello world compile with a new Global, even with a warmed up JVM.

@retronym
Copy link
Member

retronym commented Feb 15, 2018

breadcrumb to self:

[info] Test partest.RunTests.t7046-2 started
32 tests failed with partest.RunTests, rerun with:
  junit/testOnly partest.RunTests -- --tests=\Qlazy$minusconcurrent\E|\QbcodeInlinerMixed\E|\Qt7805$minusrepl$minusi\E|\Qdynamic$minusupdateDynamic\E|\Qt2318\E|\Qt4332\E|\Qt4788\E|\Qt5938\E|\Qt6669\E|\Qt7106\E|\Qt7852\E|\Qt7974\E|\Qt8046\E|\Qt8433\E|\Qt9359\E|\Qt9403\E|\Qt7817$minustree$minusgen\E|\Qicode$minusreader$minusdead$minuscode\E|\QjunitForwarders\E|\Qt8266$minusoctal$minusinterp\E|\Qreflection$minusvalueclasses$minusmagic\E|\Qt6240$minusuniverse$minuscode$minusgen\E|\Qshutdownhooks\E|\QscalapInvokedynamic\E|\Qt4788$minusseparate$minuscompilation\E|\QWeakHashSetTest\E|\Qt8601$minusclosure$minuselim\E|\Qdynamic$minusselectDynamic\E|\Qreify_copypaste1\E|\Qdynamic$minusapplyDynamic\E|\Qreflection$minusmagicsymbols$minusinvoke\E|\Qt7046$minus1\E
[info] Test run finished: 33 failed, 0 ignored, 1872 total, 1441.14s
[error] Failed: Total 1872, Failed 33, Errors 0, Passed 1839
[error] Failed tests:
[error] 	partest.RunTests
[error] (junit/test:testOnly) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 1547 s, completed 15/02/2018 1:46:14 PM

@retronym
Copy link
Member

At the other end of the scale, the fastest executing tests still take around 130ms. This is likely limited by the time taken to do a hello world compile with a new Global, even with a warmed up JVM.

It occurs to me that reusing a Global instance for this use case is probably quite achievable. After compilation, we can walk the Symbol table and unlink symbols that originate from the just-compiled set of source files. We can leave the rest of the symbols in place (e.g. standard library) without needing to worry about invalidating them if the corresponding JAR changes.

@lrytz mentioned that the Junit tests that perform programatic compilation already to Global instance re-use. They don't actually bother unlinking symbols from the previous run at all, and just make sure the tests use distinct class names or that overlaps don't trigger buggy behaviour.

@retronym
Copy link
Member

Another means to speedup might be reducing IO by using virtual directories for the target, which could be directly used as the classpath of the executing project.

@lrytz
Copy link
Member

lrytz commented Feb 21, 2018

make sure the tests use distinct class names

The tests reuse the same names often, actually; there were only few bugs, I fixed some, for example scala/scala@59d6dbc

@lrytz
Copy link
Member

lrytz commented Feb 21, 2018

Ah, there are still open issues; scala/scala-dev#214

@retronym
Copy link
Member

I'm currently working a branch of partest to support JVM reuse: https://github.com/retronym/scala-partest/tree/ticket/75

I think it makes sense to get this change through before changing to using JUnit as the front end.

@retronym
Copy link
Member

Okay, I've got a working prototype of Global reuse in https://github.com/retronym/scala-partest/tree/ticket/75. The complementary config changes to scala/scala are in topic/jptest

That gets the single threaded partest --run down to 14m on my laptop.

@retronym
Copy link
Member

JVM reuse is now merged in scala/scala#6413

PR validation down to ~23 mins, down from 55 mins.

I'm closing this ticket as fixed, as the cost/benefit of the Global reuse idea doesn't seem worth it right now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants