New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memoize org.scalatest.Suite class loading #5614

Merged
merged 15 commits into from Mar 22, 2018

Conversation

Projects
None yet
4 participants
@iantabolt
Copy link
Contributor

iantabolt commented Mar 19, 2018

Problem

JUnit runner has very long hangs when running many tests in our repo (around 5 minutes). The line that it seems to be hanging on is the class loading of org.scalatest.Suite class.

It might be a deadlock, or it might be that the reflection is just slow... either way, saving the result of the class load avoids the extra work and fixes the issue.

Solution

Only load org.scalatest.Suite class once and save the result.

I should note that another (maybe more straightforward) solution is to reference org.scalatest.Suite.class directly, but this adds a scalatest dependency to the junit runner.

Result

The hang is gone!

@mateor mateor requested review from stuhood , mateor and cheister Mar 19, 2018

* org.scalatest.Suite.class, loaded with runtime reflection to avoid the
* scalatest dependency.
*/
private static Class<?> suiteClass = null;

This comment has been minimized.

@stuhood

stuhood Mar 20, 2018

Member

Do you still see the slowdown if you load this eagerly in a static block?

private static Class<?> suiteClass = null;
static {
  suiteClass = Class.forName("org.scalatest.Suite", true, clazz.getClassLoader());
};

This comment has been minimized.

@iantabolt

iantabolt Mar 20, 2018

Contributor

This uses the ClassLoader from the test instead of the runner, so it needs the reference to clazz from inside the method. The reasoning behind this isn't totally clear to me but I didn't want to change any behavior.

This comment has been minimized.

@stuhood

stuhood Mar 20, 2018

Member

There is only one classloader in play, afaik: we shade the testrunner and then directly merge the classpaths.

This comment has been minimized.

@iantabolt

iantabolt Mar 20, 2018

Contributor

Makes sense to me. I made the change.

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Mar 20, 2018

Also, note that there is an actual checkstyle failure in CI.

@iantabolt

This comment has been minimized.

Copy link
Contributor

iantabolt commented Mar 20, 2018

Weird, I might be reading this wrong but this error is in a file that I didn't touch

[ERROR] /home/travis/build/pantsbuild/pants/src/java/org/pantsbuild/tools/junit/impl/ConsoleRunnerImpl.java:735: Line is longer than 100 characters (found 102). [LineLength]

Also I don't actually see any 100+ character lines in that file.

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Mar 20, 2018

I'll look at the style failure later.

private static final Class<?> suiteClass;
static {
try {
suiteClass = Class.forName("org.scalatest.Suite");

This comment has been minimized.

@baroquebobcat

baroquebobcat Mar 20, 2018

Contributor

If we're doing this for the suiteClass, maybe it'd make sense to cache the runner class in the same way, since it'll be looked up once per scala test.

This comment has been minimized.

@iantabolt

iantabolt Mar 20, 2018

Contributor

Yeah that makes sense. Any idea if it would make sense to cache the actual Runner instance that is returned instead of the Class<Runner> that is used to construct it?

@@ -30,11 +44,6 @@ public static Runner getJUnitRunner(Class<?> clazz) {
* @return true if the test class is a scalatest test, false if not.

This comment has been minimized.

@baroquebobcat

baroquebobcat Mar 20, 2018

Contributor

You should probably also update the javadoc because we're no longer using the test classes class loader.

I'm not sure if it is worth it, but it might make sense to assert that the clazz classloader is the same as the suiteClass's or is in the same loader hierarchy. I think Stu's right that we just use one, but this feels like a place that could be forgotten if that gets changed in the future. A test case that would fail if they were different would also work for me.

This comment has been minimized.

@iantabolt

iantabolt Mar 20, 2018

Contributor

Ok I'll update the doc and look into adding a unit test.

This comment has been minimized.

@iantabolt

iantabolt Mar 21, 2018

Contributor

In regards to

assert that the clazz classloader is the same as the suiteClass's

I was able to do this as a sanity check locally, but in the long run I think if something broke then suiteClass wouldn't be loaded and would be null, making the assert tricky.

I think there is enough coverage around this code in UtilTest and SpecParserTest that a test would fail if something were broken, but I don't know how to test the class loading directly.

Any suggestions on where/how I could test this?

This comment has been minimized.

@baroquebobcat

baroquebobcat Mar 21, 2018

Contributor

I think the way I'd approach it is to create a one-off classloader that has ScalaSuite and a subclass of ScalaSuite loaded into it, then assert that passing the subclass into the util method fails in a understandable way. Constructing that test might be a bit tricky if you're not familiar with classloaders.

My guess for the failure mode is that the separate classloader owned ScalaSuite would fail assignFrom and wouldn't be considered a scalatest. If that's the case the user facing behavior would be that the test would either not be run or would fail to run. But, I'm not sure.

@mateor

mateor approved these changes Mar 20, 2018

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Mar 20, 2018

The lint failure is super strange... not able to reproduce it locally. Have cleared caches and restarted that travis shard.

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Mar 20, 2018

@iantabolt , @mateor : Have you guys tested this build of the runner internally?

@mateor

This comment has been minimized.

Copy link
Member

mateor commented Mar 20, 2018

Yes we have tested it but have not deployed. We plan to ship widely as soon as this lands though. If we land this commit and publish, we can deploy to our users and report back before bumping the version in the junit subsystem

@mateor

This comment has been minimized.

Copy link
Member

mateor commented Mar 20, 2018

On other note is that we only see this on tests marked SERIAL, aka the legacy code path. Everything else is fine, which we run as PARALLEL_METHOD_AND_CLASSES

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Mar 21, 2018

Attempting to repro the CI failure. Sorry for the trouble!

EDIT: Current failure appears to be due to #5619... sorry for the trouble!

@stuhood stuhood referenced this pull request Mar 21, 2018

Merged

Drop example fsnotify dep #5619

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Mar 21, 2018

The fix for #5619 is in: would you mind rebasing/merging and re-running CI?

In the meantime, I'll smoketest this.

@stuhood
Copy link
Member

stuhood left a comment

Looks good on our infrastructure.

@iantabolt

This comment has been minimized.

Copy link
Contributor

iantabolt commented Mar 21, 2018

Looks like the CI is failing on that mysterious line length error again 🤔

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Mar 22, 2018

Sigh. Well, it's deterministic at least. Just not reproducible.

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Mar 22, 2018

@mateor : Do you have time to look into that? I'm unfortunately going to be out from Thursday afternoon through Monday. If you figure out what it is, feel free to merge.

@mateor

This comment has been minimized.

Copy link
Member

mateor commented Mar 22, 2018

Yeah, we can tackle to some degree. We will also try and smoke test the junit change as an asynchronous operation if need be

@iantabolt

This comment has been minimized.

Copy link
Contributor

iantabolt commented Mar 22, 2018

Ok so I was able to repro the failure

./pants fmt src/java/org/pantsbuild/tools/junit::
 ./pants lint src/java/org/pantsbuild/tools/junit::

so the issue is that the CI server is rewriting the code with fmt-google-java-format prior to linting, and the formatter itself is causing the lint failures.

I'm going to open an issue with more details.

@baroquebobcat
Copy link
Contributor

baroquebobcat left a comment

The doc updates look good and thanks for moving the runner class lookup too!

I reread my comment and I don't think I was as clear as I could have been. I don't consider assertions/tests about the class loaders to be a blocker for landing this because we have decent test coverage, that's not an issue we've run into before, and this is probably not the only place that would be affected.

@mateor

This comment has been minimized.

Copy link
Member

mateor commented Mar 22, 2018

Great, thanks for the feedback all. I am going to land these patches today so we can get it running internally this weekend.

I will land #5623 first (when green) and then circle back here

mateor added a commit that referenced this pull request Mar 22, 2018

Disable google java format by default (#5623)
See issue at #5622 and the original PR that was failing to build at #5614

### Problem

CI is running a formatter prior to linting, and the formatter causes the linting (and the build) to fail.

### Solution

Disable the linter by default, much like scalafmt and scalafix are disabled by default.

### Result

The build (#5614) went green!

@mateor mateor merged commit 7534992 into pantsbuild:master Mar 22, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

mateor added a commit to foursquare/pants that referenced this pull request Mar 25, 2018

Memoize org.scalatest.Suite class loading (pantsbuild#5614)
* Memoize org.scalatest.Suite class loading

* Load Suite class in static initialization

* No final with try catch

* Also load junit runner class in static init

* Fix small style issues

* Touch file in attempt to get CI to work

* Revert "Touch file in attempt to get CI to work"

This reverts commit c69dfc7.

* Edit the line that CI is complaining about ¯\_(ツ)_/¯

* Another edit to be safe

* Revert "Another edit to be safe"

This reverts commit 694724e.

* Revert "Edit the line that CI is complaining about ¯\_(ツ)_/¯"

This reverts commit 7ec7423.

* Disable fmt.google-java-format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment