Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testsuite slim down #1515

Closed
3 of 6 tasks
abrokenjester opened this issue Aug 27, 2019 · 19 comments · Fixed by #1648 or #1770
Closed
3 of 6 tasks

Testsuite slim down #1515

abrokenjester opened this issue Aug 27, 2019 · 19 comments · Fixed by #1648 or #1770
Assignees
Labels
📶 enhancement issue is a new feature or improvement

Comments

@abrokenjester
Copy link
Contributor

abrokenjester commented Aug 27, 2019

The current tests run (on Jenkins) for close to an hour (54 minutes). Even when disabling compliance tests, building and executing all unit tests still takes over 20 minutes.

We should look into "slimming down" the testuite: identify large/long-running tests, and determine if these tests are still necessary. If not, scrap them. If they are, see if they can be rewritten to run faster.

While we're at it we should also pay attention to the output from tests. Some tests spit out thousands of logger or even STDOUT lines, and it makes trawling through the build log near-impossible.

  • look into updating/removing SailConcurrency test
  • look into streamlining RepositoryConnection test suite
  • look into not executing full RepositoryConnection test suite on ever possible combination of stacked + backend sail.
  • investigate if all maven build plugins currently used are still necessary
  • investigate use of Jenkins incremental or parallel builds
  • look into options for fully isolated Jenkins agents.
@abrokenjester
Copy link
Contributor Author

@hmottestad could you take a closer look at the SHACL tests? Executing the build+tests on the SHACL module takes almost 8 minutes (it's responsible for nearly half the build time, on its own).

I have a hunch this is related to the recent changes you made to include benchmark tests.

I've also noticed that if Jenkins executes two builds in parallel, the JMH benchmarks become flaky: it complains about not being able to create a lock:

Error Message

ERROR: Unable to acquire the JMH lock (/tmp/genie.rdf4j/jmh.lock): already taken by another JMH instance, exiting. Use -Djmh.ignoreLock=true to forcefully continue.

Stacktrace

org.openjdk.jmh.runner.RunnerException: ERROR: Unable to acquire the JMH lock (/tmp/genie.rdf4j/jmh.lock): already taken by another JMH instance, exiting. Use -Djmh.ignoreLock=true to forcefully continue.
at org.eclipse.rdf4j.sail.shacl.benchmark.BenchmarkJunitTests.test(BenchmarkJunitTests.java:36)

I think we should make sure not to run JMH benchmarks as part of the Jenkins build, at least not as part of the regular build-verification jobs. There is an option available to set Jenkins up using fully independent agents, which we could potentially use to schedule regular (nightly?) runs of certain benchmark suites, instead.

@hmottestad
Copy link
Contributor

Why does each parallel build server in Jenkins share a temporary directory. Must be some way to enable some decent isolation in Jenkins.

@hmottestad
Copy link
Contributor

hmottestad commented Aug 27, 2019

Btw. Benchmarks only take 1 minute.

The most time consuming tests are the multithreaded fuzzing tests that check for deadlocks and threading issues. These take >4 minutes.

AB6DB3C3-5E6B-4836-BAB5-35CB9E66BC62

@hmottestad
Copy link
Contributor

These are really good tests that have found important bugs in the ShaclSail.

@hmottestad
Copy link
Contributor

Some googling didn’t give me much info about what the intention of the jmh lock is, so maybe we should try with -Djmh.ignoreLock=true.

@abrokenjester
Copy link
Contributor Author

Thanks for looking into this, I'll first focus on trimming down the other tests. There's big wins there I'm sure.

@abrokenjester
Copy link
Contributor Author

abrokenjester commented Aug 27, 2019

These are really good tests that have found important bugs in the ShaclSail.

Just had a closer look. When you first mentioned this I thought these were shared tests that deal with concurrency and transaction isolation, but they are tests specific to the ShaclSail. Yet they get executed for both native and memory store. If these tests are specifically about how the ShaclSail deals with multithreading, testing against just one backend Sail (either memory or native) should be enough. They both honor the same contracts.

Anyway, we can revisit this later. I'll try and focus on some of the shared test cases first, especially the SailConcurrency and RepositoryConnectionTest suite, which I think eat up a lot of time in general. Actually, looking at those more closely, we do a lot of unnecessary duplicate runs there as well, e.g. the RepositoryConnectionTest suite is run against both RDFS inferencers (old and new schema-caching), with both a memory and a native store underneath. I'll try and reorganize this.

@hmottestad
Copy link
Contributor

They detected a Concurrent Modification in the TreeModel used by the NativeStore when run with Serializable isolation. This was “fixed” by reordering the calls to .prepare() inside the ShaclSailConnection. I still think it’s actually a bug somewhere in the transaction handling in RDF4J in general, but it’s super hard to figure out since it doesn’t happen when debugging!

@hmottestad
Copy link
Contributor

Btw. I’ve previously used maven profiles to run tests based on if the name of the test class ended in something specific. So we could separate out tests that are slow and we don’t need to run that often by renaming to classes to *CI_NIGHTLY_TESTS of something like that. Then have Jenkins fun them periodically or only on commits to develop/master.

It should also be possible to do this with java annotations but I’ve never managed to get that to work.

Slimming down the maven build plugins might also help. mvn install -dSkipTests has always been fairly slow, even when we had multiple repos. Maybe this is what partially due to the maven plugins?

@abrokenjester
Copy link
Contributor Author

They detected a Concurrent Modification in the TreeModel used by the NativeStore when run with Serializable isolation. This was “fixed” by reordering the calls to .prepare() inside the ShaclSailConnection. I still think it’s actually a bug somewhere in the transaction handling in RDF4J in general, but it’s super hard to figure out since it doesn’t happen when debugging!

That does sound like a bug in the native store. Could we try and reproduce this in a test for the native store directly (that is, without using the SHACLSail on top)? If we know what the SHACLSail tried to do it should be possible to recreate the circumstances.

@hmottestad
Copy link
Contributor

I would love that. But I’m barely able to reproduce it through the SHACL sail. And I still don’t know why it’s happening or where the concurrent modification is coming from.

@barthanssens barthanssens added the 📶 enhancement issue is a new feature or improvement label Aug 28, 2019
@hmottestad
Copy link
Contributor

I made a PR for you to look at @jeenbroekstra wrt. the concurrent modification exception.

Btw. My biggest issue right now is the build time. A full clean build took me 11 minutes on my (aging) macbook.

[INFO] Reactor Summary for Eclipse RDF4J 3.0.1-SNAPSHOT:
[INFO] 
[INFO] Eclipse RDF4J ...................................... SUCCESS [  3.878 s]
[INFO] RDF4J Core ......................................... SUCCESS [  0.803 s]
[INFO] RDF4J: util ........................................ SUCCESS [ 16.591 s]
[INFO] RDF4J: Model ....................................... SUCCESS [ 11.620 s]
[INFO] RDF4J: SparqlBuilder ............................... SUCCESS [  8.321 s]
[INFO] RDF4J: Rio ......................................... SUCCESS [  0.747 s]
[INFO] RDF4J: Rio - API ................................... SUCCESS [  8.512 s]
[INFO] RDF4J: Rio - Languages ............................. SUCCESS [  1.749 s]
[INFO] RDF4J: Rio - Datatypes ............................. SUCCESS [  2.351 s]
[INFO] RDF4J: Query ....................................... SUCCESS [  6.804 s]
[INFO] RDF4J: Rio - Binary ................................ SUCCESS [  2.499 s]
[INFO] RDF4J: Rio - JSON-LD ............................... SUCCESS [  6.298 s]
[INFO] RDF4J: Rio - N-Triples ............................. SUCCESS [  3.614 s]
[INFO] RDF4J: Rio - Turtle ................................ SUCCESS [  6.313 s]
[INFO] RDF4J: Rio - N3 (writer-only) ...................... SUCCESS [  1.568 s]
[INFO] RDF4J: Rio - N-Quads ............................... SUCCESS [  2.622 s]
[INFO] RDF4J: Rio - RDF/JSON .............................. SUCCESS [  5.498 s]
[INFO] RDF4J: Rio - RDF/XML ............................... SUCCESS [  3.757 s]
[INFO] RDF4J: Rio - TriX .................................. SUCCESS [  2.838 s]
[INFO] RDF4J: Rio - TriG .................................. SUCCESS [  4.891 s]
[INFO] RDF4J: Query result IO ............................. SUCCESS [  0.746 s]
[INFO] RDF4J: Query result IO - API ....................... SUCCESS [  3.581 s]
[INFO] RDF4J: Query result IO - binary .................... SUCCESS [  4.885 s]
[INFO] RDF4J: Query result IO - SPARQL/JSON ............... SUCCESS [  3.012 s]
[INFO] RDF4J: Query result IO - SPARQL/XML ................ SUCCESS [  3.008 s]
[INFO] RDF4J: Query result IO - plain text booleans ....... SUCCESS [  5.362 s]
[INFO] RDF4J: Query algebra ............................... SUCCESS [  0.683 s]
[INFO] RDF4J: Query algebra - model ....................... SUCCESS [  5.932 s]
[INFO] RDF4J: Query parser ................................ SUCCESS [  2.970 s]
[INFO] RDF4J: Query parser - API .......................... SUCCESS [  9.131 s]
[INFO] RDF4J: Query parser - SeRQL ........................ SUCCESS [  9.964 s]
[INFO] RDF4J: Repository .................................. SUCCESS [  1.497 s]
[INFO] RDF4J: Repository - API ............................ SUCCESS [ 12.313 s]
[INFO] RDF4J: HTTP ........................................ SUCCESS [  1.121 s]
[INFO] RDF4J: HTTP protocol ............................... SUCCESS [  6.526 s]
[INFO] RDF4J: HTTP client ................................. SUCCESS [  4.717 s]
[INFO] RDF4J: Query parser - SPARQL ....................... SUCCESS [ 14.338 s]
[INFO] RDF4J: SPARQL Repository ........................... SUCCESS [  7.719 s]
[INFO] RDF4J: Query algebra - evaluation .................. SUCCESS [ 18.387 s]
[INFO] RDF4J: Sail ........................................ SUCCESS [  2.599 s]
[INFO] RDF4J: Sail API .................................... SUCCESS [  7.632 s]
[INFO] RDF4J: Sail base implementations ................... SUCCESS [  7.172 s]
[INFO] RDF4J: MemoryStore ................................. SUCCESS [ 10.929 s]
[INFO] RDF4J: SailRepository .............................. SUCCESS [ 12.587 s]
[INFO] RDF4J: Query algebra - GeoSPARQL ................... SUCCESS [ 21.465 s]
[INFO] RDF4J: Query Rendering ............................. SUCCESS [  7.526 s]
[INFO] RDF4J: Sail Model .................................. SUCCESS [  6.463 s]
[INFO] RDF4J: Inferencer Sails ............................ SUCCESS [  7.762 s]
[INFO] RDF4J: NativeStore ................................. SUCCESS [ 11.996 s]
[INFO] RDF4J: Repository - event (wrapper) ................ SUCCESS [  6.996 s]
[INFO] RDF4J: HTTPRepository .............................. SUCCESS [  4.824 s]
[INFO] RDF4J: Repository manager .......................... SUCCESS [ 11.940 s]
[INFO] RDF4J: Federation SAIL ............................. SUCCESS [  8.425 s]
[INFO] RDF4J: SPIN ........................................ SUCCESS [  7.992 s]
[INFO] RDF4J: SPIN SAIL ................................... SUCCESS [  5.209 s]
[INFO] RDF4J: SHACL ....................................... SUCCESS [ 15.160 s]
[INFO] RDF4J Lucene Sail API .............................. SUCCESS [  4.260 s]
[INFO] RDF4J Lucene Sail Index ............................ SUCCESS [  7.355 s]
[INFO] RDF4J Lucene Sail Spin ............................. SUCCESS [  5.270 s]
[INFO] RDF4J Solr Sail Index .............................. SUCCESS [  7.620 s]
[INFO] RDF4J Elastic Search Sail Index .................... SUCCESS [  4.454 s]
[INFO] RDF4J: DatasetRepository (wrapper) ................. SUCCESS [  1.933 s]
[INFO] RDF4J: Repository - context aware (wrapper) ........ SUCCESS [  3.287 s]
[INFO] RDF4J: Client Library .............................. SUCCESS [01:51 min]
[INFO] RDF4J: Storage ..................................... SUCCESS [ 45.788 s]
[INFO] RDF4J Tools ........................................ SUCCESS [  1.451 s]
[INFO] RDF4J: application configuration ................... SUCCESS [  3.461 s]
[INFO] RDF4J: Console ..................................... SUCCESS [  8.431 s]
[INFO] RDF4J: HTTP server - core .......................... SUCCESS [  8.968 s]
[INFO] RDF4J: HTTP server ................................. SUCCESS [  8.333 s]
[INFO] RDF4J Workbench .................................... SUCCESS [ 14.971 s]
[INFO] RDF4J: Runtime ..................................... SUCCESS [ 37.649 s]
[INFO] RDF4J: Runtime - OSGi .............................. SUCCESS [  9.886 s]
[INFO] RDF4J : Test Suites ................................ SUCCESS [  0.706 s]
[INFO] RDF4J Rio compliance test suite .................... SUCCESS [  5.635 s]
[INFO] RDF4J SPARQL compliance test suite ................. SUCCESS [  7.198 s]
[INFO] RDF4J SeRQL test suite ............................. SUCCESS [  2.147 s]
[INFO] RDF4J SAIL and Repository API compliance test suite  SUCCESS [  3.537 s]
[INFO] RDF4J SHACL compliance test suite .................. SUCCESS [  2.347 s]
[INFO] RDF4J Lucene Sail Tests ............................ SUCCESS [  3.328 s]
[INFO] RDF4J GeoSPARQL compliance test suite .............. SUCCESS [  1.888 s]
[INFO] RDF4J: benchmarks .................................. SUCCESS [  5.815 s]
[INFO] RDF4J BOM .......................................... SUCCESS [  0.689 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  11:39 min
[INFO] Finished at: 2019-08-28T23:29:01+02:00
[INFO] ------------------------------------------------------------------------

@abrokenjester
Copy link
Contributor Author

I made a PR for you to look at @jeenbroekstra wrt. the concurrent modification exception.

Good stuff, thanks @hmottestad . Let's continue discussion on that particular issue there.

@abrokenjester
Copy link
Contributor Author

Btw. My biggest issue right now is the build time. A full clean build took me 11 minutes on my (aging) macbook.

Actually, 11 minutes for a full clean project build doesn't sound too bad (I assume that includes quite a few dependency downloads as well). What exact command did you use to do this? From the times I it looks it doesn't include running the compliance tests, but did this include running unit tests?

I'm also noting that creating the rdf4j-client and rdf4j-storage shaded jars together takes up nearly 3 minutes. That might be an easy win: getting rid of those shaded artifacts.

@hmottestad
Copy link
Contributor

Actually, 11 minutes for a full clean project build doesn't sound too bad (I assume that includes quite a few dependency downloads as well). What exact command did you use to do this?

mvn clean install -DskipTests

All dependencies were downloaded already.

abrokenjester added a commit that referenced this issue Aug 31, 2019
duplicate of CustomGraphcQueryInferencerRepositoryConnectionTest in
rdf4j-sail-inferencer, use of native store instead of memory adds little
value.
@abrokenjester abrokenjester self-assigned this Aug 31, 2019
abrokenjester pushed a commit that referenced this issue Sep 8, 2019
abrokenjester added a commit that referenced this issue Nov 3, 2019
* GH-1515 moved inferencing tests out of compliance

* GH-1515 eliminated rdf4j-store-testsuite and compliance module

* GH-1515 fixed-version dependency for rio testsuite no longer necessary
@abrokenjester abrokenjester reopened this Nov 3, 2019
@abrokenjester
Copy link
Contributor Author

Not quite done, I want to take a closer look at some of the native store tests and the SHACL test suite as well, since those are the two biggest chunks of build time.

@abrokenjester
Copy link
Contributor Author

For some reason, after merging this into master, Eclipse IDE is giving me a hard time on project builds. It looks as if its maven project builder can't handle the new structure and gets stuck in a loop. At a loss as to why this didn't happen on the feature branch, tbh.

Everything works as expected using command line maven though so for now assuming this is an Eclipse M2E bug. I may need to revisit the idea to use test-jars though.

@abrokenjester
Copy link
Contributor Author

A full cleanup of my Eclipse workspace and a re-import of the entire project as a generic Java project (after running mvn eclipse:eclipse from command line) seems to have done the trick. For now asssuming it was a glitch.

Folks: do let me know if your IDE of choice is having trouble with the changed test structure. The point of this exercise is to make it easier, not harder :)

@abrokenjester
Copy link
Contributor Author

Unfortunately the issue with Eclipse builds getting stuck in a loop is recurring. Investigating cause and possible fixes (either in our project setup or some config setting in Eclipse I'm overlooking).

abrokenjester added a commit that referenced this issue Dec 27, 2019
* GH-1515 remove unnecessary stdout printing from tests

* GH-1695 GH-1515 standardized logback, warn level for test output

* GH-1515 avoid compliance test deployment, fix test output

* remove code for tweaking logger
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📶 enhancement issue is a new feature or improvement
Projects
None yet
3 participants