[BROOKLYN-356] Transformer value resolver test by neykov · Pull Request #365 · apache/brooklyn-server

neykov · 2016-10-02T13:39:26Z

Adds thread startup jittering so we can lure out any concurrency bugs hiding in the code base. Testing core with this results in a handful of failures. Similar problems are very apparent on the windows builds where the machines have a single CPU and are frequently failing. I'd enable this by default for testing only if it wasn't slowing the tests to a craw.
Adds a test which shows the problem with the transformer resolver. It's failing under normal conditions infrequently, so use with the above enabled.

Tests are marked as Broken because there's no fix available for BOOKLYN-356 yet so they might fail builds. The tests are otherwise fully implemented/working.

aledsage · 2016-10-05T09:21:54Z

core/src/main/java/org/apache/brooklyn/util/core/task/BasicExecutionManager.java

        invokeCallback(flags.get("newTaskStartCallback"), task);
    }

+    private void jitterThreadStart(Task<?> task) {


Great idea!

I wonder if we should make it non-static so it can be set by a test, and unset in its tearDown method. But let's go with static. For now, we can use it by setting the system property when doing mvn clean install.

Once we've fixed (most of) the failures that it exposes, then we can think about enabling it for some tests in jenkins - for regression testing purposes. But we want jenkins to be deterministic!

aledsage · 2016-10-05T09:23:19Z

core/src/main/java/org/apache/brooklyn/util/core/task/BasicExecutionManager.java

                Throwable error = null;
-                String oldThreadName = Thread.currentThread().getName();
                try {
-                    if (RENAME_THREADS) {


I presume this is removed from here because internalBeforeStart() and internalAfterEnd() are already doing the thread renaming? If so, then great.

Correct. Details in the commit message:

Remove duplicate thread renaming The code in beforeStartAtomicTask overrides the value set here. Keeping the other one as it's called for scheduled tasks as well.

aledsage · 2016-10-05T09:31:23Z

...src/test/java/org/apache/brooklyn/camp/brooklyn/enricher/TransformerEnricherWithDslTest.java

+    }
+
+    @Test(groups={"Integration", "Broken"}, invocationCount=10)
+    public void testTransformerResolvesResolvableValuesIntegration() {


Nice test. Without the jitter, I ran it with invocationCount=100 and it failed 2 out of 100 times with errors like:

java.lang.AssertionError: entity=Application[s01l91es]; attribute=Sensor: port.transformed (java.lang.String) expected [10889] but found [null]

aledsage · 2016-10-05T09:32:04Z

...src/test/java/org/apache/brooklyn/camp/brooklyn/enricher/TransformerEnricherWithDslTest.java

+        testTransformerResolvesResolvableValues(START_PORT, 200);
+    }
+
+    @Test(groups={"Integration", "Broken"}, invocationCount=10)


Worth including a comment above the test that points people to https://issues.apache.org/jira/browse/BROOKLYN-356

aledsage · 2016-10-05T09:39:45Z

Jenkins build failure looks unrelated: when building "Brooklyn CAMP REST API", got Build timed out (after 60 minutes). Marking the build as aborted.

It was in the middle of invoking DslAndRebindYamlTest.testDslAttributeWhenReadyPersistedWithoutLeakingResolvedValue(). Below is the logging. Note the long GC pause times (approx 0.2 seconds each time). That suggests to me that it was doing a full GC. But it's hard to say on a shared build machine that might be starved of resources.

We should (separately) investigate DslAndRebindYamlTest to see how resource-hungry it is.

Unfortunately we don't have the "brooklyn gc" logging to tell us about the memory usage over time (it's logged at debug). But then for all tests except that one, the test would complete within one minute so we'd have deleted the managementContext before the brooklynGC did any logging.

Maybe we want something like the BrooklynLeakListener to spawn a scheduled executor in the background, and to log the memory usage of the VM periodically so we can see what the test suite is doing over time?

2016-10-02 21:04:32,889 INFO  TESTNG INVOKING: "Surefire test" - org.apache.brooklyn.camp.brooklyn.DslAndRebindYamlTest.testDslAttributeWhenReadyPersistedWithoutLeakingResolvedValue()
2016-10-02 21:04:33,052 INFO  Started application BasicApplicationImpl{id=gpt66t7160}
2016-10-02 21:04:33,052 INFO  Waiting on 3 task(s)
SUREFIRE-859: [GC 538982K->318659K(759808K), 0.1438800 secs]
SUREFIRE-859: [GC 542915K->355674K(767488K), 0.3288260 secs]
SUREFIRE-859: [GC 579930K->383114K(718336K), 0.2902200 secs]
SUREFIRE-859: [GC 558218K->401746K(736256K), 0.2247390 secs]
SUREFIRE-859: [GC 576850K->402834K(729088K), 0.2136890 secs]
SUREFIRE-859: [GC 559506K->395786K(711680K), 0.1718160 secs]
SUREFIRE-859: [GC 552458K->398330K(730624K), 0.1999770 secs]
SUREFIRE-859: [GC 550906K->397842K(709632K), 0.1918430 secs]
SUREFIRE-859: [GC 550418K->397994K(730112K), 0.1925270 secs]
SUREFIRE-859: [GC 548522K->395114K(730624K), 0.1828260 secs]
SUREFIRE-859: [GC 545642K->396522K(732160K), 0.1813480 secs]
SUREFIRE-859: [GC 549610K->398146K(731648K), 0.1988880 secs]
SUREFIRE-859: [GC 551234K->398202K(732672K), 0.1862850 secs]
SUREFIRE-859: [GC 552314K->396794K(732160K), 0.1816440 secs]
SUREFIRE-859: [GC 550906K->397530K(734208K), 0.1915190 secs]
SUREFIRE-859: [GC 554202K->398810K(733184K), 0.1886670 secs]
SUREFIRE-859: [GC 555482K->398810K(734720K), 0.1962980 secs]
SUREFIRE-859: [GC 557530K->398066K(734720K), 0.1893480 secs]
SUREFIRE-859: [GC 556786K->398322K(736768K), 0.1909740 secs]
SUREFIRE-859: [GC 560114K->400122K(735744K), 0.1974540 secs]
SUREFIRE-859: [GC 561914K->400274K(736768K), 0.2081040 secs]
SUREFIRE-859: [GC 563090K->399050K(736768K), 0.1930720 secs]
SUREFIRE-859: [GC 561866K->399466K(738304K), 0.1916980 secs]
SUREFIRE-859: [GC 564842K->400842K(737792K), 0.2071120 secs]
SUREFIRE-859: [GC 566218K->400866K(738816K), 0.2107460 secs]
SUREFIRE-859: [GC 567266K->399898K(738304K), 0.1974350 secs]
SUREFIRE-859: [GC 566298K->400474K(739840K), 0.2096590 secs]
SUREFIRE-859: [GC 568922K->401506K(739328K), 0.2051750 secs]
SUREFIRE-859: [GC 569954K->401882K(740352K), 0.2205450 secs]
SUREFIRE-859: [GC 571866K->400362K(740352K), 0.1994660 secs]
SUREFIRE-859: [GC 570346K->401922K(740864K), 0.2073610 secs]
SUREFIRE-859: [GC 572930K->402458K(740864K), 0.2085150 secs]
SUREFIRE-859: [GC 573466K->402618K(741376K), 0.2088340 secs]
SUREFIRE-859: [GC 574138K->400634K(740864K), 0.1965650 secs]
SUREFIRE-859: [GC 572154K->402330K(742400K), 0.2098550 secs]
SUREFIRE-859: [GC 575898K->403098K(741888K), 0.2152220 secs]
SUREFIRE-859: [GC 576666K->401906K(742400K), 0.2178950 secs]
SUREFIRE-859: [GC 575986K->401714K(742400K), 0.2041080 secs]
Build timed out (after 60 minutes). Marking the build as aborted.

aledsage · 2016-10-05T09:40:53Z

@neykov looks very good to me. A couple of extremely minor things relating to this PR (the comment, and confirming about the thread renaming). Can you kick off the jenkins build again, and then merge this?

aledsage · 2016-10-05T09:49:02Z

@neykov previous build failure here could well be fixed by #368 - our BrooklynLeakListener wasn't finding any left-behind management contexts to stop them, so our memory usage might have been climbing over time.

Once that PR is confirmed to work in jenkins (including checking its stdout that the leak-detector really executed) then I'll merge #368).

Useful for troubleshooting/testing concurrency related code.

The code in beforeStartAtomicTask overrides the value set here. Keeping the other one as it's called for scheduled tasks as well.

neykov · 2016-10-11T06:21:30Z

Addressed comments and kicked off another build.

neykov force-pushed the transformer-resolver-test branch from 361574c to 1f82109 Compare October 2, 2016 20:12

aledsage reviewed Oct 5, 2016

View reviewed changes

neykov added 4 commits October 11, 2016 09:20

Adds optional task startup jitter

42db864

Useful for troubleshooting/testing concurrency related code.

Logging - link thread name to running task

5fc8344

Remove duplicate thread renaming

05622be

The code in beforeStartAtomicTask overrides the value set here. Keeping the other one as it's called for scheduled tasks as well.

Test Transformer reliability when resolving DSL

670f83d

neykov force-pushed the transformer-resolver-test branch from 1f82109 to 670f83d Compare October 11, 2016 06:20

asfgit merged commit 670f83d into apache:master Oct 12, 2016

asfgit pushed a commit that referenced this pull request Oct 12, 2016

This closes #365

f2ce1cf

neykov deleted the transformer-resolver-test branch October 12, 2016 13:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BROOKLYN-356] Transformer value resolver test#365

[BROOKLYN-356] Transformer value resolver test#365
asfgit merged 4 commits intoapache:masterfrom
neykov:transformer-resolver-test

neykov commented Oct 2, 2016 •

edited

Loading

Uh oh!

aledsage Oct 5, 2016

Uh oh!

aledsage Oct 5, 2016

Uh oh!

neykov Oct 11, 2016

Uh oh!

aledsage Oct 5, 2016

Uh oh!

aledsage Oct 5, 2016

Uh oh!

aledsage commented Oct 5, 2016

Uh oh!

aledsage commented Oct 5, 2016

Uh oh!

aledsage commented Oct 5, 2016

Uh oh!

neykov commented Oct 11, 2016 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

neykov commented Oct 2, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aledsage Oct 5, 2016

Choose a reason for hiding this comment

Uh oh!

aledsage Oct 5, 2016

Choose a reason for hiding this comment

Uh oh!

neykov Oct 11, 2016

Choose a reason for hiding this comment

Uh oh!

aledsage Oct 5, 2016

Choose a reason for hiding this comment

Uh oh!

aledsage Oct 5, 2016

Choose a reason for hiding this comment

Uh oh!

aledsage commented Oct 5, 2016

Uh oh!

aledsage commented Oct 5, 2016

Uh oh!

aledsage commented Oct 5, 2016

Uh oh!

neykov commented Oct 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

neykov commented Oct 2, 2016 •

edited

Loading

neykov commented Oct 11, 2016 •

edited

Loading