Make RuntimeAsyncCommandsTest more reliable by hubertp · Pull Request #12602 · enso-org/enso

hubertp · 2025-03-21T15:46:15Z

Pull Request Description

Tests are doing more harm than good when they fail randomly. Also, test suite shouldn't take minutes.

Using monitors instead of relaying on flaky Thread.sleep. Changed the output of one test which was relying on sometimes odd interaction with safepoints.

This change should make RuntimeAsyncCommandsTest more reliable while still giving us some feedback about interrupts.
The whole test suite should also run in a couple of seconds.

Closes #11576 🤞

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

All code follows the
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
Unit tests have been written where possible.

Tests are doing more harm than good when they fail randomly. Also, test suite shouldn't take minutes. Using monitors instead of relaying on flaky `Thread.sleep`. Changed the output of one test which was relying on sometimes odd interaction with safepoints. This change should make `RuntimeAsyncCommandsTest` more reliable while still giving us some feedback about interrupts.

JaroslavTulach

wait is better than sleep
thanks for improving the code
suggestion: wrap the awaiting code into a single helper method
enter synchronized first and only then loop
check condition first and only if not satisfied wait

...ion-tests/src/test/scala/org/enso/interpreter/test/instrument/RuntimeAsyncCommandsTest.scala

Akirathan · 2025-03-24T13:37:49Z

...ion-tests/src/test/scala/org/enso/interpreter/test/instrument/RuntimeAsyncCommandsTest.scala


-    val failures = responses.filter(_.payload.isInstanceOf[Api.ExecutionFailed])
+    val failures =
+      responses.filter(_.payload.isInstanceOf[Api.ExecutionComplete])


Is this change deliberate?

Yes, see my comment in the PR

Maybe Windows is just slow by definition.

JaroslavTulach

possibly there is problem calling reset too early
otherwise I like the simplification of the testing code

JaroslavTulach · 2025-03-25T09:42:57Z

...ion-tests/src/test/scala/org/enso/interpreter/test/instrument/RuntimeAsyncCommandsTest.scala

+      val expectedList      = expected.toList
+      monitor.synchronized {
+        while (!receivedExpected && iteration < 20) {
+          out = readAndReset()


readAndReset() is destructive

it cleans the already collected text

it is suspicious to reset() before the condition is met

what if two lines are expected and ...

... out contains just the first one and the check fails

... then another line is added to out, but ...

... the first line is already reset()?

JaroslavTulach · 2025-03-25T09:45:18Z

...ion-tests/src/test/scala/org/enso/interpreter/test/instrument/RuntimeAsyncCommandsTest.scala

-      isProgramStarted = out == List("started")
-      iteration += 1
-    }
+    val isProgramStarted = context.out.awaitOnText("started")


Simpler.

context.out.assertAwaitOnText("started", "Program start time out");

Having an assert like check would remove one local variable and three subsequent lines reporting the failure.

...ion-tests/src/test/scala/org/enso/interpreter/test/instrument/RuntimeAsyncCommandsTest.scala

JaroslavTulach · 2025-03-25T09:48:45Z

...ion-tests/src/test/scala/org/enso/interpreter/test/instrument/RuntimeAsyncCommandsTest.scala

-      iteration += 1
-    }
-
+    val reallyFinished = context.out.awaitOnText(exact = false, "True")


Suggested change

val reallyFinished = context.out.awaitOnText(exact = false, "True")

context.out.awaitOnText(exact = false, "True") shouldBe true

is relatively short as well. I still like assertAwaitOnText a bit more. Probably because it feels more JUnit4 like...

JaroslavTulach · 2025-03-26T06:19:05Z

The changes in this PR improve robustness. But: I am debugging RuntimeAsyncCommandsTest for last few days and alas, the changes in this PR are unlikely to help its stability:

"command-pool-1"
	at java.lang.Thread.interrupt(Thread.java:1713)
	at java.util.concurrent.FutureTask.cancel(FutureTask.java:173)
	at org.enso.interpreter.instrument.execution.JobExecutionEngine.maybeForceCancelRunningJob(JobExecutionEngine.scala:159)
	at org.enso.interpreter.instrument.execution.JobExecutionEngine.$anonfun$abortJobs$3(JobExecutionEngine.scala:330)

The great variability of the results comes from the randomness when this FutureTask.cancel and Thread.interrupt is called and where (in the process of executing the program thread) this happens. This is hard to track under debugger, but I have already seen many results - including ExecutionFailed as well as ExecutionComplete. I also witnessed just InterruptedContextResponse without execution result. This might be caused by

Try(
          executeProgram(contextId, stackItem, localCalls)
).toEither.left
          .map(onExecutionError(stackItem.item, _))

which (in my opinion) may not be catching InterruptedException when it happens by FutureTask.cancel at random moment of execution (for example during acquiring ReentrantLocking.withWriteCompilationLock such an exception is notified as Failed [{}] to acquire lock: interrupted and yields null.asInstanceOf[T]).

    try {
      lockTimestamp = acquireWriteCompilationLock(where)
      callable.call()
    } catch {
      case _: InterruptedException =>
        logger.debug(
          "Failed [{}] to acquire lock: interrupted",
          Array[Any](where.getSimpleName)
        )
        null.asInstanceOf[T]
    } finally {

if InterruptedException happens in callable.call(), then it is lost and null is returned.

JaroslavTulach · 2025-03-26T08:59:53Z

Another problem is in ProgramExecutionSupport:

 def onFailure(): Option[Api.ExecutionResult] = error match {
      case _: ThreadInterruptedException =>
        val message = s"Execution of function $itemName interrupted."
        logger.trace(message)
       None
      case _ =>
        val message = s"Execution of function $itemName failed ($reason)."
        logger.trace(message, error)
        Some(ExecutionResult.Failure(message, None))

if the interrupt happens in guest code (like in Enso calling Thread.sleep 100), then the exception here is going to be HostException wrapping InterruptException - that's what the test is checking for. However, if the exception happens elsewhere (not in Thread.sleep 100), then it may be ThreadInterruptedException and that yields None and the test fails.

hubertp added the CI: No changelog needed Do not require a changelog entry for this PR. label Mar 21, 2025

hubertp requested review from 4e6, Akirathan and JaroslavTulach as code owners March 21, 2025 15:46

JaroslavTulach approved these changes Mar 21, 2025

View reviewed changes

4e6 approved these changes Mar 21, 2025

View reviewed changes

...ion-tests/src/test/scala/org/enso/interpreter/test/instrument/RuntimeAsyncCommandsTest.scala Outdated Show resolved Hide resolved

hubertp added 2 commits March 24, 2025 13:13

refactoring

53d98fa

More refactorings

530b05f

Akirathan reviewed Mar 24, 2025

View reviewed changes

hubertp added 2 commits March 24, 2025 15:20

Merge branch 'develop' into wip/hubert/11576-flaky-test

b836e43

Double iterations when waiting on output

3ee1e92

Maybe Windows is just slow by definition.

JaroslavTulach approved these changes Mar 25, 2025

View reviewed changes

JaroslavTulach mentioned this pull request Mar 25, 2025

Reschedule guest code execution into dedicated thread pool #12613

Merged

3 tasks

hubertp added 3 commits March 25, 2025 12:48

Bring back old timeout

bd9e7d8

nit

eb5331e

Merge branch 'develop' into wip/hubert/11576-flaky-test

ed0b580

hubertp added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Mar 25, 2025

hubertp added the CI: Ready to merge This PR is eligible for automatic merge label Mar 26, 2025

mergify bot merged commit 0d6db87 into develop Mar 26, 2025
73 of 74 checks passed

mergify bot deleted the wip/hubert/11576-flaky-test branch March 26, 2025 08:15

jdunkerley added this to the 2025-Q2 Release milestone Mar 26, 2025

JaroslavTulach mentioned this pull request Mar 26, 2025

Rely on ThreadManager when interrupting Enso execution #12655

Merged

2 tasks

This was referenced Mar 26, 2025

Flaky RuntimeAsyncCommandsTest #11576

Closed

Optimize runtime Docker image size #11529

Merged

JaroslavTulach mentioned this pull request May 4, 2025

Adjust RuntimeAsyncCommandsTest program start timeouts #13024

Closed

5 tasks

	val reallyFinished = context.out.awaitOnText(exact = false, "True")
	context.out.awaitOnText(exact = false, "True") shouldBe true

Comments

Conversation

hubertp commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Description

Checklist

Uh oh!

JaroslavTulach left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Akirathan Mar 24, 2025

Choose a reason for hiding this comment

Uh oh!

hubertp Mar 24, 2025

Choose a reason for hiding this comment

Uh oh!

JaroslavTulach left a comment

Choose a reason for hiding this comment

Uh oh!

JaroslavTulach Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JaroslavTulach Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JaroslavTulach Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

JaroslavTulach commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

JaroslavTulach commented Mar 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

hubertp commented Mar 21, 2025 •

edited

Loading

JaroslavTulach Mar 25, 2025 •

edited

Loading

JaroslavTulach commented Mar 26, 2025 •

edited

Loading