Skip to content

Comments

Make RuntimeAsyncCommandsTest more reliable#12602

Merged
mergify[bot] merged 8 commits intodevelopfrom
wip/hubert/11576-flaky-test
Mar 26, 2025
Merged

Make RuntimeAsyncCommandsTest more reliable#12602
mergify[bot] merged 8 commits intodevelopfrom
wip/hubert/11576-flaky-test

Conversation

@hubertp
Copy link
Collaborator

@hubertp hubertp commented Mar 21, 2025

Pull Request Description

Tests are doing more harm than good when they fail randomly. Also, test suite shouldn't take minutes.

Using monitors instead of relaying on flaky Thread.sleep. Changed the output of one test which was relying on sometimes odd interaction with safepoints.

This change should make RuntimeAsyncCommandsTest more reliable while still giving us some feedback about interrupts.
The whole test suite should also run in a couple of seconds.

Closes #11576 🤞

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • All code follows the
    Scala,
    Java,
    TypeScript,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • Unit tests have been written where possible.

Tests are doing more harm than good when they fail randomly.
Also, test suite shouldn't take minutes.

Using monitors instead of relaying on flaky `Thread.sleep`.
Changed the output of one test which was relying on sometimes odd
interaction with safepoints.

This change should make `RuntimeAsyncCommandsTest` more reliable while
still giving us some feedback about interrupts.
@hubertp hubertp added the CI: No changelog needed Do not require a changelog entry for this PR. label Mar 21, 2025
Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • wait is better than sleep
  • thanks for improving the code
  • suggestion: wrap the awaiting code into a single helper method
  • enter synchronized first and only then loop
  • check condition first and only if not satisfied wait


val failures = responses.filter(_.payload.isInstanceOf[Api.ExecutionFailed])
val failures =
responses.filter(_.payload.isInstanceOf[Api.ExecutionComplete])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change deliberate?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see my comment in the PR

Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • possibly there is problem calling reset too early
  • otherwise I like the simplification of the testing code

val expectedList = expected.toList
monitor.synchronized {
while (!receivedExpected && iteration < 20) {
out = readAndReset()
Copy link
Member

@JaroslavTulach JaroslavTulach Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readAndReset() is destructive

  • it cleans the already collected text
  • it is suspicious to reset() before the condition is met
  • what if two lines are expected and ...
  • ... out contains just the first one and the check fails
  • ... then another line is added to out, but ...
  • ... the first line is already reset()?

isProgramStarted = out == List("started")
iteration += 1
}
val isProgramStarted = context.out.awaitOnText("started")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simpler.

context.out.assertAwaitOnText("started", "Program start time out");

Having an assert like check would remove one local variable and three subsequent lines reporting the failure.

iteration += 1
}

val reallyFinished = context.out.awaitOnText(exact = false, "True")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
val reallyFinished = context.out.awaitOnText(exact = false, "True")
context.out.awaitOnText(exact = false, "True") shouldBe true

is relatively short as well. I still like assertAwaitOnText a bit more. Probably because it feels more JUnit4 like...

@hubertp hubertp added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Mar 25, 2025
@JaroslavTulach
Copy link
Member

JaroslavTulach commented Mar 26, 2025

The changes in this PR improve robustness. But: I am debugging RuntimeAsyncCommandsTest for last few days and alas, the changes in this PR are unlikely to help its stability:

"command-pool-1"
	at java.lang.Thread.interrupt(Thread.java:1713)
	at java.util.concurrent.FutureTask.cancel(FutureTask.java:173)
	at org.enso.interpreter.instrument.execution.JobExecutionEngine.maybeForceCancelRunningJob(JobExecutionEngine.scala:159)
	at org.enso.interpreter.instrument.execution.JobExecutionEngine.$anonfun$abortJobs$3(JobExecutionEngine.scala:330)

The great variability of the results comes from the randomness when this FutureTask.cancel and Thread.interrupt is called and where (in the process of executing the program thread) this happens. This is hard to track under debugger, but I have already seen many results - including ExecutionFailed as well as ExecutionComplete. I also witnessed just InterruptedContextResponse without execution result. This might be caused by

Try(
          executeProgram(contextId, stackItem, localCalls)
).toEither.left
          .map(onExecutionError(stackItem.item, _))

which (in my opinion) may not be catching InterruptedException when it happens by FutureTask.cancel at random moment of execution (for example during acquiring ReentrantLocking.withWriteCompilationLock such an exception is notified as Failed [{}] to acquire lock: interrupted and yields null.asInstanceOf[T]).

    try {
      lockTimestamp = acquireWriteCompilationLock(where)
      callable.call()
    } catch {
      case _: InterruptedException =>
        logger.debug(
          "Failed [{}] to acquire lock: interrupted",
          Array[Any](where.getSimpleName)
        )
        null.asInstanceOf[T]
    } finally {

if InterruptedException happens in callable.call(), then it is lost and null is returned.

@hubertp hubertp added the CI: Ready to merge This PR is eligible for automatic merge label Mar 26, 2025
@mergify mergify bot merged commit 0d6db87 into develop Mar 26, 2025
73 of 74 checks passed
@mergify mergify bot deleted the wip/hubert/11576-flaky-test branch March 26, 2025 08:15
@JaroslavTulach
Copy link
Member

Another problem is in ProgramExecutionSupport:

 def onFailure(): Option[Api.ExecutionResult] = error match {
      case _: ThreadInterruptedException =>
        val message = s"Execution of function $itemName interrupted."
        logger.trace(message)
       None
      case _ =>
        val message = s"Execution of function $itemName failed ($reason)."
        logger.trace(message, error)
        Some(ExecutionResult.Failure(message, None))

if the interrupt happens in guest code (like in Enso calling Thread.sleep 100), then the exception here is going to be HostException wrapping InterruptException - that's what the test is checking for. However, if the exception happens elsewhere (not in Thread.sleep 100), then it may be ThreadInterruptedException and that yields None and the test fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI: Clean build required CI runners will be cleaned before and after this PR is built. CI: No changelog needed Do not require a changelog entry for this PR. CI: Ready to merge This PR is eligible for automatic merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky RuntimeAsyncCommandsTest

5 participants