[SPARK-57073][SS][PYTHON][TESTS] Fix flaky test_listener_events_spark_command by zhengruifeng · Pull Request #56109 · apache/spark

zhengruifeng · 2026-05-26T07:08:22Z

What changes were proposed in this pull request?

Reintroduce the time.sleep(10) before reading the listener-written tables in test_listener_events_spark_command, and add a sticky comment explaining why the sleep must not be removed or replaced with @eventually.

Why are the changes needed?

SPARK-54957 replaced a time.sleep(60) with @eventually(timeout=60, catch_assertions=True). eventually only retries AssertionError by default. spark.read.table() on a missing table raises AnalysisException (TABLE_OR_VIEW_NOT_FOUND), which is not in the caught set, so the decorator aborts on the first attempt instead of polling.

The onQueryTerminated callback fires asynchronously after q.stop(), so the listener_terminated_events table is sometimes not yet written when the test first reads it. This makes the test flaky; it has failed on the Build / Python-only (master, Python 3.12, MacOS26) scheduled workflow on:

2026-05-23 — https://github.com/apache/spark/actions/runs/26346300968/job/77556662680
2026-05-25 — https://github.com/apache/spark/actions/runs/26423905857/job/77783724134

Both failed with the same exception at the same line.

A 10s sleep gives an order-of-magnitude safety margin over the observed sub-second race window, while avoiding the 60s waste that SPARK-54957 was trying to eliminate.

Does this PR introduce any user-facing change?

No, test-only change.

How was this patch tested?

Existing test pyspark.sql.tests.connect.streaming.test_parity_listener.StreamingListenerParityTests.test_listener_events_spark_command.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.7

### What changes were proposed in this pull request? Reintroduce the `time.sleep(30)` before reading the listener-written tables in `test_listener_events_spark_command`, and add a sticky comment explaining why the sleep must not be removed or replaced with `@eventually`. ### Why are the changes needed? SPARK-54957 replaced a `time.sleep(60)` with `@eventually(timeout=60, catch_assertions=True)`. `eventually` only retries `AssertionError` by default. `spark.read.table()` on a missing table raises `AnalysisException` (`TABLE_OR_VIEW_NOT_FOUND`), which is not in the caught set, so the decorator aborts on the first attempt instead of polling. The `onQueryTerminated` callback fires asynchronously after `q.stop()`, so the `listener_terminated_events` table is sometimes not yet written when the test first reads it. This makes the test flaky; it has failed on the `Build / Python-only (master, Python 3.12, MacOS26)` scheduled workflow on: - 2026-05-23 — https://github.com/apache/spark/actions/runs/26346300968/job/77556662680 - 2026-05-25 — https://github.com/apache/spark/actions/runs/26423905857/job/77783724134 Both failed with the same exception at the same line. ### Does this PR introduce _any_ user-facing change? No, test-only change. ### How was this patch tested? Existing test `pyspark.sql.tests.connect.streaming.test_parity_listener.StreamingListenerParityTests.test_listener_events_spark_command`. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.7

gaogaotiantian · 2026-05-26T17:10:00Z

I think the better way to solve this is to catch the potential exception in eventually. We can either add AnalysisException in expected_exceptions, or do it explicitly in the wrapped function. We should avoid using time.sleep as much as possible. That's a potential cause for unstable tests.

zhengruifeng force-pushed the fix-listener-events-spark-command-dev4 branch from 0c58ed1 to 5c34f7e Compare May 26, 2026 09:01

zhengruifeng force-pushed the fix-listener-events-spark-command-dev4 branch from 5c34f7e to 7065ea1 Compare May 26, 2026 09:35

zhengruifeng changed the title ~~[SS][PYTHON][TESTS] Fix flaky test_listener_events_spark_command~~ [SPARK-57073][SS][PYTHON][TESTS] Fix flaky test_listener_events_spark_command May 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-57073][SS][PYTHON][TESTS] Fix flaky test_listener_events_spark_command#56109

[SPARK-57073][SS][PYTHON][TESTS] Fix flaky test_listener_events_spark_command#56109
zhengruifeng wants to merge 1 commit into
apache:masterfrom
zhengruifeng:fix-listener-events-spark-command-dev4

zhengruifeng commented May 26, 2026 •

edited

Loading

Uh oh!

gaogaotiantian commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhengruifeng commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

gaogaotiantian commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zhengruifeng commented May 26, 2026 •

edited

Loading