Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47233][CONNECT][SS][2/2] Client & Server logic for Client side streaming query listener #46037

Closed
wants to merge 6 commits into from

Conversation

WweiL
Copy link
Contributor

@WweiL WweiL commented Apr 13, 2024

What changes were proposed in this pull request?

Server and client side for the client side listener.

The client should start send a add_listener_bus_listener RPC for the first listener ever added.
The server should start a long running thread and register a new "SparkConnectListenerBusListener" upon receiving the RPC, the listener should stream back the listener events to the client using the responseObserver created in the executeHandler of the add_listener_bus_listener call.

On the client side, a spark client method: execute_long_running_command is created to continuously receive new events from the server with a long-running iterator. The client starts a new thread for handing such events. Please see the graphs below for a more detailed illustration.

When either the last client side listener is removed, and the client sends "remove_listener_bus_listener" call, or the send method of SparkConnectListenerBusListener throws, the long-running server thread is stopped, as an effect, the final ResultComplete is sent to the client, closing the client's long-running iterator.

Why are the changes needed?

Development of spark connect streaming

Does this PR introduce any user-facing change?

How was this patch tested?

Added unit test. Removed old unit test that created for verifying server-side listener limitations.

Was this patch authored or co-authored using generative AI tooling?

No

@WweiL WweiL changed the title Spark 47233 client side listener 2 [SPARK-47233][CONNECT][SS][2/2] Client & Server logic for Client side streaming query listener Apr 13, 2024
@WweiL WweiL marked this pull request as ready for review April 13, 2024 02:16
@HyukjinKwon
Copy link
Member

Merged to master.

WweiL added a commit to WweiL/oss-spark that referenced this pull request May 2, 2024
… streaming query listener

Server and client side for the client side listener.

The client should start send a `add_listener_bus_listener` RPC for the first listener ever added.
The server should start a long running thread and register a new "SparkConnectListenerBusListener" upon receiving the RPC, the listener should stream back the listener events to the client using the `responseObserver` created in the `executeHandler` of the `add_listener_bus_listener` call.

On the client side, a spark client method: `execute_long_running_command` is created to continuously receive new events from the server with a long-running iterator. The client starts a new thread for handing such events. Please see the graphs below for a more detailed illustration.

When either the last client side listener is removed, and the client sends "remove_listener_bus_listener" call, or the `send` method of `SparkConnectListenerBusListener` throws, the long-running server thread is stopped, as an effect, the final `ResultComplete` is sent to the client, closing the client's long-running iterator.

Development of spark connect streaming

Added unit test. Removed old unit test that created for verifying server-side listener limitations.

No

Closes apache#46037 from WweiL/SPARK-47233-client-side-listener-2.

Authored-by: Wei Liu <wei.liu@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants