Skip to content

Commit

Permalink
[SPARK-47035][SS][CONNECT] Protocol for Client-Side Listener
Browse files Browse the repository at this point in the history
Currently, the StreamingQueryListener for Connect runs on the server side.

From the user point of view, the purpose of a listener is mainly to have a pushing mechanism that notifies them when queries start / end / make progress. Before, the server-side listener essentially loses this functionality, and we find out that internally there are needs for the client side listener.

The new listener will be running on the client side, with the server continuously pushing the new listener events to the client, and the client will call corresponding callback functions for different listener event types.

This is the first PR that defines the protocol of this new listener.

Add client side listener which makes more sense.

Not for this one.

No need for this one, new tests will be added later.

No

Closes apache#45091 from WweiL/SPARK-47035-protocol.

Authored-by: Wei Liu <wei.liu@databricks.com>
Signed-off-by: Takuya UESHIN <ueshin@databricks.com>
  • Loading branch information
WweiL committed May 2, 2024
1 parent d82403f commit 974eb16
Show file tree
Hide file tree
Showing 3 changed files with 245 additions and 49 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,9 @@ message WriteStreamOperationStartResult {
// An optional query name.
string name = 2;

// Optional query started event if there is any listener registered on the client side.
optional string query_started_event_json = 3;

// TODO: How do we indicate errors?
// TODO: Consider adding status, last progress etc here.
}
Expand Down Expand Up @@ -407,6 +410,40 @@ message StreamingQueryManagerCommandResult {
}
}

// The protocol for client-side StreamingQueryListener.
// This command will only be set when either the first listener is added to the client, or the last
// listener is removed from the client.
// The add_listener_bus_listener command will only be set true in the first case.
// The remove_listener_bus_listener command will only be set true in the second case.
message StreamingQueryListenerBusCommand {
oneof command {
bool add_listener_bus_listener = 1;
bool remove_listener_bus_listener = 2;
}
}

// The enum used for client side streaming query listener event
// There is no QueryStartedEvent defined here,
// it is added as a field in WriteStreamOperationStartResult
enum StreamingQueryEventType {
QUERY_PROGRESS_UNSPECIFIED = 0;
QUERY_PROGRESS_EVENT = 1;
QUERY_TERMINATED_EVENT = 2;
QUERY_IDLE_EVENT = 3;
}

// The protocol for the returned events in the long-running response channel.
message StreamingQueryListenerEvent {
// (Required) The json serialized event, all StreamingQueryListener events have a json method
string event_json = 1;
// (Required) Query event type used by client to decide how to deserialize the event_json
StreamingQueryEventType event_type = 2;
}

message StreamingQueryListenerEventsResult {
repeated StreamingQueryListenerEvent events = 1;
}

// Command to get the output of 'SparkContext.resources'
message GetResourcesCommand { }

Expand Down

0 comments on commit 974eb16

Please sign in to comment.