[SPARK-41628][CONNECT][SERVER] The Design for support async query execution#40649
[SPARK-41628][CONNECT][SERVER] The Design for support async query execution#40649Hisoka-X wants to merge 3 commits intoapache:masterfrom
Conversation
|
@grundprinzip @zhenlineo @LuciferYang Can you help me see if there is a problem? Thanks. |
|
I suggest placing the design doc on Google doc and initiating discussions in the dev mail list for more people to participate. Additionally, Spark Connect is not limited to Scala clients, so Python clients should also be considered. Meanwhile, there is still a lot of unfinished work on Spark Connect (in order to maintain the same behavior as the native Spark API), so I am not sure if everyone has the energy to discuss this new feature at the moment. |
Thanks for suggestion, I will add python design and move doc to google doc later. Then send mail. Before start to do this feature, I will try to do other Spark Connect missing features that need to be added |
|
@Hisoka-X thanks for the write up. We should be able to support most of this at the moment. GRPC supports this type of execution out of the box. The reason we did not really go for this, is because of API compatibility. The The thing that Martin was getting at in the ticket is more about what to do when disconnect happen. You probably want to reconnect in these cases, this does require some architectural rework. We are discussing how we should do this, there are quite a few trade offs here. Do you mind shelving this until we can provide a bit more clarity? Please let me know if you want in on these conversations. |
Ok for me. I would be happy if I could join the discussion |
What changes were proposed in this pull request?
The Design for support async query execution
Why are the changes needed?
Prepare for code async query execution
Does this PR introduce any user-facing change?
NO
How was this patch tested?
Unnecessary