[SPARK-51688][PYTHON][FOLLOW-UP] Implement UDS in Accumulators#50587
Closed
HyukjinKwon wants to merge 2 commits intoapache:masterfrom
Closed
[SPARK-51688][PYTHON][FOLLOW-UP] Implement UDS in Accumulators#50587HyukjinKwon wants to merge 2 commits intoapache:masterfrom
HyukjinKwon wants to merge 2 commits intoapache:masterfrom
Conversation
Member
Author
|
cc @ueshin |
dongjoon-hyun
approved these changes
Apr 15, 2025
f906f50 to
384a3ae
Compare
HyukjinKwon
commented
Apr 15, 2025
Member
Author
There was a problem hiding this comment.
I will disable this back when I merge. It will be disabled by default.
384a3ae to
2817ba4
Compare
2817ba4 to
489e564
Compare
Member
Author
|
Tests with the configuration enabled: https://github.com/HyukjinKwon/spark/actions/runs/14461375675 |
Contributor
There was a problem hiding this comment.
do we need to also add jobs core/sql/etc in the new scheduled job https://github.com/apache/spark/pull/50585/files ?
seems core is already tested https://github.com/apache/spark/actions/runs/14462775938/job/40558388704
zhengruifeng
approved these changes
Apr 15, 2025
Member
Author
|
Merged to master. |
HyukjinKwon
added a commit
that referenced
this pull request
Apr 17, 2025
…variable in SparkContext initialization ### What changes were proposed in this pull request? This PR is a followup of #50587 that makes the SparkContext initialization to respect `PYSPARK_UDS_MODE` environment variable being used in the scheduled build. ### Why are the changes needed? To make the scheduled build passing (https://github.com/apache/spark/actions/workflows/build_uds.yml) ### Does this PR introduce _any_ user-facing change? No, test-only. ### How was this patch tested? Manually tested via: ```bash PYSPARK_UDS_MODE=true ./python/run-tests --python-executables=python3 --testnames "pyspark.sql.tests.test_udf UDFTests.test_same_accumulator_in_udfs" ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #50616 from HyukjinKwon/SPARK-51800-followup. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
cloud-fan
reviewed
Apr 18, 2025
| if (socket == null || !socket.isOpen) { | ||
| if (isUnixDomainSock) { | ||
| socket = SocketChannel.open(UnixDomainSocketAddress.of(socketPath.get)) | ||
| logInfo(log"Connected to AccumulatorServer at socket: ${MDC(SOCKET_ADDRESS, serverHost)}") |
cloud-fan
reviewed
Apr 18, 2025
| logInfo(log"Connected to AccumulatorServer at socket: ${MDC(SOCKET_ADDRESS, serverHost)}") | ||
| } else { | ||
| socket = SocketChannel.open(new InetSocketAddress(serverHost.get, serverPort.get)) | ||
| logInfo(log"Connected to AccumulatorServer at host: ${MDC(HOST, serverHost)}" + |
cloud-fan
reviewed
Apr 18, 2025
| } else { | ||
| socket = SocketChannel.open(new InetSocketAddress(serverHost.get, serverPort.get)) | ||
| logInfo(log"Connected to AccumulatorServer at host: ${MDC(HOST, serverHost)}" + | ||
| log" port: ${MDC(PORT, serverPort)}") |
Member
Author
|
Thanks, made a Pr at #50648 |
HyukjinKwon
added a commit
that referenced
this pull request
Apr 21, 2025
### What changes were proposed in this pull request? This PR is a followup of #50587 that logs strings properly instead of `Option`s ### Why are the changes needed? To log strings properly instead of option instances. ### Does this PR introduce _any_ user-facing change? No the main change has not been released yet. ### How was this patch tested? Manually. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #50648 from HyukjinKwon/SPARK-51688-followup3. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR is a followup of #50466 that enables Unix Domain Sockets in Accumulators as well. This will be the last PR to complete UDS in PySpark.
Why are the changes needed?
This was not handled in the original PR as it was a bit complicated. Was separated out. See the original PR.
Does this PR introduce any user-facing change?
See #50466
How was this patch tested?
CI in this PR with enabling it by default. Also will set up a scheduled build through #50585
Was this patch authored or co-authored using generative AI tooling?
No.