New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-42099][SPARK-41845][CONNECT][PYTHON] Fix count(*)
and count(col(*))
#39622
Conversation
I think we should fix it in Scala side, but didn't find the right place. |
count(*)
and count(expr(*))
count(*)
, count(col(*))
, count(expr(*))
I checked that #39636 can resolve the |
count(*)
, count(col(*))
, count(expr(*))
count(*)
and count(col(*))
41cc5f7
to
fa99f10
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI seems to complain at compilation failure. Could you double-check and re-trigger, @zhengruifeng ?
Error: SparkConnectPlannerSuite.scala:555:
value addTarget is not a member of
org.apache.spark.connect.proto.Expression.UnresolvedStar.Builder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
merged into master, thank you @cloud-fan @HyukjinKwon @dongjoon-hyun ! |
What changes were proposed in this pull request?
1, add
UnresolvedStar
toexpressions.py
;2, Fix
count(*)
andcount(col(*))
, should returnColumn(UnresolvedStar(None))
instead ofColumn(UnresolvedAttribute("*"))
, see:spark/sql/core/src/main/scala/org/apache/spark/sql/Column.scala
Lines 144 to 150 in 68531ad
3, remove the
count(*) -> count(1)
transformation ingroup.py
, since it's no longer needed.Why are the changes needed?
#39636 fixed the
count(*)
issue in the server side, and thencount(expr(*))
works after that PR.This PR makes the corresponding changes in the Python Client side, in order to support
count(*)
, andcount(col(*))
Does this PR introduce any user-facing change?
yes
How was this patch tested?
enabled UT and added UT