[SPARK-41201][CONNECT][PYTHON] Implement DataFrame.SelectExpr in Python client#38723
[SPARK-41201][CONNECT][PYTHON] Implement DataFrame.SelectExpr in Python client#38723amaliujia wants to merge 2 commits intoapache:masterfrom
DataFrame.SelectExpr in Python client#38723Conversation
There was a problem hiding this comment.
So this becomes an unresolved attribute and just works out of the box?
There was a problem hiding this comment.
Actually it becomes Expression.Builder().setExpressionString() which are SQL expression strings.
str could be different things in DataFrame API.
There was a problem hiding this comment.
This was not necessarily the question that I had, but I was not remembering correctly the type interface to Project:
def __init__(self, child: Optional["LogicalPlan"], *columns: "ExpressionOrString") -> None:
In this case SQLExpression is an expression and it just works.
python/pyspark/sql/connect/column.py
Outdated
There was a problem hiding this comment.
should we assert here that expr is string and not another expression?
There was a problem hiding this comment.
In this implementation, we don't need to because the caller has verified the type (thus mypy does not complain).
I think this is a good question:
Generally speaking, I think for public API, we should throw user-facing exception, for internal API, we can assert when we want to defensive check unexpected input.
So it is a question of if we want to enforce checking cross all public/private API (by corresponding ways). I guess maybe not now but worth it at a right time (maybe before 3.4 release).
There was a problem hiding this comment.
I think that's fine. One point of having type hints is to avoid asserts on those types too.
|
Can one of the admins verify this patch? |
python/pyspark/sql/connect/column.py
Outdated
There was a problem hiding this comment.
I still don't like kind of naming .. but this is at least somewhat consistent with what we have in DSv2 so I am fine.
There was a problem hiding this comment.
Let's see in the future... I guess we will need to name more...
There was a problem hiding this comment.
Didn't we fix this to grpc.RPCError?
There was a problem hiding this comment.
hmmm I guess that was gone during code conflict and resolution then good fix is gone.
3df9929 to
7802dcc
Compare
|
Merged to master. |
| .toPandas(), | ||
| ) | ||
|
|
||
| @unittest.skip("test_fill_na is flaky") |
There was a problem hiding this comment.
ah .. I didn't notice this. Can we enable this back?
There was a problem hiding this comment.
I think so, will send a followup for it
There was a problem hiding this comment.
I am pretty sure I removed this after conflict resolution.
Actually Martin pointed out another case: #38723 (comment)
Basically it seems happened more than once that after code conflict resolution, the code I want to keep is gone.|
Maybe I should always do a -i commits square to in case more than 1 commit rebase causing unexpected result.
There was a problem hiding this comment.
I will follow up this soon.
There was a problem hiding this comment.
I am really guessing if I have more than 1 commit locally, if the first one I resolve the conflict, the following commit that might add something back silently.....
### What changes were proposed in this pull request? Reenable test_fill_na ### Why are the changes needed? `test_fill_na` was disabled by mistake in #38723 ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? reenabled test Closes #38763 from zhengruifeng/connect_reenable_test_fillna. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
…thon client ### What changes were proposed in this pull request? Implement `DataFrame.SelectExpr` in Python client. `SelectExpr` also has a good amount of usage. ### Why are the changes needed? API coverage. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? UT Closes apache#38723 from amaliujia/support_select_expr_in_python. Authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
…thon client ### What changes were proposed in this pull request? Implement `DataFrame.SelectExpr` in Python client. `SelectExpr` also has a good amount of usage. ### Why are the changes needed? API coverage. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? UT Closes apache#38723 from amaliujia/support_select_expr_in_python. Authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
### What changes were proposed in this pull request? Reenable test_fill_na ### Why are the changes needed? `test_fill_na` was disabled by mistake in apache#38723 ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? reenabled test Closes apache#38763 from zhengruifeng/connect_reenable_test_fillna. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
…thon client ### What changes were proposed in this pull request? Implement `DataFrame.SelectExpr` in Python client. `SelectExpr` also has a good amount of usage. ### Why are the changes needed? API coverage. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? UT Closes apache#38723 from amaliujia/support_select_expr_in_python. Authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
### What changes were proposed in this pull request? Reenable test_fill_na ### Why are the changes needed? `test_fill_na` was disabled by mistake in apache#38723 ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? reenabled test Closes apache#38763 from zhengruifeng/connect_reenable_test_fillna. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
What changes were proposed in this pull request?
Implement
DataFrame.SelectExprin Python client.SelectExpralso has a good amount of usage.Why are the changes needed?
API coverage.
Does this PR introduce any user-facing change?
NO
How was this patch tested?
UT