New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-41002][CONNECT][PYTHON] Compatible take
, head
and first
API in Python client
#38488
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! LGTM!
take
and head
API in Python clienttake
, head
and first
API in Python client
:class:`Row` | ||
First row if :class:`DataFrame` is not empty, otherwise ``None``. | ||
""" | ||
return self.head() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this is copied from PySpark, but isn't it better to signal intent here with a 1 as explicit param?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is implementation details though but updated to self.head(1)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah actually we cannot. self.head()
returns Optional[Row]
but self.head(n)
returns List[Row]
. self.head()
is to make sure mypy check pass.
a03a8dd
to
05afe6e
Compare
actually I will follow https://spark.apache.org/contributing.html. to add a short description for test cases in this PR/ |
Can one of the admins verify this patch? |
Ok added short description for the new test cases. |
merged into master |
…API in Python client ### What changes were proposed in this pull request? 1. Add `take(n)` API. 2. Change `head(n)` API to return `Union[Optional[Row], List[Row]]`. 3. Update `first()` to return `Optional[Row]`. ### Why are the changes needed? Improve API coverage. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? UT Closes apache#38488 from amaliujia/SPARK-41002. Authored-by: Rui Wang <rui.wang@databricks.com> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
What changes were proposed in this pull request?
take(n)
API.head(n)
API to returnUnion[Optional[Row], List[Row]]
.first()
to returnOptional[Row]
.Why are the changes needed?
Improve API coverage.
Does this PR introduce any user-facing change?
No
How was this patch tested?
UT