-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-47132][DOCS][PYTHON] Correct docstring for pyspark's dataframe.head #45197
Conversation
The docstring claimed that `head(n)` would return a `Row` (rather than a list of rows) iff n == 1, but that's incorrect. Type hints, example, and implementation show that the difference between row or list of rows lies in whether n is supplied at all -- if it isn't, `head()` returns a `Row`, if it is, even if it is 1, `head(n)` returns a list. This commit corrects the docstring.
Good catch! Thanks for working on that! Would you create a JIRA and add the JIRA number to the PR title, along with Also feel free to answer no to |
python/pyspark/sql/dataframe.py
Outdated
@@ -3526,8 +3526,8 @@ def head(self, n: Optional[int] = None) -> Union[Optional[Row], List[Row]]: | |||
|
|||
Returns | |||
------- | |||
If n is greater than 1, return a list of :class:`Row`. | |||
If n is 1, return a single Row. | |||
If n is supplied, return a list of :class:`Row`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>>> spark.range(3).head(0)
[]
I'm wondering if we should call out the empty list when n=0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'm suggesting adding this as an example, and extending the returns
section to mention the length of the list (see commit below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Thank you @xinrong-meng ! I've created the JIRA and edited the PR. |
@@ -3526,8 +3526,9 @@ def head(self, n: Optional[int] = None) -> Union[Optional[Row], List[Row]]: | |||
|
|||
Returns | |||
------- | |||
If n is greater than 1, return a list of :class:`Row`. | |||
If n is 1, return a single Row. | |||
If n is supplied, return a list of :class:`Row` of length n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description is so much better! Thanks!
Merged to master, thank you! |
….head ### What changes were proposed in this pull request? Change description of `returns` in docstring for pyspark's dataframe's `head` to match actual behaviour. ### Why are the changes needed? The docstring claimed that `head(n)` would return a `Row` (rather than a list of rows) iff n == 1, but that's incorrect. Type hints, example, and implementation show that the difference between row or list of rows lies in whether n is supplied at all -- if it isn't, `head()` returns a `Row`, if it is, even if it is 1, `head(n)` returns a list. ### Does this PR introduce _any_ user-facing change? No, only a Python docstring is changed. ### How was this patch tested? No tests were added since only docs changed. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#45197 from wunderalbert/patch-1. Authored-by: Albert Ziegler <55346556+wunderalbert@users.noreply.github.com> Signed-off-by: Xinrong Meng <xinrong@apache.org>
What changes were proposed in this pull request?
Change description of
returns
in docstring for pyspark's dataframe'shead
to match actual behaviour.Why are the changes needed?
The docstring claimed that
head(n)
would return aRow
(rather than a list of rows) iff n == 1, but that's incorrect.Type hints, example, and implementation show that the difference between row or list of rows lies in whether n is supplied at all -- if it isn't,
head()
returns aRow
, if it is, even if it is 1,head(n)
returns a list.Does this PR introduce any user-facing change?
No, only a Python docstring is changed.
How was this patch tested?
No tests were added since only docs changed.
Was this patch authored or co-authored using generative AI tooling?
No.