New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-41213][CONNECT][PYTHON] Implement DataFrame.__repr__
and DataFrame.dtypes
#38735
Conversation
@@ -115,6 +115,9 @@ def __init__( | |||
self._cache: Dict[str, Any] = {} | |||
self._session: "RemoteSparkSession" = session | |||
|
|||
def __repr__(self) -> str: | |||
return "DataFrame[%s]" % (", ".join("%s: %s" % c for c in self.dtypes)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this follows the default behavior of
spark/python/pyspark/sql/dataframe.py
Lines 860 to 869 in 40a9a6e
def __repr__(self) -> str: | |
if not self._support_repr_html and self.sparkSession._jconf.isReplEagerEvalEnabled(): | |
vertical = False | |
return self._jdf.showString( | |
self.sparkSession._jconf.replEagerEvalMaxNumRows(), | |
self.sparkSession._jconf.replEagerEvalTruncate(), | |
vertical, | |
) | |
else: | |
return "DataFrame[%s]" % (", ".join("%s: %s" % c for c in self.dtypes)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: is this public API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will be invoked here:
In [1]: df = spark.createDataFrame([(10, 80, "Alice"), (5, None, "Bob"), (None, 10, "Tom"), (None, None, None)], schema=["age", "height", "name"])
In [2]: df
Out[2]: DataFrame[age: bigint, height: bigint, name: string]
In [3]: df.__repr__()
Out[3]: 'DataFrame[age: bigint, height: bigint, name: string]'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the example!
Merged to master. |
@HyukjinKwon thanks for the reviews |
…taFrame.dtypes` ### What changes were proposed in this pull request? Implement `DataFrame.__repr__` and `DataFrame.dtypes` ### Why are the changes needed? For api coverage ### Does this PR introduce _any_ user-facing change? yes ### How was this patch tested? added UT Closes apache#38735 from zhengruifeng/connect_df_repr. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
…taFrame.dtypes` ### What changes were proposed in this pull request? Implement `DataFrame.__repr__` and `DataFrame.dtypes` ### Why are the changes needed? For api coverage ### Does this PR introduce _any_ user-facing change? yes ### How was this patch tested? added UT Closes apache#38735 from zhengruifeng/connect_df_repr. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
…taFrame.dtypes` ### What changes were proposed in this pull request? Implement `DataFrame.__repr__` and `DataFrame.dtypes` ### Why are the changes needed? For api coverage ### Does this PR introduce _any_ user-facing change? yes ### How was this patch tested? added UT Closes apache#38735 from zhengruifeng/connect_df_repr. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
What changes were proposed in this pull request?
Implement
DataFrame.__repr__
andDataFrame.dtypes
Why are the changes needed?
For api coverage
Does this PR introduce any user-facing change?
yes
How was this patch tested?
added UT